
Master PyTorch for deep learning through hands-on coding, covering fundamentals, neural networks, computer vision, transfer learning, and deployment, with milestone Food Vision projects to build your professional portfolio.
Discover how deep learning fits within machine learning and gain hands-on experience writing PyTorch code to find patterns in data through supervised learning.
Join our online classroom to boost accountability, engage with a learning community on Discord, form accountability buddies, and finish the course with support from instructors and mentors.
Discover free Gtlm resources for learners: coding challenges, open source projects, discourse server, and campus events. Access cheat sheets, a blog, and monthly industry newsletters.
Learn why machine learning turns inputs into numbers and is valuable for complex problems where writing exhaustive rules is impractical, such as self-driving cars.
Apply Google's rule that if you don't need it, don't use it, and learn when to deploy machine learning or deep learning for long rule sets, changing environments, or data.
Explore how traditional machine learning handles structured data with gradient boosted machines like XGBoost, while deep learning uses neural networks for unstructured data, all built with PyTorch.
Explore the anatomy of neural networks, from input and hidden layers to output, and how data are numerically encoded into learned representations or features.
Focus on supervised learning and transfer learning with PyTorch, contrasting labeled data with unlabeled data, and introducing self-supervised and reinforcement learning paradigms.
Explore what deep learning can do in real-world tasks, from recommendations to translation, speech recognition, computer vision, and natural language processing, with notes on sequence-to-sequence, classification, regression, and PyTorch.
Learn why PyTorch is the most popular research deep learning framework, offering Python-based, GPU-accelerated modeling with access to Torch Hub models and a complete stack from data preprocessing to deployment.
Learn that tensors are building blocks of deep learning, encoding images, text, and audio, processed by a neural network, and converted into outputs humans understand with PyTorch.
Explore PyTorch basics, focusing on tensors, tensor operations, and preprocessing data into tensors. Build, train, and evaluate models, use pre-trained neural networks, make predictions on custom data, and save models.
Learn to approach this course by writing PyTorch code and coding along. Explore, experiment, and visualize data; ask questions, do the exercises, and share your work.
Access three core resources: the GitHub materials, the course Q&A, and the online book. Rely on the PyTorch website and forums as supplementary references.
Set up PyTorch code using Google Colab, where notebooks come pre-installed with PyTorch and common data science packages, and practice on CPU or GPU with CUDA considerations.
Explore PyTorch tensors, from scalars to matrices, and learn dimensions, shape, and indexing while coding in Colab with optional GPU.
Explore how tensors in PyTorch represent data from scalars to multi-dimensional arrays, create random tensors with height, width, and color channels, and understand that networks start with random numbers.
Create tensors of zeros and ones in PyTorch, explore using size or default shape, and learn how masking works via multiplying by zero.
Explore creating ranges and tensors like other tensors in PyTorch, using torch.arange instead of deprecated torch.range, and matching shapes with zeros_like and the like method.
Learn to create tensors in PyTorch, set data types (float32 and float16), and manage device and requires_grad to control precision, memory use, and gradient tracking.
Discover how to inspect tensor attributes in PyTorch, including data type, shape, and device, and navigate common data type errors that arise during training.
Master tensor operations in PyTorch, including addition, subtraction, multiplication, division, and matrix multiplication. Understand how tensor data type, shape, and device influence neural network computations and learning.
Learn the difference between element-wise and matrix multiplication (dot product) in neural networks, with practical PyTorch examples comparing for loops versus vectorized torch.matmul for speed.
Learn matrix multiplication in PyTorch by applying two rules: inner dimensions must match, and the result has the outer dimensions, using the at operator or torch.matmul.
Learn how matrix multiplication requires matching inner dimensions, fix shape errors by transposing tensors in PyTorch (torch.mm), and visualize how 3x2 and 2x3 yield a 3x3 result.
Learn tensor aggregation in PyTorch by computing min, max, mean, and sum with built-in methods, illustrated with a range tensor and data type adjustments to float32.
Learn to locate the positional min and max of tensors using argmin and argmax, managing data types to ensure operations like mean and softmax run correctly.
Master reshaping, viewing, and stacking tensors in PyTorch, using reshape, view, vstack, and hstack, and apply squeeze and permute to manage tensor dimensions.
Explore how to manipulate tensor dimensions in PyTorch using squeeze, unsqueeze, and permute, including shapes, memory sharing, and applying these techniques to image data with height, width, and color channels.
Master indexing in PyTorch to select data from multi-dimensional tensors, using examples like x[0], x[0,0], and x[0,0,0], and learn how views share memory across tensors.
Explore converting numpy arrays to torch tensors with torch.from_numpy and back using tensor.numpy, and understand how numpy float64 and torch float32 affect dtype and memory, guiding necessary conversions.
Learn to achieve reproducible experiments in PyTorch by flavoring pseudo randomness with a random seed, creating reproducible random tensors, and understanding how deterministic computers shape outcomes.
Learn how to run PyTorch tensors on GPUs for faster computation using CUDA and Nvidia hardware, and compare options like Google Colab, own GPU, and cloud services, with device-agnostic code.
Learn to write device-agnostic PyTorch code and move tensors and models between CPU and GPU, using CUDA, Colab, and simple device checks.
Practice PyTorch fundamentals with guided exercises, documentation reading, and extra-curriculum tasks, including creating random tensors, performing matrix multiplications, and using Colab templates and the GitHub extras.
Explore the PyTorch workflow from preparing data as tensors to selecting a loss function, building a training loop, evaluating, and saving or exporting trained models.
Learn an end-to-end PyTorch workflow in Colab, from data loading and building a model to training, evaluating or inferring, and saving or loading the trained model with reference notebooks.
Explore data preparation and loading in PyTorch, encoding inputs into numerical representations and training a simple linear regression model to learn weights and bias.
Split data into training and test sets to train models on seen data and assess generalisation. Note training proportions (60–80%) and test (10–20%), with an optional validation set.
Visualize training and test data with a custom plot function, plotting blue training points, green test points, and red predictions to compare model outputs against actual values.
Create a linear regression model in PyTorch by implementing a neural network module with weights, bias, and a forward method. Train it with gradient descent and backpropagation on your data.
Explore how PyTorch builds a simple linear regression model with random weight and bias, tracks gradients via autograd, and uses gradient descent to fit parameters.
Explore core PyTorch model building classes, including nn.Module and the forward method, and learn how optimizers drive gradient descent for backpropagation in neural networks with data sets and loaders.
Explore inspecting a PyTorch model's internals by examining parameters, weights, and biases, using a random seed to ensure reproducible linear regression training.
Make predictions with a randomly initialized PyTorch model using inference mode. Observe the faster forward passes on X_test and compare gradient tracking to no_grad during inference.
Explore how a loss function measures prediction error and guides an optimizer to adjust model parameters, moving from random initialization toward better data representation through training and testing loops.
Explore how to train a PyTorch model by setting up a l1 loss (mean absolute error) and an optimizer such as sgd to minimize loss via a training loop.
Explore how a PyTorch training loop uses a loss function and an optimizer to adjust model parameters through forward and backward passes, guided by learning rate and gradient descent.
Implement the forward pass in a PyTorch training loop, compute the mean absolute error, then zero gradients, backpropagate, and take an optimizer step to update the model.
Explore the training loop in PyTorch, detailing forward pass, loss computation, zeroing gradients, backward propagation, and optimizer step across epochs to converge toward minimum loss.
watch the training loop update model parameters through backpropagation and gradient descent, then evaluate predictions with the testing loop and loss function choices.
Learn to write a testing loop in PyTorch with inference mode and no grad, run forward passes on test data, and compute test loss across epochs.
Master the training and testing loop in PyTorch by tracking epoch losses, refining model parameters with backpropagation and gradient descent, and visualizing loss curves to gauge progress.
Learn to save and load PyTorch models by saving the state dict with torch.save, loading with torch.load, and applying model.load_state_dict for reuse and inference.
Learn to save a PyTorch model, load its state dict with torch.load, instantiate a new model, apply load_state_dict, and verify predictions in evaluation mode.
Revisit the PyTorch workflow from data preparation to training with a loss function and optimizer. Practice saving and loading a model and writing device-agnostic code for CPU or CUDA GPUs.
Create a dummy data set and build a PyTorch linear model to learn from training data using y = weight times features plus bias, with 80/20 train-test split and plots.
Build a linear regression model by subclassing torch.nn.Module and using a linear layer with one in feature and one out feature, implementing the forward pass.
Train a PyTorch linear model on CUDA or CPU with device-agnostic code, using L1 loss and SGD, implementing a full training and evaluation loop and monitoring progress.
Make predictions on test data in evaluation mode, visualize results, and move tensors to CPU for plotting, preparing you for saving and loading the model state in the next part.
Learn to save and load a trained PyTorch model by saving the state dict, creating model paths, loading to the correct device, and validating predictions.
Recognize imposter syndrome as a symptom of learning that signals growth, and reinforce progress through practice while teaching others on Discord.
Explore the PyTorch workflow with exercises and extra curriculum, including notebooks, Colab templates, and solutions, to solidify data preparation, model building, training, and saving and loading trained models.
Explore neural network classification with PyTorch, covering binary, multiclass, and multi-label problems, input/output shapes, features and labels, and training workflows including loss functions, optimizers, and model saving or loading.
See how images are numerically encoded as tensors with width, height, and color channels, then mapped to prediction probabilities across three classes like sushi, steak, or pizza.
Explore the typical architecture of a classification neural network in PyTorch, from input feature vectors to probability outputs, covering binary and multiclass setups, activations, and cross entropy losses.
Explore creating a toy classification dataset with PyTorch in Colab, using the make circles data set from the Psychic Loan library to practice binary neural network classification.
Convert numpy data to PyTorch tensors and set input/output shapes for a neural network, then split into training and test sets using train_test_split.
Learn to set up device-agnostic PyTorch code, build a model by subclassing nn.Module, and craft a training loop with a loss function and optimizer to classify data using 80/20 split.
Learn to implement device agnostic code in PyTorch by subclassing nn.Module to build a small neural network with two linear layers, a forward method, and deployment on a target device.
Build and visualize a multi-layer neural network with two input features, a five-neuron hidden layer, and one output, illustrating the forward pass, activations, and learning parameters.
Compare two ways to build a PyTorch model: a simple two-layer neural network in nn.Sequential and a subclassed module, and inspect the weights, shapes, and forward passes.
Learn to choose and implement loss functions and optimizers for a binary classification network in PyTorch, using binary cross entropy with logits, sigmoid activation, SGD, and Adam, with accuracy evaluation.
Convert logits to prediction probabilities with sigmoid or softmax, then derive labels using BCE or BCE with logits loss, and implement a full PyTorch training loop.
Learn to train a PyTorch classification model with a full training and testing loop, converting logics to predictions using sigmoid and BCE with logits loss.
Train a PyTorch classification model for 100 epochs and visualize its predictions with a helper function to plot the decision boundary, diagnosing why learning stagnates on balanced data.
Improve a PyTorch neural network by adding layers and hidden units, training, and adjusting learning rate and loss, testing optimizers like SGD or Adam to separate blue and red dots.
Explore how to improve a model from a model perspective by adjusting hyperparameters, adding layers and hidden units, and extending training epochs, while tracking experiments.
Upgrade circle model v0 to v1 by adding hidden units and an extra layer, then train with bce with logits loss and sgd to assess performance improvements.
Create a straight line dataset to test if the PyTorch model can learn anything, using 80/20 train/test splits, linear regression, and data visualization to troubleshoot learning.
Build and train an nn.Sequential model to fit a straight line data set, adjust input features for regression, apply L1 loss and SGD, and evaluate on the test data.
Demonstrate model two's capacity to learn with straight line data by adjusting the loss function and learning rate in a device-agnostic training loop, then perform inference, predict, and plot results.
Explore why non-linearity unlocks neural network power by combining linear and nonlinear functions to model circular data, using PyTorch, activation functions, and train-test splits.
Build a classification model with nonlinearity by using two hidden layers of ten units each in PyTorch, inserting nonlinear activations like relu and sigmoid between layers.
Train a non-linear PyTorch model for binary classification using BCE loss with logits and sigmoid on blue and red dots; explore learning rate and ReLU activation effects on training.
Apply non-linear activation to make predictions with model three, evaluate performance by visualizing decision boundaries on training and test data, and compare linear versus non-linear models to improve accuracy.
Replicate nonlinear activation functions like ReLU and Sigmoid using pure PyTorch. Visualize outputs, compare with PyTorch's ReLU and Sigmoid, and see how linear and nonlinear functions build neural networks.
Create a four-class dataset with two features using make blobs, visualize data, and split it into train and test. Apply softmax and cross-entropy to train a multiclass classifier in PyTorch.
Create a PyTorch multiclass classification model that handles four classes, defines input features, hidden units, and output features, and uses softmax, cross entropy loss, and device agnostic training.
Set up a multiclass classification with PyTorch by defining a cross entropy loss and SGD optimizer, configure input and output features, and explore training loop basics.
Learn to convert multiclass model logits to prediction probabilities with softmax, then to prediction labels via argmax, and build training and testing loops in PyTorch.
Train a multiclass PyTorch model end-to-end, converting logits to probabilities with softmax, building a training and testing loop, and debugging data type and shape issues to improve accuracy.
Learn how to make predictions with a multiclass PyTorch model, convert logits to probabilities, determine the predicted classes, and evaluate using classification metrics and visualizations.
Explore beyond accuracy by examining precision, recall, F1 score, confusion matrices, and classification reports for multiclass classification in PyTorch, with guidance on imbalanced data and using torch metrics.
practice hands-on coding with seven PyTorch classification exercises and an extra curriculum, including skeleton code, datasets, and data frame preparation, plus reference notebooks for solutions.
Explore what makes a computer vision problem and how PyTorch and torchvision power end-to-end cnn workflows for multiclass image classification, object detection, segmentation, training, and evaluation.
Explore how computer vision converts images into tensors, with 24x24x3 inputs and three-class outputs (sushi, steak, pizza), and build CNNs in PyTorch with proper data pipelines.
Explore how convolutional neural networks process images by typical CNN architecture with input and output shapes, convolutional layers, nonlinear activation, pooling, and PyTorch code-first implementation.
Explore the core PyTorch computer vision stack by importing torchvision, datasets, transforms, and pre-trained models; learn to prepare images as tensors and set up data loaders for vision tasks.
Learn to acquire and prepare computer vision datasets with PyTorch using torchvision, fashion-mnist and mnist variants, including transforms to tensor and train/test splits; understand input and output shapes.
Explore random grayscale 28x28 fashion‑MNIST samples from Zalando Research in PyTorch, learn input and output shapes with color channels first, and visualize and inspect class labels for model readiness.
Turn a 60,000-image clothing dataset into 32-image mini batches with a data loader, shuffling training data to prevent learning the data order, for 60,000 training and 10,000 testing images.
Turn datasets into data loaders by configuring a batch size of 32, enabling shuffle for training, and inspecting train and test loaders to understand batch shapes.
Create a baseline computer vision model with a flatten layer and two linear layers, using 28 by 28 images flattened to 784 features, producing ten class logits.
Set up a multi-class loss with cross-entropy, configure SGD optimizer, and evaluate accuracy to train a baseline PyTorch model for clothing type recognition on 28x28 grayscale images.
Create a timing function using Python's time module to measure training time and compare CPU versus GPU performance while tracking loss and accuracy for your PyTorch model.
Create a PyTorch training loop that trains on batched data via a data loader, updates per batch, and evaluates with train and test losses and accuracy across epochs.
Learn to build a reusable evaluation function in PyTorch that computes loss and accuracy on a data loader and returns a results dictionary for model comparisons.
Set up device-agnostic PyTorch code to run on GPU when available, perform CUDA checks, and run experiments on Google Colab with a small dataset and nonlinear functions.
Build a nonlinear model with relu activations to capture nonlinear data, experimenting with a multi-layer sequential neural network and comparing it to a linear baseline using device-agnostic code.
Explore mode 1 by adding nonlinear layers, then create a cross entropy loss and SGD optimizer for multiclass classification. Build training and testing steps to assess accuracy and guide experiments.
Turn your training and evaluation loops into reusable functions; implement train step and test step with a model, data loader, loss function, optimizer, and accuracy metric, enabling device-agnostic, epoch-based training.
Turn the testing loop into a reusable test_step function using a model, test data loader, and cross-entropy loss, evaluated in inference mode to compute average loss and accuracy.
Combine training and test steps into an optimization loop for CPU and GPU. Measure training time across epochs with a data loader, loss, optimizer, and accuracy.
Create a results dictionary for model one using the test data loader, loss, and accuracy functions, and fix a runtime device mismatch to compare results with model zero.
Explore a convolutional neural network (CNN) architecture for image data, detailing input, convolutional and pooling layers, nonlinear activations, and a linear output layer, with PyTorch coding in the next video.
Build your first convolutional neural network in PyTorch by coding a Tiny VG CNN with two blocks, conv2d, maxpool2d, and a final classifier, showcasing kernel size, stride, and padding.
Explore conv2d and maxpool2d operations in PyTorch by stepping through a conv2d layer with a 3x3 kernel, stride, padding, and tensor shapes, using dummy data to illustrate input–output transformations.
Explore how max pool 2d operates in a convolutional neural network, tracing input shapes, kernel size, and the max selection to compress features.
Learn to determine the input and output shapes of every layer in a PyTorch CNN by using a dummy tensor and printing intermediate shapes during a forward pass.
Set up a cross-entropy loss and SGD optimizer for a CNN, train on grayscale fashion images, and implement the training loop with train and test steps.
Train a first convolutional neural network using reusable train and test steps, measure GPU training time with CUDA, and assess model accuracy on the test set after three epochs.
Compare results across modelling experiments using a pandas dataframe to contrast accuracy, loss, and training time for baseline, gpu with nonlinearity, and the tiny vg cnn.
Visualize predictions from the best model by sampling random test images, converting logits to probabilities with softmax, and comparing predicted labels against true labels to evaluate performance.
Visualize model predictions by plotting random test samples in a 3x3 grid with predicted and true labels, then color the title green for correct and red for incorrect.
Visualize and evaluate a multiclass PyTorch model by building and plotting a confusion matrix, using torchmetrics and ML Extend, and generating predictions on the full test set.
Import torch metrics and ML Extend, compute and plot a confusion matrix to compare predictions to targets on the test data for ten classes, revealing misclassifications.
Save and load your best performing PyTorch model by exporting its state_dict to a models directory and reloading it to verify results with is_close.
Recap end-to-end computer vision with PyTorch, covering data loading, the baseline model, CNNs, evaluation with confusion matrices, saving models, and the exercises plus extra curriculum resources.
Explore how to load and preprocess a custom dataset with PyTorch to build a Food Vision Mini model that classifies pizza, sushi, and steak.
Learn how to import torch, verify PyTorch version, and set up device-agnostic code to run on CPU or GPU in Google Colab, including enabling CUDA when available.
Download and prepare a pizza, steak, and sushi dataset for PyTorch, based on Food 101, and practice building a small 3-class, 10% subset from a zip file.
Prepare and explore data for PyTorch with the Food 101 dataset and a train‑test folder structure. Demonstrate image folder data formats, walking directories, and visualizing samples to understand data layout.
Visualize a random image from a labeled train and test dataset, determine its class by its parent folder, and inspect its metadata using Python, Pillow, and pathlib.
Visualize a random image with Matplotlib by converting it to a NumPy array. Inspect its height, width, and color channels, and compare channel orders for PyTorch workflows.
Transform data from images into tensors using torchvision transforms, resizing to 64x64, applying random horizontal flips, and loading via torch utils data set and data loader.
Visualize transformed images using the transforms module and compose, compare original versus transformed images, and learn how data augmentation and tensor conversion prepare data for a model.
Load image data using image folder, apply transforms to resize, flip, and convert images to tensors for model input, and inspect train and test datasets with class names.
Index into train dataset to retrieve an image and its label, convert it to a torch tensor, map the label to pizza, and plot the result.
Turn your custom PyTorch datasets into train and test data loaders, set batch sizes and workers, shuffle training data, and manage memory by loading data in batches of 32.
Create a custom PyTorch dataset class to load images from a directory and obtain class names as a list and dictionary for use with a data loader.
Learn to build a helper function to extract class names from a directory using os.scandir, returning a sorted list and a class-to-index map, and raising an error if none found.
Learn how to build a PyTorch custom dataset by subclassing torch.utils.data.Dataset, overriding __len__ and get_item to load image samples, apply transforms, and map class names to indices.
Learn to implement a custom image folder dataset in PyTorch that mirrors the original class, including 64x64 resizing and train-test transforms.
Learn to visualize random images from a custom PyTorch dataset by building a helper function that samples with a seed and plots with matplotlib, displaying class names and image shapes.
Turn your custom dataset into a PyTorch data loader by wrapping a custom dataset class with train and test loaders, using batch size 32 and configurable workers, with training shuffle.
Explore state-of-the-art data augmentation with torchvision transforms, including trivial augment, resize and crop, to artificially diversify custom datasets and improve model generalization.
Load and transform data for a baseline computer vision model using 64 by 64 color images of pizza, steak, and sushi, with simple transforms, image folder datasets, and data loaders.
Replicate the tiny vgg architecture from scratch for color images, building a convolutional two-block model with pooling, a classifier, and a forward pass that checks shapes.
Build a baseline PyTorch model for color images and run a forward pass on a single image to verify shapes. Troubleshoot shape mismatches by tracing through conv blocks to classifier.
Learn how to use torchinfo to inspect a PyTorch model, revealing layer shapes, total parameters, and model size via a forward pass on a single-batch input.
Learn to implement generic train and test step functions in PyTorch for custom datasets, handling models, data loaders, loss, optimizer, device, and accuracy.
Create a reusable train function that combines train and test steps to train and evaluate a model, tracking loss and accuracy with a progress bar.
Train and evaluate model zero on a custom pizza, steak, and sushi dataset using a tiny vg cnn with cross-entropy loss and an optimizer, across five epochs and seeds 42.
Plot loss curves for a convolutional neural network trained on a custom dataset, visualizing training and test loss and accuracy across epochs to assess model progress.
Explore how loss curves reveal whether a model overfits or underfits, then learn practical remedies—more data, data augmentation, transfer learning, and learning rate scheduling—to achieve the just right balance.
Create augmented training datasets and data loaders for model 1 using data augmentation, transforms, and 64×64 resizing, then prepare train and test loaders for tiny vg.
Construct and train model one with augmented training data, compare its performance to a baseline without augmentation, and evaluate a reproducible PyTorch training workflow with fixed seeds and five epochs.
Plot the loss curves of model one to evaluate performance over time, identify underfitting and overfitting, and test improvements with more epochs, data augmentation, or adjusting layers.
Compare multiple models by plotting training loss and training accuracy against test loss and test accuracy, using hard coding or tools like PyTorch tensor board, weights and biases, and MLflow.
Predict on a custom image by downloading it programmatically from a raw GitHub link in Colab, then pass it through your PyTorch model to infer pizza or other foods.
Learn to load a custom image into a PyTorch tensor, align its shape and data type to the model, and prep it on the correct device for prediction.
Learn to predict on a custom image by converting it to a 64x64 rgb float32 tensor on the correct device, with proper resizing, normalization, and a batch.
Convert raw model outputs to prediction probabilities with softmax, then derive multiclass labels by argmax for custom images. Create a reusable function to load an image, predict, and plot.
Build a custom image prediction function in PyTorch, loading and transforming a single image, running inference on a trained model, and visualizing the prediction with label and probability.
Preprocess custom data to match model expectations by aligning data type, device, and shape, including a batch dimension, then explore PyTorch vision, audio, and text tools with datasets and exercises.
Learn how to go modular in PyTorch by turning notebook code into Python scripts for data loading, model building, training, and evaluation, using a practical food image classifier workflow.
Explore going modular by turning notebook code into reusable Python scripts from cell mode to script mode, building data loaders, a tiny VG model, and end-to-end training in PyTorch.
Download and unzip a pizza, steak, and sushi image dataset and set up training and testing paths. Convert notebook code into reusable Python scripts for data loaders in script mode.
Turn notebook code into a reusable Python script for data loading. Create data set up.py with a create data loaders function using train and test dirs, data sets, transforms.
Create a reusable PyTorch data loader script that builds train and test loaders from image folders, applying a transform and pinning memory for faster GPU transfer.
Turn your PyTorch model code into a Python script by creating model_builder.py and importing it for a dummy forward pass of a tiny vg model. Modularize training steps into engine.py.
Turn your training and testing step functions and the train function from notebooks into a standalone engine.py Python script, then organize with type annotations and docstrings for clarity.
Turn training, evaluation, and saving routines into modular scripts by creating utils.py and train.py, and persist model weights with PyTorch's torch.save.
Train a PyTorch image classifier with one line of code by modularizing prior scripts such as data setup, engine, model, and utils, using device-agnostic training and saving the model.
Explore going modular by converting notebook workflows into python scripts for PyTorch projects, with hands-on exercises on data scripting, parameterization via argparse, and predicting with a saved model.
Discover transfer learning using pre-trained models from large datasets like ImageNet and Wikipedia to fine-tune for vision and language tasks, achieving strong results with less data.
Explore where to find pre-trained models in PyTorch domain libraries—vision, text, and audio—and use transfer learning on a food image classifier for pizza, steak, and sushi.
Learn to set up PyTorch for deep learning in Google Colab, upgrade to nightly torch and torchvision, and use the latest APIs to enable transfer learning with pre-trained models.
Clone the going modular repository and import the reusable PyTorch scripts to set up device-agnostic code, data loaders, and training utilities for transfer learning in this deep learning bootcamp.
Download the pizza, steak, and sushi data from GitHub, unzip the zip file into training and test image folders, then convert these folders into data loaders.
Learn to turn images from training and testing folders into PyTorch data loaders using a manually created transform pipeline, aligning image normalization with pre-trained models for transfer learning.
Master data preparation for pre-trained models using automatic transforms from torch vision 0.13+, aligning preprocessing with the model, and creating data loaders with auto transforms for efficient net B0.
Explore how to choose a pretrained model for transfer learning in PyTorch by balancing speed, size, and performance, with mobile versus server deployment examples like food vision.
Set up a pretrained efficient net b0 with torchvision using weights API and adapt the classifier from 1000 to three classes via transfer learning, using feature extraction and average pooling.
Examine transfer learning types around an efficient net B0 feature extractor; keep base layers frozen, train only the output head on new data, and apply fine tuning for larger datasets.
Apply torch info to summarize your model, inspect input/output shapes and parameters, and implement feature extraction by freezing base layers and adapting the classifier head to three classes.
Learn to freeze base layers of a pre-trained efficient net zero and replace the classifier head for a three-class problem, creating a feature extraction model with a trainable output layer.
Train our first transfer learning model with a frozen EfficientNet-B0 feature extractor, pre-trained on ImageNet, and a new three-class classifier head for pizza, steak, and sushi.
Explore plotting loss curves to evaluate transfer learning models, compare train and test loss, discuss ideal loss curves, overfitting, and verifying performance via predictions on test images.
Learn to make predictions on test and custom images by ensuring the same shape, data type, device, and transforms, and implement a pred and plot image function for inference.
Create a pred and plot image function in PyTorch that loads a trained model, processes images with transforms, predicts labels, and plots the image with the top class and probability.
Randomly sample three test image paths and run them through the trained transfer learning model to generate predictions and plots on unseen data.
Demonstrate predicting on a custom image with a transfer learning food vision model, comparing results to a tiny vg model using an efficient net zero backbone.
Learn how transfer learning in PyTorch enables strong results with limited data, identify existing well-performing models, and apply proper data transforms for training and prediction, with practical exercises and templates.
Track and compare multiple modeling experiments in PyTorch for the Food Vision mini project, using TensorBoard and other tools to identify the best model and visualize results.
Set up a GPU-enabled Google Colab notebook, ensure nightly PyTorch and TorchVision versions, and import modular torch libraries to enable experiment tracking with tensor board.
Create a reusable download data function to fetch, unzip, and organize image datasets (pizza, steak, sushi) into a PyTorch workflow, returning the image path for downstream loaders.
Turn downloaded food images into reproducible PyTorch data loaders using manual transforms, applying image net normalization to align with pre-trained models for transfer learning.
Apply automatic transforms from pre-trained model weights to build data loaders that match the training data format, enhancing transfer learning with EfficientNet-B0.
Prepare a pretrained model for your problem by freezing base layers and adapting the classifier head to three outputs using the new weights API. Inspect trainable parameters with torchinfo.
Learn to set up a single PyTorch model experiment and track it with TensorBoard using a summary writer, logging loss and accuracy while freezing base layers and training the classifier.
Train a single model in PyTorch with a loss function and optimizer, log metrics to TensorBoard, save results in a runs directory, and view experiments in a notebook.
Explore viewing and analyzing model training results with TensorBoard in Colab, learn to compare runs, inspect model architecture, and monitor overfitting using loss and accuracy curves.
Create a function to instantiate a summary writer that saves to a unique log directory per experiment, enabling programmatic tracking of multiple models with timestamped folders.
Adapt the train function to support multiple experiments by passing a writer and logging to tensorboard, enabling two runs with 5 and 10 epochs on the pizza, steak, sushi data.
Explore a series of food vision experiments in PyTorch, testing hyperparameters like epochs, data size, data augmentation, and learning rate on pizza, steak, and sushi Food-101 dataset using transfer learning.
Adjust three dials: model size, dataset size, and training time, across efficient net B0 and B2, with 5 or 10 epochs and 10% or 20% data, keeping test set fixed.
Run quick experiments on food 101 by starting with 10% data and five epochs, then gradually double data, model size, and epochs to scale while keeping runtime mobile friendly.
Transform custom data into image-net inputs by resizing to 224, normalizing to match ImageNet, and building data loaders. Provide two functions to instantiate efficient net B0 and B2 feature extractors.
Create two efficient net feature extractors (B0, B2) by freezing base layers and adjusting the classifier head for three classes, using pre-trained ImageNet weights, for 10% and 20% data loaders.
Set up and run a reproducible series of transfer learning experiments using different data loaders, 10% and 20% training data, and efficient net feature extractors, saving each model to file.
Run eight quick modelling experiments in five minutes to compare data, training time, and model size for food vision tasks, save models, and visualize results with tensor baud.
Explore modelling experiments in TensorBoard to compare accuracy, loss, and training time across data sizes, models, and epochs, highlighting efficient net B2 with 20% data and ten epochs.
Load the best EfficientNet B2 model from its saved state dictionary, run predictions on random test images, and note a 29 megabyte size for pizza, steak, and sushi.
Predict on our own custom image with the best model, testing it on a pizza dad photo to show high confidence and improvements across eight models in five minutes.
Master PyTorch workflow fundamentals, from data preparation and pre-trained models to experiment tracking and evaluation, then complete practical exercises.
Explore how vision transformer paper uses self-attention for image recognition and learn the anatomy of a machine learning research paper from abstract to references while building a transformer with PyTorch.
Replicate machine learning research papers to practice building skills as a machine learning engineer, turning papers into code, loading and preprocessing data, modeling, and deploying results.
Discover where to find machine learning research papers and code using Archive.org, papers with code, and GitHub repositories like lucid rains and ViT to replicate models.
Replicate the vision transformer in PyTorch for the food vision mini problem, turning images into patches, encoding with transformer layers, and classifying with an mlp head.
Set up your Google Colab environment for PyTorch coding by importing torch and torchvision 0.13, cloning the going modular repo, and configuring a CUDA-ready device for vision transformer replication.
Learn to download pizza, steak, and sushi images, create train and test paths, and build data loaders to replicate a vision transformer for the Food Vision mini project in PyTorch.
Turn food vision images into PyTorch datasets and data loaders with torchvision transforms, set the image size and batch size to 32, and visualize images while replicating vision transformer paper.
Visualize a single image from a batch by turning the data loader into an iterator, extracting the image and label, inspecting shapes, and plotting with matplotlib as color channels last.
Learn to replicate the vision transformer architecture in PyTorch by breaking the model into inputs, layers, blocks, and a final model, using self-attention and patch-based processing for image classification.
Break down the vision transformer architecture by mapping inputs and outputs, examining figure one, four equations, and table one, and describe patch embedding and transformer encoder blocks.
Breaks down the four equations behind the transformer, explains patch embeddings and equation details, and shares a practical trick for reading papers to replicate methods.
Breaks down equation 1 by detailing patch embeddings, a learnable class token, and position embeddings that form the input sequence to the transformer encoder.
Explore how the transformer encoder uses alternating multi head self attention and mlp blocks with layer norm before each block and residual connections, mapping to equation two and equation three.
Demystify equation four by connecting layer norm and the MLP-based classification head within a transformer framework, then translate the concepts into usable PyTorch code.
Breaks down table one of the vision transformer by detailing model variants and hyperparameters, including patch size, patch embeddings, class token, position embedding, MSAA blocks, and layer counts.
Learn to split a 224x224 image into 16x16 patches, embed each patch, and compute the patch embedding output shape for a vision transformer, preparing for class and position embeddings.
Visualize turning a single image into patches and build the patch embedding in PyTorch by computing input and output shapes from the top row using a 16x16 patch size.
Turn an input image into a grid of 196 patches of 16x16 pixels, visualized as a 14-by-14 patch grid, with each patch ready for embedding later.
Learn to create image patches with a conv2d layer, turning each patch into a learnable embedding, then flatten to a sequence for transformer input.
Create patch embeddings by applying a 2d conv with kernel and stride 16, turning images into 14 by 14 patches, then flatten to a 768-dimensional sequence for a vision transformer.
Transform convolutional feature maps into a sequence of patch embeddings for a transformer encoder by flattening spatial dimensions with PyTorch, yielding batch × 196 patches × 768 embeddings.
Visualize a single image's patch embeddings by converting patches with a convolutional layer and flattening them into a sequence for a transformer encoder.
Learn to implement the patch embedding layer for a vision transformer in PyTorch by building a reusable module that converts images into flattened patch embeddings with correct shapes.
Learn to create a learnable class token embedded at the front of patch embeddings, ensure image shape compatibility with patch size, and manage batch and embedding dimensions with torch parameters.
Learn how to add a learnable class token to the patch embedding in a transformer, verify input shapes, and prepare the patch sequence for classification using PyTorch.
Discover how to create a learnable class token, compute the sequence of patch embeddings, and add position embeddings in PyTorch for deep learning.
Integrate image to patch embeddings in one cell, using patch embedding, a learnable class token, and position embedding to realize equation one in an end-to-end PyTorch workflow.
Replicate the transformer encoder by implementing multi head self attention on patch embeddings, with layer norm and residual connections, per equation two, linking to equation three NLP block.
Explores multi-head self-attention and layer normalization in PyTorch, explaining query, key, value, embedding patches, and how layer normalization improves training speed and generalization.
Turn equation two into code by building a multi head self-attention block in PyTorch, configure embedding and heads, apply layer norm, and implement a forward pass with q k v.
Test the msaa block that combines multi-head self attention and layer norm by passing patch and position embeddings, preparing a vision transformer and the NLP block for equation three.
Learn to replicate the NLP block and MLP block inside the transformer encoder, with residual connections, layer norm, and dropout, using a two-layer MLP with gelu nonlinearity in PyTorch.
Turn equation three into reusable PyTorch code by building a transformer encoder with embedding dimension, MLP with dropout, layer norm, and a forward pass to combine MSA and MLP blocks.
Assemble a transformer encoder by alternating msaa and mlp blocks with norm layers and residual connections to turn embedded image patches into a learnable sequence representation.
Combine equation two and three to build a PyTorch transformer encoder, using msaa and mlp blocks with residual connections, embedding dim 768, and 3072 mlp size.
Explore building a transformer encoder block, then replace it with PyTorch's built-in transformer encoder layer to compare self-attention, feedforward, and performance in vision transformers.
Gather patch embeddings, a class token, and position embeddings, then stack transformer encoder blocks to build the vision transformer and implement the nlp head forward method.
Implement the forward method of a vision transformer, handling image patch embeddings, class token expansion, position embedding, dropout, transformer encoder, and final classifier output.
Build and test a custom vision transformer in PyTorch, run a forward pass on a random image, train with an optimizer and loss function, and generate a visual model summary.
Train a vision transformer on the food vision mini data, using the atom optimizer with betas 0.9 and 0.999, 0.1 weight decay, learning rate 0.001, and cross entropy loss.
Train a vision transformer on Food Vision mini data. Set seeds for reproducibility, run ten epochs with an optimizer and loaders, and compare training and test accuracy to detect overfitting.
Analyzes why the vision transformer underperforms due to data scale and proposes data, training, and learning rate warmup, decay, and gradient clipping to improve loss curves and combat overfitting.
Learn to diagnose underfitting and overfitting by plotting loss curves and applying data-efficient strategies, including transfer learning with pre-trained vision transformers (vit b16) on the Food Vision mini data.
Use a pre-trained vision transformer from torchvision, freeze the base, and replace the classifier head for a 3-class problem, leveraging transfer learning.
Prepare data for a pretrained vision transformer by applying automatic ViT transforms, building data loaders, and setting up feature extraction to compare performance with a custom model.
Train a pretrained ViT feature extractor for food vision mini using transfer learning, fine-tuning the head with cross-entropy loss, and saving the best model for deployment.
Save the pre-trained vit feature extractor to disk, inspect its size, and compare 327 MB to the 29 MB efficient net B2 to assess deployment on web or mobile.
Compare a 327 mb pre-trained feature extractor with a 29 mb efficient net b2, balancing test accuracy and test loss to weigh deployment trade-offs for in-browser, no-gpu production.
Explore making predictions on a custom image using a pretrained vision transformer (ViT), comparing model size, deployment considerations, and achieving high confidence on a pizza image.
Explore the main takeaways from replicating a PyTorch research paper by turning text, images, and math into runnable code, using pre-trained models and transfer learning, with exercises and extra resources.
What is PyTorch and why should I learn it?
PyTorch is a machine learning and deep learning framework written in Python.
PyTorch enables you to craft new and use existing state-of-the-art deep learning algorithms like neural networks powering much of today’s Artificial Intelligence (AI) applications.
Plus it's so hot right now, so there's lots of jobs available!
PyTorch is used by companies like:
Tesla to build the computer vision systems for their self-driving cars
Meta to power the curation and understanding systems for their content timelines
Apple to create computationally enhanced photography.
Want to know what's even cooler?
Much of the latest machine learning research is done and published using PyTorch code so knowing how it works means you’ll be at the cutting edge of this highly in-demand field.
And you'll be learning PyTorch in good company.
Graduates of Zero To Mastery are now working at Google, Tesla, Amazon, Apple, IBM, Uber, Meta, Shopify + other top tech companies at the forefront of machine learning and deep learning.
This can be you.
By enrolling today, you’ll also get to join our exclusive live online community classroom to learn alongside thousands of students, alumni, mentors, TAs and Instructors.
Most importantly, you will be learning PyTorch from a professional machine learning engineer, with real-world experience, and who is one of the best teachers around!
What will this PyTorch course be like?
This PyTorch course is very hands-on and project based. You won't just be staring at your screen. We'll leave that for other PyTorch tutorials and courses.
In this course you'll actually be:
Running experiments
Completing exercises to test your skills
Building real-world deep learning models and projects to mimic real life scenarios
By the end of it all, you'll have the skillset needed to identify and develop modern deep learning solutions that Big Tech companies encounter.
Fair warning: this course is very comprehensive. But don't be intimidated, Daniel will teach you everything from scratch and step-by-step!
Here's what you'll learn in this PyTorch course:
1. PyTorch Fundamentals — We start with the barebone fundamentals, so even if you're a beginner you'll get up to speed.
In machine learning, data gets represented as a tensor (a collection of numbers). Learning how to craft tensors with PyTorch is paramount to building machine learning algorithms. In PyTorch Fundamentals we cover the PyTorch tensor datatype in-depth.
2. PyTorch Workflow — Okay, you’ve got the fundamentals down, and you've made some tensors to represent data, but what now?
With PyTorch Workflow you’ll learn the steps to go from data -> tensors -> trained neural network model. You’ll see and use these steps wherever you encounter PyTorch code as well as for the rest of the course.
3. PyTorch Neural Network Classification — Classification is one of the most common machine learning problems.
Is something one thing or another?
Is an email spam or not spam?
Is credit card transaction fraud or not fraud?
With PyTorch Neural Network Classification you’ll learn how to code a neural network classification model using PyTorch so that you can classify things and answer these questions.
4. PyTorch Computer Vision — Neural networks have changed the game of computer vision forever. And now PyTorch drives many of the latest advancements in computer vision algorithms.
For example, Tesla use PyTorch to build the computer vision algorithms for their self-driving software.
With PyTorch Computer Vision you’ll build a PyTorch neural network capable of seeing patterns in images of and classifying them into different categories.
5. PyTorch Custom Datasets — The magic of machine learning is building algorithms to find patterns in your own custom data. There are plenty of existing datasets out there, but how do you load your own custom dataset into PyTorch?
This is exactly what you'll learn with the PyTorch Custom Datasets section of this course.
You’ll learn how to load an image dataset for FoodVision Mini: a PyTorch computer vision model capable of classifying images of pizza, steak and sushi (am I making you hungry to learn yet?!).
We’ll be building upon FoodVision Mini for the rest of the course.
6. PyTorch Going Modular — The whole point of PyTorch is to be able to write Pythonic machine learning code.
There are two main tools for writing machine learning code with Python:
A Jupyter/Google Colab notebook (great for experimenting)
Python scripts (great for reproducibility and modularity)
In the PyTorch Going Modular section of this course, you’ll learn how to take your most useful Jupyter/Google Colab Notebook code and turn it reusable Python scripts. This is often how you’ll find PyTorch code shared in the wild.
7. PyTorch Transfer Learning — What if you could take what one model has learned and leverage it for your own problems? That’s what PyTorch Transfer Learning covers.
You’ll learn about the power of transfer learning and how it enables you to take a machine learning model trained on millions of images, modify it slightly, and enhance the performance of FoodVision Mini, saving you time and resources.
8. PyTorch Experiment Tracking — Now we're going to start cooking with heat by starting Part 1 of our Milestone Project of the course!
At this point you’ll have built plenty of PyTorch models. But how do you keep track of which model performs the best?
That’s where PyTorch Experiment Tracking comes in.
Following the machine learning practitioner’s motto of experiment, experiment, experiment! you’ll setup a system to keep track of various FoodVision Mini experiment results and then compare them to find the best.
9. PyTorch Paper Replicating — The field of machine learning advances quickly. New research papers get published every day. Being able to read and understand these papers takes time and practice.
So that’s what PyTorch Paper Replicating covers. You’ll learn how to go through a machine learning research paper and replicate it with PyTorch code.
At this point you'll also undertake Part 2 of our Milestone Project, where you’ll replicate the groundbreaking Vision Transformer architecture!
10. PyTorch Model Deployment — By this stage your FoodVision model will be performing quite well. But up until now, you’ve been the only one with access to it.
How do you get your PyTorch models in the hands of others?
That’s what PyTorch Model Deployment covers. In Part 3 of your Milestone Project, you’ll learn how to take the best performing FoodVision Mini model and deploy it to the web so other people can access it and try it out with their own food images.
What's the bottom line?
Machine learning's growth and adoption is exploding, and deep learning is how you take your machine learning knowledge to the next level. More and more job openings are looking for this specialized knowledge.
Companies like Tesla, Microsoft, OpenAI, Meta (Facebook + Instagram), Airbnb and many others are currently powered by PyTorch.
And this is the most comprehensive online bootcamp to learn PyTorch and kickstart your career as a Deep Learning Engineer.
So why wait? Advance your career and earn a higher salary by mastering PyTorch and adding deep learning to your toolkit?