Teach on Udemy

Turn what you know into an opportunity and reach millions around the world.

Learn More

Your cart is empty.

Keep shopping

AI A-Z [2026]: Agentic AI, Gen AI, Prompt Engineering and RL

Name: AI A-Z [2026]: Agentic AI, Gen AI, Prompt Engineering and RL
Rating: 4.4 (50036 reviews)

Combine the power of Agentic AI, Generative AI, Prompt Engineering and Deep RL to build powerful AIs with AWS and Python

Created byHadelin de Ponteves, Kirill Eremenko, SuperDataScience Team, Luka Anicin, Ligency

Last updated 5/2026

English

What you'll learn

Understand the theory behind Artificial Intelligence
Build 12 different AIs for 12 different applications
Master the State of the Art AI models
Solve Real World Problems with AI
Prompt Engineering
Generative AI
Image Generation
Foundation Models Fine-Tuning
Retrieval-Augmented Generation (RAG)
Agentic AI
Q-Learning
Deep Q-Learning
Deep Convolutional Q-Learning
A3C (Asynchronous Advantage Actor-Critic)
PPO (Proximal Policy Optimization)
SAC (Soft Actor-Critic)
LLMs
Transformers
Low-Rank Adaptation (LoRA) and Quantization (QLoRA)
Responsible AI

Course content

25 sections • 163 lectures • 17h 22m total length

Get the Codes, Datasets and PDFs here0:21
How to Build Your First AI Chatbot Using AWS PartyRock | No Coding Required5:54
If you want to know:
- How can I create an AI chatbot without any coding experience?
- What is AWS PartyRock and how can I use it for chatbot development?
- Can I build a custom AI chatbot for free?
- How to create a conversational AI assistant in under 5 minutes?
- What are the steps to develop a Master Yoda-style chatbot using AWS tools?
Then this lecture is for you!

Learn how to build your first AI chatbot using AWS PartyRock, a powerful no-code platform for generative AI development. This hands-on tutorial demonstrates how to create a custom conversational AI assistant that mimics Master Yoda's speaking style, all without writing a single line of code. Using AWS PartyRock's all-in-one platform, you'll discover how to set up your development environment, implement natural language processing, and deploy your chatbot in minutes. The step-by-step guide covers account creation, chatbot configuration, and testing your AI assistant with real-time user interactions. Perfect for beginners exploring artificial intelligence and large language models (LLMs), this practical demonstration shows how to leverage advanced chatbot technology through a user-friendly interface, completely free and without requiring an AWS account or credit card information.
Prizes for Learning0:07

Welcome to Part 2 - Generative AI0:29
Fundamentals of Generative AI3:35
Generative AI for Image Generation4:12
Foundation Models Overview3:14
Foundation Models Lifecycle3:45
Data Selection3:14
Foundation Models Selection3:33
Training vs. Inference4:47
Context Window1:46
Tokens and Embeddings5:12
Transformers3:37
Foundation Models Training2:08
Foundation Models Fine-Tuning4:02
Foundation Model Fine-Tuning [Hands-On] - Short Version7:59
Foundation Model Fine-Tuning [Hands-On] - Long Version0:15
Foundation Models Evaluation3:33
Retrieval-Augmented Generation (RAG)1:43
Retrieval Augmented Generation (RAG) [Hands-On] - Short Version6:42
Retrieval Augmented Generation (RAG) [Hands-On] - Long Version0:16

Deep Learning Fundamentals: Neural Networks & Activation Functions Explained4:03
If you want to know:
- What are the fundamental building blocks of neural networks?
- How do biological neurons inspire artificial neural networks?
- What are activation functions and how do they work?
- Which activation functions should you use in different neural network layers?
- How do neural networks learn and process information?
- What makes gradient descent and stochastic gradient descent effective for neural network training?
Then this lecture is for you!

This comprehensive lecture on Deep Learning fundamentals provides a thorough introduction to neural networks and activation functions. Starting with the biological inspiration behind artificial neurons, you'll learn how the human brain's structure influences neural network design. The lecture covers essential concepts including various activation functions, their applications in different network layers, and practical implementation considerations. Through a simplified real estate price prediction example, you'll understand neural network operations before diving into learning mechanisms. The session concludes with detailed explanations of gradient descent, stochastic gradient descent, and backpropagation techniques, providing you with a complete foundation in neural network architecture and training methodologies. This structured approach ensures a clear understanding of both theoretical concepts and practical applications in deep learning systems.
How Reinforcement Learning Works: A Beginner's Guide to AI Training Methods11:26
If you want to know:
- What is reinforcement learning and how does it differ from traditional programming?
- How do agents learn from their environment through rewards and actions?
- What is the relationship between states, actions, and rewards in AI training?
- How does reinforcement learning compare to training real-world entities like dogs?
- What makes reinforcement learning particularly effective for robotics applications?
Then this lecture is for you!

This comprehensive introduction to reinforcement learning (RL) explores the fundamental concepts of AI training methods through practical examples and real-world applications. The lecture covers the core components of reinforcement learning, including state-action pairs, reward systems, and decision-making processes. You'll understand how AI agents interact with their environment, learn from feedback, and optimize their behavior through trial and error. The discussion includes practical examples ranging from maze navigation to robotic movement, demonstrating how reinforcement learning differs from traditional programming approaches. Special attention is given to the Robodog example, illustrating how RL enables machines to develop optimal solutions without explicit programming. The lecture also introduces important theoretical concepts that form the foundation of modern AI training methods, complemented by recommended readings from influential papers in the field. Whether you're interested in artificial intelligence, machine learning, or practical applications of RL in robotics, this lecture provides essential knowledge for understanding how AI systems learn and adapt through experience.
Bellman Equation in Reinforcement Learning: A Step-by-Step Introduction18:25
If you want to know:
- What is the Bellman equation and why is it fundamental to reinforcement learning?
- How does the value function work in Markov Decision Processes?
- Why is the discount factor important in reinforcement learning algorithms?
- How do agents learn optimal policies using the Bellman equation?
- What role does the Bellman equation play in decision-making processes?
Then this lecture is for you!

This comprehensive lecture introduces the Bellman equation, a cornerstone concept in reinforcement learning and artificial intelligence. Starting with fundamental concepts of states, actions, and rewards in Markov Decision Processes (MDPs), the lecture systematically builds understanding through practical maze-solving examples. You'll learn how the Bellman equation enables agents to make optimal decisions by calculating state values and incorporating the discount factor. The lecture explains value functions, optimal policies, and the mathematical foundations behind reinforcement learning algorithms. Through clear visualizations and step-by-step explanations, you'll understand how agents learn to maximize cumulative rewards and make intelligent decisions in complex environments. Special attention is given to the discount factor's role in determining state values and guiding action selection. This foundational knowledge is essential for anyone interested in AI, machine learning, or developing reinforcement learning applications.
From State Values to Optimal Plans: Bellman Equation in AI Decision Making2:12
If you want to know:
- How does the Bellman Equation help AI make optimal decisions?
- What's the relationship between state values and action plans in reinforcement learning?
- How do agents create navigation maps from state values?
- What's the difference between plans and policies in Markov Decision Processes?
- How does an AI agent determine the best action to take in each state?
Then this lecture is for you!

This lecture explores the fundamental connection between state values and optimal planning in reinforcement learning systems. Learn how the Bellman Equation transforms numerical state values into actionable navigation maps for AI agents. Discover the mathematical principles behind value-based decision-making in Markov Decision Processes (MDPs), including how agents evaluate different states and select optimal actions. The lecture demonstrates practical examples using a maze environment, showing how an AI agent converts value functions into concrete movement plans. Understanding these concepts is crucial for implementing effective reinforcement learning algorithms and developing intelligent decision-making systems. The session concludes with an introduction to stochastic environments and the distinction between deterministic plans and probabilistic policies, setting the foundation for advanced reinforcement learning concepts.
Markov Decision Processes in Reinforcement Learning: A Complete Guide16:26
If you want to know:
- What are Markov Decision Processes (MDPs) and how do they work in reinforcement learning?
- How does the Bellman equation change when dealing with non-deterministic environments?
- What's the difference between deterministic and stochastic processes in AI?
- How do probability and randomness affect decision-making in reinforcement learning?
- What are real-world applications of Markov Decision Processes?
Then this lecture is for you!

This comprehensive lecture explores Markov Decision Processes (MDPs) in reinforcement learning, bridging the gap between theoretical concepts and practical applications. Starting with the fundamental distinction between deterministic and non-deterministic searches, the lecture explains how probability and randomness influence decision-making in AI systems. You'll learn how the Bellman equation evolves to handle stochastic environments, incorporating expected values and probability distributions. The lecture covers the Markov property, its significance in reinforcement learning, and how MDPs provide a mathematical framework for modeling real-world decision-making scenarios. Through practical examples ranging from population dynamics to financial investments, you'll understand how MDPs are applied in various fields. The lecture concludes with a discussion of the modified Bellman equation for non-deterministic environments, demonstrating how reinforcement learning algorithms handle uncertainty and randomness in practical applications.
RL Tutorial: Optimal Policy vs Fixed Plans in AI Decision Making12:55
If you want to know:
- What's the key difference between optimal policies and fixed plans in AI?
- How does the Bellman equation adapt to stochastic environments?
- Why do value functions change when uncertainty is introduced?
- How does reinforcement learning handle unexpected outcomes?
- What makes optimal policies more robust than deterministic plans?
Then this lecture is for you!

This lecture explores the fundamental distinction between optimal policies and fixed plans in reinforcement learning, focusing on decision-making under uncertainty. Through practical examples using Markov Decision Processes (MDPs), you'll understand how the Bellman Optimality Equation adapts to stochastic environments. The lecture demonstrates how value functions change when introducing randomness and probability, showcasing real-world applications of reinforcement learning algorithms. You'll learn why optimal policies often outperform deterministic plans, especially in environments with uncertain outcomes. Using a maze navigation example, the lecture illustrates how AI can develop counter-intuitive yet optimal strategies through reinforcement learning, highlighting the power of mathematical optimization in artificial intelligence. Special attention is given to the role of discount factors, expected rewards, and state-action pairs in developing robust decision-making processes that can handle environmental uncertainties.
Living Penalty in Reinforcement Learning: Optimize AI Agent Decision Making9:47
If you want to know:
- How does living penalty affect AI agent decision-making in reinforcement learning?
- Why is continuous reward important in real-world reinforcement learning applications?
- How do different living penalty values impact optimal policy selection?
- What role does the Bellman equation play in calculating living penalties?
- How can negative rewards influence an agent's behavior in reinforcement learning?
Then this lecture is for you!

This lecture explores the concept of living penalty in reinforcement learning, a crucial mechanism for optimizing AI agent decision-making. Learn how continuous rewards affect agent behavior throughout the learning process, moving beyond simple end-state rewards. The lecture demonstrates how different living penalty values (-0.04, -0.5, -2.0) dramatically influence optimal policy selection and agent pathfinding strategies. Using practical examples, you'll understand how the Bellman equation incorporates ongoing rewards and how negative reinforcement shapes agent behavior. The session covers real-world applications of living penalties in various scenarios, from gaming AI to robotics, illustrating how this technique helps create more sophisticated reinforcement learning algorithms. Perfect for those looking to deepen their understanding of advanced reinforcement learning concepts and practical implementation strategies.
Q-Learning in Reinforcement Learning: From V-Values to Q-Values Explained14:45
If you want to know:
- What is Q-learning and how does it differ from traditional V-value approaches?
- How do Q-values relate to state values in reinforcement learning?
- How does the Bellman equation adapt when moving from V-values to Q-values?
- Why is Q-learning considered a breakthrough in reinforcement learning algorithms?
- How do agents use Q-values to make optimal decisions in stochastic environments?
Then this lecture is for you!

This comprehensive lecture explores the fundamental transition from V-values to Q-values in reinforcement learning, explaining how Q-learning revolutionizes the way AI agents make decisions. Starting with the classic Bellman equation, we demonstrate how Q-learning transforms state-value assessments into action-value evaluations, providing a more direct approach to optimal decision-making in artificial intelligence. The lecture covers the mathematical foundations of Q-learning, including the derivation of Q-value equations, their relationship to V-values, and their application in Markov Decision Processes. You'll understand how agents leverage Q-values to evaluate and select optimal actions, making this approach particularly powerful for deep reinforcement learning applications. The session concludes with practical insights into implementing Q-learning algorithms and their significance in modern AI applications, from robotics to dynamic pricing systems. Perfect for those looking to deepen their understanding of advanced reinforcement learning techniques and their real-world applications.
Temporal Difference in Q-Learning: A Complete Guide for Reinforcement Learning19:26
If you want to know:
- What is temporal difference in Q-learning and why is it crucial for reinforcement learning?
- How does temporal difference help in updating Q-values effectively?
- What role does the learning rate (alpha) play in Q-learning algorithms?
- How do stochastic environments affect Q-value calculations?
- When does a Q-learning algorithm achieve convergence?
Then this lecture is for you!

This comprehensive guide to temporal difference in Q-learning explores the fundamental concepts of reinforcement learning algorithms. Learn how temporal difference serves as the core mechanism for updating Q-values in stochastic environments, and understand the mathematical foundations behind Q-learning optimization. The lecture covers the Bellman equation, the role of learning rates, and practical implementations of temporal difference in reinforcement learning frameworks. Discover how agents learn from experience by calculating temporal differences between predicted and actual Q-values, and understand when algorithms achieve convergence. Perfect for AI practitioners and machine learning enthusiasts looking to deepen their understanding of advanced reinforcement learning techniques. The lecture includes practical examples of Q-learning applications and references to seminal works in the field, including Richard Sutton's influential research on temporal difference methods.
Quiz 1

Deep Learning Fundamentals: Neural Networks & Activation Functions Explained2:17
If you want to know:
- What are the fundamental building blocks of neural networks?
- How do biological neurons inspire artificial neural networks?
- What are activation functions and how do they work?
- Which activation functions should you use in different neural network layers?
- How do neural networks learn and process information?
- What makes gradient descent and stochastic gradient descent effective for neural network training?
Then this lecture is for you!

This comprehensive lecture on Deep Learning fundamentals provides a thorough introduction to neural networks and activation functions. Starting with the biological inspiration behind artificial neurons, you'll learn how the human brain's structure influences neural network design. The lecture covers essential concepts including various activation functions, their applications in different network layers, and practical implementation considerations. Through a simplified real estate price prediction example, you'll understand neural network operations before diving into learning mechanisms. The session concludes with detailed explanations of gradient descent, stochastic gradient descent, and backpropagation techniques, providing you with a complete foundation in neural network architecture and training methodologies. This structured approach ensures a clear understanding of both theoretical concepts and practical applications in deep learning systems.
Deep Q-Learning vs Traditional Q-Learning: Key Differences Explained15:15
If you want to know:
- What are the fundamental differences between traditional Q-learning and Deep Q-learning?
- How does Deep Q-learning leverage neural networks to enhance reinforcement learning?
- Why is Deep Q-learning more effective for complex environments than traditional Q-learning?
- How does the temporal difference learning concept evolve in Deep Q-learning?
- What makes Deep Q-Networks (DQN) more powerful for solving advanced RL problems?
Then this lecture is for you!

This comprehensive lecture explores the evolution from traditional Q-learning to Deep Q-learning in reinforcement learning. You'll understand how Deep Q-Networks (DQN) integrate neural networks to handle complex state spaces and learn optimal policies. The lecture covers the transformation of temporal difference learning in deep architectures, explaining how neural networks predict Q-values and utilize backpropagation for learning. You'll learn why Deep Q-learning excels in sophisticated environments like self-driving cars and Atari games, where traditional Q-learning falls short. The session details the mathematical foundations of DQN, including loss calculation, target Q-values, and weight updates through stochastic gradient descent. Special attention is given to the practical aspects of implementing Deep Q-learning, making complex reinforcement learning concepts accessible and applicable to real-world scenarios.
How Deep Q-Learning Works: Neural Networks & Reinforcement Learning Explained6:06
If you want to know:
- How does Deep Q-Learning combine neural networks with reinforcement learning?
- What are the key differences between traditional Q-Learning and Deep Q-Learning?
- How does the action selection process work in Deep Q-Learning?
- What role do Q-values play in the decision-making process?
- How does the learning process occur in Deep Q-Learning systems?
Then this lecture is for you!

This comprehensive lecture breaks down the fundamental concepts of Deep Q-Learning (DQN), bridging the gap between traditional reinforcement learning and neural networks. You'll learn how states are encoded into vectors for neural network processing, understand the two crucial phases of Deep Q-Learning - the learning phase and action selection phase, and master the process of Q-value computation and optimization. The lecture covers practical implementations using PyTorch, explaining how agents learn through experience replay and value function approximation. Special attention is given to action selection policies, including the Softmax function, and how they influence the agent's decision-making process. Whether you're new to deep reinforcement learning or looking to strengthen your understanding of DQN architecture, this lecture provides both theoretical foundations and practical insights for implementing effective deep Q-learning solutions.
Experience Replay in Deep Q-Learning: How it Works & Why it Matters15:45
If you want to know:
- What is Experience Replay and why is it crucial for Deep Q-Learning?
- How does Experience Replay solve the problem of correlated sequential experiences?
- Why do rare experiences matter in reinforcement learning?
- How does Experience Replay improve learning efficiency in limited environments?
- What are the key advantages of implementing Experience Replay in DQN?
Then this lecture is for you!

Experience Replay is a fundamental technique in Deep Q-Learning (DQN) that significantly enhances the learning process of reinforcement learning agents. This lecture explores how Experience Replay breaks the pattern of correlated sequential experiences by storing and randomly sampling from past interactions, enabling more efficient and stable learning. You'll understand how this mechanism helps preserve rare but valuable experiences, prevents biased learning from sequential states, and accelerates training in environments with limited experiences. The lecture covers practical implementation aspects of Experience Replay, including batch processing, rolling window approaches, and uniform distribution sampling. Advanced concepts like Prioritized Experience Replay from DeepMind's 2016 research are also introduced, providing insights into cutting-edge developments in deep reinforcement learning. Through practical examples using a self-driving car simulation, you'll learn how Experience Replay addresses key challenges in deep Q-learning and improves overall agent performance.
Q-Learning: Guide to Epsilon-Greedy & Softmax Action Selection Algorithms16:23
If you want to know:
- What are the main action selection policies in reinforcement learning?
- How do epsilon-greedy and softmax algorithms work in Q-learning?
- Why is the balance between exploration and exploitation crucial in RL?
- How does softmax transform Q-values into action probabilities?
- What makes softmax different from epsilon-greedy in deep reinforcement learning?
Then this lecture is for you!

This comprehensive guide explores action selection policies in reinforcement learning, focusing on epsilon-greedy and softmax algorithms. Learn how these crucial mechanisms balance exploration and exploitation in Q-learning environments. The lecture explains the mathematical foundations of softmax, demonstrating how it transforms Q-values into probability distributions for action selection. You'll understand why random exploration is essential for avoiding local maxima and how different selection policies affect agent behavior. The session covers practical implementations, comparing epsilon-greedy's simple randomization approach with softmax's more sophisticated probability-based selection method. Advanced concepts include adaptive exploration strategies and their applications in deep Q-learning networks (DQN). Perfect for machine learning practitioners looking to optimize their reinforcement learning algorithms and understand the nuances of action selection in AI systems.

Get the Codes here0:10
Step 1 - Deep Q-Learning Environment Setup: From Gmail to Lunar Lander Training6:59
If you want to know:
- How do you set up a Deep Q-Learning environment from scratch?
- What tools do you need to start training a Lunar Lander AI?
- How can you use Google Colab for reinforcement learning projects?
- Why is Gymnasium (OpenAI Gym) essential for AI training?
- What's the best way to prepare your development environment for DQN training?
Then this lecture is for you!

This comprehensive setup guide walks you through creating a complete Deep Q-Learning environment for training a Lunar Lander AI agent. Starting with Gmail account creation for Google Colab access, you'll learn how to configure PyTorch, Gymnasium (formerly OpenAI Gym), and essential deep learning libraries. The lecture covers environment initialization for the LunarLander-v2 challenge, explaining why Google Colab is preferred for hassle-free deep reinforcement learning projects. You'll get hands-on experience with the Gymnasium platform, understanding its various environments including Classic Control, Box2D, and Atari games. The session provides a foundation for implementing Deep Q-Networks (DQN) while avoiding common setup pitfalls, preparing you for practical AI training in subsequent modules.
Google Colab Setup: Deep Q-Learning for Lunar Lander Tutorial6:24
If you want to know:
- How do you set up Google Colab for Deep Q-Learning projects?
- What libraries are essential for implementing Lunar Lander with PyTorch?
- How do you create a working copy of a read-only Colab notebook?
- How do you install Gymnasium and its environments in Google Colab?
- What are the necessary dependencies for Deep Q-Learning in Python?
Then this lecture is for you!

This tutorial guides you through the essential setup process for implementing Deep Q-Learning in Google Colab, specifically for the LunarLander-v2 environment. Learn how to create your personal copy of the notebook in Google Drive, install crucial dependencies including Gymnasium, Atari, and Box2D environments, and import necessary Python libraries such as PyTorch, NumPy, and specialized reinforcement learning modules. The lecture covers the complete environment setup, from handling read-only notebooks to preparing your workspace for deep reinforcement learning implementations. Perfect for machine learning enthusiasts looking to start practical deep Q-learning projects using OpenAI Gym environments and PyTorch framework. This foundational setup prepares you for building and training an AI agent capable of mastering the Lunar Lander challenge.
Step 3 - PyTorch DQN Architecture: Building the AI Brain for OpenAI Lunar Lander8:23
If you want to know:
- How do you build a Deep Q-Network (DQN) architecture for reinforcement learning?
- What's the optimal neural network structure for OpenAI's LunarLander-v2 environment?
- How do you implement PyTorch layers for a DQN agent?
- What are the key components of a neural network for lunar landing tasks?
- How do you connect input states to action outputs in a DQN?
Then this lecture is for you!

In this comprehensive PyTorch tutorial, learn how to build the neural network architecture for a Deep Q-Learning agent tackling OpenAI's LunarLander-v2 environment. The lecture covers the implementation of a three-layer neural network using PyTorch's nn.Module, specifically designed for reinforcement learning tasks. You'll understand how to structure the network with optimal layer sizes (8 inputs, 64-neuron hidden layers, 4 outputs), handle state observations, and prepare action outputs. The tutorial explains the complete architecture setup, including inheritance from nn.Module, initialization methods, and the strategic placement of fully connected layers. Perfect for developers looking to implement deep reinforcement learning solutions using PyTorch for complex control tasks like lunar landing.
PyTorch Deep Q-Learning: Implementing Forward Method for Neural Nets3:58
If you want to know:
- How do you implement the forward method in PyTorch for Deep Q-Learning?
- What's the process of forward propagation in a neural network for reinforcement learning?
- How do you build a neural network architecture for LunarLander-v2 using PyTorch?
- What role do fully connected layers and activation functions play in DQN implementation?
- How do you connect input states to output actions in a Deep Q-Network?
Then this lecture is for you!

In this comprehensive PyTorch tutorial, we dive deep into implementing the forward method for Deep Q-Learning neural networks. Learn how to construct a complete neural network architecture using PyTorch's powerful framework, specifically designed for reinforcement learning applications. The lecture covers the step-by-step implementation of forward propagation, including input layer handling, multiple fully connected layers, and ReLU activation functions. You'll understand how to properly structure the network to process state inputs and generate action outputs, essential for Deep Q-Network (DQN) applications like OpenAI Gym's LunarLander-v2 environment. This hands-on tutorial demonstrates practical PyTorch implementation techniques, focusing on neural network architecture design for reinforcement learning tasks, making it valuable for both beginners and intermediate practitioners in machine learning.
Step 5 - Configure LunarLander-v2 Environment Parameters for DQN Training5:45
If you want to know:
- How do you set up the LunarLander-v2 environment for DQN training?
- What are the essential parameters needed for Deep Q-Learning implementation?
- How do you configure state shape and action space for reinforcement learning?
- What are the key environment variables needed for training a lunar landing AI?
- How do you prepare the gymnasium environment for deep reinforcement learning?
Then this lecture is for you!

In this comprehensive tutorial on Deep Q-Network (DQN) implementation, you'll learn how to properly configure the LunarLander-v2 environment using gymnasium. The lecture covers essential setup procedures for deep reinforcement learning, including importing necessary libraries and establishing crucial environment parameters. You'll understand how to define state shapes, determine state sizes, and configure action spaces specifically for DQN training. The tutorial demonstrates practical implementation of experience replay setup, showing you how to extract and utilize the environment's observation space and action space parameters. By the end, you'll have a properly configured environment ready for training your deep Q-learning agent, with all necessary parameters established for the neural network architecture. This foundational setup is crucial for successful implementation of reinforcement learning algorithms in the lunar landing task.
DQN Hyperparameters: Learning Rate & Replay Buffer Setup Guide (Step 6)3:55
If you want to know:
- How do you set optimal learning rates for DQN algorithms?
- What's the ideal replay buffer size for deep reinforcement learning?
- How do you configure hyperparameters for effective experience replay?
- What are the best practices for initializing DQN training parameters?
- How do discount factors impact deep Q-learning performance?
Then this lecture is for you!

This comprehensive guide focuses on configuring essential hyperparameters for Deep Q-Networks (DQN) implementation. Learn how to set up crucial parameters including learning rate (0.0005), minibatch size (100), and discount factor (0.99) for optimal DQN performance. The lecture covers experience replay buffer configuration with detailed explanations of the replay memory size (100,000) and interpolation parameter (0.001) settings. Understanding these parameters is crucial for training stable and efficient deep reinforcement learning models. Perfect for AI practitioners and researchers working with deep Q-learning algorithms, this tutorial provides practical insights based on extensive experimentation and real-world testing in lunar landing environments. The session delivers concrete values and explanations for each hyperparameter, ensuring your deep reinforcement learning implementation is properly configured for optimal training results.
Step 7: Implementing Experience Replay Memory in DQN with Python2:44
If you want to know:
- How does Experience Replay Memory improve Deep Q-Learning performance?
- What is the role of ReplayMemory class in DQN implementation?
- How to implement a memory buffer for storing and sampling experiences?
- How to break correlations between consecutive experiences in DQN?
- What are the key components of Experience Replay implementation in Python?
Then this lecture is for you!

In this comprehensive lecture on implementing Experience Replay Memory in Deep Q-Networks (DQN), you'll learn how to create a robust ReplayMemory class using Python and PyTorch. The lecture covers essential deep reinforcement learning concepts, focusing on the implementation of experience replay buffer - a crucial component that enhances DQN training stability. You'll discover how to initialize the memory capacity, handle GPU/CPU device selection, and structure the memory buffer to store state-action pairs, rewards, and transitions. The implementation includes practical code examples demonstrating how to break correlations between consecutive experiences, a fundamental aspect of successful deep Q-learning algorithms. This hands-on tutorial is perfect for AI practitioners and researchers looking to master advanced reinforcement learning techniques through practical implementation.
Step 8: DQN Push Method - Adding Experiences to Replay Memory Buffer3:07
If you want to know:
- How does the DQN replay memory buffer work in deep reinforcement learning?
- What's the best way to implement experience storage in a DQN algorithm?
- How do you manage memory capacity in deep Q-learning implementations?
- How can you efficiently add experiences to a DQN replay buffer?
- What's the optimal way to handle old experiences in DQN memory management?
Then this lecture is for you!

This lecture focuses on implementing the crucial push method for DQN's replay memory buffer, a fundamental component of deep Q-learning algorithms. You'll learn how to properly add experience tuples (state, action, reward, next state, done) to the replay memory while maintaining optimal buffer size. The implementation covers memory capacity management, handling overflow situations, and the strategic removal of oldest experiences when the buffer reaches its capacity limit. This practical Python implementation demonstrates essential concepts for building stable and efficient deep reinforcement learning systems, particularly useful for training DQN agents in complex environments. The lecture provides hands-on experience with experience replay mechanisms, a critical technique for stabilizing deep Q-network training and improving learning efficiency in reinforcement learning applications.
Step 9: Coding DQN Memory Sampling - PyTorch Experience Replay Tutorial8:17
If you want to know:
- How does experience replay memory work in Deep Q-Networks?
- What's the best way to implement memory sampling in DQN using PyTorch?
- How to properly handle state, action, reward, and next state transitions in DQN?
- How to convert NumPy arrays to PyTorch tensors for DQN training?
- What are the key components of implementing a replay buffer in deep reinforcement learning?
Then this lecture is for you!

This lecture provides a detailed walkthrough of implementing the crucial memory sampling mechanism in Deep Q-Networks (DQN) using PyTorch. You'll learn how to create an efficient ReplayMemory class with a sample method that randomly selects batches of experiences from the memory buffer. The tutorial covers essential DQN components including state-action-reward transitions, proper data type handling, and tensor conversions from NumPy to PyTorch. You'll master the implementation of experience replay, a fundamental technique that stabilizes deep reinforcement learning training by breaking temporal correlations in the training data. The lecture demonstrates how to properly handle different data types, manage GPU/CPU device allocation, and prepare data structures for neural network training. This hands-on implementation serves as a crucial building block for creating robust DQN agents capable of learning complex environments.
DQN Tutorial: Initialize Q-Networks, Optimizer & Replay Memory Buffer7:39
If you want to know:
- How do you initialize Q-networks for deep reinforcement learning?
- What are the essential components of a DQN agent implementation?
- How do you set up replay memory buffers for DQN algorithms?
- How do you properly configure optimizers for deep Q-learning?
- What's the proper way to implement target and local networks in DQN?
Then this lecture is for you!

This lecture covers the fundamental implementation of a Deep Q-Network (DQN) agent class, focusing on essential initialization components for deep reinforcement learning. You'll learn how to properly set up local and target Q-networks using PyTorch, configure the Adam optimizer for neural network training, and implement an experience replay memory buffer. The lecture demonstrates practical implementation steps including device setup for CPU/GPU compatibility, state and action space initialization, and time step counter configuration. This hands-on tutorial is part of a larger series on implementing DQN for solving complex reinforcement learning environments, specifically designed for training an AI to master lunar landing scenarios. Perfect for developers and AI enthusiasts looking to understand the core architecture of deep Q-learning systems.
Step 11: DQN Step Method - Store & Learn from Experiences in Python7:35
If you want to know:
- How does the DQN step method work in deep reinforcement learning?
- What's the proper way to store and learn from experiences in DQN?
- How do you implement experience replay in Python for deep Q-learning?
- When should an agent learn from stored experiences in DQN?
- How do you handle minibatch learning in deep Q-networks?
Then this lecture is for you!

This lecture demonstrates the implementation of the crucial step method in Deep Q-Network (DQN) algorithms using Python. You'll learn how to properly store experiences in replay memory and manage the learning process in deep reinforcement learning. The tutorial covers essential DQN components including experience replay implementation, minibatch sampling, and time step management for learning intervals. You'll understand how to structure the step method to handle state-action-reward transitions, implement proper memory storage mechanisms, and control when the agent should learn from accumulated experiences. The lecture provides practical insights into handling experience tuples, managing replay buffers, and coordinating the learning process with minibatch sizes in deep Q-learning implementations. Perfect for developers and AI enthusiasts working on reinforcement learning projects who want to master DQN implementation details.
Step 12: DQN Action Selection - State Processing to Policy Implementation8:59
If you want to know:
- How does action selection work in Deep Q-Networks?
- What's the process of converting state data for DQN processing?
- How do you implement epsilon-greedy policy in deep reinforcement learning?
- What role does batch dimension play in state processing?
- How do you switch between evaluation and training modes in DQN?
Then this lecture is for you!

This comprehensive lecture explores the implementation of action selection in Deep Q-Networks (DQN), focusing on state processing and policy implementation. Learn how to convert NumPy arrays to PyTorch tensors, handle batch dimensions, and implement epsilon-greedy action selection policy. The lecture covers essential deep reinforcement learning concepts, including state preprocessing, network evaluation modes, and action-value predictions. You'll understand how to use torch.no_grad() for inference, manage training/evaluation modes, and implement optimal action selection strategies. Perfect for practitioners working with complex environments and deep Q-learning applications, this lecture bridges theoretical knowledge with practical implementation, demonstrating how agents learn to interact with the environment effectively through deep neural networks.
Step 13: Deep Q-Network Training - Implementing Learn Method for RL8:55
If you want to know:
- How does the learn method work in Deep Q-Networks?
- What is the process of updating Q-values in reinforcement learning?
- How are target networks and local networks used in deep Q-learning?
- How does back propagation work in DQN training?
- What role does the optimization step play in updating model parameters?
Then this lecture is for you!

This comprehensive tutorial focuses on implementing the crucial learn method in Deep Q-Network (DQN) training, a fundamental component of deep reinforcement learning. You'll master the process of updating Q-values based on sampled experiences, including state-action pairs and rewards. The lecture covers essential concepts like computing Q-targets using target networks, calculating expected Q-values with local networks, and implementing loss functions for optimization. You'll learn how to perform back propagation, execute optimization steps, and understand the soft update mechanism for target network parameters. This hands-on implementation demonstrates how deep neural networks integrate with Q-learning to create powerful reinforcement learning agents capable of handling complex environments. The tutorial provides practical insights into the training process, including proper tensor manipulation, loss computation, and gradient updates, making it essential for anyone looking to master deep Q-learning implementation.
Step 14 - Deep Q-Network Implementation: Soft Update Method for Stable Training6:47
If you want to know:
- How does soft update improve stability in Deep Q-Networks?
- What's the best way to implement target network updates in deep reinforcement learning?
- How can you optimize the training process for complex environments?
- Why is parameter updating crucial for deep Q-learning success?
- How do target and local networks interact in DQN implementation?
Then this lecture is for you!

This comprehensive lecture focuses on implementing the soft update method in Deep Q-Networks (DQN), a crucial technique for stable reinforcement learning training. Learn how to effectively manage parameter updates between local and target Q-networks using interpolation parameters and weighted averages. The lecture covers the implementation of the soft update method, explaining how it prevents abrupt changes that could destabilize the training process. You'll understand the intricate relationship between local Q-networks for action selection and target Q-networks for Q-value calculation. The implementation includes practical code examples demonstrating parameter management, data copying techniques, and the mathematical foundations behind soft updates. This knowledge is essential for developing robust deep reinforcement learning systems that can effectively learn and adapt in complex environments. Perfect for practitioners looking to optimize their DQN implementations and achieve more stable training results in reinforcement learning applications.
Step 15: Creating Your First AI Agent - Deep Q-Network (DQN) Tutorial2:09
If you want to know:
- How do you create your first AI agent using Deep Q-Network (DQN)?
- What are the essential components needed to initialize a DQN agent?
- How does reinforcement learning integrate with neural networks in practice?
- What's the first step in building an AI model that can learn from its environment?
- How do you transition from architecture to actual AI implementation?
Then this lecture is for you!

In this comprehensive deep dive into reinforcement learning, you'll learn how to create your first functional AI agent using Deep Q-Network (DQN) architecture. The lecture demonstrates the practical implementation of machine learning concepts by showing you how to initialize a DQN agent with just a few lines of code. You'll understand how to properly set up state and action parameters, create an instance of the Agent class, and prepare your AI model for training. This tutorial bridges the gap between theoretical neural network architecture and practical AI implementation, setting the foundation for deep reinforcement learning applications. The session concludes by preparing you for the next stage: training your AI's brain using essential methods like step, act, learn, and soft update functions.
Step 16 - Epsilon-Greedy Strategy: Initializing AI Training Hyperparameters5:38
If you want to know:
- How do you initialize hyperparameters for AI training in reinforcement learning?
- What are the key parameters needed for Epsilon-Greedy strategy implementation?
- How do you set up exploration vs exploitation parameters in AI training?
- What are the optimal starting values for training an AI agent using reinforcement learning?
- How do you configure episode lengths and time steps in deep reinforcement learning?
Then this lecture is for you!

In this comprehensive deep dive into AI training initialization, we explore the crucial hyperparameters needed for implementing the Epsilon-Greedy strategy in reinforcement learning. The lecture covers essential training parameters including episode counts, maximum time steps, and the complete setup of Epsilon-Greedy action selection policy. You'll learn how to configure starting values, decay rates, and ending values for the exploration-exploitation trade-off. The session demonstrates practical implementation of training windows using double-ended queues and explains how to optimize these parameters for effective AI agent training. This fundamental knowledge is crucial for anyone working with deep reinforcement learning algorithms and machine learning models. Perfect for developers and AI enthusiasts looking to understand the initialization phase of AI training systems.
Step 17: Deep Q-Learning Training Loop - Complete Lunar Lander Guide9:39
If you want to know:
- How do you implement a complete training loop for Deep Q-Learning?
- What are the essential components of a DQN training process?
- How does epsilon-greedy policy work in reinforcement learning?
- How can you track and update rewards in a lunar lander environment?
- What's the proper way to handle state transitions in Deep RL?
Then this lecture is for you!

This comprehensive guide walks you through implementing a complete training loop for Deep Q-Learning using the Lunar Lander environment from OpenAI Gym. Learn how to structure the essential components of Deep Reinforcement Learning (DRL), including state initialization, action selection, and reward processing. The lecture covers crucial implementation details such as epsilon-greedy policy application, state transitions, and reward accumulation. You'll understand how to properly implement the learning process using DQN (Deep Q-Network), manage episode termination conditions, and handle score tracking. The tutorial provides practical insights into hyperparameter management, including epsilon decay for exploration-exploitation trade-off, making it an essential resource for building robust reinforcement learning models. Perfect for those looking to master deep Q-learning implementation with real-world applications.
Step 18: DQN Training Visualization - Dynamic Score Tracking Implementation20:26
If you want to know:
- How do you implement dynamic score tracking in DQN training?
- What's the best way to visualize real-time training progress in reinforcement learning?
- How can you monitor and display average scores during DQN model training?
- What techniques are used to implement overriding effects in training visualization?
- How do you determine when a reinforcement learning model has successfully solved an environment?
Then this lecture is for you!

This lecture demonstrates the implementation of dynamic score tracking visualization for Deep Q-Network (DQN) training in reinforcement learning. Learn how to create an advanced printing system that displays real-time average scores with dynamic overriding effects, enabling efficient monitoring of model performance. The implementation covers score calculation over episodes, dynamic console updates, and automated model checkpoint saving when reaching target performance thresholds. You'll understand how to track training progress using moving averages, implement carriage return effects for clean visualization, and set up condition-based model saving. The lecture showcases practical techniques for monitoring DQN training in OpenAI Gym environments, helping you evaluate and optimize your reinforcement learning models effectively. Perfect for those looking to enhance their deep RL implementations with professional-grade training visualization capabilities.
Step 19: Visualizing Deep Q-Learning - AI Perfects Lunar Lander Landing5:19
If you want to know:
- How does Deep Q-Learning perform in real-world scenarios like lunar landing?
- What does successful reinforcement learning visualization look like in practice?
- How can you evaluate a trained DQN model's performance?
- What's the difference between training and inference mode in Deep RL?
- How does OpenAI Gym help visualize reinforcement learning results?
Then this lecture is for you!

This lecture demonstrates the practical visualization of a Deep Q-Learning model successfully mastering the Lunar Lander environment. Students will observe the trained deep reinforcement learning agent executing perfect landings in OpenAI Gym's simulation. The session covers the transition from training to inference mode, explaining how the trained DQN model applies learned policies without further optimization. You'll understand the key differences between training and evaluation phases, see real-time visualization of the agent's decision-making process, and learn how to interpret the model's performance metrics. The lecture includes practical demonstrations of saving model parameters, generating visualization frames, and creating downloadable video outputs of the trained agent's performance. This hands-on demonstration provides valuable insights into deep reinforcement learning model evaluation and real-world application visualization.
ChatGPT vs Custom DQN: Comparing Deep RL Implementations5:44
If you want to know:
- How does ChatGPT's DQN implementation compare to custom-built Deep Q-Learning models?
- What are the key differences between manual and AI-generated reinforcement learning code?
- Why is understanding DQN implementation fundamentals important for AI engineers?
- Can ChatGPT replace manual Deep Q-Learning implementation in real-world applications?
- What are the trade-offs between using ChatGPT and custom DQN implementations?
Then this lecture is for you!

This lecture provides a comprehensive comparison between manually implemented Deep Q-Learning (DQN) models and ChatGPT-generated solutions using the OpenAI Gym Lunar Lander environment. Through practical demonstrations in PyTorch, the lecture explores the architectural differences, performance variations, and implementation nuances between both approaches. Students will gain valuable insights into deep reinforcement learning fundamentals, including epsilon-greedy strategies, experience replay, and neural network architectures. The session emphasizes the importance of understanding core DQN concepts for optimization and parameter tuning, highlighting why manual implementation knowledge gives AI engineers a competitive advantage. Real-world performance evaluations and practical code analysis demonstrate the trade-offs between using AI-generated code versus custom implementations, providing essential insights for developing robust reinforcement learning models.

Requirements

High School Maths
Basic Python knowledge

Description

Welcome to Artificial Intelligence A-Z!

This course is structured in 10 parts:

Part 1 - Prompt Engineering: Prompt Engineering & Prompt Templates, Prompt Engineering Techniques, The 4 Elements of a (good) prompt, Inference Parameters
Part 2 - Generative AI: Fundamentals of Generative AI, Image Generation, Foundation Models Overview, Foundation Models Lifecycle, Data Selection, Foundation Models Selection, Training vs. Inference, Context Window, Tokens and Embeddings, Transformers, Foundation Models Training, Foundation Models Fine-Tuning, Foundation Models Evaluation, Retrieval-Augmented Generation (RAG) for Cooking Assistance
Part 3 - Agentic AI: AI Agents, Building a Cloud-powered AI Agent for Business Assistance
Part 4 - Fundamentals of Reinforcement Learning: Q-Learning Intuition, Q-Learning Implementation
Part 5 - Deep Q-Learning: Deep Q-Learning Intuition, Deep Q-Learning Implementation for Moon Landing
Part 6 - Deep Convolutional Q-Learning: Deep Convolutional Q-Learning Intuition, Deep Convolutional Q-Learning Implementation for Pac-Man
Part 7 - A3C: A3C Intuition, A3C Implementation for Kung Fu
Part 8 - PPO and SAC: Proximal Policy Optimization, Soft Actor-Critic, Build and Train the PPO & SAC models for Self-Driving Cars
Part 9 - LLMs: The Ingredients of an LLM, Who invented LLMs, How LLMs generate text, Understand what's inside an LLM, The LLM Parameters, The LLM Context Window, How to Fine-Tune LLMs for Medical Assistance
Part 10 - Responsible AI: Features of Responsible AI, Guardrails in Generative AI, Legal Risks of Generative AI, AWS Tools for Responsible AI, Amazon SageMaker Clarify and Monitor, Amazon Augmented AI [Amazon A2I], Interpretability vs. Explainability, SageMaker Model Cards

All along this journey, you will learn key AI concepts with intuition lectures to get you quickly up to speed with all things AI and practice them by building 12 different AIs:

Build a ChatBot App that speaks like Master Yoda in 5 Minutes.
Build a Movie Script Generator by leveraging advanced Prompt Engineering.
Build Your Custom LLM with Amazon Bedrock, Databricks, and Hugging Face.
Build a RAG-powered Generative AI application with Amazon Bedrock and Knowledge Bases.
Build an AI Agent with a Foundation Model (LLM) for business assistance, all powered by the Cloud.
Build an AI with a Q-Learning model and train it to optimize warehouse flows in a Process Optimization case study.
Build an AI with a Deep Q-Learning model and train it to land on the moon.
Build an AI with a Deep Convolutional Q-Learning model and train it to play the game of Pac-Man.
Build an AI with an A3C (Asynchronous Advantage Actor-Critic) model and train it to fight Kung Fu.
Build an AI with a PPO (Proximal Policy Optimization) model and train it for a Self-Driving Car.
Build an AI with a SAC (Soft Actor-Critic) model and train it for a Self-Driving Car.
Build an AI by fine-tuning a powerful pre-trained LLM (Llama by Meta) with Hugging Face and re-train it to chat with you about medical terms. Simply put, we build here an AI Doctor Chatbot.

Some of these AIs will be built in AWS, and the others will be built in Python and PyTorch.

But that's not all... Once you complete the course, you will get 3 extra AIs: DDPG, Full World Model, and Evolution Strategies & Genetic Algorithms. We build these AIs with ChatGPT for a Self-Driving Car and a Humanoid application. For each of these extra AIs you will get a long video lecture explaining the implementation, a mini PDF, and the Python code.

Besides, you will get a free 3-hour extra course on Generative AI and LLMs with Cloud Computing as a Prize for completing the course.

And last but not least, here is what you will get with this course:

1. Complete beginner to expert AI skills: Learn to code self-improving AI for a range of purposes. In fact, we code together with you. Every tutorial starts with a blank page and we write up the code from scratch. This way you can follow along and understand exactly how the code comes together and what each line means.

2. Hassle-Free Coding and Code templates: We will build all our AIs in Google Colab, which means that we will have absolutely NO hassle installing libraries or packages because everything is already pre-installed in Google Colab notebooks. Plus, you’ll get downloadable Python code templates (in .py and .ipynb) for every AI you build in the course. This makes building truly unique AI as simple as changing a few lines of code. If you unleash your imagination, the potential is unlimited.

3. Intuition Tutorials: Where most courses simply bombard you with dense theory and set you on your way, we believe in developing a deep understanding for not only what you’re doing, but why you’re doing it. That’s why we don’t throw complex mathematics at you, but focus on building up your intuition in AI for much better results down the line.

4. Real-world solutions: You’ll achieve your goal in not only one AI model but in 5. Each module is comprised of varying structures and difficulties, meaning you’ll be skilled enough to build AI adaptable to any environment in real life, rather than just passing a glorified memory “test and forget” like most other courses. Practice truly does make perfect.

5. In-course support: We’re fully committed to making this the most accessible and results-driven AI course on the planet. This requires us to be there when you need our help. That’s why we’ve put together a team of professional Data Scientists to support you in your journey, meaning you’ll get a response from us within 48 hours maximum.

So, are you ready to embrace the fascinating world of AI?

Come join us, never stop learning, and until then, enjoy AI!

Who this course is for:

Anyone interested in Artificial Intelligence, Machine Learning or Deep Learning

AI A-Z [2026]: Agentic AI, Gen AI, Prompt Engineering and RL

What you'll learn

Explore related topics

Course content

Welcome to the course!3 lectures • 6min

--- Part 1: Prompt Engineering ---7 lectures • 30min

--- Part 2: Generative AI ---19 lectures • 1hr 4min

--- Part 3: Agentic AI ---3 lectures • 3min

--- Part 4: Fundamentals of Reinforcement Learning ---1 lecture • 1min

Q-Learning Intuition9 lectures • 1hr 49min

Q-Learning Implementation1 lecture • 1min

--- Part 5: Deep Q-Learning ---1 lecture • 1min

Deep Q-Learning Intuition5 lectures • 56min

Deep Q-Learning Implementation21 lectures • 2hr 19min

Requirements

Description

Who this course is for: