Dr. Soper presents a complete walkthrough (tutorial) of a Q-learning-based AI system written in Python. The video demonstrates how to define the environment's states, actions, and rewards, and how to train an AI agent to identify an optimal policy by using Q-learning. The business problem presented in the video is for an AI agent to learn to control warehouse robots such that a robot can take the shortest path between any point in the warehouse and the item packaging area, while simultaneously learning to avoid crashing into any shelves or other item storage locations. Jupyter notebook for this lesson: : Previous lesson (Foundations of Q-Learning): Next lesson (Foundations of Artificial Neural Networks & Deep Q-Learning):
Hide player controls
Hide resume playing