Google Cloud Developer Advocate Nikita Namjoshi introduces how distributed training models can dramatically reduce machine learning training times, explains how to make use of multiple GPUs with Data Parallelism vs Model Parallelism, and explores Synchronous vs Asynchronous Data Parallelism. Mesh TensorFlow → Distributed Training with Keras tutorial → GCP Reduction Server Blog → Multi Worker Mirrored Strategy tutorial → Parameter Server Strategy tutorial → Distributed training on GCP Demo → Chapters: 0:00 - Introduction 00:17 - Agenda 00:37 - Why distributed training? 1:49 - Data Parallelism vs Model Parallelism 6:05 - Synchronous Data Parallelism 18:20 - Asynchronous Data Parallelism 23:41 Thank you for watching Watch more ML Tech Talks → Subscribe to TensorFlow → #TensorFlow #MachineLearning #ML product
Hide player controls
Hide resume playing