-
Notifications
You must be signed in to change notification settings - Fork 16
Description
Title of the talk
Multi-GPU ML training using Pytorch-DDP
Description
As the scale and complexity of deep learning models continue to grow, efficient training strategies have become crucial for accelerating innovation and pushing the boundaries of AI research and deployment. Multi-GPU training has emerged as a game-changer, enabling faster model convergence and the ability to handle larger datasets and models. Among the various approaches available, PyTorch’s Distributed Data Parallel (DDP) stands out as a powerful and efficient solution designed for scalability and performance.
Table of contents
Topics of interest include, but are not limited to:
- Introduction to Pytorch DDP (Distributed Data Parallel)
- Best practices for setting up and using PyTorch DDP for multi-GPU training.
- Practical demo on training a simple neural-network with the MNIST datasets using PyTorch DDP.
Duration (including Q&A)
25 mins
Prerequisites
Nothing is required
Speaker bio
- Amita Sharma (Red hat Openshift AI - Kubeflow Training Team : Technical Project Manager)
- Abhijeet Dhumal (Red hat Openshift AI - Kubeflow Training Team : Engineer) @abhijeet-dhumal
The talk/workshop speaker agrees to
-
Share the slides, code snippets and other material used during the talk
-
If the talk is recorded, you grant the permission to release
the video on PythonPune's YouTube
channel
under CC-BY-4.0
license -
Not do any hiring pitches during the talk and follow the Code
of
Conduct