Harnessing the Power of Multi-GPU Training with PyTorch Distributed Data Parallel (DDP)

### Title of the talk

Multi-GPU ML training using Pytorch-DDP

### Description

As the scale and complexity of deep learning models continue to grow, efficient training strategies have become crucial for accelerating innovation and pushing the boundaries of AI research and deployment. Multi-GPU training has emerged as a game-changer, enabling faster model convergence and the ability to handle larger datasets and models. Among the various approaches available, PyTorch’s Distributed Data Parallel (DDP) stands out as a powerful and efficient solution designed for scalability and performance.

### Table of contents

Topics of interest include, but are not limited to:

- Introduction to Pytorch DDP (Distributed Data Parallel)
- Best practices for setting up and using PyTorch DDP for multi-GPU training.
- Practical demo on training a simple neural-network with the MNIST datasets using PyTorch DDP.

### Duration (including Q&A)

25 mins

### Prerequisites

Nothing is required

### Speaker bio

- Amita Sharma (Red hat Openshift AI - Kubeflow Training Team : Technical Project Manager)
- Abhijeet Dhumal (Red hat Openshift AI - Kubeflow Training Team : Engineer) @abhijeet-dhumal

### The talk/workshop speaker agrees to

- [x] Share the slides, code snippets and other material used during the talk
- [x] If the talk is recorded, you grant the permission to release
the video on [PythonPune's YouTube
channel](https://www.youtube.com/channel/UCWjk7oGWV9eknuOzC20dyiQ)
under [CC-BY-4.0
license](https://creativecommons.org/licenses/by/4.0/)

- [x] Not do any hiring pitches during the talk and follow the [Code
of
Conduct](https://github.com/pythonpune/meetup-talks#code-of-conduct)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Harnessing the Power of Multi-GPU Training with PyTorch Distributed Data Parallel (DDP) #191

Title of the talk

Description

Table of contents

Duration (including Q&A)

Prerequisites

Speaker bio

The talk/workshop speaker agrees to

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Harnessing the Power of Multi-GPU Training with PyTorch Distributed Data Parallel (DDP) #191

Description

Title of the talk

Description

Table of contents

Duration (including Q&A)

Prerequisites

Speaker bio

The talk/workshop speaker agrees to

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions