-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Distributed Training
Sherlock edited this page Mar 12, 2021
·
1 revision
- Understand NCCLAllReduce
- Get familiar with DDP usage/setup
-
Read https://arxiv.org/abs/1910.02054
-
Zero-1
- Understand ReduceScatter/AllGather
- Understand how optimizer state is partitioned
-
Zero-2
-
Zero-3
- Understand All2All
Please use the learning roadmap on the home wiki page for building general understanding of ORT.