Detailed blog on various Distributed Training startegies can be read here.
To train standalone PyTorch script run:
python train.py
To train DataParallel PyTorch script run:
python train_dataparallel.py
To train DistributedDataParallel(DDP) PyTorch script run:
torchrun --nnodes=1 --nproc-per-node=4 train_ddp.py
To train FullyShardedDataParallel(FSDP) PyTorch script run:
torchrun --nnodes=1 --nproc-per-node=4 train_fsdp.py