Detailed Course Outline
Introduction
- Meet the instructor.
 - Create an account at courses.nvidia.com/join
 
Introduction to Training of Large Models
- Learn about the motivation behind and key challenges of training large models.
 - Get an overview of the basic techniques and tools needed for large-scale training.
 - Get an introduction to distributed training and the Slurm job scheduler.
 - Train a GPT model using data parallelism.
 - Profile the training process and understand execution performance.
 
Model Parallelism: Advanced Topics
- Increase the model size using a range of memory-saving techniques.
 - Get an introduction to tensor and pipeline parallelism.
 - Go beyond natural language processing and get an introduction to DeepSpeed.
 - Auto-tune model performance.
 - Learn about mixture-of-experts models.
 
Inference of Large Models
- Understand the challenges of deployment associated with large models.
 - Explore techniques for model reduction.
 - Learn how to use TensorRT-LLM.
 - Learn how to use Triton Inference Server.
 - Understand the process of deploying GPT checkpoint to production.
 - See an example of prompt engineering.
 
Final Review
- Review key learnings and answer questions.
 - Complete the assessment and earn a certificate.
 - Complete the workshop survey.