Skip to navigation (Press Enter)
Skip to search (Press Enter)
Skip to course offerings (Press Enter)
Skip to content (Press Enter)

+41 44 832 50 80 Contact
Newsletter subscription

MPBDLNN

Online Training

Duration
1 day

Price

on request

Book now

Enquire a date

Classroom Training

Duration
1 day

Price

on request

Book now

Enquire a date

Nvidia

Model Parallelism: Building and Deploying Large Neural Networks (MPBDLNN) – Outline

Detailed Course Outline

Introduction

Meet the instructor.
Create an account at courses.nvidia.com/join

Introduction to Training of Large Models

Learn about the motivation behind and key challenges of training large models.
Get an overview of the basic techniques and tools needed for large-scale training.
Get an introduction to distributed training and the Slurm job scheduler.
Train a GPT model using data parallelism.
Profile the training process and understand execution performance.

Model Parallelism: Advanced Topics

Increase the model size using a range of memory-saving techniques.
Get an introduction to tensor and pipeline parallelism.
Go beyond natural language processing and get an introduction to DeepSpeed.
Auto-tune model performance.
Learn about mixture-of-experts models.

Inference of Large Models

Understand the challenges of deployment associated with large models.
Explore techniques for model reduction.
Learn how to use TensorRT-LLM.
Learn how to use Triton Inference Server.
Understand the process of deploying GPT checkpoint to production.
See an example of prompt engineering.

Final Review

Review key learnings and answer questions.
Complete the assessment and earn a certificate.
Complete the workshop survey.

Contact