CMSC 35200 – Deep Learning Systems¶

This class will be held online.

Course Staff¶

Instructors

Ian Foster (foster@uchicago.edu) and Rick Stevens (rick.stevens@uchicago.edu).

Office hours: By appointment.

Course Description¶

Deep learning is emerging as a major technique for solving problems in a variety of fields, including computer vision, personalized medicine, autonomous vehicles, and natural language processing. Critical to success in these target domains is the development of learning systems: deep learning frameworks that support the tasks of learning complex models and inferencing with those models, and targeting many devices including CPUs, GPUs, mobile device, edge devices, computer clusters, and scalable parallel systems. The systematic study of how to build and optimize such systems is an active area of research.

This course is aimed as an introduction to this topic. We will cover various aspects of deep learning systems, including: basics of deep learning, programming models for expressing machine learning models, automatic differentiation methods used to compute gradients for training, memory optimization, scheduling, data and model parallel and distributed learning, hardware acceleration, domain specific languages, workflows for large-scale machine learning including hyper parameter optimization and uncertainty quantification, and training data and model serving. Many of these topics intersect with existing research directions in databases, systems and networking, architecture, and programming languages. The goal is to present a comprehensive picture of how current deep learning systems work, discuss and explore research opportunities for extending and building on existing frameworks, and deep dive into the accelerators being developed by numerous startups to address the performance needs of the machine learning community.

We will split out time between concepts and practice, with a typical week having one lecture on a specific aspect of deep learning systems and one lab/discussion session in which technologies such as Keras, Tensorflow, CNTK, Mxnet, and PyTorch are used to address that specific aspect. Some guest lectures may cover emerging computer architectures for next generation deep learning accelerators.

Specific topics to be covered:

Introduction to deep learning models
Functional content of deep learning frameworks
Software architecture and design of frameworks
Abstraction layers for deep learning
Performance and benchmarking deep learning systems
Hardware architectures for accelerating deep learning
Parallelism (model, data, ensemble)
Portable representations and translations of models
Optimization for training, inference
Workflows for machine learning and workflow tools
Hyper-parameter optimization and ensembles
Uncertainty quantification

At the end of the quarter, students will:

Understand the purpose of deep learning systems.
Be able to use common deep learning tools such as Keras, TensorFlow, and PyTorch
Understand major challenges in efficient deep learning and how those challenges are addressed in different systems.
Have basic knowledge of research challenges in deep learning system design and implementation.

For more details, please see the syllabus.