Course Syllabus

Probabilistic Foundations of Deep Learning

Class notes, announcements, and other information can be found here. 

Instructor: Anastasios Matzavinos, amatzavinos@uc.cl 

Class meeting times: Monday & Wednesday 4:10 am - 5:20 pm in room AP503.

Instructor's office hours: Thursday 12:30 pm - 1:30 pm or by appointment. 

Course description: This semester, IMT 3801 will focus on the analysis of neural networks and deep learning. Using analytic and probabilistic tools, we will address fundamental questions such as:

  • What does the loss surface of an artificial neural network look like?
  • Why do neural networks typically generalize well, that is, why do they often perform well on unseen data?
  • Why is stochastic gradient descent, a local optimization algorithm, so effective in training deep neural networks? How should we think about optimizing the training process?

We will see that probability theory provides powerful tools for answering these questions. For example, we will show that the presence of “bad” local minima, namely local minima whose cost is significantly higher than that of global minima, becomes a low-probability event as the number of hidden units increases.

The course is organized in two parts. The first part is fast-paced and provides an engineering-oriented introduction to core topics, including regression models, neural networks, stochastic gradient descent, backpropagation, regularization and dropout, batch normalization, feature importance, and automatic differentiation.

The second part develops rigorous mathematical results on the performance and asymptotic behavior of neural network architectures. After reviewing classical results such as universal approximation theorems and convergence results for stochastic gradient descent, we will turn to more recent research topics, including the neural tangent kernel regime and mean-field limits of neural networks. These topics require advanced probabilistic tools, such as the Skorokhod topology and related asymptotic results for stochastic processes, which will be developed carefully in class. This asymptotic analysis provides a principled probabilistic framework for understanding and rigorously characterizing the geometry, optimization dynamics, and generalization properties of deep neural networks

Course textbook: We will mainly use the following reference.

  • K. Spiliopoulos, R. Sowers, and J. Sirignano. Mathematical Foundations of Deep Learning Models and Algorithms. American Mathematical Society, 2025.

Grading policy: The final grade will be based on attendance (5% of the grade), homework assignments (35%), a mid-term exam (30%),  and a final take-home exam (30%).

Homework assignments: Homework problems will be handed out on a regular basis. Discussion of homework assignments with other students is encouraged, but what is handed in should be your own work. 

Announcements and other information about the class can be found here. A PDF copy of the syllabus can be found here: IMT_3801.pdf

Course Summary:

Course Summary
Date Details Due