Course Review: Probability - The Science of Uncertainty and Data MITx (6.431x)

Aug 23, 2021 6 min read

After finishing Probability and Statistics by Stanford Online, I wanted to improve my knowledge of statistics further. The logical choice was to take Probability - The Science of Uncertainty and Data MITx (6.431x) which was starting at the time. While there is a MIT OCW version of the course, I decided to use the edX version, as the videos are shorter, it is more structured, there are tight deadlines instead of self study and as the course was starting, there were many other students participating at the same time. Especially the deadlines helped to motivate me, as there were coursework due every week that was graded and that had to be completed in time to count towards your overall score.

Overview

This course builds foundational knowledge of data science with this introduction to probabilistic models, including random processes and the basic elements of statistical inference. It is a introductory course into probability theory. The following is copied from the course description:

What you’ll learn

The world is full of uncertainty: accidents, storms, unruly financial markets, noisy communications. The world is also full of data. Probabilistic modeling and the related field of statistical inference are the keys to analyzing data and making scientifically sound predictions.

Probabilistic models use the language of mathematics. But instead of relying on the traditional “theorem-proof” format, we develop the material in an intuitive – but still rigorous and mathematically-precise – manner. Furthermore, while the applications are multiple and evident, we emphasize the basic concepts and methodologies that are universally applicable.

The course covers all of the basic probability concepts, including:

multiple discrete or continuous random variables, expectations, and conditional distributions
laws of large numbers
the main tools of Bayesian inference methods
an introduction to random processes (Poisson processes and Markov chains)

Syllabus

Unit 1: Probability models and axioms
Probability models and axioms
Mathematical background: Sets; sequences, limits, and series; (un)countable sets.
Unit 2: Conditioning and independence
Conditioning and Bayes’ rule
Independence
Unit 3: Counting
Counting
Unit 4: Discrete random variables
Probability mass functions and expectations
Variance; Conditioning on an event; Multiple random variables
Conditioning on a random variable; Independence of random variables
Unit 5: Continuous random variables
Probability density functions
Conditioning on an event; Multiple random variables
Conditioning on a random variable; Independence; Bayes’ rule
Unit 6: Further topics on random variables
Derived distributions
Sums of independent random variables; Covariance and correlation
Conditional expectation and variance revisited; Sum of a random number of independent random variables
Unit 7: Bayesian inference
Introduction to Bayesian inference
Linear models with normal noise
Least mean squares (LMS) estimation
Linear least mean squares (LLMS) estimation
Unit 8: Limit theorems and classical statistics
Inequalities, convergence, and the Weak Law of Large Numbers
The Central Limit Theorem (CLT)
An introduction to classical
Unit 9: Bernoulli and Poisson processes
The Bernoulli process
The Poisson process
More on the Poisson process
Unit 10 (Optional): Markov chains
Finite-state Markov chains
Steady-state behavior of Markov chains
Absorption probabilities and expected time to absorption

Format

The course is targeted at students that hear probability for the first time. It assumes next to no background in probability theory, but it requires a background in algebra and analysis. The schedule and setup is similar to a college course. The course material is organized along units, each unit containing between one and three lecture sequences. There is one unit due per two weeks. Each lecture sequence is of a single topic with several videos of around 10 minutes length each. Between videos, there are graded quizzes that can be very easy to quite hard. These should be completed before going to the next video. After the lessons for a week are completed, there are videos for solved exercises or additional theoretical background. Finally, there is homework of around 6 exercises that is graded. These are quite difficult and require a substantial time investment solve.

You are given two weeks to complete each week’s exercises. The videos amount to around 200 minutes per week MIT students who take the corresponding residential class typically report an average of 11-12 hours spent each week, including lectures, recitations, readings, homework, and exams. I watched the videos on 1.5x speed and think that I needed around 5-10 hours per week to watch the videos, to quizzes and exercises. There are two exams, one mid term and one final. The final exam is only available if you paid. I took the mid term exam and it was really difficult and while I reached a “passing” score, I did not score well overall. One can earn a certificate if above a certain score when paying, but I did the free course and just did the exercises to test myself.

Conclusion

I heard the basics of probability theory several times during school and university, but this course went much more in depth. It started familiar and got difficult pretty quickly. Especially enjoyable were the proofs of almost anything, the graphical explanations and the bite-sized videos which made it easy to follow and find the time. Everything has solutions if you did not manage to solve a quiz or exercise.

Lection 7 about Bayesian inference was really hard and many students complained about it. It was said that one needs to study the accompanying course book before watching the videos as they do not explain the material very well. Even though I took courses on Statistical Machine Learning before, I really struggled with the related quizzes and exercises of that unit. In the future, I will revisit it and see whether the explanation in the course was the issue or me.

Lection 9 and 10 were mostly new to me and very interesting. I always wondered about stochastic processes and after finishing the course, I now have a good basic understanding about them I liked the Markov chain, especially as they are related a bit to classical Reinforcement Learning. Overall, I really enjoyed the course and already feel its effects on me, especially when reading papers that use probability theory.

Overall, I am very happy to have taken the course; I find it is a good foundation and I will continue to build on it. The next course I want to take is Fundamentals of Statistics (MITx - 18.6501x) which is scheduled to start in Jan 24, 2022. It was offered at the same time as this course, but taking two courses at the same time was not recommended by the lecturers and not possible as it would take too much time per week from me. So I look forward to taking it in January! The next instance of Probability - The Science of Uncertainty and Data MITx (6.431x) seems also to be due in January, around the same time.

Math ML mooc