Cheat Sheet: Properties of Probability Distributions

Here is a probability distribution cheat sheet that I like to keep around for reference. This focuses on the “big picture” properties of some well known PDFs. The goal is to collect some properties that can help me decide when it’s appropriate to use a particular distribution. Beta Distribution Used in task duration modeling (E.g.. […]

How to diagnose problems with your statistical learning algorithm.

In this post I cover a few tricks to diagnose problems with a statistical learning algorithm. We discuss tricks for uncovering high bias and high variance. Then we discuss a method that can tell if there are problems with your objective function or the optimization algorithm employed. These are lecture notes from Andrew Ng’s on […]

Evidence approximation in linear regression: A method that produces “automatically regularized” solutions.

Summary In this post, I look at a Bayesian treatment of the linear regression problem. Making use of basis functions allows you to model non-linear patterns in data, however taking this route usually requires that you regularize your solution. To find the best regularization parameter often requires cross validation, but by looking at a framework […]