Please disable your adblock and script blockers to view this page

A Visual Exploration of Gaussian Processes

Khan Academy
the German Research Foundation
the Research Unit

McHutchon et al
Carla Avolio
Marc Spicker
Jonas K├Ârner


No matching tags

No matching tags

No matching tags

\sigma_i of the i-th

Positivity     34.00%   
   Negativity   66.00%
The New York Times
Write a review: Hacker News

The standard deviations for each random variable are on the diagonal of the covariance matrix, while the other values show the covariance between them.Gaussian distributions are widely used to model the real world: either as a surrogate when the original In particular, given a normal probability distribution P(X,Y) over vectors of random variables X, and Y, we can determine their marginalized probability distributions in the following way:The interpretation of this equation is straight forward: each partition X and Y only depends on its corresponding entries in \mu and \Sigma. To marginalize out a random variable from a Gaussian distribution we can simply drop the variables from \mu and \Sigma.The way to interpret this equation is that if we are interested in the probability of Now that we have recalled some of the basic properties of multivariate Gaussian distributions, we will combine them together to define Gaussian processes, and show how they can be used to tackle regression problems. Now, the key idea behind Gaussian processes is that all function values stem from a multivariate Gaussian distribution. That means that the joint probability distribution P(X,Y) spans the space of possible function values for the function that we want to predict. In the case of Gaussian processes, this information is the training data. how do we set up this distribution and define the mean \mu and the covariance matrix \Sigma? But before we come to this, let us reflect on how we can use multivariate Gaussian distributions to estimate function values. test points, the corresponding multivariate Gaussian distribution is also Making a prediction using a Gaussian process ultimately boils down to drawing samples from this distribution. The covariance matrix will not only describe the shape of our distribution, but ultimately determines the characteristics of the function that we want to predict. This follows from the definition of the multivariate Gaussian distribution, which states that \Sigma_{ij} defines the correlation between the i-th and the j-th random variable. The following figure shows examples of some common kernels for Gaussian processes. As we have mentioned earlier, Gaussian processes define a probability distribution over possible functions. Because this distribution is a multivariate Gaussian distribution, the distribution of functions is normal. The following figure shows samples of potential functions from prior distributions that were created using different kernels: The result is a multivariate Gaussian distribution with dimensions |Y| + |X|. Through marginalization of each random variable, we can extract the respective mean function value \mu'_i and standard deviation \sigma'_i = \Sigma'_{ii} for i-th test point. But when we condition the joint distribution of the test and training data the resulting distribution will most likely have a non-zero mean \mu' \neq 0. As described earlier, the power of Gaussian processes lies in the choice of the kernel function. Using Gaussian processes, we can define a kernel function that fits our data and add uncertainty to the

As said here by G?rtler, Jochen, Kehlbeck, Rebecca, Deussen, Oliver