Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Learning Bayesian Models with R

You're reading from   Learning Bayesian Models with R Become an expert in Bayesian Machine Learning methods using R and apply them to solve real-world big data problems

Arrow left icon
Product type Paperback
Published in Oct 2015
Publisher Packt
ISBN-13 9781783987603
Length 168 pages
Edition 1st Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Hari Manassery Koduvely Hari Manassery Koduvely
Author Profile Icon Hari Manassery Koduvely
Hari Manassery Koduvely
Arrow right icon
View More author details
Toc

Table of Contents (11) Chapters Close

Preface 1. Introducing the Probability Theory FREE CHAPTER 2. The R Environment 3. Introducing Bayesian Inference 4. Machine Learning Using Bayesian Inference 5. Bayesian Regression Models 6. Bayesian Classification Models 7. Bayesian Models for Unsupervised Learning 8. Bayesian Neural Networks 9. Bayesian Modeling at Big Data Scale Index

Expectations and covariance

Having known the distribution of a set of random variables Expectations and covariance, what one would be typically interested in for real-life applications is to be able to estimate the average values of these random variables and the correlations between them. These are computed formally using the following expressions:

Expectations and covariance
Expectations and covariance

For example, in the case of two-dimensional normal distribution, if we are interested in finding the correlation between the variables Expectations and covariance and Expectations and covariance, it can be formally computed from the joint distribution using the following formula:

Expectations and covariance

Binomial distribution

A binomial distribution is a discrete distribution that gives the probability of heads in n independent trials where each trial has one of two possible outcomes, heads or tails, with the probability of heads being p. Each of the trials is called a Bernoulli trial. The functional form of the binomial distribution is given by:

Binomial distribution

Here, Binomial distribution denotes the probability of having k heads in n trials. The mean of the binomial distribution is given by np and variance is given by np(1-p). Have a look at the following graphs:

Binomial distribution

The preceding graphs show the binomial distribution for two values of n; 100 and 1000 for p = 0.7. As you can see, when n becomes large, the Binomial distribution becomes sharply peaked. It can be shown that, in the large n limit, a binomial distribution can be approximated using a normal distribution with mean np and variance np(1-p). This is a characteristic shared by many discrete distributions that, in the large n limit, they can be approximated by some continuous distributions.

Beta distribution

The Beta distribution denoted by Beta distribution is a function of the power of Beta distribution, and its reflection Beta distribution is given by:

Beta distribution

Here, Beta distribution are parameters that determine the shape of the distribution function and Beta distribution is the Beta function given by the ratio of Gamma functions: Beta distribution.

The Beta distribution is a very important distribution in Bayesian inference. It is the conjugate prior probability distribution (which will be defined more precisely in the next chapter) for binomial, Bernoulli, negative binomial, and geometric distributions. It is used for modeling the random behavior of percentages and proportions. For example, the Beta distribution has been used for modeling allele frequencies in population genetics, time allocation in project management, the proportion of minerals in rocks, and heterogeneity in the probability of HIV transmission.

Gamma distribution

The Gamma distribution denoted by Gamma distribution is another common distribution used in Bayesian inference. It is used for modeling the waiting times such as survival rates. Special cases of the Gamma distribution are the well-known Exponential and Chi-Square distributions.

In Bayesian inference, the Gamma distribution is used as a conjugate prior for the inverse of variance of a one-dimensional normal distribution or parameters such as the rate (Gamma distribution) of an exponential or Poisson distribution.

The mathematical form of a Gamma distribution is given by:

Gamma distribution

Here, Gamma distribution and Gamma distribution are the shape and rate parameters, respectively (both take values greater than zero). There is also a form in terms of the scale parameter Gamma distribution, which is common in econometrics. Another related distribution is the Inverse-Gamma distribution that is the distribution of the reciprocal of a variable that is distributed according to the Gamma distribution. It's mainly used in Bayesian inference as the conjugate prior distribution for the variance of a one-dimensional normal distribution.

Dirichlet distribution

The Dirichlet distribution is a multivariate analogue of the Beta distribution. It is commonly used in Bayesian inference as the conjugate prior distribution for multinomial distribution and categorical distribution. The main reason for this is that it is easy to implement inference techniques, such as Gibbs sampling, on the Dirichlet-multinomial distribution.

The Dirichlet distribution of order Dirichlet distribution is defined over an open Dirichlet distribution dimensional simplex as follows:

Dirichlet distribution

Here, Dirichlet distribution, Dirichlet distribution, and Dirichlet distribution.

Wishart distribution

The Wishart distribution is a multivariate generalization of the Gamma distribution. It is defined over symmetric non-negative matrix-valued random variables. In Bayesian inference, it is used as the conjugate prior to estimate the distribution of inverse of the covariance matrix Wishart distribution (or precision matrix) of the normal distribution. When we discussed Gamma distribution, we said it is used as a conjugate distribution for the inverse of the variance of the one-dimensional normal distribution.

The mathematical definition of the Wishart distribution is as follows:

Wishart distribution

Here, Wishart distribution denotes the determinant of the matrix Wishart distribution of dimension Wishart distribution and Wishart distribution is the degrees of freedom.

A special case of the Wishart distribution is when Wishart distribution corresponds to the well-known Chi-Square distribution function with Wishart distribution degrees of freedom.

Wikipedia gives a list of more than 100 useful distributions that are commonly used by statisticians (reference 1 in the Reference section of this chapter). Interested readers should refer to this article.

You have been reading a chapter from
Learning Bayesian Models with R
Published in: Oct 2015
Publisher: Packt
ISBN-13: 9781783987603
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €18.99/month. Cancel anytime