Diving into Bayesian analysis

Welcome to the first section on Bayesian analysis. This section discusses the basic concepts used in Bayesian statistics. This branch of statistics often involves classical statistics and requires more knowledge of mathematics and probability, but it seems to be popular in computer science. This section will get you up to speed with what you need to know to understand and perform Bayesian statistics.

All Bayesian statistics are based on Bayes' theorem; in Bayesian statistics, we consider an event or parameter as a random variable. For example, suppose that we're talking about a parameter; we give a prior distribution to the parameter, and a likelihood of observing a certain outcome given the value of the parameter. Bayes' theorem lets us compute the posterior distribution of the parameter, which we can use to reach conclusions about it. The following formula shows Bayes' theorem:

All Bayesian statistics are an exercise in applying this theorem. The α symbol means proportional to, that is, that the two sides differ by a multiplicative factor.

How Bayesian analysis works

I assume that we are interested in the value of a parameter, such as the mean or proportion. We start by giving this parameter a prior distribution quantifying our beliefs about where the parameter is located, based on what we believe about it before collecting data. There are lots of ways to pick the prior; for example, we could pick an uninformative prior that says little about a parameter's value. Alternatively, we could use a prior that gives beliefs based on, say, previous studies, therefore biasing the value of the parameter to these values.

Then, we collect data and use it to compute the posterior distribution of the parameter, which is our updated belief about its location after seeing new evidence. This posterior distribution is then used to answer all our questions about the parameter's location. Note that the posterior distribution will answer all questions with probabilities. This means that we don't say whether the parameter is in a particular region or not, but the probability that it is located in that region instead. In general, the posterior distribution is difficult to compute. Often, we need to rely on computationally intensive methods such as Monte Carlo simulation to estimate posterior quantities. So, let's examine a very simple example of Bayesian analysis.

Using Bayesian analysis to solve a hit-and-run

In this case, we're going to be solving a hit-and-run. In a certain city, 95% of cabs are owned by the Yellow Cab Company, and 5% are owned by Green Cab, Inc. Recently, a cab was involved in a hit-and-run accident, injuring a pedestrian. A witness saw the accident and claimed that the cab that hit the pedestrian was a green cab. Tests by investigators revealed that, under similar circumstances, this witness is correctly able to identify a green cab 90% of the time and correctly identify a yellow cab 85% of the time. This means that they incorrectly call a yellow cab a green cab 15% of the time, and incorrectly call a green cab a yellow cab 10% of the time. So, the question is, should we pursue Green Cab, Inc.?

The following formula shows Bayes' theorem:

Here, H represents the event that a green cab hit the pedestrian, while G represents the event that the witness claims to have seen a green cab.

So, let's encode these probabilities, as follows:

Now that we have these probabilities, we can use Bayes' theorem to compute the posterior probability, which is given as follows:

So, the result is that the prior probability that the cab was actually green was 0.05, which was very low. The posterior probability, that is, the probability that the cab that hit the pedestrian was green, given that the witness said the cab was green, is now 32%, which is higher than that number, but it is still less than 50%.

Additionally, considering that this city consists only of yellow cabs and green cabs, this indicates that even though the witness saw a green cab or claimed to have seen a green cab, there are too few green cabs and the witness is not accurate enough to override how few green cabs there are. This means that it's still more likely that the pedestrian was hit by a yellow cab and that the witness made a mistake.

Now, let's take a look at some useful applications of Bayesian analysis. We will go through topics similar to those seen previously, but from the Bayesian perspective.