Probability
Probability is a way to measure how likely something is to happen. As mentioned previously, in data science, ML, and decision-making, we often deal with uncertain events or outcomes. Probability helps us understand and quantify that uncertainty.
For example, when we flip a coin, we don’t know whether it will land heads or tails. The probability of it landing heads is 50%, and the probability of it landing tails is also 50%.
Probability distribution
A probability distribution is a way to show the likelihood of each possible outcome. For example, when we roll a six-sided die, the probability of getting each number is the same – 1/6. This means that the probability distribution is equal for each outcome.
Conditional probability
Conditional probability is the likelihood of an event or outcome happening, given that another event or outcome has already occurred. For example, if we know that a person is over six feet tall, the conditional probability of them being a basketball player is higher than the probability of a randomly selected person being a basketball player.
Let’s say there were two different events, A and B, which had some probability of occurring, within what is known as a sample space, S, of all possible events occurring.
For example, A could be the event that a consumer purchases a particular brand’s product, and B could be the event that a consumer has visited the brand’s website. In the following diagram, the probability of event A, P(A), and the probability of event B, P(B), are represented by the shaded areas in the following Venn diagram. The probability of both A and B occurring is represented by the shaded area where A and B overlap. In mathematical notation, this is written as P(A ∩ B), which means the probability of the intersection of A and B. This intersection simply means both A and B occur:
Figure 1.3: A Venn diagram visualizing the probability of two events (A and B) occurring in a sample space (S)
The conditional probability of A occurring, given that B has occurred, can be calculated as follows:
In our example, this would be the probability of a consumer purchasing a brand’s product, given they have visited the brand’s website. By understanding the probabilities of different events and how they are related, we can calculate things such as conditional probabilities, which can help us understand the chance of events happening based on our data.