Random variables and probability distributions
We start this section by introducing a new concept that is necessary to describe the randomness we find in data.
A new concept – random variables
In computer code, when we want to use a variable, we type something such as x=5
. In many programming languages, we may change the value of the variable x
. We even use the word variable to indicate that its value may change. However, those changes are caused by us or by code we have written, and so typically they happen in a deterministic way; that is, we compute when the changes should happen, and we can compute the new value of the variable.
For data that contains a random component, we need a new concept. Remember – random means non-predictable. When we record, observe, or capture the value of that variable, its value is not pre-determined. Instead, it could take on a number of values. The new concept we need is that of a random variable. A random variable is a variable...