All data is random
If you read only one chapter in this book, read this one. Why? Well, because it explains the most important math concept in data science – all data is random. Or, more precisely, all data contains a random component.
Is this really the case? Let’s explain. To start, we must explain what we mean by random. I’m not going to give some dry technical definition here, expressed in mathematical symbols. I’m going to give a technical, but intuitive definition: random means non-predictable.
What do we mean by that? Precisely what it says. If something is random, it can’t be computed or calculated in advance.
A little example
I have an old ship’s barometer that belonged to my father (he was a ship’s captain). The barometer is damaged and a bit temperamental, so the measurement is imprecise. This means the measured atmospheric pressure is not the same as the actual atmospheric pressure but deviates from it, possibly...