3.3 Water quality
Suppose we want to analyze the quality of water in a city, so we take samples by dividing the city into neighborhoods. We may think we have two options for analyzing this data:
Study each neighborhood as a separate entity
Pool all the data together and estimate the water quality of the city as a single big group
You have probably already noticed the pattern here. We can justify the first option by saying we obtain a more detailed view of the problem, which otherwise could become invisible or less evident if we average the data. The second option can be justified by saying that if we pool the data, we obtain a bigger sample size and hence a more accurate estimation. But we already know we have a third option: we can do a hierarchical model!
For this example, we are going to use synthetic data. I love using synthetic data; it is a great way to understand things. If you don’t understand something, simulate it! There are many uses for synthetic data. Here, we are...