Suppose we want to analyze the quality of water in a city, so we take samples by dividing the city into neighborhoods. We may think we have two options to analyze this data:
- Study each neighborhood as a separate entity
- Pool all the data together and estimate the water quality of the city as a single big group
Both options could be reasonable, depending on what we want to know. We can justify the first option by saying we obtain a more detailed view of the problem, which otherwise could become invisible or less evident if we average the data. The second option can be justified by saying that if we pool the data, we obtain a bigger sample size and hence a more accurate estimation. Both are good reasons, but we can do something else, something in-between. We can build a model to estimate the water quality of each neighborhood and, at the same time, estimate...