Histograms
One of the main disadvantages of using aggregators such as avg()
is that they can hide interesting behaviors in the data. Picture the following scenario: you have access to the scores of five thousand students regarding four subjects—Science, Mathematics, English, and Literature. Each subject is graded with the help of two exams. In order to condense all these records and show only high-level figures, you decide to calculate an average. However, after analyzing the results, you start wondering: are there some extremely good students raising the global average, or are there alarmingly bad ones bringing it down? Are they all consistent? Can we separate them into groups (good, normal, or bad)? How did most of them fare?
One of the best ways to deal with this kind of questions is to create a histogram, a visualization focused on distributions instead of magnitudes. In this chart, the x-axis represents the exam grade, while the y-axis counts the number of students that scored it. As...