Taking the Measure of Your Data
Within a week of receiving a new dataset, at least one person is likely to ask us a familiar question – “so, how does it look?” This is not always asked relaxedly, and others are not usually excited to hear about all of the red flags we have already found. There might be a sense of urgency to declare the data ready for analysis. Of course, if we sign off on it too soon, this can create much larger problems; the presentation of invalid results, the misinterpretation of variable relationships, and having to redo major chunks of our analysis. The key is sorting out what we need to know about the data before we explore anything else in the data. The recipes in this chapter offer techniques for determining if the data is in good enough shape to begin the analysis, so that even if we cannot say, “it looks fine,” we can at least say, “I’m pretty sure I have identified the main issues, and here they are.”...