Summing it all up
So, what have you learned so far just by some quick slicing and dicing of the data? You learned that the dataset does not have a complete history for every weather station over the period. We identified that records are probably sent when there is something to report, and there is not a record for every 15 minutes of every day. We found some stations that only report once a month on the 1st day.
We detected a potentially useful pattern with the accumulated level of precipitation over time. An extreme event was identified in the data and was externally verified as an actual occurrence. You learned the statistical distribution of values for each field in the data using R.
The geographical distribution of stations was also explored. You learned that stations are not evenly spaced across the state of Colorado (although not too bad for the range of area covered). We also identified some data values that appear to be acting as an indicator instead of a measurement (-9999
, 999.990...