Digging in with R
Using the power of R programming, we can run various queries on the data; noting that the results of these queries may spawn additional questions and queries and eventually, yield data ready for visualizing.
Let's start with a few simple profile queries. I always start my data profiling by time boxing the data.
The following R scripts (although as mentioned earlier, there are many ways to accomplish the same objective) work well for this:
# --- read our file into a temporary R table tmpRTable4TimeBox<-read.table(file="C:/Big Data Visualization/Chapter 3/sampleHCSurvey02.txt", sep=",") # --- convert to an R data frame and filter it to just include # --- the 2nd column or field of data data.df <- data.frame(tmpRTable4TimeBox) data.df <- data.df[,2] # --- provides a sorted list of the years in the file YearsInData = substr(substr(data.df[],(regexpr('/',data.df[])+1),11),( regexpr('/',substr(data.df[],(regexpr('/',data...