Let's start this section with some background on R. R is a language and environment that is easy to learn, very flexible in nature, and very focused on statistical computing, making it a great choice for manipulating, cleaning, summarizing, producing probability statistics, and so on.
In addition, here are a few more reasons to use R for data cleaning:
- It is used by a large number of data scientists so it's not going away anytime soon
- R is platform independent, so what you create will run almost anywhere
- R has awesome help resources--just Google it, you'll see!