Nowadays, we are overwhelmed by large amounts of information—see Shi, Zhang, and Khan (2017), or Fang and Zhang (2016)—the catchphrase being big data. However, defining it is still controversial, since many explanations are available. Davenport and Patil (2012) suggest that if your organization stores multiple petabytes of data, if the information most critical to your business resides in forms other than rows and columns of numbers, or if answering your biggest question would involve a mashup of several analytical efforts, you've got a big data opportunity.
Many users of data science or data analytics are learning several programming languages such as R and Python, but how can they use both of them at the same time? If John is using R while his teammate is using Python, how do they communicate with each other? How do team members share their packages, programs, and even their working environments? In this book, we try our best to offer a solution to all of these challenging tasks by introducing Anaconda, since it possesses several wonderful properties.
Generally speaking, R is a programming language for statistical computing and graphics that is supported by the R Foundation for statistical computing. Python is an interpreted, object-oriented programming language similar to Perl that has gained popularity because of its clear syntax and readability. Julia is for numerical computing and extensive mathematical function and is designed for parallelism and cloud computing, while Octave is for numerical computation and mathematics-oriented and batch-oriented language. All those four languages, R, Python, Julia, and Octave, are free.