Introduction to big data analysis in R
Big data refers to the situations when volume, velocity, or a variety of data exceeds the abilities of our computation capacity to process, store, and analyze them. Big data analysis has to deal not only with large datasets but also with computationally intensive analyses, simulations, and models with many parameters.
Leveraging large data samples can provide significant advantages in the field of quantitative finance; we can relax the assumption of linearity and normality, generate better perdition models, or identify low-frequency events.
However, the analysis of large datasets raises two challenges. First, most of the tools of quantitative analysis have limited capacity to handle massive data, and even simple calculations and data-management tasks can be challenging to perform. Second, even without the capacity limit, computation on large datasets may be extremely time consuming.
Although R is a powerful and robust program with a rich set of statistical...