High performance computing
Initially, it is important to measure which lines of code take the most computation time. Here, you should try to solve problems with the processing time of individual calculations by improving the computation time. This can often be done in R by vectorization, or often better by writing individual pieces of code in a compilable language, such as C, C++*, or Fortran**.
In addition, some calculations can be parallelized and accelerated through parallel computing.
Profiling to detect computationally slow functions in code
Take an example where you have written code for your data analysis but it runs (too) slow. However, it is most likely that not all your lines of code are slow and only a few lines need improvement in terms of computational time. In this instance it is very important to know exactly what step in the code takes the most computation time.
The easiest way is to find this out is to work with the R function system.time
. We will compare two models:
data(Cars93...