To get the most out of this book
The get the most out of the content presented in this book, it is expected that you have a minimum knowledge of object-oriented programming (creating variables, loops, and functions) and have already worked with R. A basic knowledge of data science concepts is also welcome and can help you understand the tutorials and projects.
All the software and code are created using RStudio for Windows 10, and if you want to code along with the examples, you will need to install R and RStudio on your local machine. To do that, you should go to https://cran.r-project.org/, click on Download R for Windows (or for your operating system), then click on base, and finally, click on Download R-X.X.X for Windows. This will download the R language executable file to your machine. Then, you can double-click on the file to install, accepting the default selections.
Next, you need to install RStudio, renamed to Posit in 2022. The URL to download the software is found here: https://posit.co/download/rstudio-desktop/. Click on Download and look for the version of your operating system. The software has a free of charge version and you can install it, accepting the default options once again.
The main libraries used in the tutorials from this book are indicated as follows:
Software/Library |
Version |
R |
4.1.0 |
RStudio |
2022.02.3+492 for Windows |
Tidyverse |
1.3.1 |
Tidytext |
0.3.2 |
Gutenbergr |
0.2.1 |
Patchwork |
1.1.1 |
wordcloud2 |
0.2.1 |
ROCR |
1.0-11 |
Shinythemes |
1.2.0 |
Plotly |
4.10.0 |
Caret |
6.0-90 |
Shiny |
1.7.1 |
Skimr |
2.1.4 |
Lubridate |
1.8.0 |
randomForest |
4.7-1 |
data.table |
1.14.2 |
To install any library in RStudio, just use the following code snippet:
# Installing libraries to RStudio install.packages(“package_name”) # Loading a library to a session library(package_name)
In R, it can be useful to remind yourself of, or have in mind, these two code snippets. The first one is how to write for
loops. We can write it as, for a given condition, execute a piece of code until the condition is not met anymore:
for (num in 1:5) { print(num) }
The other one is the skeleton of a function written in R language, where we provide variables and the code of what should be done with those variables, returning the resulting calculation:
custom_sum_function <- function(var1, var2) { # Function code my_sum = sum(var1 + var2) return(my_sum) }
If you are using a digital version of this book, we advise you to type the code yourself or access the code from the book’s GitHub repository, preventing any potential errors with code broken due to copy and paste.