In our clustering model, discussed in Chapter 6, Visualizing Economic Problems in the European Union, using self-organizing maps, all the available data was used. Now, in order to train a model to be able to predict sovereign ratings, we need to split the data into two samples: train and test.
That's not new for us. When we tried to develop different models to predict a bank's failures, we used the caTools package to split the data, while considering our target variable.
The same procedure is used again here:
library(caTools)
index = sample.split(macroeconomic_data$RatingMayT1, SplitRatio = .75)
train_macro<-subset(macroeconomic_data, index == TRUE)
test_macro<-subset(macroeconomic_data, index == FALSE)
Now, you can print the following statements:
print(paste("The number of observations in the train...