Using kernels with PCA
With some data, it is not possible to construct principal components that are linearly separable. This may not actually be easy to visualize in advance of our modeling. Fortunately, there are tools we can use to determine the kernel that will yield the best results, including a linear kernel. Kernel PCA with a linear kernel should perform similarly to standard PCA.
In this section, we will use kernel PCA for feature extraction with data on labor force participation rates, educational attainment, teenage birth frequency, and participation in politics by gender at the country level.
Note
This dataset on gender-based differences in educational and labor force outcomes is made available for public use by the United Nations Development Program at https://www.kaggle.com/datasets/undp/human-development. There is one record per country with aggregate employment, income, and education data by gender for 2015.
Let’s start building the model:
-
...