Data mining and machine learning
We are going to focus on three kinds of problems: Classification, Dimensionality reduction, and Clustering. Each of these problems is used in both data mining and machine learning to draw conclusions about the data. Let's explain each of these settings in different sections.
Classification
Classification is an example of supervised learning. There is a set of training data with an attribute that classifies it in one of several categories. The goal is to find the value of that attribute for new data. For example, with our running database, we could use all the data from the year 2013 to figure out which financial complaints got solved positively for the customer, which ones got solved without relief, and which ones remained in progress. This will offer us good insight on, for instance, which companies are faster to respond to consumer complaints positively, if there are states where complaints are less likely to get resolved, and so on.
Let's start by finding...