Creating a dashboard showing multiple plots
We have explored categorical and numerical data, as well as text data. We have learned how to extract various features from text data, and we built aggregated features from some of the numerical ones. Let’s now build two more features by grouping Title and Family Size. We will create two new features:
- Titles: By clustering together similar titles (like
Miss
withMlle.
, orMrs.
andMme.
) or rare (likeDona.
,Don.
,Capt.
,Jonkheer
,Rev.
, andCountess
) and keeping the most frequent ones –Mr.
,Mrs.
,Master
, andMiss
- Family Type: By creating three clusters from the Family Size values – Single for a family size of 1, Small for families made of up to 4 members, and Large for families with more than 4 members
Then, we will represent, on a single graph, several simple or derived features that we learned have an important predictive value. We show the passengers’ survival rates for Sex
, Passenger...