Machine learning models are mathematical models that required numeric and integer values for computation. Such models can't work on categorical features. That's why we often need to convert categorical features into numerical ones. Machine learning model performance is affected by what encoding technique we use. Categorical values range from 0 to N-1 categories.
One-hot encoding
One-hot encoding transforms the categorical column into labels and splits the column into multiple columns. The numbers are replaced by binary values such as 1s or 0s. For example, let's say that, in the color variable, there are three categories; that is, red, green, and blue. These three categories are labeled and encoded into binary columns, as shown in the following diagram:
One-hot encoding can also be performed using the get_dummies() function. Let's use the get_dummies() function as an example:
# Read the data
data=pd.read_csv('employee.csv')
# Dummy...