The dataframe is almost complete; however, there is one issue that requires addressing before building the neural network. Rather than keeping the gender value as a string, it is better to convert the value to a numeric integer for calculation purposes, which will become more evident as this chapter progresses.
Manipulating columns in a PySpark dataframe
Getting ready
This section will require importing the following:
- from pyspark.sql import functions
How to do it...
This section walks through the steps for the string conversion to a numeric value...