Summary
In this chapter, we learned about the methods we can use in Optimus to group similar string values in a column using key collision and nearest-neighbor methods and replace them with a single value that could represent them better.
With the clustering already created, we learned how to explore suggestions, modified them, and applied them to our data.
Also, we learned about different algorithms that are available in Optimus, which to use depending on the type of data we're handling, and how accurate/fast we need to get our clusters.
In the next chapter, we will learn how to start doing feature engineering to our dataset as an introduction to the machine learning (ML) chapter.
Sum of Length |
Minimum Rating |
≤ 4 |
5 |
4 < sum ≤ 7 |
4 |
7 <... |