Summary
This was an interesting chapter. Finally, we got to work with Dataset APIs, using real data. We also got a glimpse of API organization. Datasets and their associated classes have a lot of interesting functions for you to explore. Python APIs are very much similar to Scala APIs and sometimes a little easier. The IPython notebook is available at https://github.com/xsankar/fdps-v3/blob/master/extras/003-DataFrame-For-DS.ipynb. Data wrangling with Python, and especially with Python notebooks, is the preferred way for data scientists.