Wrangling data with pandas
Data wrangling is one of the most important topics in data science interviews. For starters, data is often not presented in an analysis-ready format, which makes it necessary for data modeling preprocessing and addressing data quality concerns. Thus, data scientists can spend upward of 80% of their time cleaning and wrangling data [1].
Furthermore, data wrangling skills demonstrate your comfort and fluency with computer programming. Having the ability to use functions, loops, indexing, aggregation, filtering, and forming calculations will serve you well in your data science journey, enabling you to complete work quickly and efficiently. It is also fundamental for extract, transform, load (ETL) activities, querying data, data modeling, descriptive statistics, reporting, and a host of other data tasks.
In this section, we will review a couple of common data wrangling challenges, including handling missing data, filtering data, merging, and aggregating...