This chapter covers a step-by-step process of building a Python data pipeline within PyCharm via a hands-on example. The term data pipeline generally denotes a set of actions or steps in a procedure to collect, process, and analyze data. This term is widely used in the industry to express the need for a reliable workflow of taking raw data and converting it into actionable insights.
On a smaller scale, this includes working with and maintaining data for your data science projects, pre-processing methods, and the visualization of data. In addition to the practical know-how of using PyCharm in this process, you will also be able to gain knowledge on the general workflow, as well as common practices in a complete data science project.
The following topics will be covered in this chapter:
- Working with and maintaining datasets
- Data cleaning and...