Personally, I find the field of natural language processing very exciting. The vast majority of our knowledge as humans is contained in books, documents, and web pages. Knowing how to automatically extract this information and organize it with the help of machine learning is essential to our scientific progress and endeavors in automation. This is why multiple scientific fields, such as information retrieval, statistics, and linguistics, borrow ideas from each other and try to solve the same problem from different angles. In this chapter, we also borrowed ideas from all these fields and learned how to represent textual data in formats suitable to machine learning algorithms. We also learned about the utilities that scikit-learn provides to aid in building and optimizing end-to-end solutions. We also encountered concepts such as transfer learning, and we were able to seamlessly incorporate spaCy's language models into scikit-learn.
From the next chapter, we...