Introduction to Beautiful Soup 4 and Web Page Parsing
The ability to read and understand web pages is of paramount interest to a person collecting and formatting data. For example, consider the task of gathering data about movies and then formatting it for a downstream system. Data from movie databases is best obtained from websites such as IMDb, and that data does not come pre-packaged in nice forms (such as CSV or JSON), so you need to know how to download and read a web page.
You also need to be equipped with the knowledge of the structure of a web page so that you can design a system that can search for (query) a particular piece of information from a whole web page and get the value from it. This involves understanding the grammar of markup languages and being able to write something that can parse them. Doing this, and keeping all the edge cases in mind, for something like HTML is already incredibly complex, and if you extend the scope of the bespoke markup language to include...