Chapter 2. Web Scraping
The amount of data created each day on the Internet is quite staggering. Much of this data is created on social media websites as well as individual blogs. We also have data that we create from our cell phones, tablets, and wearable devices. According to the following website (http://www.livevault.com/2-5-quintillion-bytes-of-data-are-created-every-day/) in 2015 IBM reported that the average amount of data created per day is approximately 2.5 quintillion bytes. It would be useful to any organization to get their hands on this data and make sense out of it. This is where web scraping comes into play.
Simply put web scraping is a technique to extract data from different websites, manipulate the data into a structured format, and then save the data to local files for consumption and reporting. We've all probably done some form of web scraping in the past even though we may not have known it as the time.
In the previous chapter, Introduction to Practical Business Intelligence...