Chapter 12
What is HTML?
HTML stands for Hypertext Markup Language, which is the standard and most common markup language for developing web pages and web applications.
What are HTTP requests?
Most of the communication done via the internet (more specifically, the World Wide Web) utilizes HTTP. In HTTP, request methods are used to convey information on what data is being requested and should be sent back from a server.
What are HTTP response status codes?
HTTP response status codes are three-digit numbers that signify the state of communication between a server and its client. They are sorted into five categories, each indicating a specific state of communication.
How does the requests
module help with making web requests?
The requests
module manages the communication between a Python program and a web server through HTTP requests.
What is a ping test and how is one typically designed?
A ping test is a tool typically used by web administrators to make sure that their sites are still available to clients. A ping test does this by making requests to the websites under consideration and analyzes the returned response status codes
Why is concurrency applicable in making web requests?
Both the process of making different requests to a web server and the process of parsing and processing downloaded HTML source code are independent across separate requests.
What are the considerations that need to be made when developing web scraping applications?
The following considerations should be made when developing applications that make concurrent web requests:
- The terms of service and data-collecting policies
- Error handling
- Updating your program regularly
- Avoiding over-scraping