When we talk of web application scanning, we often come across crawlers that are built into the automatic scanning tools we use for web application scanning. Tools such as Burp Suite, Acunetix, web inspect, and so on all have wonderful crawlers that crawl through web applications and try various attack vectors against the crawled URLs. In this chapter, we are going to understand how a crawler works and what happens under the hood. The objective of this chapter is to enable the user to understand how a crawler collects all the information and forms the attack surface for various attacks. The same knowledge can be later used to develop a custom tool that may automate web application scanning. In this chapter, we are going to create a custom web crawler that will crawl through a website and give us a list that contains the following:
- Web pages
- HTML forms...