Finding broken links in a website
Some people manually check every page on a website to search for broken links. It is feasible for websites having very few pages, but gets difficult when the number of pages become large. It becomes really easy if we can automate the process of finding broken links. We can find the broken links by using HTTP manipulation tools. Let's see how to do it.
Getting ready
To identify the links and find the broken ones from the links, we can use lynx
and curl
. It has an option, namely -traversal
, which will recursively visit pages on the website and build a list of all hyperlinks in the website. We can use cURL to verify each of the links for whether they're broken or not.
How to do it...
Let's write a Bash script with the help of the curl
command to find out the broken links on a web page:
#!/bin/bash #Filename: find_broken.sh #Desc: Find broken links in a website if [ $# -ne 1 ]; then echo -e "$Usage: $0 URL\n" exit 1; fi echo Broken links: mkdir /tmp...