Image crawler and downloader
Image crawlers are very useful when we need to download all the images that appear in a web page. Instead of going through the HTML sources and picking all the images, we can use a script to parse the image files and download them automatically. Let's see how to do it.
How to do it...
Let's write a Bash script to crawl and download the images from a web page, as follows:
#!/bin/bash #Desc: Images downloader #Filename: img_downloader.sh if [ $# -ne 3 ]; then echo "Usage: $0 URL -d DIRECTORY" exit -1 fi for i in {1..4} do case $1 in -d) shift; directory=$1; shift ;; *) url=${url:-$1}; shift;; esac done mkdir -p $directory; baseurl=$(echo $url | egrep -o "https?://[a-z.]+") echo Downloading $url curl -s $url | egrep -o "<img src=[^>]*>" | sed 's/<img src=\"\([^"]*\).*/\1/g' > /tmp/$$.list sed -i "s|^/|$baseurl/|" /tmp/$$.list cd $directory; while read filename; do echo Downloading $filename curl -s -O "$filename" --silent done...