There are many cases where we just want to save content that we scrape into a local copy for archive purposes, backup, or later bulk analysis. We also might want to save media from those sites for later use. I've built scrapers for advertisement compliance companies, where we would track and download advertisement based media on web sites to ensure proper usage, and also to store for later analysis, compliance and transcoding.
The storage required for these types of systems can be immense, but with the advent of cloud storage services such as AWS S3 (Simple Storage Service), this becomes much easier and more cost effective than managing a large SAN (Storage Area Network) in your own IT department. Plus, S3 can also automatically move data from hot to cold storage, and then to long-term storage, such as a glacier, which can save you much more money...