Leaving stray pages in Google's index
If you have Flash files and PDF files on your site, Google will eventually find those pages and index them—even if you don't intend them as pages to be searchable. Eventually, you'll have a slew of ugly, unappealing indexed entries in Google's search results. These pages will often compete in search results for your primary terms. You can check Google's index of your site by entering the site:yourdomain.com
query in the Google search box.
As shown in the screenshot, a search for the Google's index for the above-referenced site yields a large number of unattractive and pointless PDF files in the index.
You should monitor the index that Google maintains for your site to make sure that no low-value pages have made their way into the index. To remove the files from Google's index, simply create an entry in your robots.txt
file instructing the search engines to ignore particular pages.