Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Joomla! Search Engine Optimization

You're reading from   Joomla! Search Engine Optimization Drive people to your site with this supercharged guide to Joomla! and Search Engine Optimization with this book and ebook.

Arrow left icon
Product type Paperback
Published in Jul 2012
Publisher Packt
ISBN-13 9781849518765
Length 116 pages
Edition 1st Edition
Languages
Tools
Concepts
Arrow right icon
Author (1):
Arrow left icon
Ric Shreves Ric Shreves
Author Profile Icon Ric Shreves
Ric Shreves
Arrow right icon
View More author details
Toc

How search engines assess sites


Search engines all function in approximately the same fashion: A software agent, known as a bot, a spider, or a crawler, visits a page, gathers the content, and stores it in the search engine's data repository. Once the information is in the repository, it is indexed. The crawling and indexing processes are constant and on-going. Each of the major search engines maintains multiple crawlers that work tirelessly to refresh its index. The spiders find new pages by a variety of methods, typically including XML Site Maps, URLs already in the index, links to pages discovered while indexing, and URLs submitted for inclusion by users. How frequently they visit a specific site, and how deeply they spider the site each visit, varies.

When a user visits the search engine and runs a search, the search engine extracts from the search engine's index a list of pages that are relevant to the query, and then displays that list of pages to the user. The output on the search results page is defined according to each search engine's own criteria. The ranking methodology used by each engine is the result of the search engine's secret algorithm.

The search engine's crawler is primarily interested in certain types of information on the page, particularly the URL, the text, and the links on the page. Formatting is not indexed. Images and other media are indexed by most search engines, but to varying degrees of depth. Some types of media, such as Flash or attached files, are rarely indexed, though there are exceptions.

If you have a Google Webmaster account, you can see a web page exactly as the Googlebot (the name of the Google crawler) sees it. To do this, log into Google Webmaster Tools, and click on a site profile. In the navigation menu on the left-hand side, select the Diagnostics menu and then select the Fetch as Googlebot option. Type the URL of the page you want to see, and the system will produce the results. You can see in the following screenshots, a webpage, followed by the Googlebot's view of the same page:

Here's the spider's view of the same page:

You have been reading a chapter from
Joomla! Search Engine Optimization
Published in: Jul 2012
Publisher: Packt
ISBN-13: 9781849518765
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image