Configuring the Alfresco search engine
The Alfresco search engine is configurable and highly scalable. This section provides information about the underlying search engine and the process for configuring it.
The theory behind the search engine
Alfresco supports full-text search capabilities, using Apache's powerful Lucene search engine (http://lucene.apache.org). Lucene is an open source, highly scalable, and fast search engine. Lucene powers search in the discussion groups at Fortune 100 companies, in commercial issue trackers, email search from Microsoft, and the Nutch web search engine (which scales to billions of pages).
Lucene's logical architecture performs a search on a document based on its text content. This helps Lucene to be independent of the file format. So any kind of file (PDF, HTML, Microsoft Word documents, and so on) can be indexed—as long as its textual information can be extracted.
Lucene stores the search indexes and related data in a back-end file system, similar to Alfresco...