Understanding the Search Engine Process

There are a number of steps in the search engine process that are important to understand, if one hopes to successfully push their site up in the SERPs (search engine results pages). From the moment a site goes live, these are the steps that take place, to make that site (or one or more of its pages) available to users via search engines.

  • Crawling the Web

In this phase, pages are discovered and quickly scanned by automatic prorams, called bots or spiders, that find pages via the hyperlink structure of the web, and “crawl” them, in order to get a rough outline of what the site is about. This process is going on continuously, in an effort to capture the estimated 20 billion pages that exist on the web.

  • Indexing Pages

Once a page has been crawled, its contents can be indexed. This is the phase wherein the data contained on a page is stored in a massive database, comprised of all the information indexed by the search engine. The manner in which data is managed is not publicised, but must be quite complex, in order to be able to search and sort billions of documents in fractions of a second.

  • Processing a Query

When one of the hundreds of millions of queries per day is entered into the search engine, the engine searches its index for documents that coincide with the requested data, and retrieves those that are determined by its algorithms to match the query. There are a number of operators available to specialize a query. Google, for instance, recognises these. If the index yields documents that match the user’s query, then that document will show up in the displayed SERPs.

  • Ranking of the Results

Once the search engine has compiled a number of responsive documents, its algorithms will sort the documents, and present them in order of relevance to the query that was entered. This is one reason why SEO specialists place so much emphasis on the various aspects of search engine optimisation, in order to make their pages appear as high on the SERPs as possible, the premise being that a user is likely to go to the first page shown that appears to match his needs. Pages shown on the second page of the SERPs are much less likely to receive traffic than the first few documents presented .

As can easily be seen, the search engines have a lot of information to filter, at a very rapid pace. The aggregate computing power of the large search engines, such as Google, Yahoo and Bing is staggering. They perform millions of such calculations per second, to help users find the information they seek.