Home | Français
Local:(514) 321-2122
Toll Free: 1-888-331-2122
Web DesignInternet MarketingBranding/IdentityPortfolioRequest PricingContact Us

How Search Engines Index Websites

by S. Saminaden

For your online business to be successful it needs be optimized and updated on a regular basis to ensure it maintains a good position on the major search engines. Positioning your website competitively, through SEO and other methods of online marketing, starts by understanding what a search engine is and how one works.


What is a Search Engine?

The term "search engine" is often used interchangeably to describe crawler-based search engines and human-edited web directories. However, these two search systems use very different ways to gather their listings.

Human-edited directories such as Dmoz.org rely on submissions from humans to populate the directory’s categories. Crawler-based search engines such as Google use special software called “bots” or “spiders” to “crawl” the Internet, to look for webpages to add to their listings.

Directories contain information about websites, whereas crawler-based search engines gather information from webpages. They don’t necessarily grab all the information on each webpage, but they take a significant amount and apply complex algorithms to index the information.


The Parts of a Search Engine

All search engines consist of three parts: a database of webpages, a spider operating on that database, and a series of search engine software that decide how search results are displayed.

First, a spider visits a webpage, reads it and then follows the links to other pages within the website. All the data that the spider has gathered is stored in the database, which contains a copy of every webpage that the spider has found. The spider will often return regularly to a website to look for any changes and update the database accordingly.

The search engine sotware sifts through the millions of webpages stored in the database, identifying the body text, links and other content on the page. It does this so it can find matches to a search and ranks them in order of their relevancy. Page titles, meta-descriptions and other various elements play a part in determining where a website should be ranked.

One advantage of having spiders revisit your site is that you can make changes to your webpage in order to appear higher in their relevancy rankings, then see if your changes worked. This is what search engine optimization is all about.

Although every search engine is made up of these three parts, there are differences and biases in how webpages are evaluated. This explains why the same search phrase will yield different results on different search engines. Google uses over 150 criteria to evaluate a webpage.

Some pages however are excluded from the database either by policy or because the spiders cannot access them, such as Flash pages or webpages with URLs containing special characters like question marks or ampersands (&). These factors prevent a website from being viewed as search-engine friendly.


The Importance of Linking

If a webpage is never linked to from any other webpage, the spiders will never find it. The only ways a brand new webpage can get indexed by the search engines is if it is linked from within the website or the URL is sent to the search engine companies as a request to be included in their index.

It’s important to know that the links that count in terms of search rankings, are the ones pointing to your website.


Online Resources about Search Engines

Articles about search engines abound on the Internet. However, one of the more interesting ones is a Search Engines Tutorial at the UC Berkeley Online Library.

This tutotial looks at the features of all the major search engines (Google, Yahoo, Ask.com) and has a document summarizing what makes a good search engine.

Danny Sullivan, former editor of SearchEngineWatch.com, wrote a series of detailed articles explaining how search engines work. A little outdated but still relevant, he also explains how to make your website pages more search-engine friendly.

Return to Montreal Web Design articles.


Web Design
Web Design
View Samples
Request Quote
Logo Design
Logo Design
View Samples
Contact Us
Printing Services
Printing Services
Order Now
Brochure Design
Brochure Design
Order Now