Hubbry Logo
search
logo

Sitemaps

logo
Community Hub0 Subscribers
Write something...
Be the first to start a discussion here.
Be the first to start a discussion here.
See all
Sitemaps

Sitemaps is a protocol in XML format meant for a webmaster to inform search engines about URLs on a website that are available for web crawling. It allows webmasters to include additional information about each URL: when it was last updated, how often it changes, and how important it is in relation to other URLs of the site. This allows search engines to crawl the site more efficiently and to find URLs that may be isolated from the rest of the site's content. The Sitemaps protocol is a URL inclusion protocol and complements robots.txt, a URL exclusion protocol.

Google first introduced Sitemaps 0.84 in June 2005 so web developers could publish lists of links from across their sites. Google, Yahoo! and Microsoft announced joint support for the Sitemaps protocol in November 2006. The schema version was changed to "Sitemap 0.90", but no other changes were made.

In April 2007, Ask.com and IBM announced support for Sitemaps. Also, Google, Yahoo, MSN announced auto-discovery for sitemaps through robots.txt. In May 2007, the state governments of Arizona, California, Utah and Virginia announced they would use Sitemaps on their web sites.

The Sitemaps protocol is based on ideas from "Crawler-friendly Web Servers," with improvements including auto-discovery through robots.txt and the ability to specify the priority and change frequency of pages.

Sitemaps are particularly beneficial on websites where:

The Sitemap Protocol format consists of XML tags. The file itself must be UTF-8 encoded. Sitemaps can also be just a plain text list of URLs. They can also be compressed in .gz format.

A sample Sitemap that contains just one URL and uses all optional tags is shown below.

The Sitemap XML protocol is also extended to provide a way of listing multiple Sitemaps in a 'Sitemap index' file. The maximum Sitemap size of 50 MiB (uncompressed) or 50,000 URLs means this is necessary for large sites.

See all
User Avatar
No comments yet.