Skip to content
SiteShiftCo

Sitemap

An XML or HTML file listing the pages of a website, used to help search engines discover and index content efficiently.

Also known as: XML sitemap, sitemap.xml, HTML sitemap

A sitemap is a file that lists the pages of a website, used to help search engines discover and index content. The most common form is an XML sitemap at /sitemap.xml, which is a machine-readable file specifically designed for search engine crawlers. An HTML sitemap is a human-readable page listing the site’s content, less common today but still used on some larger sites.

XML sitemap structure

An XML sitemap follows a defined schema (sitemaps.org). A minimal example:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>https://example.com/</loc>
    <lastmod>2026-04-23</lastmod>
    <changefreq>weekly</changefreq>
    <priority>1.0</priority>
  </url>
  <url>
    <loc>https://example.com/about</loc>
    <lastmod>2026-03-10</lastmod>
  </url>
</urlset>

Each <url> entry can include:

  • <loc>, the page URL (required)
  • <lastmod>, last modification date (recommended; search engines actually use this)
  • <changefreq>, how often the page changes (largely ignored by modern crawlers)
  • <priority>, relative importance (largely ignored by modern crawlers)

What sitemaps do

Sitemaps help search engines:

  • Discover pages they might not find through normal crawling
  • Understand site structure at a glance
  • Prioritize crawling of recently updated pages (via <lastmod>)
  • Index large sites more efficiently
  • Cover orphan pages (pages not linked from elsewhere)

For small sites with good internal linking, a sitemap may add little value beyond what crawling alone provides. For larger or more complex sites, sitemaps significantly improve indexation.

Sitemap variants

VariantPurpose
Standard XML sitemapLists pages
Image sitemapLists images for image search indexing
Video sitemapLists videos with metadata
News sitemapFor Google News submissions
Sitemap indexMaster file listing multiple individual sitemaps; useful for sites with more than 50,000 URLs

A sitemap index allows breaking large sites into multiple sitemaps:

<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <sitemap>
    <loc>https://example.com/sitemap-pages.xml</loc>
  </sitemap>
  <sitemap>
    <loc>https://example.com/sitemap-blog.xml</loc>
  </sitemap>
</sitemapindex>

Where to put the sitemap

Conventional location: https://example.com/sitemap.xml (the root of the site). Alternative locations work as well; the location is referenced from robots.txt or submitted directly to search consoles.

How sitemaps are submitted to search engines

Several ways:

  1. Reference in robots.txt. Add Sitemap: https://example.com/sitemap.xml to robots.txt
  2. Google Search Console. Submit the sitemap URL in the Sitemaps section
  3. Bing Webmaster Tools. Submit through the Sitemaps section
  4. Yandex Webmaster. Similar process for the Russian search engine
  5. Ping (legacy). Some search engines historically accepted ping URLs to notify of sitemap updates; this is largely deprecated

After submission, search engines fetch the sitemap on a schedule (often daily for active sites).

Sitemap limits

  • Maximum 50,000 URLs per sitemap file
  • Maximum 50 MB uncompressed file size
  • For larger sites, use sitemap indexes to combine multiple sitemaps

Generating sitemaps

Most modern web platforms generate sitemaps automatically:

  • WordPress. Generates a sitemap by default since version 5.5; SEO plugins (Yoast, Rank Math) override with their own
  • Static site generators. Astro, Hugo, Eleventy, Next.js, Gatsby, etc. include sitemap generation as plugins or built-in features
  • Hosted CMS. Squarespace, Wix, Webflow, Shopify all generate sitemaps automatically
  • Custom sites. Generated by build scripts or libraries

Manual sitemap creation is rare; automation is standard.

What to include

  • All canonical, indexable URLs of the site
  • Pages that should appear in search results

What to exclude:

  • Pages with noindex meta tags
  • Duplicate or near-duplicate pages
  • Admin, login, or internal pages
  • URL parameter variations (filtered, sorted, paginated versions)
  • Pages requiring authentication

Sitemap and SEO

Sitemaps assist indexing but do not guarantee ranking improvements. They are a discovery mechanism, not a ranking factor. The pages must still earn rankings through content quality, site authority, and other signals.

For new sites, well-structured sitemaps can speed up initial indexing. For established sites, the impact is usually smaller.

Common misconceptions

  • “Submitting a sitemap guarantees indexing.” Search engines may still choose not to index pages they consider low-quality or duplicate.
  • “Higher priority values improve rankings.” The <priority> attribute is largely ignored by modern crawlers.
  • “Every page needs to be in the sitemap to rank.” Pages can rank without being in the sitemap if they are discovered through links.
  • “Sitemaps replace internal linking.” They complement linking, not replace it; well-linked pages are crawled more frequently and treated as more important.