SEO · Search Engine Optimisationbeginner3 min read

What is XML Sitemap?

An XML sitemap is a file that lists all the URLs on your website that you want search engines to crawl and index. It acts as a roadmap for Googlebot, telling it which pages exist, when they were last updated, and how important they are relative to each other. Submitting a sitemap to Google Search Console speeds up indexation and helps Google discover pages it might otherwise miss.

72%
of top-ranking pages have a sitemap submitted to Search Console
Source: Ahrefs, 2023
Fact-checked against 2 sourcesLast updated 8 June 2026
Key Takeaways
  • Only include canonical, indexable URLs in your sitemap — no noindex pages, no redirects.
  • Submit your sitemap to Google Search Console on day one of going live.
  • Keep your sitemap accurate — stale or broken sitemaps can confuse Googlebot.
  • Large sites should use sitemap index files to split into multiple sitemaps (50,000 URL limit per file).
  • Next.js and most CMSs can auto-generate sitemaps — don't maintain them manually.

What Goes in an XML Sitemap?

A sitemap is an XML file listing your URLs with optional metadata: last modification date (lastmod), change frequency (changefreq), and priority (0.0–1.0).

Only include pages you want indexed. Exclude: noindex pages, redirect URLs, paginated duplicates (keep only canonical), and admin or login pages.

The priority attribute is relative — Google mostly ignores changefreq. The lastmod date is the most useful signal: it tells Google a page was updated and is worth recrawling.

How to Submit Your Sitemap to Google

In Google Search Console, go to Sitemaps in the left nav. Enter your sitemap URL (usually /sitemap.xml or /sitemap_index.xml) and click Submit.

Google will show you how many URLs were submitted vs how many were indexed. If the numbers diverge significantly, investigate why — it usually means you have noindex tags on some URLs, canonical issues, or duplicate content.

Resubmit whenever you make significant structural changes to your site.

Stay sharp

Most guides are already outdated.

One email a week. The search stuff that actually matters — what shifted, what died, and what to do about it.

Subscribe free →
✓ DO

Include only canonical, indexable URLs in your sitemap

Update lastmod dates accurately when page content meaningfully changes

Use a sitemap index file to split large sites into multiple child sitemaps

Keep each individual sitemap file under 50,000 URLs and 50MB uncompressed

Submit your sitemap via Google Search Console and reference it in robots.txt

✗ DON'T

Include URLs with noindex tags or canonical tags pointing elsewhere

List redirect URLs (301 or 302) — only include final destination URLs

Set fake or static lastmod dates that never update

Include session IDs, tracking parameters, or faceted navigation duplicates

Assume submitting a sitemap guarantees indexation of every listed URL

50,000
Max URLs per sitemap file (Google limit)
50MB
Max uncompressed sitemap file size
~48hrs
Typical Google discovery time after sitemap submission
500
Max sitemap files in a single sitemap index
XML SITEMAP HEALTH CHECKLIST
0/8 complete
Sitemap is referenced in your robots.txt file via the Sitemap: directive
All listed URLs return a 200 HTTP status code
No noindex pages are included in the sitemap
lastmod values reflect actual content update dates, not server timestamps
Sitemap is submitted and verified in Google Search Console
Large sites use a sitemap index file linking to child sitemaps
Image or video sitemaps are used if multimedia discovery is a priority
Submitted URL count vs indexed URL count gap is investigated and understood
XML SITEMAP VS HTML SITEMAP
XML SitemapHTML Sitemap
Intended audience: search engine crawlersIntended audience: human visitors
Machine-readable XML formatHuman-readable HTML page
Submitted directly to Google Search ConsoleLinked from site navigation or footer
Includes metadata: lastmod, priority, changefreqNo metadata — just hyperlinks
Not visible or useful to regular site visitorsHelps users navigate large sites
Critical for crawl efficiency on large sitesModest SEO benefit; mainly UX-focused
REAL-WORLD EXAMPLE
E-Commerce Site: Sitemap Index Structure

A mid-size e-commerce store with 80,000 product pages cannot fit all URLs in a single sitemap file (limit: 50,000). Their solution: a sitemap index file at /sitemap_index.xml referencing four child sitemaps — /sitemap-products-1.xml (50,000 URLs), /sitemap-products-2.xml (30,000 URLs), /sitemap-categories.xml (1,200 URLs), and /sitemap-blog.xml (800 URLs). Only the index URL is submitted to Google Search Console. Google crawls each child sitemap automatically. After submission, they noticed 12,000 product URLs were submitted but not indexed — investigation revealed those pages had thin content and were being ignored by Google despite being in the sitemap, prompting a content quality review.

XML SITEMAP TERMINOLOGY
Sitemap Index

A parent XML file that lists multiple child sitemap files, used when a site exceeds 50,000 URLs or needs to segment sitemaps by content type.

lastmod

An XML sitemap attribute specifying when a URL was last meaningfully updated. Google uses this to prioritize recrawling of recently changed pages.

changefreq

An optional sitemap attribute suggesting how often a page changes (e.g., daily, weekly). Google largely ignores this signal in practice.

priority

A relative value between 0.0 and 1.0 indicating a URL's importance within your site. Google treats this as a weak signal and it has minimal ranking impact.

Crawl Budget

The number of pages Googlebot will crawl on your site within a given timeframe. A clean, accurate sitemap helps allocate crawl budget toward your most important pages.

Free Tool

How does your site score on SEO?

Paste your URL. Get a score and a fix list across all three disciplines. No form, no email.

Run Free Audit →

Frequently Asked Questions

Not directly. A sitemap helps Google find and index your pages faster, but it doesn't influence ranking signals. The SEO value is in faster indexation — especially for new content and large sites where discovery via crawling alone would be slow.

An XML sitemap is a machine-readable file for search engines. An HTML sitemap is a human-readable page linking to your site structure — it's less common now but can help users navigate large sites. For SEO, focus on the XML sitemap.

Sources & Further Reading
  • 1.Google Search Central — Sitemaps documentation
  • 2.Ahrefs Blog — Sitemap Best Practices