SEO · Search Engine Optimisationintermediate3 min read

What is Crawl Budget?

Crawl budget is the number of pages Googlebot will crawl on your website within a given timeframe. It's determined by two factors: crawl rate limit (how fast Google can crawl without overloading your server) and crawl demand (how much Google wants to crawl your pages based on their popularity and freshness). Managing crawl budget is critical for large sites where not all pages get crawled and indexed in a reasonable time.

500ms
server response time threshold — above this, Googlebot crawls less
Source: Google Search Central, 2024
Fact-checked against 3 sourcesLast updated 8 June 2026
Key Takeaways
  • Crawl budget = crawl rate limit × crawl demand — optimise both factors.
  • Only matters when you have hundreds of pages or more; small sites are crawled fully by default.
  • Blocking low-value pages (tag archives, filtered URLs) frees budget for pages that matter.
  • Server response time under 200ms is the single most impactful crawl budget lever.
  • Check your crawl stats in Google Search Console → Settings → Crawl Stats.

What Determines Your Crawl Budget?

Google allocates crawl budget based on two things: how fast your server can handle requests without slowing down, and how much value Google sees in your pages.

Crawl rate limit is Google's self-imposed throttle. If your server responds slowly, Googlebot backs off to avoid degrading your site's performance for real users. Faster servers = more crawling.

Crawl demand is driven by how popular your URLs are (backlinks, traffic) and how frequently they change. A homepage gets crawled daily. A deep archive page from 2012 might get crawled monthly — or never.

Signs You Have a Crawl Budget Problem

Most sites don't have a crawl budget problem. If you have fewer than a few hundred pages, Google will crawl all of them regularly.

You need to care about crawl budget if: new pages take weeks to appear in search results, you have thousands of URLs but low organic traffic, or Google Search Console shows a crawl anomaly.

The clearest signal is in Search Console's Crawl Stats report — compare daily crawl volume against your total page count. If Googlebot is visiting a fraction of your pages, you have a problem worth solving.

Stay sharp

Most guides are already outdated.

One email a week. The search stuff that actually matters — what shifted, what died, and what to do about it.

Subscribe free →

How to Optimise Crawl Budget

The fastest wins come from blocking what shouldn't be crawled. Use robots.txt to block tag archives, filtered product pages, and admin URLs. Add noindex to thin pages you can't delete.

Consolidate duplicate content with canonical tags. Fix redirect chains so Googlebot doesn't waste hops. Remove dead URLs from your sitemap — only include canonical, indexable pages.

On the server side: improve response time, reduce server errors, and ensure your most important pages are reachable within 3 internal links from the homepage.

CRAWL BUDGET FORMULA
Crawl Budget = Crawl Rate Limit × Crawl Demand

Crawl Rate Limit is the maximum speed Googlebot can crawl without overloading your server — measured in parallel connections and requests per second. Crawl Demand reflects how much Google prioritises your URLs based on popularity (backlinks, traffic) and freshness (how recently content changed). Both factors must be high to maximise the pages Googlebot processes in a given window.

✓ DO

Submit an XML sitemap containing only canonical, indexable URLs

Use robots.txt to block faceted navigation, session IDs, and admin directories

Improve server response times (target TTFB under 200ms) to increase crawl rate limit

Keep your most important pages within 3 clicks of the homepage

Monitor the Crawl Stats report in Google Search Console weekly for anomalies

✗ DON'T

Include redirected or noindexed URLs in your XML sitemap

Block JavaScript or CSS files in robots.txt — Googlebot needs them to render pages

Let redirect chains exceed two hops, wasting crawl budget on intermediate URLs

Rely on crawl budget fixes alone if your content quality is the real indexing issue

Ignore 5xx server errors — they signal Googlebot to back off and crawl less frequently

SITES THAT NEED CRAWL BUDGET MANAGEMENT VS. SITES THAT DON'T
Crawl Budget Is CriticalCrawl Budget Is Not a Priority
10,000+ indexable URLsFewer than ~1,000 pages
New pages take weeks to appear in Search ConsoleNew pages indexed within 1–3 days
Large e-commerce with faceted navigationSmall business or blog site
News or content sites publishing dozens of articles dailyStatic brochure site updated monthly
GSC Crawl Stats show only a fraction of pages crawled per dayGSC shows consistent daily crawl coverage
Multiple redirect chains and legacy URL structuresClean, flat URL architecture
CRAWL BUDGET AUDIT CHECKLIST
0/8 complete
Check Google Search Console Crawl Stats: compare average daily crawls to total page count
Audit your XML sitemap — remove redirected, noindexed, and 4xx URLs
Identify and flatten redirect chains longer than one hop
Block low-value URL patterns in robots.txt (filters, tags, pagination, session parameters)
Add noindex to thin or duplicate pages you cannot remove entirely
Review internal link depth — ensure key pages are reachable within 3 clicks
Measure server TTFB and resolve any response time issues above 500ms
Check the Coverage report in GSC for crawled-but-not-indexed pages as a waste signal
REAL-WORLD EXAMPLE
How an E-commerce Site Recovered Indexing by Cutting 400,000 URLs

A large fashion retailer had over 600,000 URLs — the majority generated by faceted navigation (colour, size, and sort-order filter combinations). Googlebot was spending the bulk of its crawl budget on these parameter-driven pages, leaving thousands of new product and category pages unindexed for months. After blocking filter parameters in robots.txt and consolidating duplicate filtered pages with canonical tags, the crawlable URL pool dropped to under 200,000. Within eight weeks, Google Search Console showed a 3× increase in new product pages indexed, and organic traffic to new seasonal lines improved materially within the same quarter.

HOW GOOGLE'S APPROACH TO CRAWL BUDGET HAS EVOLVED
2009
Crawl-Delay Directive Introduced

Search engines including Google began honouring the crawl-delay directive in robots.txt, giving site owners a basic mechanism to throttle Googlebot and protect server performance.

2016
Google Officially Defines 'Crawl Budget'

Google's Gary Illyes published the first official blog post explaining crawl budget as the product of crawl rate limit and crawl demand, giving SEOs a formal framework for the first time.

2018
Crawl Stats Report Added to Search Console

Google launched the Crawl Stats report in the new Search Console, surfacing daily crawl data, response codes, and file-type breakdowns — making crawl budget analysis accessible without log file analysis.

2022
Google Updates Crawl Budget Documentation for JavaScript

Google clarified that JavaScript-heavy pages require a two-stage crawl-and-render process, effectively doubling the crawl budget cost of client-side rendered content versus server-side HTML.

Free Tool

How does your site score on SEO?

Paste your URL. Get a score and a fix list across all three disciplines. No form, no email.

Run Free Audit →

Frequently Asked Questions

Not directly — crawl budget affects indexation, which affects whether your pages can rank at all. If a page isn't crawled, it can't be indexed, and if it isn't indexed, it can't rank. So indirectly, yes: poor crawl budget management can suppress rankings on large sites.

Go to Google Search Console → Settings (bottom left) → Crawl Stats. You'll see average daily crawl volume, response time breakdown, and crawl requests by response code. Compare the daily crawl number against your total indexed page count.

Only block pages you're confident should never be indexed. robots.txt prevents crawling but Google may still index a blocked URL if it has external links pointing to it. For pages you want deindexed, use noindex instead. Use robots.txt for pages that genuinely waste crawl budget — parameter URLs, faceted navigation, session IDs.

Sources & Further Reading
  • 1.Google Search Central — Crawl Budget documentation
  • 2.Screaming Frog SEO Spider documentation
  • 3.Advanced Web Ranking CTR Study, 2024