mediumDistribution Readiness

Sitemap Directive in robots.txt

A Sitemap: directive in robots.txt tells crawlers exactly where to find your sitemap, removing guesswork. AI and traditional search crawlers benefit alike. SaaSalyst checks that robots.txt contains at least one Sitemap: line and that the referenced URL resolves to a valid sitemap.

What SaaSalyst Checks

SaaSalyst fetches /robots.txt, parses it for Sitemap: directives, and verifies the referenced URL returns 200 with valid XML (root element <urlset> or <sitemapindex>). The check passes when both conditions hold. It warns when the directive exists but the URL fails. It fails when no directive exists at all.

Why This Matters

Crawlers can find sitemaps two ways: probing well-known paths (/sitemap.xml, /sitemap_index.xml, /wp-sitemap.xml), or reading robots.txt for a Sitemap: directive. The directive is more reliable because it works regardless of where you put the sitemap, and it documents intent.

For AI crawlers in particular, robots.txt is often the FIRST file fetched. A Sitemap: directive there means the crawler doesn't have to guess paths — and avoids cases where a non-standard sitemap path (e.g., /sitemap-2024.xml) would otherwise be missed.

This check pairs with sitemap_xml_reachable: this one verifies your robots.txt declares a sitemap; the other verifies the sitemap itself is fetchable and valid.

How to Fix It

  1. Add a Sitemap: line to robots.txt pointing at your sitemap URL: Sitemap: https://yourdomain.com/sitemap.xml
  2. Multiple sitemaps are allowed — add one Sitemap: line per file, or use a sitemap index that references multiple sitemaps.
  3. The URL should be absolute (https://yourdomain.com/...) not relative (/sitemap.xml). Some crawlers reject relative paths.
  4. Verify the sitemap URL returns 200 with <urlset> or <sitemapindex> XML — see the sitemap_xml_reachable check.

Frequently Asked Questions

Do I need a Sitemap: directive if my sitemap is at /sitemap.xml?

Most crawlers will probe /sitemap.xml automatically, so it's not strictly required. But adding the directive makes intent explicit, supports non-standard paths, and is the canonical signal AI crawlers respect. SaaSalyst rates the directive as medium because the cost is one line and the upside is removing crawl ambiguity.

Can I have multiple Sitemap: directives?

Yes — and it's common. SaaSalyst accepts each Sitemap: line independently. Use multiple lines for separate sitemap files, or one line pointing at a sitemap index that references children.

Why is this paired with sitemap_xml_reachable?

Because they answer two different questions: "is the sitemap declared in robots.txt?" and "is the sitemap actually fetchable and valid?" Both must hold for crawlers to use it. SaaSalyst surfaces both checks so you can fix the right thing.

Check Your SaaS Now | Free

SaaSalyst scans your website in 30 seconds and checks for Sitemap Directive in robots.txt along with 101+ other business readiness signals.

Scan Your App

Related Checks SaaSalyst Runs