“Missing from Site” identifies URLs listed in your sitemap.xml that weren’t found during the crawl. These are typically pages that return 404 errors, have been deleted, or are inaccessible—yet they’re still declared in your sitemap as valid pages.
Sitemap vs Reality═══════════════════════════════════════════════════════════sitemap.xml declares: Actual site returns:───────────────────── ────────────────────/about ────────────→ 200 OK ✓/products ────────────→ 200 OK ✓/old-product ────────────→ 404 Not Found ✗/deleted-page ────────────→ 404 Not Found ✗/typo-url ────────────→ 404 Not Found ✗ │ └─ These are "Missing from Site"
Listing non-existent pages in your sitemap causes several problems:
Issue
Impact
Wasted crawl budget
Search engines spend resources on dead pages
Poor indexing signals
Indicates poor site maintenance
Sitemap trust erosion
Search engines may deprioritize your sitemap
User experience
Users following sitemap-based links hit 404s
SEO authority loss
Broken pages can’t pass link equity
Google explicitly recommends only including canonical, 200-status URLs in your sitemap. Including 404s violates their sitemap guidelines and can reduce crawl efficiency.
WordPress Example:─────────────────────────────────────────Plugin auto-adds all posts to sitemap ✓Post is trashed ✓ Post is deleted ✓Sitemap still references post ✗ ← Plugin didn't update
<!-- These shouldn't be in production sitemap --><url> <loc>https://example.com/test-page-123</loc></url><url> <loc>https://example.com/staging-preview</loc></url>
<!-- REMOVE these entries from sitemap.xml --><!-- Deleted page - remove entirely --><url> <loc>https://example.com/deleted-page</loc> <lastmod>2024-01-15</lastmod></url><!-- Moved page - remove old URL (redirect handles SEO) --><url> <loc>https://example.com/old-url</loc> <lastmod>2024-01-15</lastmod></url><!-- ADD the new location if it moved --><url> <loc>https://example.com/new-url</loc> <lastmod>2024-12-01</lastmod></url>
If you believe a page is incorrectly flagged, manually verify it exists by visiting the URL directly. Our crawler may have been blocked or rate-limited.