Question 1

What is a sitemap extractor?

Accepted Answer

A sitemap extractor is a tool that reads an XML sitemap file and returns a plain list of every page URL inside it. Sitemaps wrap each URL in a `<loc>` tag along with optional metadata like `lastmod` and `priority`; the extractor parses that XML and strips everything down to the raw URLs so you can copy, filter, or feed them into an SEO crawler. This tool also follows sitemap index files (sitemaps that link to other sitemaps) automatically, which view-source workflows can't do.

Question 2

What's the difference between a sitemap extractor, scraper, and parser?

Accepted Answer

They overlap, but there's a useful distinction. A sitemap **parser** reads the XML structure and hands back structured data (URL plus `lastmod`, `changefreq`, `priority`). A sitemap **extractor** specifically pulls out the URL list — the most common use case. A sitemap **scraper** usually implies crawling the live pages after extraction to collect additional data like titles or status codes. This tool is an extractor with a built-in parser: you get the URLs plus their metadata, not a full crawl.

Question 3

How do I download a sitemap as a list of URLs?

Accepted Answer

Paste the sitemap URL (or just the domain) into the input above and click **Extract**. Once the URLs load, click **Export** to download them as a `.txt` file — one URL per line — or **Copy All** to push them to your clipboard for pasting into Excel, Google Sheets, or Screaming Frog. You can filter the list first if you only want URLs matching a path like `/blog/` or `/products/`.

Question 4

Does this tool handle XML sitemap index files?

Accepted Answer

Yes. If you paste a sitemap index (a sitemap that references other sitemaps — common on large sites, ecommerce stores, and WordPress installs using Yoast's `sitemap_index.xml`), the extractor detects the nested structure, fetches each child sitemap, and merges all URLs into a single list. You don't have to extract each child sitemap separately.

Question 5

What if I don't know the sitemap URL for a website?

Accepted Answer

Paste the bare domain (e.g. `example.com`) and the extractor will try `/sitemap.xml` first, then check `/robots.txt` for a `Sitemap:` directive pointing to a non-standard location. If that still doesn't find it — which happens on sites that use custom sitemap paths — use our [Sitemap Finder](/sitemap-finder), which probes a wider list of common locations (`/sitemap_index.xml`, `/sitemap1.xml`, `/wp-sitemap.xml`, and more).

Extract URLs from Sitemap

Extract Every URL From Any XML Sitemap

How to Extract URLs From a Sitemap

1. Use the extractor above (fastest)

2. View-source and grep the `<loc>` tags

3. Use `curl` and a regex one-liner

What the Extractor Handles

How to Find a Website's Sitemap URL

Why Extract URLs From a Sitemap

Frequently Asked Questions