Robots.txt Generator
Robots.txt Generator creates a correctly formatted robots.txt file — the plain-text file at the root of your site that tells search engine crawlers which paths they may and may not request. Choose a crawling policy, list the paths to block, point to your sitemap, and copy the result.
A well-formed robots.txt keeps crawlers out of admin panels, carts and faceted-search URLs while making sure your important pages stay crawlable, and it advertises your sitemap so engines discover your URLs faster.
User-agent: * Disallow: /admin/ Disallow: /cart/ Disallow: /*?
Save this as robots.txtin your site's root so it is served at https://yourdomain.com/robots.txt.
How to use Robots.txt Generator
- 1
Pick a crawling policy
Allow the whole site and block specific paths, or block the entire site while it is in development.
- 2
List paths to disallow and add your sitemap
Enter one path per line for areas you want kept out of crawling, and paste your sitemap URL so engines can find it.
- 3
Save it as robots.txt at your site root
Copy the output and upload it as robots.txt so it is served at https://yourdomain.com/robots.txt.
What robots.txt does — and does not do
robots.txt is a crawling directive, not a security or privacy control. Compliant crawlers like Googlebot read it and obey the Allow and Disallow rules, which keeps low-value or sensitive sections out of the crawl budget. But the file is public, and badly behaved bots can ignore it, so never use it to hide confidential URLs — protect those with authentication instead.
Importantly, Disallow stops a page from being crawled, not necessarily from being indexed: if other sites link to a blocked URL, Google can still list it without a description. To reliably keep a page out of search results, allow it to be crawled and add a noindex robots meta tag, or protect it behind a login.
Syntax essentials
Each block starts with a User-agent line naming the crawler the rules apply to ("*" means all crawlers), followed by Disallow and optional Allow lines listing path prefixes. An empty Disallow value means "allow everything". Paths are case-sensitive and matched from the start of the URL path; the * wildcard and $ end-anchor are supported by major engines.
The Sitemap directive is independent of user-agent and gives the absolute URL of your sitemap — include it so engines can discover all your pages. Place the finished file at the very root of the domain; a robots.txt in a subfolder is ignored.
Frequently asked questions
- Where does the robots.txt file go?
- At the root of your domain, served at https://yourdomain.com/robots.txt. Crawlers only look there — a robots.txt placed in a subdirectory has no effect.
- Does Disallow remove a page from Google?
- Not reliably. Disallow blocks crawling, but a blocked URL can still be indexed if other pages link to it. To keep a page out of results, let it be crawled and add a noindex meta tag, or require login.
- Should I block my whole site with Disallow: /?
- Only for staging or development sites you do not want indexed yet. On a live site, blocking everything removes you from search entirely — make sure to remove that rule before launch.
Last updated: