Robots.txt in 2026: how to create, test, and avoid breaking indexation
A practical robots.txt guide: what to block, what not to block, why robots does not remove pages from Google, and how to connect it with sitemap.
A practical robots.txt guide: what to block, what not to block, why robots does not remove pages from Google, and how to connect it with sitemap.
Robots.txt decision tree
Robots manages crawling. It is not a privacy or deindexing tool.
Robots.txt is a small file with large risk. One line can block important pages from crawling, while another can leave filters, parameters, and technical folders open for crawl waste.
The main rule: robots.txt controls crawling, not guaranteed removal from search. Google warns that robots.txt should not be used as a way to hide web pages from search results.
What robots.txt is good for
| Job | Use robots.txt? | Note |
|---|---|---|
| Reduce crawl waste | yes | filters, parameters, technical folders |
| Block CSS/JS | usually no | Google should understand the page |
| Remove a page from index | no | use noindex or remove the URL |
| Point to sitemap | yes | useful for search engines |
| Hide private data | no | use authentication, not robots |
Basic example
User-agent: *
Disallow: /wp-admin/
Disallow: /*?sort=
Allow: /wp-admin/admin-ajax.php
Sitemap: https://example.com/sitemap.xml
Before publishing, check that you are not blocking service pages, blog pages, images, CSS, or JavaScript needed for rendering.
Common competitor mistakes
- blocking the entire site with
Disallow: /; - blocking theme resources and hurting rendering;
- blocking URLs and expecting them to disappear from the index;
- forgetting sitemap;
- not checking robots after redesign or migration.
Next step: if you are unsure, start with a technical audit, check the site in UNmiss, and add sem.chat so users do not get stuck after search visits.
Sources
50+ mega-prompts for ChatGPT and Gemini: the SEO specialist workflow for 2026
A practical SEO prompt library for keyword research, content, GEO, technical audit, UX, and conversion, with rules for checking AI output.
Read →How to check AI-written text in 2026: signals, risks, and a sane workflow
How to separate weak AI text from useful content: generation signals, fact-checking, E-E-A-T, editing, and SEO impact.
Read →How to create content for AI answers: AEO and GEO guide for 2026
How to create pages that can be used in AI Overviews, ChatGPT, Gemini, and other answer systems: structure, sources, entities, and UX.
Read →Want to apply this to your site?
We will review the current situation, find the first growth levers, and suggest a practical working format.