Google has proposed an official internet standard for the rules included in robots.txt files.
Those rules, outlined in the Robots Exclusion Protocol (REP), have been an unofficial standard for the past 25 years.
While the REP has been adopted by search engines it’s still not official, which means it’s open to interpretation by developers. Further, it has never been updated to cover today’s use cases.
It’s been 25 years, and the Robots Exclusion Protocol never became an official standard. While it was adopted by all major search engines, it didn’t cover everything: does a 500 HTTP status code mean that the crawler can crawl anything or nothing? pic.twitter.com/imqoVQW92V
— Google Webmasters (@googlewmc) July 1, 2019
As Google says, this creates a challenge for website owners because the ambiguously written, de-facto standard made it difficult to write the rules correctly.
To eliminate this challenge, Google has documented how the REP is used on the modern web and submitted it to the Internet Engineering Task Force (IETF) for review.
What do you think??