I stumbled upon the “Heuristic/ Available/Active Opensearch System”
I think the open search integration is a phenomenal idea. It just reduces the need to index websites that offer it. What I don’t get is, why the adress of the opensearch endpoint must be allowed in the
robots.txt. That kind of defeats the purpose.
Wikipedia for example has an open search integration, but the path to that is not explicitly allowed in the robots.txt, while the parent path of it is dissallowed.
Now it could be argued that with that robots.txt wikipedia says that they don’t want the opensearch to be used by a crawler and that would propably be correct, but to use the search instead of a crawler is imo something completely different (and the approach that I would have)
What are your thoughts on that?