Wow, this is fantastic.
One question, to start with.
There seems to be a discrepancy between the YaCy list of OAI-PMH Servers here:
http://localhost:8090/IndexImportOAIPMH_p.html
Which totals 7081 servers in all
and the list provided by openarchives.org
Only 4224 currently.
Also, I might report something of a bug, or annoyance.
While attempting to compare the two lists, I was trying to highlight a URL in the YaCy list and suddenly a download started. I was a little alarmed because I did not intend this, had no idea how big the file might be and could find no way to cancel the process. I ended up just letting it run. I have no idea what exactly I downloaded but the files do not seem to have been too big.
I’m not sure why the download started, as I was only trying to just highlight the link, but the “bug” that seems like it could be a bigger problem is, Once the download finished, I was transferred to another page showing the results of the download.
I hit the back button in the browser (Firefox) to get back to the list I was reviewing and the same download started all over again!
So now it appears that I have the same index downloaded twice. but with two different “token” identifiers.
It seems like this could be a bigger problem if I had downloaded numerous files and then forgot not to hit the back button.
Does this sort of thing result in duplicate files in the index?
Apparently Google does not support the OAI-PMH format.
So is the YaCy list more extensive than the OpenArchive list or does it perhaps include outdated, unofficial or no-longer-existant resources.
It is also difficult to assess what the resource might contain?
I was perhaps lucky to have accidentally transferred a relatively small resource. What if I had accidentally or unknowingly clicked on, or started a download of a 50 Gigabyte file or something?
Anyway, the fact that YaCy supports this kind of open index/resource sharing blows my mind!
What about mod_oai?