I’ve tried to use yacy from many years but it never work good. In fact you can’t use yacy with defaut installation of a few gb even with a lot of memory on your computer, not anybody can use it and install it like said on the homepage.
You need a lot of memory, if you want to have millions of urls, you need to have yacy running on many gb of memory at least to expect it not crash (not the default yacy memory settings), and there’s always the same problems, when you start having “a lot” of urls (thousands not even millions) , the crawling become very slow, i ve tried to change a lot of settings and nothing change, event set in advanced configuration the crawling to 0ms never change any thing and other parameters do nothing. It can crash and when you restart, you need to delete the freeworld index folder because if not its always crash…
Another is the search results are slow, when i see the search results in logs its 10seconds or even 15 seconds for 10 or 11 results founds (!).
I’ve lost a lot of time to try to make yacy work, i use a 1to ssd only for yacy and it changer nothing, after thousands of pages indexed it become very slow, in fact yacy is made for very small index or very powerfull workstations.
And a lot of other problems already said multiple times in forums and messages about yacy but never solved !
It need to be completly rewritten to be used easily on computers like said in the yacy homepage, you can easily install it but it will not work at least on personnal computers.
I understand your frustration, I have very similar one. Spent last 2 years experimenting with yacy and still, it is unsatisfactory.
Memory is IMHO the biggest problem of yacy, I sloved it personally with throwing a huge amount of RAM into yacy’s use and using a external solr server on the same machine – and it works so so, now.
Still, it is the only decentralised search engine, we have, so I decided to go on. Rewrite in some more efficient language is something I dream of, but not able to do it by myself. Do you have some programming skills?
Probably the main problem is lack of support, issues are unresolved for quite a long time. And lack of developers! How many of them are here, actually? @Orbiter is probably caught by new project, thkoch contributed recently, but not systematically. Who does have the repository rights in github? Are there any other developers around?
At least in one point I got to disagree with you, some things got fixed at least in github repo, if you’re able to build yacy yourself from source, you might be surprised with some things solved (even some memory and efficiency issues).
It seems to me, that yacy is a amazing project of one man, being thrown into oblivion by not transfering the development to the community after moving on to another project. If any actual dev community is there. Am I wrong?
same from my side. I also spent last months with yacy.
I will stay with yacy, because of the decentralized architecture. It’s an alternative to other big players and yacy returns search results, other search engines would never return.
But yes, yacy is resource hungry. For now a have a dedicated server with a lot of RAM and CPU and of course I spent a lot of time in optimizing.
But you cannot compare to other big players
For me it’s OK now, indexing runs smoothly, postprocessing is running and I regularly do recrawl the pages. The result is not the best but also not the worst.
Test it, I have published my peer via reverse proxy (TLS 1.3 and HTTP/2 support) on
I hove, the project goes on and the community grows.
I had a pentium 4 running yacy about 10 years ago yes the search was slow.
If you crawl with a high depth then yes you need lots of memory.
Yacy will only crawl at 120 ppm per domain or site.in order not to overload the site.
I have had the crawler slow down but check the logs its usually waiting for time to expire before continuing.
If you want the latest version to run in windows you can clone and build in linux then copy to windows. There is issues with the solr index version so you have to export and import your data.
Last update 24 aug.