Here my user experience as a YaCy noob (I am 55 and software developer all my life): I have set up YaCy on my home server (Mac Mini). The first few tries caused my JVM to freeze after about three hours because of poor YaCy memory setup and usage. Suggestion: this should work without first customizing the setup. All the rest is very interesting. First search experience is not as good as Google but can be improved e. g. by own crawling of interesting sites.
Thank you for trying out YaCy!
The best default settings for memory is hard to find: some people thing they are too high, some to low.
But you are right, people should not find out after only some hours that the installation need maintenance!
I will increase the default memory setting.
My Mac Mini server is from 2009 and has only 4GB RAM. Setting crawling speed to 70 PPM seems to work, but it is pretty slow. The inital setting of 6000 PPM seems not to be useful when the RAM setting for the JVM is 600MB. My last try is 90 PPM crawling speed, reading URLs from file, 25k maximum word in index cache and maximum 1600MB RAM for JVM. Obviously wrong settings cause the JVM to freeze and remain at high CPU load. I know it is difficult, but the best thing would be: here is your RAM, harddisk space and list of URLs to crawl: go ahead without freezing and give me an every day overview about the statistics…
I concur. I started with Yacy yesterday and installed it in a docker container on a Raspberry Pi 4 with 4GB of RAM.
It works, but it is quite slow. Sometimes the admin interface freezes and webpages are no longer served. That is a bit frustrating. It basically freezes the whole Pi and even other services are no longer available. Like I can’t even SSH into the Pi anymore. Is it not possible to allocate the required RAM more dynamically?
I also find the admin console quite confusing. There are menu items on the left, and on top, and no breadcrumbs. When you look at a page, it is not obvious where on the menu you are.
I haven’t really tried to search anything, as I expect I first have to crawl some websites first. But I’d love to see that as an alternative to my SearX instance I have been using for some time now.
SearX does not craw, does s not index and does not store. All if that is hard, io-intensive and needs a lot of ram.
It is possible to run yacy even on a RPi3, but with strong restrictions.