Hardening YACY for Public Use


Im exposing port 8090, looks like msot settings require a password but am I doing this wrong? I just want to provide YACY as a search engine for the public to search a large static website.



1 Like

Especially with being a Java program, I too am interested in essential hardening steps. What settings should I check out-of-the-box?

My own digging has brought up:

Admin Panel > Portal Configuration > Remote search encryption
and check the box for “Prefer https for search queries on remote peers. When SSL/TLS is enabled on remote peers, https should be used…”

Use Case & Accounts > Network Configuration > Protocol operations encryption
check the box for “Prefer HTTPS for outgoing connexions to remote peers.”

RAM/Disk Usage & Updates > Download System Update > automatic update
and setup updates are made within fixed cycles:

System Administration > Referer Policy Settings
In the Global policy section, move referer (radio button) upward in the list to decrease the header data exposed to sites being crawled. But keep in mind that this may prompt some sites to block your crawler’s access to it.

Also run YaCy as an unprivileged user. On Linux, set them up under their own group with access only to the yacy installation files.

:hourglass:I will post more as I find additional hardening options.

One user may contain Yacy binaries, configurations. Modulate each folder versus group permissions to a necessary minimum.
Second user that actually runs Yacy, must belong to first user group.

If installed by package, the installer likelly will create user 1 but not user 2. The service definition could be updated to use user 2 or disable service and create your own service definition.

Just a suggestion, please comment…

Basic Configuration > Section 4. > check “with SSL (https enabled on port 8443)”

I am not sure if this requires some kind of cert setup, or if YaCy just uses some predefined https. But you will may to restart the peer before a lock icon shows up next to its name.

Also, if you’re really paranoid it may technically be more secure to continue running in Junior Peer mode.

Have a blacklist in place for your instance. You want to avoid crawling malware domains, ad domains and probably explicit content.

Filters & Blacklists > Import/Export > plain text file:

You can adapt list files, containing one domain ( example.com/* ) per line, from many popular blocklists. Or, if you’re not sure, at least consider importing a blocklist from some peer.