Clarity needed: exposed ports, ssl/https, LetsEncrypt, Certificates, Docker, nginx, proxy, inter-host comms and user security and privacy

Hi all,

First post so please be nice :wink:

I’d like to get some clarity on exactly what the requirements are to achieve a privacy respecting deployment of YaCy, preferably on (the possibly unsupported) Docker deployment. To that end could I please ask the following questions so that I can possibly contribute the answers back to the documentation section of YaCy as I’d love to get involved and contribute back :slight_smile:)

  1. Deploying YaCy on Docker makes sense for me - - is a version that has over 1 million downloads and has run reliably for a long while - but is there an officially supported version planned and if so can we document it?

  2. Deploying YaCy at home or in the enterprise requires ports to be forwarded - but exactly what information is sent over the default non-SSL port of 8090 and why, in 2019, is a non-SSL(TLS) default still a thing? Can we make port 8091 with SSL/TLS trusted or self-signed the default?

  3. Can I remove http ports entirely and only use https/SSL ports (8091) - even if I don’t want to do point 4 below?

  4. If I want to use a proxy in combination with cert-bot to provide a free LetsEncrypt trusted-cert so that users of YaCy get search access only via a secure encrypted SSL/TLS search page from their browser then I’m left with a confusing scenario whereby the users come in over my proxy on 443 (https) but YaCy complains if the default unencrypted 8090 is unreachable from the internet, as it should be (after all, they’re being proxied over 443 for a reason!) If I then open 8090 so that YaCy is reachable from the public internet, and stops complaining, I’m now left with a search engine that users can browse over a unencrypted http channel as well as over 443 through the proxy. This isn’t expected or good practice. Is there a way to let YaCy know that it’s being proxied and can send ALL traffic through the proxy (and not an port on the YacY application itself that’s directly internet facing) with the benefit of a valid trusted LetsEncrypt certificate?

  5. YaCy hosts/nodes presumably communicate machine-to-machine over port 8090 (see point 1 above) but this isn’t private or secure and leaves inter-node communications and their search terms open to capture by ISPs/‘bad guys’ who could potentially poison search results or illegally/immorally spy on YaCy users. Can anyone correct me on this if I’m wrong and point to ways that this can be made better, more secure and more private?

To reiterate: I’m not here to rain on anyone’s parade - this is a great tool that’s much appreciated that I’d like to help make better :slight_smile: Whilst I’m not a coder I do have experience is Cyber-Security and user privacy so happy to help if useful and contributions well received :slight_smile:



This has been posted for two months … I have bookmarked it so to view an answer … With that said I will NOT install until such time as an appropriate reply is made. … do hurry!!


I share some of your security concerns.

I really love YaCy. But I also recognize it was released nearly 15 years ago as free software . There are, I think, some language barriers. The developer continues to drop in occasionally, but is likely too busy to address long, complex, involved questions.

I don’t know enough to answer your questions, but I like YaCy enough that I’m currently studying up on Java. Maybe in a year or two, I might know enough to address your issues.

In the meantime, I’ve been running YaCy on a Live Linux installed on a Flash drive, plugged into an old, otherwise useless laptop.

Overall, I assume any privacy issues are much less with YaCy than with the various available, centralized, corporate, online search portals.

I’ve found I have better luck getting questions answered if I keep them short and to the point, as, among other things, it may require translation. It seems the majority of YaCy enthusiasts/coders that are still around or interested in the project speak German, which also is the language of much of the available documentation, tutorial videos, etc.

You have ideas for possible ways to improve security, which I like. I would also like to see more documentation and tutorials. YaCy is a surprisingly complex and highly sophisticated program. It is certainly more complex, versatile, and feature rich than a simple online search portal like Google.

Edit: One question I think I can answer with some confidence is, I believe YaCy can be very easily configured to use whatever port you like. 8090 is the default, but this can be changed during installation and setup if desired.

I’m not sure how or why you would surmise “YaCy complains if the default unencrypted 8090 is unreachable from the internet” if you haven’t installed it yet.

The recently updated FAQ I notice, may also address some of your security concerns, for example:

" Can other people find-out about my browsing log/history?

There’s no way to browse the pages that are stored on a peer. A search of the pages is only possible on a precise word. The words are themselves dispatched over the peers thanks to the distributed hash tables (DHT). Then the hash tables of the peers are mixed, which makes retrieving the history of browsing of a certain peer impossible."