Non-English characters

Hello All,

Thank you for this great software. I liked it :slight_smile:

I want to use it to search Turkish documents.

When i try a search with keyword like İstanbul it turns no result because of İ.

Is there a solution for this.

Thank you in advance.

Hi ufukayyildiz,

good question, I tried it and instantly I was also not find anything. However I am pretty sure that YaCy is able to process all UTF-8 characters. Therefore I was looking for web pages that actually contain İstanbul as you spell it and I found https://www.istdergi.com/ which I put into my crawler to verify that it is possible to find this page again with the word İstanbul. And it worked!

So - there is no problem with YaCy with such letters. There is simply not enough content to find. Just put in the web pages that you want to be found!

Hi @Orbiter

Thank you so much.

I am trying search pdf files. Could it be related PDF files?

Could you please check the video on the link?

https://drive.google.com/file/d/1iDEfm0WD4gacS7u05Gf9dRydYiOTBTwT/view?usp=sharing

Thanks again.

Are these characters interchangeable?
Eg. I would expect the same results both for İstanbul (with the Turkish İ) and Istanbul (as everyone else would type it outside of Turkey).

1 Like

Based on my experience they are not interchangeble.
eg try to search for ‘Árvíztűrő tükörfúrógép’ vs ‘Arvizturo tukorfurogep
The first brings this exact match besides others: https://hu.wikipedia.org/wiki/%C3%81rv%C3%ADzt%C5%B1r%C5%91_t%C3%BCk%C3%B6rf%C3%BAr%C3%B3g%C3%A9p but for the other there are 0 results.

Is there any way to change this?