Skip to content
This repository has been archived by the owner on Sep 7, 2023. It is now read-only.

Yandex Engine via XML #3210

Open
ghost opened this issue Apr 16, 2022 · 7 comments
Open

Yandex Engine via XML #3210

ghost opened this issue Apr 16, 2022 · 7 comments

Comments

@ghost
Copy link

ghost commented Apr 16, 2022

Working URL to the engine
https://yandex.com

Why do you want to add this engine?
Most unfiltered and complete search engine I could find

Features of this engine
They don't filter results
Superior results compare to Google/Bing

How can Searx fetch the information from this engine?
Yandex.XML is a service that lets you send queries to the Yandex search engine and get responses in XML format:
https://yandex.com/dev/xml/doc/dg/concepts/about.html

You need to register an IP but having the option would be amazing!

Applicable category of this engine
general, files, images, it, map, music, news, science, social media and videos.

Additional context
There is also Serpapi API service, so they found a way to parse Yandex.

@unixfox
Copy link
Member

unixfox commented Apr 17, 2022

The amount of number of queries that can be done per day unless you register a phone number is very limited though.

Moreover, this requires to register an account, very bad for the privacy and only few searx instances will do it.

Screenshot_20220417-085315_Bromite_1

@ghost
Copy link
Author

ghost commented Apr 17, 2022

We can use pre-paid sim card to register an account and get the key.

I think may will be willing to use the engine, at least having the option would be great 👍

@unixfox
Copy link
Member

unixfox commented Apr 17, 2022

We can use pre-paid sim card to register an account and get the key.

I think may will be willing to use the engine, at least having the option would be great 👍

Well you could contribute to the code of this new engine :).

@br4nnigan
Copy link
Contributor

I'm fine with 10 queries per day, I'm running my own instance.

@dmigis
Copy link

dmigis commented Oct 30, 2022

We can use pre-paid sim card to register an account and get the key.

I think may will be willing to use the engine, at least having the option would be great +1

IMO, better to use virtual SIM services like https://onlinesim.ru/ or sms-activate.org, for example (depending on your budget). It will remove the necessity for getting physical prepaid SIM and cell phone itself.

@dmigis
Copy link

dmigis commented Oct 30, 2022

Also, they show only first 1000 results per request (https://yandex.com/dev/xml/doc/dg/concepts/restrictions-new.html).
I would prefer to parse Yandex using Selenium + Chrome Headless, but I guess, it is resource consuming and anyway not suitable for public instance because in case of multiple requests from 1 IP even using Chrome Headless configured for stealth operations will lead to CAPTCHA or instant ban. That can be solved by having pool of proxies (and better with Russian IPs) or using mobile proxies (for example, https://mobileproxy.space/en/), but even so it is not a guarantee and not every public instance owner will pay for such service and thus prefer to switch off Yandex support completely.
If developers are potentially ready to merge a PR for Selenium-based search engines, then I'll take a shot to implement such thing as a POC.

@unixfox
Copy link
Member

unixfox commented Oct 30, 2022

SearX current developers are not interested to add any anti-privacy features (see https://github.com/searx/searx#is-searx-in-maintenance-mode) so a full browser fall into this category.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants