Improvement of searx.network #2684

dalf · 2023-08-26T17:15:42Z

dalf
Aug 26, 2023

There is WIP / unfinish PR about the network stack, see #2685

I open the discussion to collect ideas about possible improvement on the network stack ?

What would like to be able to do regarding the network?
What bother you in the current implementation?

dalf · 2023-09-12T19:53:40Z

dalf
Sep 12, 2023
Author

TLDR: are there use cases for the multi_requests function other than the baidu engine?

A note about async HTTP client and the multi_requests function

As its name says, the function sends multiple requests at the same time. The function returns all the responses at once.
The implementation relies on an async HTTP client to send all the HTTP requests in parallel. This is the perfect use case for async.

All other HTTP requests can run in the sync world: there is no parallelism in the engine code.

However, in the master branch, all outgoing HTTP requests are sent through async HTTP clients. There are bridges from sync to async and vice versa. There was at a time when SearX(NG) was supposed to switch async entirely: https://github.com/searx/searx/wiki/Milestones#milestone-12---async
This has never happened and will not.

Async is (nearly|for sure) an anti-pattern in a Flask application: if we need parallelism, the way to go is thread.

Also, async makes the HTTP streaming complex. After each change, I ask someone on the team (usually Paul) to deploy it on a public instance and monitor for memory leaks.

So #2685 drops all async code and relies only sync HTTP client ... and the multi_requests is dropped too: this function is not used (the Bing engine used it, but not anymore).

The only potential use case for the multi_requests function is the Baidu engine. The URL returns by Baidu are something like http://www.baidu.com/link?url=. We can implement multi_requests using only threads: each call creates one thread per URL. In this scenario, one user query needs about 11 HTTP requests (1 for the search, and about 10 for the results) = 11 threads per user request. If there are ten user queries simultaneously, that's about 110 threads.
I don't know if this is an problem to create a lot of thread. If it is:

either we says that we can't implement the Baidu engine and similar engine (currently, no other engine).
or we find a way to implement the multi_requests.

So my question is: Are there other use cases for the multi_requests function?

1 reply

return42 Sep 13, 2023
Maintainer

So my question is: Are there other use cases for the multi_requests function?

I am not aware of any case.

BTW: thanks a lot for the detail explanation .. was very helpful to me 🚀

czaky · 2024-05-17T11:43:46Z

czaky
May 17, 2024

One significant improvement in terms of latency and reliability can be achieved by sending multiple requests at once, while waiting for the first answer. This is easily done using asyncio. Works like charm with tor, which is otherwise slow and blocked many times by downstream engines.

E.g.: PR: Proxy Request Redundancy #3491

0 replies

glanham-jr · 2024-06-22T21:06:49Z

glanham-jr
Jun 22, 2024

As its name says, the function sends multiple requests at the same time. The function returns all the responses at once.
The implementation relies on an async HTTP client to send all the HTTP requests in parallel. This is the perfect use case for async.

All other HTTP requests can run in the sync world: there is no parallelism in the engine code.

From my understanding, async is not the same as parallel. Even if its one task, not blocking on an HTTP request gives back resources so other threads can run for the same application.

Async is (nearly|for sure) an anti-pattern in a Flask application: if we need parallelism, the way to go is thread.
I think I need clarification - async doesn't mean parallel for me. Async to me means non-blocking and giving back resources for long-running tasks, while parallel means multi-thread/multi-core processing.

Sync will definitely be faster on instances that aren't handling many requests, but high volume instances I'd expect Async will allow better sharing of resources.

If it was an anti-pattern in flask, seems very strange to have async docs. What they state is also aligned with my understanding of async as well.

Async is not inherently faster than sync code. Async is beneficial when performing concurrent IO-bound tasks, but will probably not improve CPU-bound tasks. Traditional Flask views will still be appropriate for most use cases, but Flask’s async support enables writing and using code that wasn’t possible natively before.

With this, I believe SearXNG actually falls in this category - the whole job is to make outbound IO requests to third party search engines for most requests. However, there is a disclaimer from their docs regarding async code being less performant though...

Flask’s async support is less performant than async-first frameworks due to the way it is implemented. If you have a mainly async codebase it would make sense to consider Quart. Quart is a reimplementation of Flask based on the ASGI standard instead of WSGI. This allows it to handle many concurrent requests, long running requests, and websockets without requiring multiple worker processes or threads.

What would be interesting is to see how we can propagate async/await through requests. I would also be interested in seeing if Quart might be worth migrating to if it solves the async perf problem of Flask. The docs say migration is super simple:

It should be possible to migrate to Quart from Flask by a find and replace of flask to quart and then adding async and await keywords. See the docs for more help.

With the above, some things I think may be interesting to investigate

Propogate async/await through flask[async] extension, and see if we can hook into this style of async code. This would involve updates to network to leverage asyncio async/await syntax. With this, is there a reason we may not want to consider trying this? I know Dalf has said there are a lot of failed experiences, I'm just wondering what failed for the previous attempts and why they failed.
See if Quart is a viable alternative to swap out Flask to get the most bang-for-buck for async style, and how much work it would be

0 replies

dalf · 2024-06-23T08:50:24Z

dalf
Jun 23, 2024
Author

I'm just wondering what failed for the previous attempts and why they failed.

Disclaimer:

I’ve tried to use async multiple times, but my tests lack thorough review. It was a few years ago, and the situation may have changed with improvements in newer Python versions.
I aim to summarize my tests here, but I will likely forget some details and might add an edit section to this message (or just add another comment).
Comparing the performance between two code versions is challenging. The tests require the same configuration, the same bare-metal hardware, and various workload types with automated tests. I haven’t done this as meticulously as needed to make a definitive decision in most of the cases.

My main concern about full async (web framework, HTTP client, engines) : if one engine is slow, everything else slows down. I ended up with some weird things like this to improve the response time. With a single thread to handle async code (by design in Python), we have to be very careful everywhere (also related to my comment on httpx below).

Few years ago I though about this architecture:

async for the web framework and HTTP clients
sync code for the engines running in a pool of workers
async and sync codes communicate using queues.

... this does not fit with the concept of HTTP server for Python app that take control of the app (like uwsgi and all others).

Web framework

Quart prototype

searx/searx#1724

Comments to take with pinch of salt, from what I remember:

async jinja2 rendering was slower. Note: two years after this POC, there is a fixed related issue: Async is slow due to isawaitable use pallets/jinja#1514
Quart response times were slower than the Flask version. I don't remember clearly if I have test a huge workload.

I know this is not quantitative...

Starlette:

There is no proper support for babel: Kludex/starlette#279

Client

httpx

httpx is a good example of well written code and carefully tested, however the async code has a lot of await for each chunk of data especially with HTTP/2. According to my tests, it is fast enough for few kilobytes like a search engine answer. I remember a test on @mrpaulblack 's instance with HTTP/2 for the image proxy : the CPU usage increased by a lot --> things went back to normal when HTTP/2 was disabled for the image proxy (this is the default settings now).

4 years ago, I tried to benchmark httpx : https://github.com/dalf/pyhttp-benchmark/blob/master/results/output.md
The test is old, it would be good to test again.

Without success, I've tried various things to improve the speed:

compile httpx with Cython
partially rely on nghttp2 : https://github.com/dalf/hpack_nghttp2/
Most probably some other attempts I forgot

I've seen that homeassistant use zlib-ng, perhaps this can help.

pycurl

searx/searx#1725

pycurl is a binding to the system libcurl, and there are a lot of different version of libcurl. I've stopped the POC when I encountered a segfault using HTTP/2 on Ubuntu 18.04.

That was a six years ago, perhaps things has changed since then.

aiohttp

#254

There is no support for HTTP/2 which really help to avoid CAPTCHA for some engines (I don't remember which one, @unixfox has a better knowledge on this topic).

Actually we could use this for the image proxy.

HTTP server

Related, more or less out of topic: perhaps, a good replacement for uwsgi is granian which support WSGI or ASGI apps.

3 replies

glanham-jr Jun 23, 2024

With a single thread to handle async code (by design in Python), we have to be very careful everywhere (also related to my comment on httpx below).

A single thread for all async requests? That definitely makes sense as to why this is not as straightforward. I'll look into the other PRs and experiments more, but in this case the logic of carefully selecting what is async seems correct then.

dalf Jun 23, 2024
Author

A single thread for all async requests?

Inside one worker, Python seems to be designed to use one single thread for all async code most probably because of the GIL.

We can start multiple loops, but that's not intended usage of many libraries AFIU.

carefully selecting what is async seems correct then

Yes. From my point of view, engine code should remain sync (I might be wrong).

glanham-jr Jul 9, 2024

Hey Dalf, so I was thinking more about this recently. It seems like the hard choice is deciding what is currently the best approach. We currently are able to base it on knowledge on how Python works for async + some other stress tests, but it seems difficult to see how all of that comes together for SearXNG.

I know you have a draft PR out for reworking the network. I was thinking, what if we sequenced the following updates?

Implement the separation of concerns and HTTP Abstractions first. Specifically, I'm referring to the interface class ABCHTTPClient.
Also have an interface for ABCHTTPClientAsync, and implement alternative versions as with the ABCHTTPClient. If we don't know what async version to implement, we could technically implement a QuartsHTTPClientAsync, PyCurlHTTPClientAsync, etc
Allow a configuration options on the application level to switch between various HTTPClient implementations
Finally, if not already done, have a way to gather data on latency on network requests. CPU and other data can be done outside of the application.

The above may be wishful thinking and more complicated without more research, but it would allow SearXNG maintainers to test and see how well SearXNG performs with a given implementation. With this, I think this approach could solve the difficulties you experienced: a lack of data between sync/async. But this would also require assistance from those running SearxNG to let us know how it works along with stress tests.

This isn't to say I think we should do sync/async (by default, I think sync will likely be the better default choice for now), but rather I'd be interested to see if its possible to explore a way to figure out a good implementation with some data to back it up.

Improvement of searx.network #2684

Uh oh!

Uh oh!

dalf Aug 26, 2023

Replies: 4 comments · 4 replies

Uh oh!

Uh oh!

dalf Sep 12, 2023 Author

Uh oh!

return42 Sep 13, 2023 Maintainer

Uh oh!

czaky May 17, 2024

Uh oh!

Uh oh!

glanham-jr Jun 22, 2024

Uh oh!

Uh oh!

dalf Jun 23, 2024 Author

Web framework

Quart prototype

Starlette:

Client

httpx

pycurl

aiohttp

HTTP server

Uh oh!

glanham-jr Jun 23, 2024

Uh oh!

dalf Jun 23, 2024 Author

Uh oh!

Uh oh!

glanham-jr Jul 9, 2024

dalf
Aug 26, 2023

Replies: 4 comments 4 replies

dalf
Sep 12, 2023
Author

return42 Sep 13, 2023
Maintainer

czaky
May 17, 2024

glanham-jr
Jun 22, 2024

dalf
Jun 23, 2024
Author

dalf Jun 23, 2024
Author