-
Notifications
You must be signed in to change notification settings - Fork 2.3k
httpx impersonate #4801
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
httpx impersonate #4801
Conversation
9d5fd35 to
5736897
Compare
|
I would use httpx_curl_cffi globally:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces an "impersonate" network parameter to support httpx-curl-cffi for improved client transport configuration, which is utilized in various parts of the network and engine modules.
- Added "impersonate" option to network settings in settings.yml.
- Propagated the "impersonate" parameter through the Network and client APIs, updating client creation and error handling.
- Updated the Qwant engine to include additional HTTP headers for requests.
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| searx/settings.yml | Adds network.impersonate configuration (set to "chrome") |
| searx/network/network.py | Updates Network initialization and client key construction to include impersonate |
| searx/network/client.py | Integrates impersonate in new_client and mount selection logic with proper error handling |
| searx/engines/qwant.py | Adds custom HTTP headers for the Qwant engine requests |
The package is not available on ARMv7 and Python 3.9. [EDIT] it should compile on ARMv7 : lexiforest/curl_cffi#304 |
Yes but only if it's available. I'm talking about the customization done like these: https://github.com/searxng/searxng/pull/4801/files#diff-e95e3f454925bc2344d3cf538950b11a1255935b47a609c9589e6ac33d74802fR1713-R1714 |
|
Notes:
(I'm saying we should or should make it global) |
|
Looking at the benchmark, httpx-async is one of the slowest:
Should we be worried that a single, commercially backed company is driving the project and could one day switch off the updates, requiring SearXNG to migrate to a pure httpx client? |
Just a heads up if you're using e.g. impersonate=chrome: curl_cffi sets the user agent and the impersonated browser's headers for you. You can override them but it's something to keep in mind. You can use something like httpbin's "anything" endpoint to see what's being sent:
The project is open source and can be forked, it's not like it's some super obscure project either so I am not too worried. |
What does this PR do?
Use httpx-curl-cffi to add a network parameter:
impersonate.See https://github.com/lexiforest/curl_cffi/blob/8b4ee6d4db0982a3d93191b698c56936dfdfdcf0/curl_cffi/requests/impersonate.py#L9-L77 for the possible values
Why is this change important?
See #3929
How to test this PR locally?
enable_http2: trueworksenable_http2: falseworksverify: falseworksverify: "/path/to/ca.pem"workslocal_addressesworksAuthor's checklist
Based on #4674
curl-cffi requires Python 3.10 or above and ARM64 or AMD64. With curl-cffi, the engines using the impersonate parameter crash with an explanation.
Related issues