-
Notifications
You must be signed in to change notification settings - Fork 2.3k
[experimental] google: use the mobile UI #160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
One minor request change is to stop searching for number_of_results when use_mobile_ui is enabled because the div with the ID |
|
searxng/searx/engines/google.py Lines 323 to 324 in 95e634a
? |
Oops never mind. |
|
For me this patch seems to work. I can't validate if it also works when google blocks me, because google does not block me :-) Does anyone have an idea how I can enforce to be blocked by google? In the following I will leave some comments from my reverse engineering ... (similar to #159 (comment)) The content type is plain text: The response text is idiosyncratic ... starting with in the further course one finds fragments of HTML, one JS What me wonders: the response is not (valid) XML but it seems that lxml's parser ( searxng/searx/engines/google.py Lines 299 to 303 in 0e3a87b
I tested with Randomly I checked paging and different languages (which means different domains Over all I would say that this solution seems to work and that the lxml parser could handle this smashed plain text file (response). I vote to give it a try in a production environment .. lets merge. |
disable by default, it has to be enabled in settings.yml related to #159
A simple stress tool like hey will get you blocked after just 5 minutes. Here is the command that I used:
According to a friend, it seems like this is protobuf data serialized to JavaScript, more about that here: marin-m/pbtk#15 (comment)
We could extract the correct HTML code, that's what I wanted to do at first, but lxml parsed the content without any issues, so I gave up the idea.
It seems like I'm getting more Google answers, maybe it is not parsing the correct one... I don't know.
By the way, it's already in test on https://searx.be! But before merging it, do you think we should reduce the amount of parameters given in the |
|
Note: if settings.yml doesn't include |
use_mobile_ui seems to be to true in the default settings.yml though: https://github.com/searxng/searxng/blob/master/searx/settings.yml#L586 |
Here is what I tried and what works for me: additional_parameters = {}
if use_mobile_ui:
additional_parameters = {
'async': 'use_ac:true,_fmt:pc'
} |
|
Should we also document this enhancement in the settings.yml? |
If we have consolidated the development, I can add some doc-strings which could be shown here https://searxng.github.io/searxng/src/index.html |
|
Oops I see this PR is merged .. I will implement one more PR which reduces the parameters and add the documentation. |
Reverse engineering shows that not all of the parameters used by google's mobile UI (aka "more results" button) are needed [1]. [1] searxng#160 (comment) Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Reverse engineering shows that not all of the parameters used by google's mobile UI (aka "more results" button) are needed [1]. [1] searxng#160 (comment) Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Reverse engineering shows that not all of the parameters used by google's mobile UI (aka "more results" button) are needed [1]. [1] searxng/searxng#160 (comment) Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Reverse engineering shows that not all of the parameters used by google's mobile UI (aka "more results" button) are needed [1]. [1] searxng/searxng#160 (comment) Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
What does this PR do?
see #159