[refactor] typification of SearXNG (initial) / result items (part 1) #4183

return42 · 2025-01-14T11:06:54Z

Typification of SearXNG

This patch introduces the typing of the results. The why and how is described
in the documentation, please generate the documentation ..

$ make docs.clean docs.live

and read the following articles in the "Developer documentation":

result types --> http://0.0.0.0:8000/dev/result_types/index.html

The result types are available from the searx.result_types module. The
following have been implemented so far:

base result type: searx.result_type.Result
--> http://0.0.0.0:8000/dev/result_types/base_result.html
answer results
--> http://0.0.0.0:8000/dev/result_types/answer.html

including the type for translations (inspired by #3925). For all other
types (which still need to be set up in subsequent PRs), template documentation
has been created for the transition period.

Doc of the fields used in Templates

The template documentation is the basis for the typing and is the first complete
documentation of the results (needed for engine development). It is the
"working paper" (the plan) with which further typifications can be implemented
in subsequent PRs.

Document the result fields of the HTML templates from the simple theme #357

Answer Templates

With the new (sub) types for Answer, the templates for the answers have also
been revised, Translation are now displayed with collapsible entries (inspired
by #3925).

!en-de dog

Plugins & Answerers

The implementation for Plugin and Answer has been revised, see
documentation:

Plugin: http://0.0.0.0:8000/dev/plugins/index.html
Answerer: http://0.0.0.0:8000/dev/answerers/index.html

With AnswerStorage and AnswerStorage to manage those items (in follow up
PRs, ArticleStorage, InfoStorage and .. will be implemented)

Autocomplete

The autocompletion had a bug where the results from Answer had not been shown
in the past. To test activate autocompletion and try search terms for which we
have answerers

statistics: type min 1 2 3 .. in the completion list you should find an
entry like [de] min(1, 2, 3) = 1
random: type random uuid .. in the completion list, the first item is a
random UUID

Extended Types

SearXNG extends e.g. the request and response types of flask and httpx, a module
has been set up for type extensions:

Extended Types
--> http://0.0.0.0:8000/dev/extended_types.html

Unit-Tests

The unit tests have been completely revised. In the previous implementation,
the runtime (the global variables such as searx.settings) was not initialized
before each test, so the runtime environment with which a test ran was always
determined by the tests that ran before it. This was also the reason why we
sometimes had to observe non-deterministic errors in the tests in the past:

make test.unit fails with KeyError: ('engine', 'dummy engine', 'search', 'count', 'sent') #2988 is one example for the Runtime
issues, with non-deterministic behavior ..
[l10n] update translations from Weblate #3650
[fix] tear down TEST_ENGINES after TestBang is proceeded #3654
[mod] remove py 3.6 leftovers #3642 (comment)
Fix tineye engine url, datetime parsing, and minor refactor #3746 (comment)

Why msgspec.Struct

We have already discussed typing based on e.g. TypeDict or dataclass in the past:

In my opinion, TypeDict is unsuitable because the objects are still dictionaries
and not instances of classes / the dataclass are classes but ...

The msgspec.Struct combine the advantages of typing, runtime behaviour and
also offer the option of (fast) serializing (incl. type check) the objects.

Currently not possible but conceivable with msgspec: Outsourcing the engines
into separate processes, what possibilities this opens up in the future is left
to the imagination!

Internally, we have already defined that it is desirable to decouple the
development of the engines from the development of the SearXNG core / The
serialization of the Result objects is a prerequisite for this.

HINT: The threads listed above were the template for this PR, even though the
implementation here is based on msgspec. They should also be an inspiration for
the following PRs of typification, as the models and implementations can provide
a good direction.

Why just one commit?

I tried to create several (thematically separated) commits, but gave up at some
point ... there are too many things to tackle at once / The comprehensibility of
the commits would not be improved by a thematic separation. On the contrary, we
would have to make multiple changes at the same places and the goal of a change
would be vaguely recognizable in the fog of the commits.

Closes:

Bnyro · 2025-01-16T19:40:20Z

(Side note: This would probably allow us to re-visit #3256 as well, after this PR, to get rid of some hardcoded HTML in result weather engine result templates)

return42 · 2025-01-22T15:58:47Z

I pushed this PR on my instance where you test it online: https://darmarit.org/searx

Since this PR does no functional search, everything should be as usual.

Except the translation results, which now have collapsible items and links to the origin translators:

en-de The quick red fox jumped over the lazy dog

The translation results (inspired by #3925) has been included in this PR to demonstrate what we can do with typification / how it works ..

Bnyro

I've done the steps you described and some other tests and didn't find any bugs, so it seems good from that perspective.

I've been going through the code and trying to understand the typification process, but I'm a bit worried about the current syntax for returning results. Let's take the following base example

results = []
...
Translations(results=results, translations=[foo], url=foo_url)
return results

I have to admit I don't 100% understand how this works from looking at the code:

we're creating a Translations class here, but its result is actually unused
instead it does some magic and change the results list
where is the url field we set here is defined? I couldn't find its definition anywhere in the classes Translations inherits from
Since this modifies the original results without returning anything, wouldn't that be more suitable for a function?
I'd personally love something like return Translations(translations=[item], url=foo_url) and avoid having to create an empty results array first

Apart from that confusion, that PR looks really awesome, the split of the documentation into different parts for answers, infobox, suggestions, ... is a nice step and I really like the translation answerer appearance change as well as the ability to set custom answerer templates, which is much better than my previous attempt 👍

So if you could please clarify the above point a bit, I'm sure that's good to merge. I'm confident that this is close to bug-free, although we can't tell although it's in production everywhere. Waiting any longer will probably cause more and more conflicts, so we should get this merged as soon as possible imo.
It's an important step in the right direction, thanks for all your work on that Markus!

return42 · 2025-01-26T09:47:33Z

Let's take the following base example

results = []
...
Translations(results=results, translations=[foo], url=foo_url)
return results

where is the url field we set here is defined? I couldn't find its definition anywhere in the classes Translations inherits from

The class searx.result_types.Result is the base class of all result types, here is the inheritance:

class Result(msgspec.Struct, ..):
    """Base class of all result types :ref:`result types`."""

    url: str | None = None
    """A link related to this *result*"""

    results: list = []   # in the future it will be of type EngineResults (read below)
    """Result list of an :origin:`engine <searx/engines>` .. to which the reulst item  should be added."""

class BaseAnswer(Result, ..):
   ...
class Translations(BaseAnswer, ..):
   ...

I have to admit I don't 100% understand how this works from looking at the code:
* we're creating a `Translations` class here, but its result is actually unused

* instead it does some magic and change the `results` list

Yeah that looks a bit strange, the alternative notation would be:

results = []
...
x = Translations(results=results, translations=[foo], url=foo_url)
results.append(x)
return results

but we are still at the beginning ... that's why it looks a bit strange at the moment ..

Since this modifies the original results without returning anything, wouldn't that be more suitable for a function?

I'd personally love something like return Translations(translations=[item], url=foo_url) and avoid having to create an empty results array first

In the future we will have a class for the result list, which will then offer a factory method for the typed results ... haven't thougth to its end but let assume we have a class like this:

from searx.result_types import Result, Answer, Translation, ...

class EngineResults(list):

    Answer = Answer
    Translation = Translation
    ... = ...

    def new_result(self, res: Result):
        res.results = self
        self.append(res)

And in the engine we have something like ..

from searx.xyz import EngineResults

def response(resp) -> EngineResults:
    results = EngineResults()
    results.new_result(result.Translations(translations=[foo], url=foo_url))
    return results

The above is just a 2min protototype to demonstrate wehre I want to go: (in the future) the engine developer:

should have a factory method,
should not need to import all the result types (they are already available from EngineResults.Answer, EngineResults.Translation, EngineResults.xyz).
should not care much about a result list and how the result items are managed in.

So if you could please clarify the above point a bit,

Hope, the above helps .. but I'm open for suggestions .. but for now I would like to have a optional Result.results field .. I believe that we will be able to benefit from this in the future; if there is context information (an association) between the individual result item and the list in which it is added.

Bnyro · 2025-01-26T11:48:57Z

Thanks for the clarification, that explains how this worked.

results = []
...
result = Translations(translations=[foo], url=foo_url)
results.append(result)
return results

To be honest I do very much prefer this syntax to make the code more self-descriptive and just make it more clear to everyone that's working on implementing an engine.

It's very much more clean this way as calling Translation(...) doesn't have any side effects and you can directly see what is happening to results because you modified it yourself.

This probably shouldn't be that much too change, as you said, it's already possible and only documentation and the engines using it would need to be modified (let me know if you want me to handle this if you don't have time for it).

from searx.xyz import EngineResults

def response(resp) -> EngineResults:
results = EngineResults()
results.new_result(result.Translations(translations=[foo], url=foo_url))
return results

That'd indeed be a nice syntax. This is also much closer to the way described above using an array of results, and thus seems very intuitive for engine developers.

Bnyro · 2025-01-26T11:51:42Z

    results: list = []  # https://jcristharif.com/msgspec/structs.html#default-values
    """Result list of an :origin:`engine <searx/engines>` response or a
    :origin:`answerer <searx/answerers>` to which the answer should be added.

    This field is only present for the sake of simplicity.  Typically, the
    response function of an engine has a result list that is returned at the
    end. By specifying the result list in the constructor of the result, this
    result is then immediately added to the list (this parameter does not have
    another function).

    .. code:: python

       def response(resp):
           results = []
           ...
           Answer(results=results, answer=answer, url=url)
           ...
           return results

    """

I see that you actually documented this as well, I just missed that part because I didn't figure out that Translations inherits from the Result base class, sorry for not seeing that earlier.

Typification of SearXNG ======================= This patch introduces the typing of the results. The why and how is described in the documentation, please generate the documentation .. $ make docs.clean docs.live and read the following articles in the "Developer documentation": - result types --> http://0.0.0.0:8000/dev/result_types/index.html The result types are available from the `searx.result_types` module. The following have been implemented so far: - base result type: `searx.result_type.Result` --> http://0.0.0.0:8000/dev/result_types/base_result.html - answer results --> http://0.0.0.0:8000/dev/result_types/answer.html including the type for translations (inspired by searxng#3925). For all other types (which still need to be set up in subsequent PRs), template documentation has been created for the transition period. Doc of the fields used in Templates =================================== The template documentation is the basis for the typing and is the first complete documentation of the results (needed for engine development). It is the "working paper" (the plan) with which further typifications can be implemented in subsequent PRs. - searxng#357 Answer Templates ================ With the new (sub) types for `Answer`, the templates for the answers have also been revised, `Translation` are now displayed with collapsible entries (inspired by searxng#3925). !en-de dog Plugins & Answerer ================== The implementation for `Plugin` and `Answer` has been revised, see documentation: - Plugin: http://0.0.0.0:8000/dev/plugins/index.html - Answerer: http://0.0.0.0:8000/dev/answerers/index.html With `AnswerStorage` and `AnswerStorage` to manage those items (in follow up PRs, `ArticleStorage`, `InfoStorage` and .. will be implemented) Autocomplete ============ The autocompletion had a bug where the results from `Answer` had not been shown in the past. To test activate autocompletion and try search terms for which we have answerers - statistics: type `min 1 2 3` .. in the completion list you should find an entry like `[de] min(1, 2, 3) = 1` - random: type `random uuid` .. in the completion list, the first item is a random UUID Extended Types ============== SearXNG extends e.g. the request and response types of flask and httpx, a module has been set up for type extensions: - Extended Types --> http://0.0.0.0:8000/dev/extended_types.html Unit-Tests ========== The unit tests have been completely revised. In the previous implementation, the runtime (the global variables such as `searx.settings`) was not initialized before each test, so the runtime environment with which a test ran was always determined by the tests that ran before it. This was also the reason why we sometimes had to observe non-deterministic errors in the tests in the past: - searxng#2988 is one example for the Runtime issues, with non-deterministic behavior .. - searxng#3650 - searxng#3654 - searxng#3642 (comment) - searxng#3746 (comment) Why msgspec.Struct ================== We have already discussed typing based on e.g. `TypeDict` or `dataclass` in the past: - https://github.com/searxng/searxng/pull/1562/files - https://gist.github.com/dalf/972eb05e7a9bee161487132a7de244d2 - https://github.com/searxng/searxng/pull/1412/files - searxng#1356 In my opinion, TypeDict is unsuitable because the objects are still dictionaries and not instances of classes / the `dataclass` are classes but ... The `msgspec.Struct` combine the advantages of typing, runtime behaviour and also offer the option of (fast) serializing (incl. type check) the objects. Currently not possible but conceivable with `msgspec`: Outsourcing the engines into separate processes, what possibilities this opens up in the future is left to the imagination! Internally, we have already defined that it is desirable to decouple the development of the engines from the development of the SearXNG core / The serialization of the `Result` objects is a prerequisite for this. HINT: The threads listed above were the template for this PR, even though the implementation here is based on msgspec. They should also be an inspiration for the following PRs of typification, as the models and implementations can provide a good direction. Why just one commit? ==================== I tried to create several (thematically separated) commits, but gave up at some point ... there are too many things to tackle at once / The comprehensibility of the commits would not be improved by a thematic separation. On the contrary, we would have to make multiple changes at the same places and the goal of a change would be vaguely recognizable in the fog of the commits. Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>

In [1] and [2] we discussed the need of a Result.results property and how we can avoid unclear code. This patch implements a class for the reslut-lists of engines:: searx.result_types.EngineResults A simple example for the usage in engine development:: from searx.result_types import EngineResults ... def response(resp) -> EngineResults: res = EngineResults() ... res.add( res.types.Answer(answer="lorem ipsum ..", url="https://example.org") ) ... return res [1] searxng#4183 (review) [2] searxng#4183 (comment) Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>

return42 · 2025-01-27T16:29:30Z

That'd indeed be a nice syntax. This is also much closer to the way described above using an array of results, and thus seems very intuitive for engine developers.

@Bnyro I pushed a new commit on top .. in this patch class searx.result_types.EngineResults is implemented and used in the engines:

    from searx.result_types import EngineResults
    ...
    def response(resp) -> EngineResults:
        res = EngineResults()
        ...
        res.add( res.types.Answer(answer="lorem ipsum ..", url="https://example.org") )
        ...
        return res

BTW the property Result.results is no longer needed / I dropped this field.

Bnyro · 2025-01-27T17:11:42Z

@Bnyro I pushed a new commit on top .. in this patch class searx.result_types.EngineResults is implemented and used in the engines:
    from searx.result_types import EngineResults
    ...
    def response(resp) -> EngineResults:
        res = EngineResults()
        ...
        res.add( res.types.Answer(answer="lorem ipsum ..", url="https://example.org") )
        ...
        return res
BTW the property Result.results is no longer needed / I dropped this field.

Awesome, thanks for your work!

Before searxng#4183 a builtin plugin was *defautlt_on* when it is listed in the "enabled_plugins" settings, this patch restores the previous behavior. Not part of this patch but just to mentioning in context of searxng#4263: In the long term, we will abolish the "enabled_plugins:" setting and combine all options for the plugins in the "plugins:" setting, as is already planned in the PR searxng#4282 Closes: searxng#4263 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>

Before #4183 a builtin plugin was *defautlt_on* when it is listed in the "enabled_plugins" settings, this patch restores the previous behavior. Not part of this patch but just to mentioning in context of #4263: In the long term, we will abolish the "enabled_plugins:" setting and combine all options for the plugins in the "plugins:" setting, as is already planned in the PR #4282 Closes: #4263 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>

return42 mentioned this pull request Jan 14, 2025

[refactor] translation engines: common interface #3925

Closed

return42 force-pushed the translation-answer-template branch from bc1cf66 to e4e7d73 Compare January 14, 2025 11:18

This was referenced Jan 14, 2025

Implement typed models for searx/answerers #3316

Closed

[enh] add pyproject.toml listing build requirements #4141

Draft

Bnyro self-requested a review January 16, 2025 19:12

return42 force-pushed the translation-answer-template branch from e4e7d73 to 8cddc4e Compare January 22, 2025 14:39

return42 mentioned this pull request Jan 23, 2025

[feat] engines: add tavily (AI powered) #4221

Open

Bnyro mentioned this pull request Jan 24, 2025

Add lara translate engine #4224

Open

Bnyro reviewed Jan 25, 2025

View reviewed changes

Bnyro and others added 2 commits January 27, 2025 14:30

[refactor] translation engines: common interface

f33ac10

return42 force-pushed the translation-answer-template branch from 8cddc4e to 455d258 Compare January 27, 2025 13:31

Bnyro approved these changes Jan 27, 2025

View reviewed changes

return42 merged commit 36a1ef1 into searxng:master Jan 28, 2025
9 checks passed

return42 deleted the translation-answer-template branch January 28, 2025 06:07

This was referenced Jan 28, 2025

Common interface for translation engines #3576

Closed

Document the result fields of the HTML templates from the simple theme #357

Closed

GenericMale added a commit to GenericMale/searxng that referenced this pull request Jan 29, 2025

[fix] rerank plugin: adapt to searxng#4183

ab83de9

return42 mentioned this pull request Jan 30, 2025

Hostnames plugin defaults to off even though it is enabled in the settings.yml #4263

Closed

Bnyro mentioned this pull request Feb 2, 2025

Remove hard coded HTML from engine result list. #1352

Open

5 tasks

theTyster mentioned this pull request Feb 4, 2025

Hostnames plugin no longer has an opt-out state #4280

Closed

return42 mentioned this pull request Feb 28, 2025

[fix] add backward compatibility for the "enabled_plugins:" setting #4391

Merged

return42 mentioned this pull request Mar 5, 2025

[refactor] typification of SearXNG (MainResult) / result items (part 2) #4424

Merged

return42 mentioned this pull request Apr 11, 2025

[fix] issues when launching a local development server #4603

Merged

return42 mentioned this pull request Apr 24, 2025

CI often fails for no reason #3983

Closed

return42 mentioned this pull request Jul 8, 2025

[fix] calculator plugin: subrocess is not closed on timeout #4983

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[refactor] typification of SearXNG (initial) / result items (part 1) #4183

[refactor] typification of SearXNG (initial) / result items (part 1) #4183

Uh oh!

return42 commented Jan 14, 2025 •

edited

Loading

Uh oh!

Bnyro commented Jan 16, 2025

Uh oh!

return42 commented Jan 22, 2025 •

edited

Loading

Uh oh!

Bnyro left a comment

Uh oh!

return42 commented Jan 26, 2025

Uh oh!

Bnyro commented Jan 26, 2025

Uh oh!

Bnyro commented Jan 26, 2025

Uh oh!

return42 commented Jan 27, 2025

Uh oh!

Bnyro commented Jan 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[refactor] typification of SearXNG (initial) / result items (part 1) #4183

[refactor] typification of SearXNG (initial) / result items (part 1) #4183

Uh oh!

Conversation

return42 commented Jan 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Typification of SearXNG

Doc of the fields used in Templates

Answer Templates

Plugins & Answerers

Autocomplete

Extended Types

Unit-Tests

Why msgspec.Struct

Why just one commit?

Uh oh!

Bnyro commented Jan 16, 2025

Uh oh!

return42 commented Jan 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Bnyro left a comment

Choose a reason for hiding this comment

Uh oh!

return42 commented Jan 26, 2025

Uh oh!

Bnyro commented Jan 26, 2025

Uh oh!

Bnyro commented Jan 26, 2025

Uh oh!

return42 commented Jan 27, 2025

Uh oh!

Bnyro commented Jan 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

return42 commented Jan 14, 2025 •

edited

Loading

return42 commented Jan 22, 2025 •

edited

Loading