这是indexloc提供的服务,不要输入任何密码
Skip to content

Conversation

@return42
Copy link
Member

@return42 return42 commented Jul 31, 2025

I tried various type checkers, focusing primarily on the features of the LSP. basedyright, or rather, the LSP basedpyright-langserver, convinced me.

For integration into your IDE (editor) use the basedpyright-langserver (LSP) installed by pipx basedpyright and read:

The $REPO_ROOT/pyrightconfig.json uses the virtualenv found in $REPO_ROOT/local/py3 and can be created by a make pyenv


  • pyrightconfig.json :
    for the paths searx, searxng_extra and tests, individual rules were defined (for example, in test fewer / different rules are needed than in the searx package

  • searx/engines/__builtins__.pyi :
    The builtin types that are added to the global namespace of a module by the intended monkey patching of the engine modules / replaces the previous filtering of the stdout using grep.

  • test.pyright_modified (utils/lib_sxng_test.sh) :
    static type check of local modified files not yet commited

  • make test :
    prerequisite test.pyright has been replaced by test.pyright_modified

  • searx/engines/__init__.py, searx/enginelib/__init__.py :
    First, minimal typifications that were considered necessary.


I think we need to put more effort into typing, but we shouldn't try to type everything either; that just creates unnecessary work and is frustrating. Python is and remains a dynamically typed language, but types can improve quality and, with the LSP (IDE), positively impact productivity. In this sense, we should also align our goals with regard to typing.

categories: list[str]
disabled: bool
display_error_messages: bool
enable_http: bool
Copy link
Member

@Bnyro Bnyro Aug 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also possible to override values from the outgoing: yaml settings as mentioned in

# overwrite values from section 'outgoing:'
enable_http2: false
retries: 1
max_connections: 100
max_keepalive_connections: 10
keepalive_expiry: 5.0
using_tor_proxy: false
proxies:
http:
- http://proxy1:8080
- http://proxy2:8080
https:
- http://proxy1:8080
- http://proxy2:8080
- socks5://user:password@proxy3:1080
- socks5h://user:password@proxy4:1080
, e.g. enable_http2. I'm not sure if it makes sense to add them here?

Copy link
Member Author

@return42 return42 Aug 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if it makes sense to add them here?

🤔 are these options really related to the engine .. for me these options are related to the outgoing: settings, since it overwrites those settings .. before we cover these types in the engine config, we should first port the type definitions in settings_defaults.py to msgspec.Struct.

When we have a msgspec.Struct for ...

'outgoing': {
'useragent_suffix': SettingsValue(str, ''),
'request_timeout': SettingsValue(numbers.Real, 3.0),
'enable_http2': SettingsValue(bool, True),
'verify': SettingsValue((bool, str), True),
'max_request_timeout': SettingsValue((None, numbers.Real), None),
'pool_connections': SettingsValue(int, 100),
'pool_maxsize': SettingsValue(int, 10),
'keepalive_expiry': SettingsValue(numbers.Real, 5.0),
# default maximum redirect
# from https://github.com/psf/requests/blob/8c211a96cdbe9fe320d63d9e1ae15c5c07e179f8/requests/models.py#L55
'max_redirects': SettingsValue(int, 30),
'retries': SettingsValue(int, 0),
'proxies': SettingsValue((None, str, dict), None),
'source_ips': SettingsValue((None, str, list), None),
# Tor configuration
'using_tor_proxy': SettingsValue(bool, False),
'extra_proxy_timeout': SettingsValue(int, 0),
'networks': {},
},

we can rethink how we add this type to the engine builtins .. but I hope we will implement a Engine class before we add more and more typing for all these monkey patched stuff into the builtins.


BTW we should not mix all these names of config options into a flat namespace / on the long term we should replace it by outgoing section in the engine config. This will simplify the coding a lot and prevents name collisions.

- name: example
  engine: example
  shortcut: demo
  outgoing:
    enable_http2: false
      proxies:
        http:
         - http://proxy1:8080
         - http://proxy2:8080
       https:
         - socks5://user:password@proxy3:1080
         - socks5h://user:password@proxy4:1080

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW we should not mix all these names of config options into a flat namespace / on the long term we should replace it by outgoing section in the engine config. This will simplify the coding a lot and prevents name collisions.

Totally agree on that, I found it very confusing when I used it once to change enable_http2.

@inetol
Copy link
Member

inetol commented Aug 8, 2025

I tried various type checkers, focusing primarily on the features of the LSP. basedyright, or rather, the LSP basedpyright-langserver, convinced me.

Have we considered Ruff? I came across this now while looking into uv, at first glance, looks like it replaces every existing linters and formatters, like Biome in JS stack does.

@return42
Copy link
Member Author

return42 commented Aug 8, 2025

Have we considered Ruff? I came across this now while looking into uv, at first glance, looks like it replaces every existing linters and formatters

I had a look at ruff in some other projects I had / hatch for example uses ruff by default.

However, I was not convinced by the quality... compared to the trio consisting of pylint, basedpyright & black.

I'm aware that many people prefer Ruff because it's generally considered the super-fast, all-purpose tool. Personally, I prefer basedpyright's LSP (I haven't encountered any speed issues yet), it's easy to integrate into the IDE, and (as far as I know) has more options.

I'd also like to keep pylint in our toolchain, as I've had good experiences with it as well. Please keep in mind that we only have a very small amount of static types. The vast majority of the code is and probably always will be dynamically typed.

For formatting, we have a third tool (black). I'm aware that it certainly seems easier to have one tool for all three tasks (the all-purpose tool). But true to the Linux principle of "do one thing and do it well," I think it's justified to have three tools in use here.

timeout: int
tokens: list[str]
using_tor_proxy: bool

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EngineTraits type at module name traits is missed ..

trait_map = EngineTraitsMap.from_data()
trait_map.set_traits(engine)

engine.traits = traits

@return42
Copy link
Member Author

return42 commented Aug 8, 2025

Have we considered Ruff?

While implementing #5101 I looked once more into Ruff & LSP .. they had an rewrite of the ruff-lsp last year:

With the new ruff server (LSP) integrated in my Emacs I realized that Ruff only does linting and formatting right now. Ruff has no implementation for a Static Type Checker and the LSP can therefore only implement a fraction of the functionality one would expect. Even finding a (function) definition is not possible.


TL;DR;

basedpyright is a Static Type Checker for Python and the LSP implements (all) function one would expect, here by example we have a Literal with a catalog of names (weather.WeatherConditionType):

grafik

A literal not from the catalog is detected as a type issue ..

grafik

The LSP from Ruff is not able to follow type hints .. no type issue is reported ("clear" isn't a valid value for this Literal) and no completion catalog:

grafik

All issues Ruff LSP has found are trivials, like unknown names (e.g. traits):

grafik

The announcement of ASTRAL last year was still...

grafik

.. but they are still far from a minimal feature set.

ASTRAL repeatedly emphasizes (with a pinch of arrogance) how fast their tools are... which should surely be a given if these tools just implement a fraction of the features.

"Faster" cannot be equated with "more accurate."

@inetol
Copy link
Member

inetol commented Aug 8, 2025

That's weird. I tried ruff following standard installation procedure and it doesn't provide completion for dictionary values: (idk what is this "for")

sh20250808211512

...But it does resolve replace the type with all the literals on the warning msg (messy but working?):

sh20250808212135

...I tried setting up a simple constant with the bare type and STC works there:

sh20250808211217

Probably an LSP server bug (ruff-lsp isn't working either), I haven't found any issues related on their repo.

@Bnyro
Copy link
Member

Bnyro commented Aug 9, 2025

Just to add my two cents:

I can confirm your experiences with ruff so far: It's definitely fine for formatting code, but it's certainly not a fully featured LSP that could replace Pylsp or Pyright as of now.

We could consider replacing black with ruff for formatting, but I think that there's no real benefit in switching to ruff. It currently runs within a second (depending on the hardware), see screenshot below

Screenshot from 2025-08-09 10-55-45

@return42
Copy link
Member Author

return42 commented Aug 9, 2025

but it's certainly not a fully featured LSP that could replace Pylsp or Pyright as of now.

I recommend to use basedpyright (not pyright) and I install all the needed tools into my HOME by pipx:

$ pipx install basedpyright ruff uv hatch 'python-lsp-server[all]'

Don't be confused by the PyLSP names:

@inetol
Copy link
Member

inetol commented Aug 9, 2025

We could consider replacing black with ruff for formatting, but I think that there's no real benefit in switching to ruff. It currently runs within a second (depending on the hardware), see screenshot below

From what's listed in https://docs.astral.sh/ruff/formatter/#style-guide and comparing with Black https://docs.astral.sh/ruff/formatter/black/, I noticed that we have a little more magic involved in comments, we could try this in another PR and see if the changes are acceptable.

@return42
Copy link
Member Author

FWIW I also tested the language server from https://zubanls.com/ (alpha) .. When it comes to speed, the emphasis is also on “speed” (I'm tired of hearing it) ... Zuban compares itself to mypy in terms of speed ... but unfortunately, it doesn't come close to having the same range of functions.

I feel like I've tried every Python LSP on the planet :-) ... Most are reimplementations of linterns, but they can't be compared to pylint either. Others integrate pylint, but still don't have strong type checking.

In the IDE, pyright and its successor basedpyright work best. They show the programmer most of the problems and offer good support for autocompletion.

In the build chain, pylint provides us with additional checks that a static type checker does not have in its scope, and we have black for code formatting... I think we are in a very good position and should not change our Python toolchain any further.


If there are no objections, I would like to merge the PR.

// There will still be a lot of fine-tuning possible in the configuration in pyrightconfig.json, but we should first gather some experience, which will then also be possible in the IDE with the merge of this PR.

@return42
Copy link
Member Author

return42 commented Aug 11, 2025

A little OT, but for the sake of completeness, here is an excerpt from my scrapbook of what I have installed for my SearXNG development tasks:

;; mise-en-place: https://mise.jdx.dev/installing-mise.html
;;
;;   $ mise settings set python.compile false
;;   $ mise use -g usage go node python shellcheck shfmt
;;
;;   Configure npm to install global executables to ~/.local/bin and the
;;   libraries to ~/.local/lib/node_modules/
;;
;;   $ npm config set prefix ~/.local
;;
;;   $ mise use -g jq # https://jqlang.org/
;;   $ mise use -g rust  # rustup command --> https://www.rust-lang.org/tools/install
;;   $ mise use -g github:kristoff-it/superhtml  # https://github.com/kristoff-it/superhtml
;;   ...
;;   $ mise self-update
;;   $ mise upgrade --bump  # Upgrades to latest version, bumping the version in mise.toml
;;   ...
;;   $ pipx upgrade-all
;;   ...
;;   $ npm update -g
;;   ...
;;   $ rustup update
;;
;; Python development:
;;
;;   $ pipx install \
;;       uv hatch black \
;;       basedpyright ruff  'python-lsp-server[all]' pylsp-mypy jedi-language-server zuban
;;
;; JavaScript development:
;;
;;   Typescript LSP alternative: vtsls --> https://github.com/yioneko/vtsls
;;
;;   $ npm install -g typescript-language-server typescript
;;   $ npm install -g @vtsls/language-server
;;   $ npm install -g eslint eslint-plugin-json
;;
;; Rust Development
;;
;;   $ rustup component add rust-analyzer
;;
;; Bash scripting:
;;
;;   $ mise use -g shellcheck shfmt
;;   $ npm install -g bash-language-server
;;
;; Language Servers:
;;
;;   $ npm install -g yaml-language-server
;;   $ npm install -g vscode-langservers-extracted  # html css json eslint

@inetol
Copy link
Member

inetol commented Aug 11, 2025

Shouldn't the root package.json be removed, as it was only used for pyright?

@return42
Copy link
Member Author

FYI: last force push was a just a rebase on branch master to solve a conflict.

pyrightconfig.json :

  for the paths searx, searxng_extra and tests, individual rules were
  defined (for example, in test fewer / different rules are needed than in the
  searx package

searx/engines/__builtins__.pyi :

  The builtin types that are added to the global namespace of a module by the
  intended monkey patching of the engine modules / replaces the previous
  filtering of the stdout using grep.

test.pyright_modified (utils/lib_sxng_test.sh) :

  static type check of local modified files not yet commited

make test :

  prerequisite 'test.pyright' has been replaced by 'test.pyright_modified'

searx/engines/__init__.py, searx/enginelib/__init__.py :

  First, minimal typifications that were considered necessary.
@return42
Copy link
Member Author

return42 commented Aug 19, 2025

Shouldn't the root package.json be removed, as it was only used for pyright?

I removed the pyright dependency diff but not the entire file: The package.json is used to trace (dependantbot) our developer tools (NodeJS), even if we currently do not need tools from the NodeJS eco system (at least not on this level of the project), we should leave the file there so that we can add any tools we may need later .. similar to how we do it with the developer tools from the Go ecosystem:

@return42 return42 requested a review from inetol August 19, 2025 09:49
@return42 return42 merged commit 25647c2 into searxng:master Aug 19, 2025
7 checks passed
@return42 return42 deleted the basedpyright branch August 19, 2025 10:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants