Adding a cache to speed up SearXNG #3386
Replies: 4 comments 6 replies
-
Such a "simple" caching method will break the functionality of SearXNG, because parameters of a query such as language, active engines and more are not taken into account. If we think about the variation of query parameters and the actuality of the results, we will come to the conclusion that a simple caching method is not practicable ... a few reasons for this have already been mentioned here in the thread by me and others. I don't want to stifle the discussion in principle, but it should take place at a higher level of abstraction and take all relevant aspects into consideration. The solution proposed here should not find any imitators, which is why I unfortunately have to close this discussion now. |
Beta Was this translation helpful? Give feedback.
-
|
My implementation was super simple, name the cache file after the search term and check if the file exists, if so display it otherwise perform the search as normal. I recently updated the code. LLM's like chatgpt are utilizing search engines now, caching could speed this up substantially as the LLM does not have to go to the web, so it has to be distributed database otherwise it could be the nearest fast proxy instance. It could be opt-in if privacy over speed. If privacy is paramount, the user can opt out. If you want to mess around with caching results locally, here is the updated code, no aff with searx or searxng, just to test. https://www.imtcoin.com/kb.php?page=Caching+SearXNG&redirect=no |
Beta Was this translation helpful? Give feedback.
-
|
For reference, the fork from the /e/ foundation implements a cache : https://gitlab.e.foundation/e/infra/spot . However I can't find it in the code. Public instance: https://spot.murena.io/ |
Beta Was this translation helpful? Give feedback.
-
|
This seems like a reasonable feature to add. Could be beneficial to public server operators. I don't think I see a TTL in this - we will need cache results over X period of time and purge them from storage after, to consider storage growth and just stale results. This also something that should be disabled by default in the settings, at least for the initial implementation. We may want to request specific public maintainers to see if this feature is useful for them by seeing if it reduces the number of out-bound IO requests. I also am thinking we will need metrics to track how many requests were from the cache vs IO bound so we can measure how useful this is. Further, I definitely think it makes sense to implement a generic cache API layer where we can connect to different backends (file storage, Redis-like, SQLITE, etc) for generic Key-Value storage. This would open up this feature to connect to various backends depending on what is available for the maintainer and their server constraints (i.e. lots of file storage, low memory, or lots of memory to spare, etc). With this, would you mind if I tried to continue this idea further? I'd also want to hear an opinion on @return42 on this as they have a much better understanding of the current state of internals for SearxNG than I do. I tried searching the SearXNG docs and the default settings.yml and didn't see anything for a feature like this. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Update from the maintainer: please take not of #3386 (comment)
I added a cache, here is how I did it.
/usr/local/searxng/searxng-src/searx/cache/0
...
/usr/local/searxng/searxng-src/searx/cache/9
/usr/local/searxng/searxng-src/searx/cache/a
...
/usr/local/searxng/searxng-src/searx/cache/z
Edit the file: init.py in /usr/local/searxng/searxng-src/searx/search/init.py : class Search
`
class Search:
`
Simple as that any questions, improvements, suggestions, let me know how it goes
Beta Was this translation helpful? Give feedback.
All reactions