这是indexloc提供的服务,不要输入任何密码
Skip to content

Conversation

@timothycarambat
Copy link
Member

@timothycarambat timothycarambat commented Dec 6, 2023

  • Add support for the 2,000+ GGUF models that can be used as the LLM of choice. Making AnythingLLM now 100% privately running.
  • Support sync chat and streaming from models.
  • Add documentation to support this feature
  • Hide selection when on hosted domain

@timothycarambat timothycarambat changed the title [Feature] Ship AnythingLLM with built-in LLM for chatting [Feature] AnythingLLM use locally hosted Llama.cpp and GGUF files for inferencing Dec 6, 2023
@Mintplex-Labs Mintplex-Labs deleted a comment from review-agent-prime bot Dec 6, 2023
@timothycarambat timothycarambat added enhancement New feature or request Integration Request Request for support of a new LLM, Embedder, or Vector database labels Dec 6, 2023
@Amejonah1200
Copy link

Is GPU Acceleration possible for this feature?

@timothycarambat
Copy link
Member Author

timothycarambat commented Dec 7, 2023

@Amejonah1200

Is GPU Acceleration possible for this feature?

node-llama-cpp using CUDA is supported
Apple Metal is also supported

And once the node-llama-cpp binaries are built it will use GPU acceleration for the llama client, however for docker instances this is CPU only so some additional work will need to be done to bind GPUs to the instance so they can be used.

@Amejonah1200
Copy link

Amejonah1200 commented Dec 7, 2023

@timothycarambat

however for docker instances this is CPU only so some additional work will need to be done to bind GPUs to the instance so they can be used.

It is possible to bind the GPU:
https://docs.docker.com/compose/gpu-support/

As I use WSL, I would need to install the toolkit too, I will see how this plays out later on. Thanks.

@timothycarambat timothycarambat merged commit 655ebd9 into master Dec 7, 2023
@timothycarambat timothycarambat deleted the built-in-llm branch December 7, 2023 22:48
@timothycarambat timothycarambat restored the built-in-llm branch March 14, 2024 20:08
@timothycarambat timothycarambat deleted the built-in-llm branch April 16, 2024 22:43
cabwds pushed a commit to cabwds/anything-llm that referenced this pull request Jul 3, 2025
… inferencing (Mintplex-Labs#413)

* Implement use of native embedder (all-Mini-L6-v2)
stop showing prisma queries during dev

* Add native embedder as an available embedder selection

* wrap model loader in try/catch

* print progress on download

* add built-in LLM support (expiermental)

* Update to progress output for embedder

* move embedder selection options to component

* saftey checks for modelfile

* update ref

* Hide selection when on hosted subdomain

* update documentation
hide localLlama when on hosted

* saftey checks for storage of models

* update dockerfile to pre-build Llama.cpp bindings

* update lockfile

* add langchain doc comment

* remove extraneous --no-metal option

* Show data handling for private LLM

* persist model in memory for N+1 chats

* update import
update dev comment on token model size

* update primary README

* chore: more readme updates and remove screenshots - too much to maintain, just use the app!

* remove screeshot link
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request Integration Request Request for support of a new LLM, Embedder, or Vector database

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants