[Feature] AnythingLLM use locally hosted Llama.cpp and GGUF files for inferencing #413

timothycarambat · 2023-12-06T19:40:58Z

Add support for the 2,000+ GGUF models that can be used as the LLM of choice. Making AnythingLLM now 100% privately running.
Support sync chat and streaming from models.
Add documentation to support this feature
Hide selection when on hosted domain

stop showing prisma queries during dev

…uilt-in-embedder

…-llm into built-in-llm

server/utils/helpers/updateENV.js

…uilt-in-llm

hide localLlama when on hosted

Amejonah1200 · 2023-12-06T22:50:57Z

Is GPU Acceleration possible for this feature?

timothycarambat · 2023-12-07T00:33:17Z

@Amejonah1200

Is GPU Acceleration possible for this feature?

node-llama-cpp using CUDA is supported
Apple Metal is also supported

And once the node-llama-cpp binaries are built it will use GPU acceleration for the llama client, however for docker instances this is CPU only so some additional work will need to be done to bind GPUs to the instance so they can be used.

Amejonah1200 · 2023-12-07T00:41:35Z

@timothycarambat

however for docker instances this is CPU only so some additional work will need to be done to bind GPUs to the instance so they can be used.

It is possible to bind the GPU:
https://docs.docker.com/compose/gpu-support/

As I use WSL, I would need to install the toolkit too, I will see how this plays out later on. Thanks.

update dev comment on token model size

…uilt-in-llm

…ain, just use the app!

… inferencing (Mintplex-Labs#413) * Implement use of native embedder (all-Mini-L6-v2) stop showing prisma queries during dev * Add native embedder as an available embedder selection * wrap model loader in try/catch * print progress on download * add built-in LLM support (expiermental) * Update to progress output for embedder * move embedder selection options to component * saftey checks for modelfile * update ref * Hide selection when on hosted subdomain * update documentation hide localLlama when on hosted * saftey checks for storage of models * update dockerfile to pre-build Llama.cpp bindings * update lockfile * add langchain doc comment * remove extraneous --no-metal option * Show data handling for private LLM * persist model in memory for N+1 chats * update import update dev comment on token model size * update primary README * chore: more readme updates and remove screenshots - too much to maintain, just use the app! * remove screeshot link

timothycarambat added 10 commits December 5, 2023 12:37

Implement use of native embedder (all-Mini-L6-v2)

6c63918

stop showing prisma queries during dev

Add native embedder as an available embedder selection

f038a65

wrap model loader in try/catch

ab451f8

Merge branch 'master' of github.com:Mintplex-Labs/anything-llm into b…

48680a6

…uilt-in-embedder

print progress on download

0618196

add built-in LLM support (expiermental)

73795df

Update to progress output for embedder

f7bc1fc

move embedder selection options to component

e3bfaf9

Merge branch 'built-in-embedder' of github.com:Mintplex-Labs/anything…

c1baa1f

…-llm into built-in-llm

merge conf

6afc5d8

review-agent-prime bot reviewed Dec 6, 2023

View reviewed changes

server/utils/helpers/updateENV.js Show resolved Hide resolved

review-agent-prime bot reviewed Dec 6, 2023

View reviewed changes

server/utils/helpers/updateENV.js Show resolved Hide resolved

review-agent-prime bot reviewed Dec 6, 2023

View reviewed changes

server/utils/helpers/updateENV.js Show resolved Hide resolved

timothycarambat added 3 commits December 6, 2023 11:48

saftey checks for modelfile

dc62210

update ref

19ed871

Merge branch 'master' of github.com:Mintplex-Labs/anything-llm into b…

852a8db

…uilt-in-llm

timothycarambat changed the title ~~[Feature] Ship AnythingLLM with built-in LLM for chatting~~ [Feature] AnythingLLM use locally hosted Llama.cpp and GGUF files for inferencing Dec 6, 2023

Mintplex-Labs deleted a comment from review-agent-prime bot Dec 6, 2023

timothycarambat added 3 commits December 6, 2023 13:45

Hide selection when on hosted subdomain

424d928

update documentation

0937622

hide localLlama when on hosted

saftey checks for storage of models

fafccdb

timothycarambat added enhancement New feature or request Integration Request Request for support of a new LLM, Embedder, or Vector database labels Dec 6, 2023

timothycarambat added 2 commits December 6, 2023 15:40

update dockerfile to pre-build Llama.cpp bindings

eab195b

update lockfile

cd17c73

timothycarambat added 2 commits December 6, 2023 16:35

add langchain doc comment

8f90798

remove extraneous --no-metal option

1f55eb3

timothycarambat added 8 commits December 6, 2023 17:03

Show data handling for private LLM

9ce82a8

persist model in memory for N+1 chats

2a3e6e1

erge with master

c40d9d8

update import

ea8fd05

update dev comment on token model size

Merge branch 'master' of github.com:Mintplex-Labs/anything-llm into b…

df2968a

…uilt-in-llm

update primary README

d288b32

chore: more readme updates and remove screenshots - too much to maint…

98261a4

…ain, just use the app!

remove screeshot link

b0ebcfd

timothycarambat merged commit 655ebd9 into master Dec 7, 2023

timothycarambat deleted the built-in-llm branch December 7, 2023 22:48

timothycarambat restored the built-in-llm branch March 14, 2024 20:08

timothycarambat deleted the built-in-llm branch April 16, 2024 22:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Feature] AnythingLLM use locally hosted Llama.cpp and GGUF files for inferencing #413

[Feature] AnythingLLM use locally hosted Llama.cpp and GGUF files for inferencing #413

Uh oh!

timothycarambat commented Dec 6, 2023 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Amejonah1200 commented Dec 6, 2023

Uh oh!

timothycarambat commented Dec 7, 2023 •

edited

Loading

Uh oh!

Amejonah1200 commented Dec 7, 2023 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

[Feature] AnythingLLM use locally hosted Llama.cpp and GGUF files for inferencing #413

[Feature] AnythingLLM use locally hosted Llama.cpp and GGUF files for inferencing #413

Uh oh!

Conversation

timothycarambat commented Dec 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Amejonah1200 commented Dec 6, 2023

Uh oh!

timothycarambat commented Dec 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Amejonah1200 commented Dec 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

timothycarambat commented Dec 6, 2023 •

edited

Loading

timothycarambat commented Dec 7, 2023 •

edited

Loading

Amejonah1200 commented Dec 7, 2023 •

edited

Loading