+
Skip to content

Tags: gty111/gLLM

Tags

v0.0.3

Toggle v0.0.3's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Bump up to version 0.0.3 (#81)

v0.0.2

Toggle v0.0.2's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Support TP 🎉 (#72)

* Initial support for TP

* Use random initialization

* Fix PP forward

* Downgrade to torch 2.6.0

* Fix env setting for MAX_JOBS

* Downgrade to torch 2.5.1

* Fix TP group init

* Fix annotation

* Make llama compatible for tp

* Make chatglm compatible for TP

* Make Qwen3 compatible for TP

* Remove weight_loader in fused_moe

* Make fused_moe compatible for TP; Abstract weight load function

* Make qwen_moe compatible for tp

* Make mixtral compatible for TP

* Update readme

* Abstract module attention; Clean up code for TP attention; Clean up code for model weights loading for glm

* Add MoE tuing config for A100 PCIE 40GB

* Refactor scheduler.py and AllocatorID

* Refactor IDAllocator

* Refactor worker scheduler

* Update readme

* Make embed_tokens and lm_head compatible for TP

* Fix multi-node zmq_comm

* Bump version to 0.1.0

v0.0.1

Toggle v0.0.1's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Add pyproject.toml (#62)

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载