Tags: gty111/gLLM
Tags
Support TP 🎉 (#72) * Initial support for TP * Use random initialization * Fix PP forward * Downgrade to torch 2.6.0 * Fix env setting for MAX_JOBS * Downgrade to torch 2.5.1 * Fix TP group init * Fix annotation * Make llama compatible for tp * Make chatglm compatible for TP * Make Qwen3 compatible for TP * Remove weight_loader in fused_moe * Make fused_moe compatible for TP; Abstract weight load function * Make qwen_moe compatible for tp * Make mixtral compatible for TP * Update readme * Abstract module attention; Clean up code for TP attention; Clean up code for model weights loading for glm * Add MoE tuing config for A100 PCIE 40GB * Refactor scheduler.py and AllocatorID * Refactor IDAllocator * Refactor worker scheduler * Update readme * Make embed_tokens and lm_head compatible for TP * Fix multi-node zmq_comm * Bump version to 0.1.0