Pinned Loading
Repositories
Showing 10 of 11 repositories
- flash-linear-attention Public
🚀 Efficient implementations of state-of-the-art linear attention models
fla-org/flash-linear-attention’s past year of commit activity - distillation-fla Public Forked from OpenSparseLLMs/Linearization
Distillation pipeline from pretrained Transformers to customized FLA models
fla-org/distillation-fla’s past year of commit activity - native-sparse-attention Public
🐳 Efficient Triton implementations for "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention"
fla-org/native-sparse-attention’s past year of commit activity - flash-hybrid-attention Public
fla-org/flash-hybrid-attention’s past year of commit activity