Stars
Google Cloud TPU Utilization Bar for Training Models
🔎 Monitor deep learning model training and hardware usage from your mobile phone 📱
Customizable implementation of the self-instruct paper.
An Experiment on Dynamic NTK Scaling RoPE
Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.
Make your own story. User-friendly software for LLM roleplaying