Stars
Run the latest LLMs and VLMs across GPU, NPU, and CPU with PC (Python/C++) & mobile (Android & iOS) support, running quickly with OpenAI gpt-oss, Granite4, Qwen3VL, Gemma 3n and more.
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Instant, controllable, local pre-trained AI models in Rust
A safe Rust FFI binding for the NVIDIA® Tools Extension SDK (NVTX).
Tile primitives for speedy kernels
A Pure Rust based LLM (Any LLM based MLLM such as Spark-TTS) Inference Engine, powering by Candle framework.
Democratizing large model inference and training on any device.
The profiler that gives a unified view of your entire stack - from PyTorch down to GPU
A static site generator for data apps, dashboards, reports, and more. Observable Framework combines JavaScript on the front-end for interactive graphics with any language on the back-end for data a…
An extensible framework for linking databases and interactive views.
Open-source observability for your GenAI or LLM application, based on OpenTelemetry
Ecosystem of libraries and tools for writing and executing fast GPU code fully in Rust.