Roadmap

### Function calling
- [ ] Integrate with XGrammar's structural tag: https://github.com/mlc-ai/xgrammar/pull/162, and enable reliable tool use with small models in WebLLM
- [ ] Add an E2E MCP-like example, using the structural tag / tool-use stated above

### Models
- [ ] Support Phi-4 (ongoing)
- [ ] Support Gemma3
- [ ] Support Gemma3n

### Modalities
- [ ] Add reliable image input feature (currently have initial support, performance and correctness need to be further investigated)
- [ ] Add other modality input (e.g. audio)

### Performance
- [ ] Profile existing performance, identify bottlenecks and address them if exist
- [ ] Switch some existing CPU workload (e.g. sampling) to GPU if performance improves (ongoing: https://github.com/mlc-ai/web-llm/pull/697)
- [ ] Subgroup operation support: https://github.com/mlc-ai/web-llm/issues/553

### Others
- [ ] Better WASM conversion experience (e.g. hosting a huggingface space, so users do not need to set up environment)
- [ ] Qwen2.5's 1.5B webgpu-only correctness issue—very obvious with DeepSeek-R1-Distill
- [ ] Parse thinking/non-thinking tokens when returning completion response

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Roadmap #707

Function calling

Models

Modalities

Performance

Others

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Roadmap #707

Description

Function calling

Models

Modalities

Performance

Others

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions