这是indexloc提供的服务,不要输入任何密码

T5Gemma

A collection of encoder-decoder models that provide a strong quality-inference efficiency tradeoff.

T5Gemma adapts pretrained decoder-only Gemma 2 models into an encoder-decoder architecture. These models are trained with either PrefixLM for strong generative performance or UL2 for high-quality contextual representations.

Capabilities

Enhanced reasoning

Dedicated encoder significantly boosts performance on tasks requiring deep context comprehension, such as math reasoning (GSM8K).

Flexible architecture

Model adaptation techniques allows for flexible configurations, including "unbalanced" models where the encoder and decoder have different sizes.

High efficiency

Superior quality-to-efficiency ratio without extensive compute requirements.


Models

Gemma 2 sizes

Checkpoints based on the official Gemma 2 2B and 9B models, as well as the “unbalanced” 9B-2B checkpoint.

T5 sizes

Small, Base, Large, and XL sizes following the T5 configuration, plus an additional model sized between T5 Large and T5 X.