Fix `trainable_token_indices` for `lm_head` #2863

aflueckiger · 2025-10-23T08:39:20Z

It looks like that trainable_token_indices has been broken since #2605 for the lm_head in case the weights are not tied. The lm_head is an instance of Linear rather than Embedding, and, thus, it doesn't have an attribute embedding_dim.

The embedding dimension is not even needed further down unless when training with deepspeed or re-initializing the weights.

Asking @BenjaminBossan for a review due to the changes in #2605. I couldn't find any reports concerning that issue. Let me know if you would like to fix it differently.

Minimal config for reproduction:

model_name = "utter-project/EuroLLM-9B-Instruct"

trainable_tokens_indices = [5, 6, 7, 8, 9]

trainable_tokens = {"lm_head": trainable_tokens_indices, "embed_tokens": trainable_tokens_indices}

# Before this PR: Only training the embedding layer was possible
# trainable_tokens = {"embed_tokens": trainable_tokens_indices}
# trainable_tokens = trainable_tokens_indices

LoraConfig(
    lora_alpha=16,
    lora_dropout=0.05,
    r=16,
    bias="none",
    task_type="CAUSAL_LM",
    target_modules=[
        "q_proj",
        "k_proj",
        "v_proj",
        "o_proj",
        "gate_proj",
        "up_proj",
        "down_proj",
    ],
    modules_to_save=None,
    trainable_token_indices=trainable_tokens,
)

BenjaminBossan

Thanks for providing this fix.

For the purpose of testing, could you please give a small example where this is broken? The model ID and PEFT config should be enough.

The embedding dimension is not even needed further down unless when training with deepspeed or re-initializing the weights.

It's also used when init_weights=False. I guess we could move it into the corresponding branches to avoid retrieving this argument when it's not needed, but I think it's better to make the code robust enough that it does not break.

src/peft/tuners/trainable_tokens/layer.py

aflueckiger · 2025-10-23T11:22:49Z

Thanks for the quick response. I added more information to reproduce the error in the PR description and replaced the try-expect block. Let me know about the preferred position for handling the attribute.

BenjaminBossan · 2025-10-23T13:56:53Z

Thanks for providing the reproducer. I can confirm that it raises the error and it helped me narrow down why the existing tests didn't catch this. We have a gap where we don't test specifying embed_tokens and lm_head both. To fill the gap, could you please add a unit test? Here is a suggestion:

    def test_trainable_token_indices_targets_head_and_embedding(self):
        # targeting embedding and LM head explicitly, see #2863
        model_id = "hf-internal-testing/tiny-random-OPTForCausalLM"
        with hub_online_once(model_id):
            model = AutoModelForCausalLM.from_pretrained(model_id)
            config = LoraConfig(trainable_token_indices={"lm_head": [0], "embed_tokens": [0]})
            get_peft_model(model, config)  # does not raise

A good spot for this test would be here:

peft/tests/test_initialization.py

Line 1147 in a18ba67

aflueckiger · 2025-10-23T14:30:45Z

Thanks for providing the right spot for the test. It is added now.

HuggingFaceDocBuilderDev · 2025-10-23T15:35:49Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

BenjaminBossan

Thanks for fixing this issue, LGTM.

fix trainable tokens for lm_head

f1f2375

BenjaminBossan requested changes Oct 23, 2025

View reviewed changes

src/peft/tuners/trainable_tokens/layer.py Outdated Show resolved Hide resolved

replace try-except with hasattr

8ac38b4

aflueckiger requested a review from BenjaminBossan October 23, 2025 12:01

add unittest for trainable tokens of lm_head

bb1d07d

BenjaminBossan approved these changes Oct 23, 2025

View reviewed changes

BenjaminBossan merged commit fff52ab into huggingface:main Oct 23, 2025
12 of 13 checks passed

githubnemo mentioned this pull request Nov 3, 2025

Proposal: make trainable tokens more flexible to support LMHead tuning #2792

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix `trainable_token_indices` for `lm_head` #2863

Fix `trainable_token_indices` for `lm_head` #2863

Uh oh!

aflueckiger commented Oct 23, 2025 •

edited

Loading

Uh oh!

BenjaminBossan left a comment

Uh oh!

Uh oh!

aflueckiger commented Oct 23, 2025

Uh oh!

BenjaminBossan commented Oct 23, 2025

Uh oh!

aflueckiger commented Oct 23, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Oct 23, 2025

Uh oh!

BenjaminBossan left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix trainable_token_indices for lm_head #2863

Fix trainable_token_indices for lm_head #2863

Uh oh!

Conversation

aflueckiger commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BenjaminBossan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

aflueckiger commented Oct 23, 2025

Uh oh!

BenjaminBossan commented Oct 23, 2025

Uh oh!

aflueckiger commented Oct 23, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Oct 23, 2025

Uh oh!

BenjaminBossan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix `trainable_token_indices` for `lm_head` #2863

Fix `trainable_token_indices` for `lm_head` #2863

aflueckiger commented Oct 23, 2025 •

edited

Loading