这是indexloc提供的服务,不要输入任何密码
Skip to content

Conversation

@BenjaminBossan
Copy link
Member

@BenjaminBossan BenjaminBossan commented Oct 31, 2025

Resolves #2881.

Use model_config["head_dim"] instead of peft_config.token_dim // peft_config.num_attention_heads if it exists.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Doesn't happen in from source install of transfomers ...
)

if peft_config.is_prompt_learning:
peft_config = _prepare_prompt_learning_config(peft_config, model_config)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While debugging this issue, I noticed that _prepare_prompt_learning_config is being called twice, here and in peft_model.py:

peft_config = _prepare_prompt_learning_config(peft_config, dict_config)

This is not a huge deal, as the results should be the same and because the function is quick. Still, let's remove the redundant call here.

@BenjaminBossan
Copy link
Member Author

@githubnemo I wanted to ask Cyril or Raushan if the fix is reasonable, but they're both OoO right now.

@BenjaminBossan BenjaminBossan merged commit e82e72a into huggingface:main Nov 5, 2025
13 checks passed
@BenjaminBossan BenjaminBossan deleted the fix-prefix-tuning-qwen3 branch November 5, 2025 15:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug in prefix tuning of Qwen3-0.6B: group-query attention fix in #1901 still cause error

3 participants