-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Fix trainable_token_indices for lm_head
#2863
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix trainable_token_indices for lm_head
#2863
Conversation
BenjaminBossan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for providing this fix.
For the purpose of testing, could you please give a small example where this is broken? The model ID and PEFT config should be enough.
The embedding dimension is not even needed further down unless when training with deepspeed or re-initializing the weights.
It's also used when init_weights=False. I guess we could move it into the corresponding branches to avoid retrieving this argument when it's not needed, but I think it's better to make the code robust enough that it does not break.
|
Thanks for the quick response. I added more information to reproduce the error in the PR description and replaced the try-expect block. Let me know about the preferred position for handling the attribute. |
|
Thanks for providing the reproducer. I can confirm that it raises the error and it helped me narrow down why the existing tests didn't catch this. We have a gap where we don't test specifying def test_trainable_token_indices_targets_head_and_embedding(self):
# targeting embedding and LM head explicitly, see #2863
model_id = "hf-internal-testing/tiny-random-OPTForCausalLM"
with hub_online_once(model_id):
model = AutoModelForCausalLM.from_pretrained(model_id)
config = LoraConfig(trainable_token_indices={"lm_head": [0], "embed_tokens": [0]})
get_peft_model(model, config) # does not raiseA good spot for this test would be here: peft/tests/test_initialization.py Line 1147 in a18ba67 |
|
Thanks for providing the right spot for the test. It is added now. |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
BenjaminBossan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing this issue, LGTM.
It looks like that
trainable_token_indiceshas been broken since #2605 for thelm_headin case the weights are not tied. Thelm_headis an instance ofLinearrather thanEmbedding, and, thus, it doesn't have an attributeembedding_dim.The embedding dimension is not even needed further down unless when training with
deepspeedor re-initializing the weights.Asking @BenjaminBossan for a review due to the changes in #2605. I couldn't find any reports concerning that issue. Let me know if you would like to fix it differently.
Minimal config for reproduction: