-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Fix Inconsistent Missing Keys Warning for Adapter Weights in PEFT #2084
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Inconsistent Missing Keys Warning for Adapter Weights in PEFT #2084
Conversation
|
I think this logic should cover most of the cases. I haven't added prompt learning techniques here as they don't have a concrete prefix. Please let me know if I am missing any edge cases. |
BenjaminBossan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for this PR. I made a few suggestion, please check.
On top of those, I would also like to see a test being added. LMK if you feel up to that task and if you need support with it.
|
Yes @BenjaminBossan, Let me know which test cases should be handled. I will add those. |
I checked our existing tests and I think it would actually be fine if we extend one of our existing tests and check that the Lines 520 to 536 in f4cf170
There we're just loading two adapters and do no further checks (the test is just that there is no error). We could assign the Apart from that, I just noticed that after your last pushes, when I go to https://github.com/huggingface/peft/pull/2084/files, for some reason I see 27 changed files (even though it only says 1 file on this page). Not sure what's going on. Maybe it's a GitHub bug, but it makes it super hard to review the PR. I would suggest to wait a day and see if it goes away by itself. If not, maybe you need to rebase on the latest main or if that doesn't help, create a fresh PR. Again, let's wait a day and only try those options if there is still the same error then. |
|
Hmm, I still see 27 changed files :-/ Could you please try re-basing? Otherwise, we may need a new PR with just the relevant changes. |
bc5237c to
55c1c58
Compare
|
Rebased the branch correctly and added test cases. Please review the PR. |
BenjaminBossan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates. The rebase appears to have fixed the issue with the GH diff, nice. I have a suggestion to simplify the code a bit. Moreover, could you please run make style on the PR?
|
Done with changes. |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
As you can see, the CI is failing and the reason is that the test does not pass when checking VB-LoRA. The reason for that is that the vector bank in VB-LoRA is shared among all layers, so there isn't an individual one for each layer. I don't think there is an easy way to account for that in the code changes you made. My suggestion is therefore to simply skip the test for VB-LoRA, adding a comment explaining why. Could you please adjust the test accordingly? To check locally that the error is fixed, just run |
|
Done, Skipping the VBLORA test case and added appropriate comment. |
BenjaminBossan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks for this PR, this should hopefully reduce confusion about missing keys in the future. Nice work.
|
Yup Thanks @BenjaminBossan for guiding me, and I would be happy to help with any other issues or feature requests 🤗 |
|
Sounds good, thanks! You can watch the PEFT issues for potential contributions, especially when there is the |
After merging huggingface#2084, we now clean up the missing_keys when loading a PEFT adapter to remove all but the relevant keys (the fact that base model keys are missing is expected when loading a PEFT adapter). Since the presence of missing_keys now really means that something might have gone wrong during loading, we can now warn the user if they call PeftModel.from_pretrained. Note that load_adapter still does not warn, as here we return the load_result and users can already check, but for from_pretrained, they don't have that possibility.
After merging #2084, we now clean up the missing_keys when loading a PEFT adapter to remove all but the relevant keys (the fact that base model keys are missing is expected when loading a PEFT adapter). Since the presence of missing_keys now really means that something might have gone wrong during loading, we can now warn the user if they call PeftModel.from_pretrained. Note that load_adapter still does not warn, as here we return the load_result and users can already check, but for from_pretrained, they don't have that possibility.
After merging huggingface#2084, we now clean up the missing_keys when loading a PEFT adapter to remove all but the relevant keys (the fact that base model keys are missing is expected when loading a PEFT adapter). Since the presence of missing_keys now really means that something might have gone wrong during loading, we can now warn the user if they call PeftModel.from_pretrained. Note that load_adapter still does not warn, as here we return the load_result and users can already check, but for from_pretrained, they don't have that possibility.
…ace#2084) When loading a PEFT adapter, a lot of missing keys are reported, because the base model weights are not loaded. However, this is totally fine. Therefore, those missing keys can be safely ignored. When using from_pretrrained, the missing keys won't be returned to the user, thus there is no room for confusion. But when using load_adapter, the missing keys (and unexpected keys) are returned and can cause confusion. With this PR, the missing keys are filtered to remove keys that are unrelated to the adapter. A small gap is VB-LoRA which reports missing keys because the vector bank parameters are actually only loaded once and then shared.
After merging huggingface#2084, we now clean up the missing_keys when loading a PEFT adapter to remove all but the relevant keys (the fact that base model keys are missing is expected when loading a PEFT adapter). Since the presence of missing_keys now really means that something might have gone wrong during loading, we can now warn the user if they call PeftModel.from_pretrained. Note that load_adapter still does not warn, as here we return the load_result and users can already check, but for from_pretrained, they don't have that possibility.
Refer #1932 for more context.
The main logic is we would have a specific prefix of adapter related keys and if such prefix is present in any of the missing keys then raise a warning else the adapter is loaded succesfully.