FIX: Wrong coupling between requires_grad and the active adapter #2765

BenjaminBossan · 2025-09-01T13:30:22Z

Description

At the moment, we strongly couple the active adapter with requires_grad=True. Concretely, when we call model.set_adapter(name), we automatically assume that this adapter should not only be made active, its requires_grad should also be set to True.

For the purpose of training PEFT models, this is fair. However, when loading PEFT models for inference, this is not desired. Generally, for inference, we don't need requires_grad=True, but as is, it is enabled.

Generally, this is not a severe bug, since in the inference code, we don't perform any updates, thus we don't inadvertently update a weight because it wrongly has requires_grad=True -- this is probably why it went unnoticed so far. However, it could lead to worse runtime performance and memory overhead when PyTorch records grads for those parameters (which it shouldn't if called with torch.inference_mode, but some users may forget to use this). Therefore, this bug is still worth fixing.

Example

With `modules_to_save`

A very basic example where the current PEFT fails:

import os
from transformers import AutoModelForCausalLM
from peft import LoraConfig, PeftModel, get_peft_model

model_id = "facebook/opt-125m"
path = "/tmp/peft/2759"
if not os.path.exists(path + "/adapter_model.safetensors"):
    model = AutoModelForCausalLM.from_pretrained(model_id)
    config = LoraConfig(target_modules=["q_proj", "v_proj"], modules_to_save=["lm_head"], r=8)
    model = get_peft_model(model, config)
    model.save_pretrained(path)
    del model

model = AutoModelForCausalLM.from_pretrained(model_id)
model = PeftModel.from_pretrained(model, path)
assert not model.base_model.model.lm_head.modules_to_save.default.weight.requires_grad

modules_to_save should not have grads enabled, but currently it does.

With multiple adapters

There is also an issue when loading more than one adapter:

model = PeftModel.from_pretrained(...)
assert not any(p.requires_grad for p in model.parameters())  # works

So far, so good, the first adapter does not have requires_grad.

model.load_adapter(...)
assert not any(p.requires_grad for p in model.parameters())  # fails

The load_adapter call inadvertently sets requires_grad=True for the weights of the first adapter. The reason why this happens is because when the second adapter is loaded, we call set_adapter with the first adapter to ensure that it remains the activate adapter. However, due to the coupling of active adapter and requires_grad, this would result in setting requires_grad=True for the first adapter.

The PR relaxes this coupling by allowing to call set_adapter with an additional argument, inference_mode. If set to True, the requires_grad will not be enabled, even if the adapter is activated.

The example above would also fail for modules_to_save and trainable_token_indices, not only for the LoRA/LoHa/... weights.

Still open bugs

The proposed solution is unfortunately not perfect. Right now, we do pass inference_mode based on the PEFT config of the adapter being added, which helps with the original issue described above. However, even this is not absolutely correct, because inference_mode of the second adapter does not necessarily have the same value as inference_mode of the first adapter. To illustrate how this can go wrong, I added an xfailing test:

test_loading_model_requires_grad_set_correctly_switch_inference_mode

I believe that this use case is rarer than the ones described at the beginning, so IMO it is okay to have this bug because we fix more common bugs. However, LMK if you disagree.

Related to this, I noticed that many tests in test_custom_models.TestRequiresGrad had code like this:

config0 = FooConfig(...)
peft_model = get_peft_model(MLP(), config0)
config1 = FooConfig(..., inference_mode=True)  # <==
peft_model.add_adapter("adapter1", config1)

This now fails because of the reason just given. I removed inference_mode=True here and the tests pass again.

Note that the only reason why inference_mode=True was passed here is because AdaLoRA cannot load 2 adapters in training mode and thus requires this. Later PEFT methods without this restriction blindly copied the AdaLoRA test. For those PEFT methods, I removed inference_mode=True to make them pass.

However, this also means that the AdaLoRA tests now fail. I thus marked them as xfail.

To properly fix this bug, I think we would have to refactor the code to isolate set_adapter (i.e. determining the active adapter) and setting requires_grad into separate code paths, as they're orthogonal. Moreover, these attributes are being set all over the place, which makes it hard to reason about where these attributes are being changed. This should be streamlined.

Making these changes while not breaking any existing code is not trivial (or maybe impossible even). Therefore, I went the easier way for the time being with this PR. Maybe a bigger refactor could be envisioned for a version 1.0 release of PEFT.

Related changes

While working on this, I noticed that LNTuning was buggy when calling set_adapter. This is now fixed.

Moreover, since I had to touch update_layer everywhere, I ensured that they all take kwargs for consistency.

Note to maintainers

Most changes in this PR are just the same updates for set_adapter and update_layer in each PEFT method's layer.py and model.py (except for prompt learning) with some diff noise due to updating type annotations and docstrings. For the review, focus on the changes in other.py, peft_model.py, tuners_utils.py, and test_custom_models.py.
If/When this PR is merged, existing PRs that add new PEFT methods have to be updated to reflect the changes.
Yes, it is about time we update the abstractions so that these types of changes become easier in the future (not having to update each PEFT method individually).

Description At the moment, we strongly couple the active adapter with requires_grad=True. Concretely, when we call model.set_adapter(name), we automatically assume that this adapter should not only be made active, its requires_grad should also be set to True. For the purpose of training PEFT models, this is fair. However, when loading PEFT models for inference, this is not desired. Generally, for inference, we don't need requires_grad=True, but as is, it is enabled. Generally, this is not a severe bug, since in the inference code, we don't perform any updates, thus we don't inadvertently update a weight because it wrongly has requires_grad=True. However, it could lead to worse runtime performance and memory overhead when PyTorch records grads for those parameters (which it shouldn't if called with torch.inference_mode, but some users may forget to use this). Therefore, this bug is still worth fixing. Example A very basic example where the current PEFT fails: model = PeftModel.from_pretrained(...) assert not any(p.requires_grad for p in model.parameters()) # works So far, so good, the first adapter does not have requires_grad. model.load_adapter(...) assert not any(p.requires_grad for p in model.parameters()) # fails The load_adapter call inadvertently sets requires_grad=True for the weights of the _first_ adapter. The reason why this happens is because when the second adapter is loaded, we call set_adapter with the first adapter to ensure that it remains the activate adapter. However, due to the coupling of active adapter and requires_grad, this would result in setting requires_grad=True for the first adapter. The PR relaxes this coupling by allowing to call set_adapter with an additional argument, inference_mode. If set to True, the requires_grad will not be enabled, even if the adapter is activated. The example above would also fail for modules_to_save and trainable tokens, not only for the LoRA/LoHa/... weights. Still open bugs The proposed solution is unfortunately not perfect. Right now, we do pass inference_mode based on the PEFT config of the adapter being added, which helps with the original issue described above. However, even this is not absolutely correct, because inference_mode of the second adapter does not necessarily have the same value as inference_mode of the first adapter. To illustrate how this can go wrong, I added an xfailing test: test_loading_model_requires_grad_set_correctly_switch_inference_mode I believe that this use case is rarer than the one described at the beginning, so IMO it is okay to have this bug because we fix a more common bug. However, LMK if you disagree. Related to this, I noticed that many tests in test_custom_models.TestRequiresGrad had code like this: config0 = FooConfig(...) peft_model = get_peft_model(MLP(), config0) config1 = FooConfig(..., inference_mode=True) # <== peft_model.add_adapter("adapter1", config1) This now fails because of the reason just given. I removed inference_mode=True here and the tests pass again. Note that the only reason why inference_mode=True was passed here is because AdaLoRA cannot load 2 adapters in training mode and thus requires this. Later PEFT methods without this restriction blindly copied the AdaLoRA test. For those PEFT methods, I removed inference_mode=True. However, this also means that the AdaLoRA tests now fail. I thus marked them as xfail. To properly fix this bug, I think we would have to refactor the code to isolate set_adapter (i.e. determining the active adapter) and setting requires_grad into separate code paths, as they're orthogonal. Moreover, these attributes are being set all over the place, which makes it hard to reason about where these attributes are being changed. This should be streamlined. Making these changes while not breaking any existing code is not trivial (or maybe impossible even). Therefore, I went the easier way for the time being with this PR. Maybe a bigger refactor could be envisioned for a version 1.0 release of PEFT. Related changes While working on this, I noticed that LNTuning was completely buggy when calling set_adapter. This is now fixed. Moreover, since I had to touch update_layer everywhere, I ensured that they all take kwargs for consistency. Note to maintainers: - If/When this PR is merged, existing PRs that add new PEFT methods have to be updated to reflect the changes. - Yes, it is about time we update the abstractions so that these types of changes become easier in the future (not having to update each PEFT method individually).

HuggingFaceDocBuilderDev · 2025-09-01T13:34:40Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

githubnemo

Minor nit, otherwise LGTM. Thanks for taking care of this!

githubnemo · 2025-09-08T16:22:02Z

tests/test_custom_models.py

+        extra_kwargs = {}
+        if config_cls == IA3Config:
+            extra_kwargs["feedforward_modules"] = []
+        # targeting the different modules with modules_to_save:


Suggested change

# targeting the different modules with modules_to_save:

BenjaminBossan added 2 commits September 1, 2025 15:40

Add missing future import

aae5fb6

Properly update MixedModel, fix typo

d399d19

BenjaminBossan requested a review from githubnemo September 1, 2025 15:45

BenjaminBossan mentioned this pull request Sep 2, 2025

PeftModel trainable parameters with multiple adapters #2759

Closed

BenjaminBossan added 2 commits September 3, 2025 11:56

Add another xfailing test

870063b

Merge branch 'main' into fix-set-adapter-coupling-with-requires-grad

38fc590

BenjaminBossan mentioned this pull request Sep 5, 2025

The great deduplication #2771

Merged

githubnemo approved these changes Sep 8, 2025

View reviewed changes

BenjaminBossan added 2 commits September 8, 2025 18:51

Reviewer feedback: remove obsolete comment

a28ee23

Merge branch 'main' into fix-set-adapter-coupling-with-requires-grad

7897f7c

BenjaminBossan merged commit 13fa0ae into huggingface:main Sep 8, 2025
14 checks passed

BenjaminBossan deleted the fix-set-adapter-coupling-with-requires-grad branch September 8, 2025 17:49

BenjaminBossan mentioned this pull request Sep 10, 2025

FEAT add DeLoRA #2780

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

FIX: Wrong coupling between requires_grad and the active adapter #2765

FIX: Wrong coupling between requires_grad and the active adapter #2765

Uh oh!

BenjaminBossan commented Sep 1, 2025 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Sep 1, 2025

Uh oh!

githubnemo left a comment

Uh oh!

githubnemo Sep 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

FIX: Wrong coupling between requires_grad and the active adapter #2765

FIX: Wrong coupling between requires_grad and the active adapter #2765

Uh oh!

Conversation

BenjaminBossan commented Sep 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Example

With modules_to_save

With multiple adapters

Still open bugs

Related changes

Note to maintainers

Uh oh!

HuggingFaceDocBuilderDev commented Sep 1, 2025

Uh oh!

githubnemo left a comment

Choose a reason for hiding this comment

Uh oh!

githubnemo Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

BenjaminBossan commented Sep 1, 2025 •

edited

Loading

With `modules_to_save`