-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Support Negative Weights When Merging LoRA Adapters #2796 #2811
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Negative Weights When Merging LoRA Adapters #2796 #2811
Conversation
valteu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a look into the proposed changes and the logic update aligns with my solution. I'm however unsure about the effectiveness of some test cases and don't find the docstring update necessary (reasons provided in the comments added).
Also I think the PR comment should be updated, I don't see a value in the LoraHub code example, this has nothing todo with LoraHub imo.
Otherwise Lgtm!
src/peft/tuners/lora/model.py
Outdated
| List of weights for each adapter. Weights can be positive or negative, allowing for both addition | ||
| and subtraction of adapter effects. Negative weights enable task arithmetic and adapter negation as | ||
| described in the LoRA Hub and Task Arithmetic papers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would cut this, e.g. for _svd_generalized_task_arithmetic_weighted_adapter in SVD-based merging, this was already available and I don't think the hint to LoraHub or Task Arithemetic neccessary here
| List of weights for each adapter. Weights can be positive or negative, allowing for both addition | |
| and subtraction of adapter effects. Negative weights enable task arithmetic and adapter negation as | |
| described in the LoRA Hub and Task Arithmetic papers. | |
| List of weights for each adapter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay , fixed in c190977
tests/test_negative_weights.py
Outdated
| def test_very_small_negative_weight(self, simple_model, lora_config): | ||
| """Test very small negative weights.""" | ||
| model = get_peft_model(simple_model, lora_config, adapter_name="adapter1") | ||
|
|
||
| model.add_weighted_adapter( | ||
| adapters=["adapter1"], | ||
| weights=[-1e-6], | ||
| adapter_name="merged_tiny_negative", | ||
| combination_type="linear", | ||
| ) | ||
|
|
||
| assert "merged_tiny_negative" in model.peft_config | ||
|
|
||
| def test_very_large_negative_weight(self, simple_model, lora_config): | ||
| """Test very large negative weights.""" | ||
| model = get_peft_model(simple_model, lora_config, adapter_name="adapter1") | ||
|
|
||
| model.add_weighted_adapter( | ||
| adapters=["adapter1"], | ||
| weights=[-100.0], | ||
| adapter_name="merged_large_negative", | ||
| combination_type="linear", | ||
| ) | ||
|
|
||
| assert "merged_large_negative" in model.peft_config | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm not missing sth, I don't see a benefit in this. Nothing regarding weight magnitude was updated and we are not even testing for any mathematical correctness, just to get no error.
| def test_very_small_negative_weight(self, simple_model, lora_config): | |
| """Test very small negative weights.""" | |
| model = get_peft_model(simple_model, lora_config, adapter_name="adapter1") | |
| model.add_weighted_adapter( | |
| adapters=["adapter1"], | |
| weights=[-1e-6], | |
| adapter_name="merged_tiny_negative", | |
| combination_type="linear", | |
| ) | |
| assert "merged_tiny_negative" in model.peft_config | |
| def test_very_large_negative_weight(self, simple_model, lora_config): | |
| """Test very large negative weights.""" | |
| model = get_peft_model(simple_model, lora_config, adapter_name="adapter1") | |
| model.add_weighted_adapter( | |
| adapters=["adapter1"], | |
| weights=[-100.0], | |
| adapter_name="merged_large_negative", | |
| combination_type="linear", | |
| ) | |
| assert "merged_large_negative" in model.peft_config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resolved in c190977
BenjaminBossan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @sambhavnoobcoder for working on this PR and @valteu for your review.
My review focuses on the testing. Right now, completely new tests are being added but I'd rather see the existing tests being extended. As those tests can be difficult to scrutinize, I tried to give some guidance on how to achieve that. Please let me know if anything still remains unclear.
tests/test_negative_weights.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for also including unit tests. However, we already have a couple of unit tests for add_weighted_adapter, and I would prefer to extend those instead of having this set of completely separate tests.
I think the best way to do that is to check this test function:
Lines 1822 to 1853 in b0954e0
| def _test_weighted_combination_of_adapters(self, model_id, config_cls, config_kwargs): | |
| if issubclass(config_cls, AdaLoraConfig): | |
| # AdaLora does not support adding more than 1 adapter | |
| return pytest.skip(f"Test not applicable for {config_cls}") | |
| if model_id.endswith("qwen2"): | |
| # Qwen2 fails with weighted adapter combinations using SVD | |
| return pytest.skip(f"Test does not work with model {model_id}") | |
| if "gemma" in model_id.lower(): | |
| return pytest.skip("Combining Gemma adapters with SVD is currently failing") | |
| adapter_list = ["adapter1", "adapter_2", "adapter_3"] | |
| weight_list = [0.5, 1.5, 1.5] | |
| # Initialize the config | |
| config = config_cls( | |
| base_model_name_or_path=model_id, | |
| **config_kwargs, | |
| ) | |
| if not isinstance(config, (LoraConfig, IA3Config)): | |
| # This test is only applicable for Lora and IA3 configs | |
| return pytest.skip(f"Test not applicable for {config}") | |
| with hub_online_once(model_id): | |
| model = self.transformers_class.from_pretrained(model_id) | |
| model = get_peft_model(model, config, adapter_list[0]) | |
| if isinstance(config, LoraConfig): | |
| self._test_weighted_combination_of_adapters_lora(model, config, adapter_list, weight_list) | |
| elif isinstance(config, IA3Config): | |
| self._test_weighted_combination_of_adapters_ia3(model, config, adapter_list, weight_list) | |
| else: | |
| pytest.skip(f"Test not applicable for {config}") |
It is already parameterized over PEFT methods (right now, only LoRA and IA³), different model architectures, and merging methods. Right now, it is called with a list of weights weight_list = [0.5, 1.5, 1.5]. The easiest extension would be to define another set of weights which includes negative weights and then run the tests with those as well. So e.g.:
- if isinstance(config, LoraConfig):
- self._test_weighted_combination_of_adapters_lora(model, config, adapter_list, weight_list)
- elif isinstance(config, IA3Config):
- self._test_weighted_combination_of_adapters_ia3(model, config, adapter_list, weight_list)
+ if isinstance(config, LoraConfig):
+ self._test_weighted_combination_of_adapters_lora(model, config, adapter_list, weight_list)
+ self._test_weighted_combination_of_adapters_lora(model, config, adapter_list, negative_weight_list)
+ elif isinstance(config, IA3Config):
+ self._test_weighted_combination_of_adapters_ia3(model, config, adapter_list, weight_list)
+ self._test_weighted_combination_of_adapters_ia3(model, config, adapter_list, negative_weight_list)Since those tests are generic in nature, I believe it would be useful to keep the test_negative_weight_negates_adapter test you added. Instead of putting it into a separate file, I'd rather add it to the custom tests after this test:
peft/tests/test_custom_models.py
Line 2908 in b0954e0
| def test_add_weighted_adapter_cat_with_rank_pattern(self, config_cls): |
Then you can re-use one of the models defined there, e.g. MLP, instead of defining a new SimpleModel.
Similarly, I think we can transfer the test_subtraction_with_negative_weights, which looks useful to me. I wonder, however, if we can make that test even more strict. If we have two identical LoRA adapters and then merge them together with weight [1.0, -1.0], should we not expect that they cancel out exactly? Then we can check that the new merged adapter has weights of approximately 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it — the goal was to extend the generic tests instead of creating new files. I’ve refactored accordingly, gone through the existing tests, and only added the cases that I felt weren’t already covered, directly within the generic test suite. This should now be resolved in 6a1c92d .
Please let me know if I’ve missed anything or if further changes are needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just added another small test for testing negative weights with differential scaling in bea394f . i think this covers all the edge cases for this now .
|
Thank you for your reviews @BenjaminBossan and @valteu . i have made the requested changes here . kindly look into this and let me know if this PR requires any more changes . |
BenjaminBossan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for updating the tests. I have a few smaller comments, but overall this looks good.
tests/test_custom_models.py
Outdated
| ["default", "other"], weights=[1.0, 1.0], adapter_name="merged", combination_type="cat" | ||
| ) | ||
|
|
||
| def test_negative_weight_negates_adapter(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For clarity, let's extend the test name like this:
| def test_negative_weight_negates_adapter(self): | |
| def test_add_weighted_adapter_negative_weight_negates_adapter(self): |
Same idea for the other tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resolved in 41409af
tests/test_custom_models.py
Outdated
| # Verify the merged adapter was created successfully | ||
| assert "merged_diff_scaling" in model.peft_config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resolved in 41409af
tests/testing_common.py
Outdated
|
|
||
| adapter_list = ["adapter1", "adapter_2", "adapter_3"] | ||
| weight_list = [0.5, 1.5, 1.5] | ||
| negative_weight_list = [0.5, -0.8, 1.2] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| negative_weight_list = [0.5, -0.8, 1.2] | |
| negative_weight_list = [-0.5, -0.8, -1.2] |
Let's do all negative numbers, just for good measure :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resolved in 41409af
|
@BenjaminBossan i went through the suggestions you made , implemented them in 41409af . do tell me if any other changes are needed , i will implement those as well . |
|
@sambhavnoobcoder Could you please run |
|
@BenjaminBossan ran the |
|
@sambhavnoobcoder We have a bunch of failing tests because the unit test does not consider that weights can be negative: Lines 1777 to 1778 in b0954e0
This should be easily fixed like so: weighted_original_delta_weights = target.get_delta_weight(adapter_list[0]) * weight_list[0]
+ sign = 1 if weight_list[0] > 0 else -1
+ weighted_original_delta_weights = sign * weighted_original_delta_weightsCould you please update the test? |
|
@BenjaminBossan added that in 8f9a625 . can you check now ? |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @sambhavnoobcoder and @valteu for enhancing add_weighted_adapter. The PR LGTM, failing tests are unrelated.
@valteu Pls let me know how I should credit your co-authorship.
|
Hi, thanks @sambhavnoobcoder and @BenjaminBossan for the great work! |
|
Great, done @valteu. Greetings from Berlin. |
|
Thanks, greetings from Potsdam :) |
|
Thank you both @BenjaminBossan , @valteu and greetings from Bengaluru as well :) |
Summary
This PR implements support for negative weights in LoRA adapter merging, resolving issue #2796. This feature enables task arithmetic and adapter subtraction capabilities as demonstrated in the LoRA Hub and Task Arithmetic papers.
Problem Description
Current Issue
When attempting to merge LoRA adapters with negative weights using
add_weighted_adapter(), the operation fails with aValueError: math domain error. This error occurs because the current implementation computesmath.sqrt(weight * target.scaling[adapter]), which is undefined for negative values.Why Negative Weights Matter
Several influential papers demonstrate the utility of negative weights in adapter merging:
Without negative weight support, users cannot:
Solution Approach
Mathematical Foundation
The implementation uses the following approach:
For each adapter
iwith weightw_iand scaling factors_i:r_i = sqrt(|w_i * s_i|)(magnitude using absolute value)sign_i = sign(w_i * s_i)(preserve the sign)sign_i * r_ito both A and B matricesWhen merging:
A_merged = Σ (sign_i * r_i * A_i)B_merged = Σ (sign_i * r_i * B_i)This approach:
(sign * sqrt(|w*s|) * A) @ (sign * sqrt(|w*s|) * B) = w*s*(A@B)Implementation Details
Code Changes
Modified File:
src/peft/tuners/lora/model.pyChange 1: Implementation (lines 747-750 in
_generalized_task_arithmetic_weighted_adapter)The fix replaces the single line that was causing the error with three lines:
weight_with_scaling = weight * target.scaling[adapter]sign = 1 if weight_with_scaling >= 0 else -1valid_weights.append(sign * math.sqrt(abs(weight_with_scaling)))This change affects the following combination types:
lineartiesdare_lineardare_tiesmagnitude_pruneChange 2: Documentation (lines 542-545)
Updated the docstring for
add_weighted_adapterto document that weights can now be negative, with references to the LoRA Hub and Task Arithmetic papers.Methods Verified (No Changes Needed)
SVD-based methods (
_svd_generalized_task_arithmetic_weighted_adapter):weight * target.scaling[adapter]directly without square rootCat combination:
Strict Adherence to Proposed Solution
This implementation follows @valteu's approved solution exactly with zero deviations. We did not introduce any alternative approaches or optimizations beyond what was discussed and validated in the issue thread.
Testing Strategy
Comprehensive Test Suite
Created
tests/test_negative_weights.pywith 21 comprehensive tests covering:1. Basic Negative Weight Functionality (4 tests)
2. All Combination Types (11 tests)
linear,ties,dare_linear,dare_ties,magnitude_prunesvd,ties_svd,dare_linear_svd,dare_ties_svd,magnitude_prune_svd3. Mathematical Correctness (2 tests)
4. Backward Compatibility (1 test)
sqrt(0.6*s)*A + sqrt(0.4*s)*Bmatches exactly5. Edge Cases (3 tests)
Verification Strategy
Mathematical Verification:
Backward Compatibility Verification:
1 * sqrt(w*s) = sqrt(w*s)(unchanged)Integration Testing:
Quick Demo
Here's a minimal example demonstrating the fix:
Before this PR: The above code raises
ValueError: math domain errorAfter this PR: Successfully creates merged adapter with negative weights
Use Cases Enabled
This implementation enables several important use cases:
1. Task Arithmetic
2. LoRA Hub Weight Optimization
3. Adapter Negation
Test Results
All 21 tests pass successfully:
Screenshots
Test Suite Results
Quick Demo Verification
Breaking Changes
None. This PR is fully backward compatible:
Related Issues
Closes #2796
Credits
Checklist