这是indexloc提供的服务,不要输入任何密码
Skip to content

Conversation

@sambhavnoobcoder
Copy link
Contributor

Summary

This PR implements support for negative weights in LoRA adapter merging, resolving issue #2796. This feature enables task arithmetic and adapter subtraction capabilities as demonstrated in the LoRA Hub and Task Arithmetic papers.


Problem Description

Current Issue

When attempting to merge LoRA adapters with negative weights using add_weighted_adapter(), the operation fails with a ValueError: math domain error. This error occurs because the current implementation computes math.sqrt(weight * target.scaling[adapter]), which is undefined for negative values.

Why Negative Weights Matter

Several influential papers demonstrate the utility of negative weights in adapter merging:

  1. LoRA Hub Paper: Uses negative weights in their implementation for combining multiple task-specific adapters
  2. Task Arithmetic Paper: Demonstrates using negative coefficients to subtract unwanted task knowledge from models

Without negative weight support, users cannot:

  • Perform task subtraction (removing unwanted behaviors)
  • Implement LoRA Hub's weight optimization strategies
  • Apply task arithmetic techniques to their models

Solution Approach

Mathematical Foundation

The implementation uses the following approach:

For each adapter i with weight w_i and scaling factor s_i:

  • Compute r_i = sqrt(|w_i * s_i|) (magnitude using absolute value)
  • Compute sign_i = sign(w_i * s_i) (preserve the sign)
  • Apply sign_i * r_i to both A and B matrices

When merging:

  • A_merged = Σ (sign_i * r_i * A_i)
  • B_merged = Σ (sign_i * r_i * B_i)

This approach:

  • Maintains the mathematical property: (sign * sqrt(|w*s|) * A) @ (sign * sqrt(|w*s|) * B) = w*s*(A@B)
  • Works correctly for multi-adapter merging scenarios
  • Is fully backward compatible with positive weights

Implementation Details

Code Changes

Modified File: src/peft/tuners/lora/model.py

Change 1: Implementation (lines 747-750 in _generalized_task_arithmetic_weighted_adapter)

The fix replaces the single line that was causing the error with three lines:

  1. Compute weight_with_scaling = weight * target.scaling[adapter]
  2. Determine the sign: sign = 1 if weight_with_scaling >= 0 else -1
  3. Apply to weight calculation: valid_weights.append(sign * math.sqrt(abs(weight_with_scaling)))

This change affects the following combination types:

  • linear
  • ties
  • dare_linear
  • dare_ties
  • magnitude_prune

Change 2: Documentation (lines 542-545)

Updated the docstring for add_weighted_adapter to document that weights can now be negative, with references to the LoRA Hub and Task Arithmetic papers.

Methods Verified (No Changes Needed)

SVD-based methods (_svd_generalized_task_arithmetic_weighted_adapter):

  • Already compatible with negative weights
  • Uses weight * target.scaling[adapter] directly without square root
  • Tested to ensure correct behavior with negative weights

Cat combination:

  • Already compatible with negative weights
  • Directly multiplies weights without square root operation
  • Tested to ensure correct behavior with negative weights

Strict Adherence to Proposed Solution

This implementation follows @valteu's approved solution exactly with zero deviations. We did not introduce any alternative approaches or optimizations beyond what was discussed and validated in the issue thread.


Testing Strategy

Comprehensive Test Suite

Created tests/test_negative_weights.py with 21 comprehensive tests covering:

1. Basic Negative Weight Functionality (4 tests)

  • Single negative weight with linear combination
  • Multiple adapters with mixed positive/negative weights
  • All negative weights
  • Zero and negative weight combinations

2. All Combination Types (11 tests)

  • Non-SVD methods: linear, ties, dare_linear, dare_ties, magnitude_prune
  • SVD methods: svd, ties_svd, dare_linear_svd, dare_ties_svd, magnitude_prune_svd
  • Cat combination with negative weights

3. Mathematical Correctness (2 tests)

  • Verification that weight=-1.0 properly negates an adapter (both A and B matrices negated)
  • Verification that weights [1.0, -1.0] produce correct difference between adapters
  • Validation that merged matrices match expected formulas with scaling factors

4. Backward Compatibility (1 test)

  • Verification that positive weights produce identical results to previous implementation
  • Mathematical validation: sqrt(0.6*s)*A + sqrt(0.4*s)*B matches exactly

5. Edge Cases (3 tests)

  • Very small negative weights (1e-6)
  • Very large negative weights (100.0)
  • Negative weights with different scaling factors across adapters

Verification Strategy

Mathematical Verification:

  • Manually computed expected values for test cases
  • Verified matrix operations match theoretical formulas
  • Confirmed sign is applied to both A and B matrices

Backward Compatibility Verification:

  • For positive weights, verified behavior is identical to pre-fix implementation
  • When weight ≥ 0, sign = 1, producing: 1 * sqrt(w*s) = sqrt(w*s) (unchanged)

Integration Testing:

  • Tested with simple models to ensure end-to-end functionality
  • Verified adapter merging works across all supported combination types
  • Confirmed no breaking changes to existing functionality

Quick Demo

Here's a minimal example demonstrating the fix:

import torch
from torch import nn
from peft import LoraConfig, get_peft_model

class SimpleModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.linear = nn.Linear(10, 5)
    def forward(self, x):
        return self.linear(x)

torch.manual_seed(42)
model = SimpleModel()
config = LoraConfig(r=8, lora_alpha=16, target_modules=['linear'], lora_dropout=0.0)
model = get_peft_model(model, config, adapter_name='adapter1')
model.add_adapter('adapter2', config)

# THIS WOULD HAVE FAILED BEFORE - NOW WORKS!
model.add_weighted_adapter(['adapter1', 'adapter2'], [0.7, -0.3], 'merged', combination_type='linear')
print('✅ SUCCESS: Negative weights work!')
print(f'Merged adapter created: {list(model.peft_config.keys())}')

Before this PR: The above code raises ValueError: math domain error
After this PR: Successfully creates merged adapter with negative weights


Use Cases Enabled

This implementation enables several important use cases:

1. Task Arithmetic

# Subtract unwanted task knowledge
model.add_weighted_adapter(
    adapters=["helpful_assistant", "sycophantic_behavior"],
    weights=[1.0, -0.5],  # Keep helpful, remove sycophantic
    adapter_name="improved_assistant",
    combination_type="linear"
)

2. LoRA Hub Weight Optimization

# Use negative weights in optimization (as in LoRA Hub paper)
model.add_weighted_adapter(
    adapters=["task_a", "task_b", "task_c"],
    weights=[2.0, 0.3, -0.7],  # Optimized weights including negative
    adapter_name="multi_task_optimized",
    combination_type="ties",
    density=0.5
)

3. Adapter Negation

# Create inverse of an adapter effect
model.add_weighted_adapter(
    adapters=["formal_style"],
    weights=[-1.0],  # Negate to get informal style
    adapter_name="informal_style",
    combination_type="linear"
)

Test Results

All 21 tests pass successfully:

tests/test_negative_weights.py::TestNegativeWeightsLinear - 4 tests PASSED
tests/test_negative_weights.py::TestNegativeWeightsAllCombinations - 11 tests PASSED
tests/test_negative_weights.py::TestNegativeWeightsMathematicalCorrectness - 2 tests PASSED
tests/test_negative_weights.py::TestBackwardCompatibility - 1 test PASSED
tests/test_negative_weights.py::TestEdgeCases - 3 tests PASSED

Screenshots

Test Suite Results

image

Quick Demo Verification

image

Breaking Changes

None. This PR is fully backward compatible:

  • Positive weights behave identically to the previous implementation
  • All existing functionality is preserved
  • Only adds support for previously unsupported negative weight values

Related Issues

Closes #2796


Credits


Checklist

  • Implementation follows the approved solution from issue discussion
  • Comprehensive test suite added (21 tests)
  • All tests passing
  • Backward compatibility verified
  • Documentation updated
  • Mathematical correctness validated
  • All combination types tested
  • Edge cases covered
  • No breaking changes

Copy link
Contributor

@valteu valteu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a look into the proposed changes and the logic update aligns with my solution. I'm however unsure about the effectiveness of some test cases and don't find the docstring update necessary (reasons provided in the comments added).
Also I think the PR comment should be updated, I don't see a value in the LoraHub code example, this has nothing todo with LoraHub imo.
Otherwise Lgtm!

Comment on lines 543 to 545
List of weights for each adapter. Weights can be positive or negative, allowing for both addition
and subtraction of adapter effects. Negative weights enable task arithmetic and adapter negation as
described in the LoRA Hub and Task Arithmetic papers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would cut this, e.g. for _svd_generalized_task_arithmetic_weighted_adapter in SVD-based merging, this was already available and I don't think the hint to LoraHub or Task Arithemetic neccessary here

Suggested change
List of weights for each adapter. Weights can be positive or negative, allowing for both addition
and subtraction of adapter effects. Negative weights enable task arithmetic and adapter negation as
described in the LoRA Hub and Task Arithmetic papers.
List of weights for each adapter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay , fixed in c190977

Comment on lines 319 to 344
def test_very_small_negative_weight(self, simple_model, lora_config):
"""Test very small negative weights."""
model = get_peft_model(simple_model, lora_config, adapter_name="adapter1")

model.add_weighted_adapter(
adapters=["adapter1"],
weights=[-1e-6],
adapter_name="merged_tiny_negative",
combination_type="linear",
)

assert "merged_tiny_negative" in model.peft_config

def test_very_large_negative_weight(self, simple_model, lora_config):
"""Test very large negative weights."""
model = get_peft_model(simple_model, lora_config, adapter_name="adapter1")

model.add_weighted_adapter(
adapters=["adapter1"],
weights=[-100.0],
adapter_name="merged_large_negative",
combination_type="linear",
)

assert "merged_large_negative" in model.peft_config

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm not missing sth, I don't see a benefit in this. Nothing regarding weight magnitude was updated and we are not even testing for any mathematical correctness, just to get no error.

Suggested change
def test_very_small_negative_weight(self, simple_model, lora_config):
"""Test very small negative weights."""
model = get_peft_model(simple_model, lora_config, adapter_name="adapter1")
model.add_weighted_adapter(
adapters=["adapter1"],
weights=[-1e-6],
adapter_name="merged_tiny_negative",
combination_type="linear",
)
assert "merged_tiny_negative" in model.peft_config
def test_very_large_negative_weight(self, simple_model, lora_config):
"""Test very large negative weights."""
model = get_peft_model(simple_model, lora_config, adapter_name="adapter1")
model.add_weighted_adapter(
adapters=["adapter1"],
weights=[-100.0],
adapter_name="merged_large_negative",
combination_type="linear",
)
assert "merged_large_negative" in model.peft_config

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved in c190977

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sambhavnoobcoder for working on this PR and @valteu for your review.

My review focuses on the testing. Right now, completely new tests are being added but I'd rather see the existing tests being extended. As those tests can be difficult to scrutinize, I tried to give some guidance on how to achieve that. Please let me know if anything still remains unclear.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for also including unit tests. However, we already have a couple of unit tests for add_weighted_adapter, and I would prefer to extend those instead of having this set of completely separate tests.

I think the best way to do that is to check this test function:

peft/tests/testing_common.py

Lines 1822 to 1853 in b0954e0

def _test_weighted_combination_of_adapters(self, model_id, config_cls, config_kwargs):
if issubclass(config_cls, AdaLoraConfig):
# AdaLora does not support adding more than 1 adapter
return pytest.skip(f"Test not applicable for {config_cls}")
if model_id.endswith("qwen2"):
# Qwen2 fails with weighted adapter combinations using SVD
return pytest.skip(f"Test does not work with model {model_id}")
if "gemma" in model_id.lower():
return pytest.skip("Combining Gemma adapters with SVD is currently failing")
adapter_list = ["adapter1", "adapter_2", "adapter_3"]
weight_list = [0.5, 1.5, 1.5]
# Initialize the config
config = config_cls(
base_model_name_or_path=model_id,
**config_kwargs,
)
if not isinstance(config, (LoraConfig, IA3Config)):
# This test is only applicable for Lora and IA3 configs
return pytest.skip(f"Test not applicable for {config}")
with hub_online_once(model_id):
model = self.transformers_class.from_pretrained(model_id)
model = get_peft_model(model, config, adapter_list[0])
if isinstance(config, LoraConfig):
self._test_weighted_combination_of_adapters_lora(model, config, adapter_list, weight_list)
elif isinstance(config, IA3Config):
self._test_weighted_combination_of_adapters_ia3(model, config, adapter_list, weight_list)
else:
pytest.skip(f"Test not applicable for {config}")

It is already parameterized over PEFT methods (right now, only LoRA and IA³), different model architectures, and merging methods. Right now, it is called with a list of weights weight_list = [0.5, 1.5, 1.5]. The easiest extension would be to define another set of weights which includes negative weights and then run the tests with those as well. So e.g.:

-            if isinstance(config, LoraConfig):
-                self._test_weighted_combination_of_adapters_lora(model, config, adapter_list, weight_list)
-            elif isinstance(config, IA3Config):
-                self._test_weighted_combination_of_adapters_ia3(model, config, adapter_list, weight_list)
+            if isinstance(config, LoraConfig):
+                self._test_weighted_combination_of_adapters_lora(model, config, adapter_list, weight_list)
+                self._test_weighted_combination_of_adapters_lora(model, config, adapter_list, negative_weight_list)
+            elif isinstance(config, IA3Config):
+                self._test_weighted_combination_of_adapters_ia3(model, config, adapter_list, weight_list)
+                self._test_weighted_combination_of_adapters_ia3(model, config, adapter_list, negative_weight_list)

Since those tests are generic in nature, I believe it would be useful to keep the test_negative_weight_negates_adapter test you added. Instead of putting it into a separate file, I'd rather add it to the custom tests after this test:

def test_add_weighted_adapter_cat_with_rank_pattern(self, config_cls):

Then you can re-use one of the models defined there, e.g. MLP, instead of defining a new SimpleModel.

Similarly, I think we can transfer the test_subtraction_with_negative_weights, which looks useful to me. I wonder, however, if we can make that test even more strict. If we have two identical LoRA adapters and then merge them together with weight [1.0, -1.0], should we not expect that they cancel out exactly? Then we can check that the new merged adapter has weights of approximately 0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it — the goal was to extend the generic tests instead of creating new files. I’ve refactored accordingly, gone through the existing tests, and only added the cases that I felt weren’t already covered, directly within the generic test suite. This should now be resolved in 6a1c92d .
Please let me know if I’ve missed anything or if further changes are needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just added another small test for testing negative weights with differential scaling in bea394f . i think this covers all the edge cases for this now .

@sambhavnoobcoder
Copy link
Contributor Author

Thank you for your reviews @BenjaminBossan and @valteu . i have made the requested changes here . kindly look into this and let me know if this PR requires any more changes .

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for updating the tests. I have a few smaller comments, but overall this looks good.

["default", "other"], weights=[1.0, 1.0], adapter_name="merged", combination_type="cat"
)

def test_negative_weight_negates_adapter(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For clarity, let's extend the test name like this:

Suggested change
def test_negative_weight_negates_adapter(self):
def test_add_weighted_adapter_negative_weight_negates_adapter(self):

Same idea for the other tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved in 41409af

Comment on lines 2909 to 2910
# Verify the merged adapter was created successfully
assert "merged_diff_scaling" in model.peft_config
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be removed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved in 41409af


adapter_list = ["adapter1", "adapter_2", "adapter_3"]
weight_list = [0.5, 1.5, 1.5]
negative_weight_list = [0.5, -0.8, 1.2]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
negative_weight_list = [0.5, -0.8, 1.2]
negative_weight_list = [-0.5, -0.8, -1.2]

Let's do all negative numbers, just for good measure :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved in 41409af

@sambhavnoobcoder
Copy link
Contributor Author

@BenjaminBossan i went through the suggestions you made , implemented them in 41409af . do tell me if any other changes are needed , i will implement those as well .

@BenjaminBossan
Copy link
Member

@sambhavnoobcoder Could you please run make style to satisfy the linter? Ignore the error about the doc build failure.

@sambhavnoobcoder
Copy link
Contributor Author

@BenjaminBossan ran the make style , style changes in 2 files , committed in c9f3230 . do check it out and tell me if any other changes are needed .

@BenjaminBossan
Copy link
Member

@sambhavnoobcoder We have a bunch of failing tests because the unit test does not consider that weights can be negative:

peft/tests/testing_common.py

Lines 1777 to 1778 in b0954e0

weighted_original_delta_weights = target.get_delta_weight(adapter_list[0]) * weight_list[0]
assert torch.allclose(new_delta_weight, weighted_original_delta_weights, atol=1e-4, rtol=1e-4)

This should be easily fixed like so:

                         weighted_original_delta_weights = target.get_delta_weight(adapter_list[0]) * weight_list[0]
+                        sign = 1 if weight_list[0] > 0 else -1
+                        weighted_original_delta_weights = sign * weighted_original_delta_weights

Could you please update the test?

@sambhavnoobcoder
Copy link
Contributor Author

@BenjaminBossan added that in 8f9a625 . can you check now ?

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sambhavnoobcoder and @valteu for enhancing add_weighted_adapter. The PR LGTM, failing tests are unrelated.

@valteu Pls let me know how I should credit your co-authorship.

@valteu
Copy link
Contributor

valteu commented Oct 9, 2025

Hi, thanks @sambhavnoobcoder and @BenjaminBossan for the great work!
Would it be possible to add the following line to the commit to credit my co-authorship?

Co-authored-by: Valentin Teutschbein <valentin.teutschbein@student.hpi.uni-potsdam.de>

@BenjaminBossan BenjaminBossan merged commit f8aca0a into huggingface:main Oct 9, 2025
5 of 13 checks passed
@BenjaminBossan
Copy link
Member

Great, done @valteu. Greetings from Berlin.

@valteu
Copy link
Contributor

valteu commented Oct 9, 2025

Thanks, greetings from Potsdam :)

@sambhavnoobcoder
Copy link
Contributor Author

Thank you both @BenjaminBossan , @valteu and greetings from Bengaluru as well :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support negative weights when merging LoRA adapters

4 participants