这是indexloc提供的服务,不要输入任何密码
Skip to content

Conversation

@grewalsk
Copy link
Contributor

What does this PR do?
This PR enhances the LoKr and LoHa adapter implementations within PEFT by adding proper support for:

  1. 1x1 Convolutions (nn.Conv2d with kernel_size=(1,1))
  2. nn.Conv1d layers (specifically including kernel_size=1).

This allows LoKr/LoHa adapters to be correctly applied to a wider range of modern architectures that heavily utilize these layer types (e.g., ResNet bottlenecks, MobileNet pointwise convolutions, various Transformer blocks). The implementation aims for optimized handling, inspired by LoRA's 1x1 optimization, while maintaining consistency with existing LyCORIS patterns in PEFT. Parts of the implementation logic, particularly for parameter factorization and layer adaptation, were adapted from the KohakuBlueleaf/LyCORIS library (e.g., lycoris/modules/loha.py), consistent with existing acknowledgements within the PEFT codebase.

This includes:

  • New Conv1d adapter layer classes for both LoKr and LoHa, mirroring Conv2d.
  • Updated layers_mapping in LoKrModel and LoHaModel to recognize Conv1d.
  • Enhanced create_adapter_parameters methods in LoKr/LoHa to correctly initialize parameters based on Conv1d weight shapes.
  • Refactored update_layer methods in LoKr/LoHa to:
    • Detect Conv1d layers.
    • Implement specific logic for 1x1 Conv2d and kernel_size=1 Conv1d, notably disabling use_effective_conv2d where appropriate for direct matrix handling.
    • Ensure correct shape calculations for factorization.
  • Added detection flags (is_1x1_conv2d, is_1_conv1d) in get_delta_weight methods as hooks for potential future computation optimizations (without altering current paths).
  • Maintained backward compatibility; changes are additive and do not affect existing functionality for other layer types or kernel sizes.
  • Followed established PEFT/LyCORIS coding patterns for consistency.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guidelines?
  • Did you write any new necessary tests?
  • Did you run the tests?
  • Did you run the pre-commit hooks?
  • Did you check the documentation builds correctly?

Fixes #1935

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this great PR, it is much appreciated.

I saw some small changes still being required, please check the comments. Also, to accept Conv1d, it also needs to be added here:

if isinstance(target_base_layer, torch.nn.Conv2d):

Then, it's important to add unit tests to ensure that 1x1 conv and Conv1d work as expected. For this, let's add the following test cases:

    (
        "Conv1D LoHa",
        "Conv1d",
        LoHaConfig,
        {"target_modules": ["conv1d"]},
    ),
    ("Conv2d 1x1 LoHa", "Conv2d1x1", LoHaConfig, {"target_modules": ["conv2d"]}),

...

    (
        "Conv1D LoKr",
        "Conv1d",
        LoKrConfig,
        {"target_modules": ["conv1d"]},
    ),
    ("Conv2d 1x1 LoKr", "Conv2d1x1", LoKrConfig, {"target_modules": ["conv2d"]}),

here and here.

We also need to define the ModelConv2d1x1, e.g. like this:

class ModelConv2D1x1(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv2d = nn.Conv2d(1, 10, 3)
        self.relu = nn.ReLU()
        self.flat = nn.Flatten()
        self.lin0 = nn.Linear(10, 2)
        self.sm = nn.LogSoftmax(dim=-1)
        self.dtype = torch.float

    def forward(self, X):
        X = X.to(self.dtype)
        X = X.reshape(-1, 5, 3, 3)
        X = self.conv2d(X)
        X = self.relu(X)
        X = self.flat(X)
        X = self.lin0(X)
        X = self.sm(X)
        return X

Ensure to add this snippet here:

        if model_id == "Conv2d1x1":
            return ModelConv2D1x1().to(torch_dtype)

@BenjaminBossan
Copy link
Member

gentle ping @grewalsk

Kabir Grewal added 3 commits May 10, 2025 20:19
  support to LyCORIS adapters

  - Update lycoris_utils.py to support Conv1d
  layers
  - Add ModelConv2D1x1 and ModelConv1D test
  classes
  - Add test cases for Conv1d and Conv2d 1x1 with
   LoHa and LoKr
  - Update imports in loha/model.py and
  lokr/model.py
@grewalsk
Copy link
Contributor Author

Hi @BenjaminBossan, could you take a quick look?

I removed the conflicting ModelConv2DGroups class and cleared the merge-markers locally, but it still shows up in the PR diff.

Copy link
Collaborator

@githubnemo githubnemo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update, I'll take over the review for now :)

I've had some questions/remarks but nothing major. The merge conflict markers are not visible to me, so that seems resolved.

…onsistency

Clarify existing optimal handling for 1x1 Conv2d and Conv1d layers;
confirmed  is already correctly disabled in  files,
addressing reviewer feedback on potential dead code.

Standardize test case naming in  for consistency:
- Updated 'LoHa' to 'LOHA' and 'LoKr' to 'LOKR' (all caps).
- Aligned Conv1D/Conv1d naming with PyTorch conventions for clarity.
@grewalsk
Copy link
Contributor Author

does this address ur concerns or should I add something?

Copy link
Collaborator

@githubnemo githubnemo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for the quick update. I think this looks good to go :)

Let's see what the CI has to say but if there are no further issues, we can proceed merging this.

@githubnemo githubnemo dismissed BenjaminBossan’s stale review May 28, 2025 16:17

Took over review

@githubnemo
Copy link
Collaborator

@grewalsk could you run make style from the top of the repository and push the changes?

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@githubnemo
Copy link
Collaborator

githubnemo commented May 28, 2025

It seems that we have two ModelConv1D classes in test_custom_models. Could you fix that? :)
I think there was already a class called ModelConv1D with kernel size 2.

Renamed the second ModelConv1D class (kernel_size=1) to ModelConv1DKernel1 and updated MockTransformerWrapper mapping to avoid class name conflicts. for the 1x1 support
@grewalsk
Copy link
Contributor Author

Is this good? @githubnemo

Copy link
Collaborator

@githubnemo githubnemo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You checked in a merge conflict, if you could fix that and run make style once again, I think we're good to go

.gitignore Outdated
Comment on lines 143 to 149
<<<<<<< Updated upstream
# method_comparison logs
method_comparison/MetaMathQA/cancelled_results/
method_comparison/MetaMathQA/temporary_results/
=======
**/.claude/settings.local.json
>>>>>>> Stashed changes
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<<<<<<< Updated upstream
# method_comparison logs
method_comparison/MetaMathQA/cancelled_results/
method_comparison/MetaMathQA/temporary_results/
=======
**/.claude/settings.local.json
>>>>>>> Stashed changes
# method_comparison logs
method_comparison/MetaMathQA/cancelled_results/
method_comparison/MetaMathQA/temporary_results/

@grewalsk
Copy link
Contributor Author

grewalsk commented Jun 3, 2025

I think its good now. @githubnemo

@githubnemo
Copy link
Collaborator

@grewalsk It seems that there are some failing tests regarding LoKr/LoHa. Could you investigate?

You can reproduce those runs locally by running pytest -k loha -k lokr tests/.

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

@githubnemo
Copy link
Collaborator

not stale

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

@githubnemo
Copy link
Collaborator

not stale

@githubnemo
Copy link
Collaborator

Hey @grewalsk , do you still plan on implementing this further?

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for picking up this PR. In general, it looks to be in a good state and my comments are mostly just minor things.

I did, however, discover that some of the new functionality for Conv1d is not covered by tests and when I tried to add coverage, saw that multiple tests are failing. Please check my comment below for details. If these failures are not something we wish to investigate now, I would also be fine with raising an error for the time being ("Using LoHa/LoKr with Conv1d and ... is currently not supported").

Moreover, let's update the docs for use_effective_conv2d in the respective configs to reflect its new meaning in the context of Conv1d. In a sense, the name no longer fits but I'd be fine with leaving it for the sake of backwards compatibility.

.gitignore Outdated
method_comparison/MetaMathQA/cancelled_results/
method_comparison/MetaMathQA/temporary_results/

**/.claude/settings.local.json
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unrelated but probably not too uncommon anymore, so I'd vote to keep it instead + adding a # Coding agents comment. WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this open the floodgates? I'd rather recommend folks to use a global .gitignore for their specific stuff.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed it for now. Let's see if we need to comment on this often in the future, then we can add it again.

Comment on lines 145 to 146
else:
use_effective_conv2d = use_effective_conv2d
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No-op, can be removed. In fact, this snippet does the same as the one it replaces, right?

Comment on lines 161 to 162
else:
use_effective_conv2d = use_effective_conv2d
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, can be removed.

Comment on lines 221 to 222
else:
use_effective_conv2d = use_effective_conv2d
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Comment on lines 239 to 240
else:
use_effective_conv2d = use_effective_conv2d
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Comment on lines 92 to 98
elif use_effective_conv2d: # Even for Conv1d, use the effective parameter for kernel dimension
self.lokr_t2[adapter_name] = nn.Parameter(torch.empty(r, r, shape[2]))
self.lokr_w2_a[adapter_name] = nn.Parameter(torch.empty(r, shape[0][1])) # b, 1-mode
self.lokr_w2_b[adapter_name] = nn.Parameter(torch.empty(r, shape[1][1])) # d, 2-mode
else:
self.lokr_w2_a[adapter_name] = nn.Parameter(torch.empty(shape[0][1], r))
self.lokr_w2_b[adapter_name] = nn.Parameter(torch.empty(r, shape[1][1] * shape[2]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines are not covered by tests. I think we need to extend the test cases a bit:

diff --git a/tests/test_custom_models.py b/tests/test_custom_models.py
index 505840e1..62e33011 100644
--- a/tests/test_custom_models.py
+++ b/tests/test_custom_models.py
@@ -291,7 +291,9 @@ TEST_CASES = [
     ("Conv2d 2 LOKR", "Conv2d", LoKrConfig, {"target_modules": ["conv2d", "lin0"]}),
     ("Conv2d 3 LOKR", "Conv2d", LoKrConfig, {"target_modules": ["conv2d"], "use_effective_conv2d": True}),
     ("Conv2d 4 LOKR", "Conv2d", LoKrConfig, {"target_modules": ["conv2d", "lin0"], "use_effective_conv2d": True}),
-    ("Conv1d LOKR", "Conv1d", LoKrConfig, {"target_modules": ["conv1d"]}),
+    ("Conv1d 1 LOKR", "Conv1d", LoKrConfig, {"target_modules": ["conv1d"]}),
+    ("Conv1d 2 LOKR", "Conv1dBigger", LoKrConfig, {"target_modules": ["conv1d"], "r": 2, "use_effective_conv2d": False}),
+    ("Conv1d 3 LOKR", "Conv1dBigger", LoKrConfig, {"target_modules": ["conv1d"], "r": 2, "use_effective_conv2d": True}),
     ("Conv2d 1x1 LOKR", "Conv2d1x1", LoKrConfig, {"target_modules": ["conv2d"]}),
     (
         "Conv2d 5 LOKR",
@@ -1193,6 +1195,29 @@ class ModelConv1D(nn.Module):
         return X
 
 
+class ModelConv1dBigger(nn.Module):
+    # TODO
+    def __init__(self):
+        super().__init__()
+        self.conv1d = nn.Conv1d(64, 16, 2)
+        self.relu = nn.ReLU()
+        self.flat = nn.Flatten()
+        self.lin0 = nn.Linear(144, 2)
+        self.sm = nn.LogSoftmax(dim=-1)
+        self.dtype = torch.float
+
+    def forward(self, X):
+        X = X.to(self.dtype)
+        X = X.reshape(-1, 1, 10)
+        X = torch.concat([X] * 64, dim=1)
+        X = self.conv1d(X)
+        X = self.relu(X)
+        X = self.flat(X)
+        X = self.lin0(X)
+        X = self.sm(X)
+        return X
+
+
 class ModelConv2D(nn.Module):
     def __init__(self, bias=True):
         super().__init__()
@@ -1426,6 +1451,9 @@ class MockTransformerWrapper:
         if model_id == "Conv1d":
             return ModelConv1D().to(torch_dtype)
 
+        if model_id == "Conv1dBigger":
+            return ModelConv1dBigger().to(torch_dtype)
+
         if model_id == "Conv2d":
             return ModelConv2D().to(torch_dtype)
 

However, with these changes, there are errors in some tests that involve merging. I'm not sure how trivial it is to fix those errors.

The same argument applies to LoHa, where the analogous lines are not covered and tests would need to be extended.

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice fix just adding a dimension instead of adding a new special case for merging. Thanks again for picking up the PR, LGTM.

@githubnemo githubnemo merged commit e62aee4 into huggingface:main Aug 27, 2025
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Call for contributions] help us improve LoKr, LoHa, and other LyCORIS

4 participants