-
Notifications
You must be signed in to change notification settings - Fork 2.1k
feat(lokr, loha): add 1x1 Conv2d and Conv1d support #2515
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(lokr, loha): add 1x1 Conv2d and Conv1d support #2515
Conversation
BenjaminBossan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for this great PR, it is much appreciated.
I saw some small changes still being required, please check the comments. Also, to accept Conv1d, it also needs to be added here:
peft/src/peft/tuners/lycoris_utils.py
Line 263 in d3ff133
| if isinstance(target_base_layer, torch.nn.Conv2d): |
Then, it's important to add unit tests to ensure that 1x1 conv and Conv1d work as expected. For this, let's add the following test cases:
(
"Conv1D LoHa",
"Conv1d",
LoHaConfig,
{"target_modules": ["conv1d"]},
),
("Conv2d 1x1 LoHa", "Conv2d1x1", LoHaConfig, {"target_modules": ["conv2d"]}),
...
(
"Conv1D LoKr",
"Conv1d",
LoKrConfig,
{"target_modules": ["conv1d"]},
),
("Conv2d 1x1 LoKr", "Conv2d1x1", LoKrConfig, {"target_modules": ["conv2d"]}),We also need to define the ModelConv2d1x1, e.g. like this:
class ModelConv2D1x1(nn.Module):
def __init__(self):
super().__init__()
self.conv2d = nn.Conv2d(1, 10, 3)
self.relu = nn.ReLU()
self.flat = nn.Flatten()
self.lin0 = nn.Linear(10, 2)
self.sm = nn.LogSoftmax(dim=-1)
self.dtype = torch.float
def forward(self, X):
X = X.to(self.dtype)
X = X.reshape(-1, 5, 3, 3)
X = self.conv2d(X)
X = self.relu(X)
X = self.flat(X)
X = self.lin0(X)
X = self.sm(X)
return XEnsure to add this snippet here:
if model_id == "Conv2d1x1":
return ModelConv2D1x1().to(torch_dtype)
|
gentle ping @grewalsk |
support to LyCORIS adapters - Update lycoris_utils.py to support Conv1d layers - Add ModelConv2D1x1 and ModelConv1D test classes - Add test cases for Conv1d and Conv2d 1x1 with LoHa and LoKr - Update imports in loha/model.py and lokr/model.py
|
Hi @BenjaminBossan, could you take a quick look? I removed the conflicting ModelConv2DGroups class and cleared the merge-markers locally, but it still shows up in the PR diff. |
githubnemo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update, I'll take over the review for now :)
I've had some questions/remarks but nothing major. The merge conflict markers are not visible to me, so that seems resolved.
…onsistency Clarify existing optimal handling for 1x1 Conv2d and Conv1d layers; confirmed is already correctly disabled in files, addressing reviewer feedback on potential dead code. Standardize test case naming in for consistency: - Updated 'LoHa' to 'LOHA' and 'LoKr' to 'LOKR' (all caps). - Aligned Conv1D/Conv1d naming with PyTorch conventions for clarity.
|
does this address ur concerns or should I add something? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for the quick update. I think this looks good to go :)
Let's see what the CI has to say but if there are no further issues, we can proceed merging this.
|
@grewalsk could you run |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
It seems that we have two |
Renamed the second ModelConv1D class (kernel_size=1) to ModelConv1DKernel1 and updated MockTransformerWrapper mapping to avoid class name conflicts. for the 1x1 support
|
Is this good? @githubnemo |
githubnemo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You checked in a merge conflict, if you could fix that and run make style once again, I think we're good to go
.gitignore
Outdated
| <<<<<<< Updated upstream | ||
| # method_comparison logs | ||
| method_comparison/MetaMathQA/cancelled_results/ | ||
| method_comparison/MetaMathQA/temporary_results/ | ||
| ======= | ||
| **/.claude/settings.local.json | ||
| >>>>>>> Stashed changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| <<<<<<< Updated upstream | |
| # method_comparison logs | |
| method_comparison/MetaMathQA/cancelled_results/ | |
| method_comparison/MetaMathQA/temporary_results/ | |
| ======= | |
| **/.claude/settings.local.json | |
| >>>>>>> Stashed changes | |
| # method_comparison logs | |
| method_comparison/MetaMathQA/cancelled_results/ | |
| method_comparison/MetaMathQA/temporary_results/ |
|
I think its good now. @githubnemo |
|
@grewalsk It seems that there are some failing tests regarding LoKr/LoHa. Could you investigate? You can reproduce those runs locally by running |
|
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
|
not stale |
|
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
|
not stale |
|
Hey @grewalsk , do you still plan on implementing this further? |
|
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
BenjaminBossan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for picking up this PR. In general, it looks to be in a good state and my comments are mostly just minor things.
I did, however, discover that some of the new functionality for Conv1d is not covered by tests and when I tried to add coverage, saw that multiple tests are failing. Please check my comment below for details. If these failures are not something we wish to investigate now, I would also be fine with raising an error for the time being ("Using LoHa/LoKr with Conv1d and ... is currently not supported").
Moreover, let's update the docs for use_effective_conv2d in the respective configs to reflect its new meaning in the context of Conv1d. In a sense, the name no longer fits but I'd be fine with leaving it for the sake of backwards compatibility.
.gitignore
Outdated
| method_comparison/MetaMathQA/cancelled_results/ | ||
| method_comparison/MetaMathQA/temporary_results/ | ||
|
|
||
| **/.claude/settings.local.json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unrelated but probably not too uncommon anymore, so I'd vote to keep it instead + adding a # Coding agents comment. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't this open the floodgates? I'd rather recommend folks to use a global .gitignore for their specific stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed it for now. Let's see if we need to comment on this often in the future, then we can add it again.
src/peft/tuners/loha/layer.py
Outdated
| else: | ||
| use_effective_conv2d = use_effective_conv2d |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No-op, can be removed. In fact, this snippet does the same as the one it replaces, right?
src/peft/tuners/loha/layer.py
Outdated
| else: | ||
| use_effective_conv2d = use_effective_conv2d |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same, can be removed.
src/peft/tuners/lokr/layer.py
Outdated
| else: | ||
| use_effective_conv2d = use_effective_conv2d |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
src/peft/tuners/lokr/layer.py
Outdated
| else: | ||
| use_effective_conv2d = use_effective_conv2d |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
| elif use_effective_conv2d: # Even for Conv1d, use the effective parameter for kernel dimension | ||
| self.lokr_t2[adapter_name] = nn.Parameter(torch.empty(r, r, shape[2])) | ||
| self.lokr_w2_a[adapter_name] = nn.Parameter(torch.empty(r, shape[0][1])) # b, 1-mode | ||
| self.lokr_w2_b[adapter_name] = nn.Parameter(torch.empty(r, shape[1][1])) # d, 2-mode | ||
| else: | ||
| self.lokr_w2_a[adapter_name] = nn.Parameter(torch.empty(shape[0][1], r)) | ||
| self.lokr_w2_b[adapter_name] = nn.Parameter(torch.empty(r, shape[1][1] * shape[2])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These lines are not covered by tests. I think we need to extend the test cases a bit:
diff --git a/tests/test_custom_models.py b/tests/test_custom_models.py
index 505840e1..62e33011 100644
--- a/tests/test_custom_models.py
+++ b/tests/test_custom_models.py
@@ -291,7 +291,9 @@ TEST_CASES = [
("Conv2d 2 LOKR", "Conv2d", LoKrConfig, {"target_modules": ["conv2d", "lin0"]}),
("Conv2d 3 LOKR", "Conv2d", LoKrConfig, {"target_modules": ["conv2d"], "use_effective_conv2d": True}),
("Conv2d 4 LOKR", "Conv2d", LoKrConfig, {"target_modules": ["conv2d", "lin0"], "use_effective_conv2d": True}),
- ("Conv1d LOKR", "Conv1d", LoKrConfig, {"target_modules": ["conv1d"]}),
+ ("Conv1d 1 LOKR", "Conv1d", LoKrConfig, {"target_modules": ["conv1d"]}),
+ ("Conv1d 2 LOKR", "Conv1dBigger", LoKrConfig, {"target_modules": ["conv1d"], "r": 2, "use_effective_conv2d": False}),
+ ("Conv1d 3 LOKR", "Conv1dBigger", LoKrConfig, {"target_modules": ["conv1d"], "r": 2, "use_effective_conv2d": True}),
("Conv2d 1x1 LOKR", "Conv2d1x1", LoKrConfig, {"target_modules": ["conv2d"]}),
(
"Conv2d 5 LOKR",
@@ -1193,6 +1195,29 @@ class ModelConv1D(nn.Module):
return X
+class ModelConv1dBigger(nn.Module):
+ # TODO
+ def __init__(self):
+ super().__init__()
+ self.conv1d = nn.Conv1d(64, 16, 2)
+ self.relu = nn.ReLU()
+ self.flat = nn.Flatten()
+ self.lin0 = nn.Linear(144, 2)
+ self.sm = nn.LogSoftmax(dim=-1)
+ self.dtype = torch.float
+
+ def forward(self, X):
+ X = X.to(self.dtype)
+ X = X.reshape(-1, 1, 10)
+ X = torch.concat([X] * 64, dim=1)
+ X = self.conv1d(X)
+ X = self.relu(X)
+ X = self.flat(X)
+ X = self.lin0(X)
+ X = self.sm(X)
+ return X
+
+
class ModelConv2D(nn.Module):
def __init__(self, bias=True):
super().__init__()
@@ -1426,6 +1451,9 @@ class MockTransformerWrapper:
if model_id == "Conv1d":
return ModelConv1D().to(torch_dtype)
+ if model_id == "Conv1dBigger":
+ return ModelConv1dBigger().to(torch_dtype)
+
if model_id == "Conv2d":
return ModelConv2D().to(torch_dtype)
However, with these changes, there are errors in some tests that involve merging. I'm not sure how trivial it is to fix those errors.
The same argument applies to LoHa, where the analogous lines are not covered and tests would need to be extended.
BenjaminBossan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice fix just adding a dimension instead of adding a new special case for merging. Thanks again for picking up the PR, LGTM.
What does this PR do?
This PR enhances the LoKr and LoHa adapter implementations within PEFT by adding proper support for:
nn.Conv2dwithkernel_size=(1,1))nn.Conv1dlayers (specifically includingkernel_size=1).This allows LoKr/LoHa adapters to be correctly applied to a wider range of modern architectures that heavily utilize these layer types (e.g., ResNet bottlenecks, MobileNet pointwise convolutions, various Transformer blocks). The implementation aims for optimized handling, inspired by LoRA's 1x1 optimization, while maintaining consistency with existing LyCORIS patterns in PEFT. Parts of the implementation logic, particularly for parameter factorization and layer adaptation, were adapted from the KohakuBlueleaf/LyCORIS library (e.g.,
lycoris/modules/loha.py), consistent with existing acknowledgements within the PEFT codebase.This includes:
Conv1dadapter layer classes for both LoKr and LoHa, mirroringConv2d.layers_mappinginLoKrModelandLoHaModelto recognizeConv1d.create_adapter_parametersmethods in LoKr/LoHa to correctly initialize parameters based onConv1dweight shapes.update_layermethods in LoKr/LoHa to:Conv1dlayers.Conv2dandkernel_size=1Conv1d, notably disablinguse_effective_conv2dwhere appropriate for direct matrix handling.is_1x1_conv2d,is_1_conv1d) inget_delta_weightmethods as hooks for potential future computation optimizations (without altering current paths).Before submitting
Fixes #1935