-
Notifications
You must be signed in to change notification settings - Fork 2.1k
method comprision docs #2509
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
method comprision docs #2509
Conversation
BenjaminBossan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ved1beta for splitting up the original PR, this makes it much easier to review the changes.
First of all, I want to thank you for the effort you put into this. I've added a lot of comments but I think overall, this could add great value to PEFT users. Note that I have added some comments only to one section but they would apply equally to other sections.
At least for some passages, I got the impression that they were written with the help of LLMs. This is not a problem by itself, but please ensure to double check the correctness of what was written.
Moreover, building the docs currently fails because the referenced markdown files are not found. You can run the doc build locally as described here to ensure that there are no errors.
|
|
||
| ## Choosing the Right Method | ||
|
|
||
| When selecting a PEFT method, consider the following factors: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit wary of adding a section like this. To me, these conclusion sound too strong compared to the evidence that we have. If you derived these conclusions from some papers (or even meta-reviews), that's of course different, but in that case let's add the references.
Otherwise, I would suggest to strictly stick to the evidence we have. I'm not sure if you ran the PEFT method comparison suite yourself, otherwise I think I can share the results I got from running them locally, even though those are still preliminary.
Just as an example, unless we add a new task to the method comparison suite that is specifically for classification, I would not document that Bone is especially good for classification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can remove this section let me know what todo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove this section if we don't have evidence corroborating the claims.
| # Bone (Bottleneck Orthogonal Network) | ||
|
|
||
| ## Overview | ||
| Bone is a parameter-efficient fine-tuning method that uses orthogonal transformations in bottleneck layers. It's particularly effective for small to medium-sized models and offers excellent memory efficiency. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the descriptions, let's refer to the existing documentation (in this case here) and avoid duplication as much as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removing this
| ### Not Recommended For | ||
| - Tasks requiring extensive model modifications | ||
| - Very small models (< 100M parameters) | ||
| - Non-AdamW optimizers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Not Recommended For" non-AdamW is not putting it quite correctly, in fact it is only implemented for AdamW, so other optimizers are not supported.
|
|
||
| # Define LoRA-FA configuration | ||
| config = LoraConfig( | ||
| r=16, # higher rank recommended for LoRA-FA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| r=16, # higher rank recommended for LoRA-FA | |
| r=16, # higher rank recommended for LoRA-FA compared to LoRA |
| target_modules=["q_proj", "v_proj"], | ||
| lora_dropout=0.05, | ||
| bias="none", | ||
| use_fast_adapter=True, # Enable LoRA-FA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This option does not exist, how did you come to this?
|
I'll address all the points you've raised: Thanks again for your valuable feedback! |
|
changes include added : added relevant number in the tables , removed wrong and bold claims i have addressed most of the changes asked for alsoo need to fix docs build resolving the conversations i think are resolved (Also will need to fix the duplication) on it |
BenjaminBossan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not do a complete in-depth review, but have added some comments. I think it is important that we:
- avoid duplicating information that can already be found elsewhere
- provide transparency on how the numbers and conclusions were derived
I assume that many of the numbers you present stem from the PR that adds the scripts. Therefore, I think we should prioritize that PR. Just as an example, if we find an error there that changes the results, we would need to update the numbers in this PR.
i will need some help with these , what code examples to add in files
As mentioned, let's link to existing examples, e.g.:
| |--------|------------------|----------------|----------------------| | ||
| | LoRA | High (0.96-1.90%) | Fast | 0.96-1.90% of parameters | | ||
| | LoRA-FA | Very High (0.24-0.47%) | Fast | 0.24-0.47% of parameters | | ||
| | Bone | Medium (15.30-30.39%) | Fast | 15.30-30.39% of parameters | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume that these numbers come from your script? Then it's important to wait for that PR to be finished first and explain how the numbers were derived. There is also an issue with duplicate info with "memory efficiency" and "parameter efficiency" showing the same info. Finally, 15-30% for Bone seems exceedingly high, did you mean 0.15-0.30%?
|
|
||
| ## Choosing the Right Method | ||
|
|
||
| When selecting a PEFT method, consider the following factors: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove this section if we don't have evidence corroborating the claims.
|
|
||
| ## Implementation | ||
|
|
||
| ### Basic Usage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still open.
| bottleneck_size=32, | ||
| bottleneck_alpha=2.0, | ||
| bottleneck_dropout=0.1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, these values arguments appear to be hallucinated.
|
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
|
not stale |
|
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
* Update community_tutorials.md * Update community_tutorials.md
docs: Add method comparison documentation
This PR adds method comparison docs for PEFT methods, split from the original PR #2504 . The changes include:
method_comparison.md) that serves as an entry pointThis is the first part of the original PR, focusing on documentation improvements. Future PRs will address the method comparison suite enhancements and testing components.
Also please let me know hoe to adders dummy values thanks : )