这是indexloc提供的服务,不要输入任何密码
Skip to content

Conversation

@ved1beta
Copy link
Contributor

@ved1beta ved1beta commented Apr 23, 2025

docs: Add method comparison documentation

This PR adds method comparison docs for PEFT methods, split from the original PR #2504 . The changes include:

  1. Added a new section in developer guides for method comparison
  2. Created a main method comparison guide (method_comparison.md) that serves as an entry point
  3. Added detailed documentation for three methods:
    • LoRA (Low-Rank Adaptation)
    • LoRA-FA (LoRA with Fast Adaptation)
    • Bone (Bottleneck Orthogonal Network)
  4. Updated the table of contents to include the new documentation section
    This is the first part of the original PR, focusing on documentation improvements. Future PRs will address the method comparison suite enhancements and testing components.

Also please let me know hoe to adders dummy values thanks : )

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ved1beta for splitting up the original PR, this makes it much easier to review the changes.

First of all, I want to thank you for the effort you put into this. I've added a lot of comments but I think overall, this could add great value to PEFT users. Note that I have added some comments only to one section but they would apply equally to other sections.

At least for some passages, I got the impression that they were written with the help of LLMs. This is not a problem by itself, but please ensure to double check the correctness of what was written.

Moreover, building the docs currently fails because the referenced markdown files are not found. You can run the doc build locally as described here to ensure that there are no errors.


## Choosing the Right Method

When selecting a PEFT method, consider the following factors:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit wary of adding a section like this. To me, these conclusion sound too strong compared to the evidence that we have. If you derived these conclusions from some papers (or even meta-reviews), that's of course different, but in that case let's add the references.

Otherwise, I would suggest to strictly stick to the evidence we have. I'm not sure if you ran the PEFT method comparison suite yourself, otherwise I think I can share the results I got from running them locally, even though those are still preliminary.

Just as an example, unless we add a new task to the method comparison suite that is specifically for classification, I would not document that Bone is especially good for classification.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can remove this section let me know what todo

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove this section if we don't have evidence corroborating the claims.

# Bone (Bottleneck Orthogonal Network)

## Overview
Bone is a parameter-efficient fine-tuning method that uses orthogonal transformations in bottleneck layers. It's particularly effective for small to medium-sized models and offers excellent memory efficiency.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the descriptions, let's refer to the existing documentation (in this case here) and avoid duplication as much as possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removing this

### Not Recommended For
- Tasks requiring extensive model modifications
- Very small models (< 100M parameters)
- Non-AdamW optimizers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Not Recommended For" non-AdamW is not putting it quite correctly, in fact it is only implemented for AdamW, so other optimizers are not supported.


# Define LoRA-FA configuration
config = LoraConfig(
r=16, # higher rank recommended for LoRA-FA
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
r=16, # higher rank recommended for LoRA-FA
r=16, # higher rank recommended for LoRA-FA compared to LoRA

target_modules=["q_proj", "v_proj"],
lora_dropout=0.05,
bias="none",
use_fast_adapter=True, # Enable LoRA-FA
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This option does not exist, how did you come to this?

@ved1beta
Copy link
Contributor Author

I'll address all the points you've raised:
The documentation gaps you've identified are correct , and I'll work on filling those gaps systematically, and i agree there are too many LLM hallucinations will run the scripts and get the relevant data numbers and submit an updated version addressing these changes shortly.

Thanks again for your valuable feedback!

@ved1beta
Copy link
Contributor Author

ved1beta commented Apr 24, 2025

changes include added : added relevant number in the tables , removed wrong and bold claims i have addressed most of the changes asked for alsoo need to fix docs build resolving the conversations i think are resolved (Also will need to fix the duplication) on it
i will need some help with these , what code examples to add in files

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not do a complete in-depth review, but have added some comments. I think it is important that we:

  • avoid duplicating information that can already be found elsewhere
  • provide transparency on how the numbers and conclusions were derived

I assume that many of the numbers you present stem from the PR that adds the scripts. Therefore, I think we should prioritize that PR. Just as an example, if we find an error there that changes the results, we would need to update the numbers in this PR.

i will need some help with these , what code examples to add in files

As mentioned, let's link to existing examples, e.g.:

|--------|------------------|----------------|----------------------|
| LoRA | High (0.96-1.90%) | Fast | 0.96-1.90% of parameters |
| LoRA-FA | Very High (0.24-0.47%) | Fast | 0.24-0.47% of parameters |
| Bone | Medium (15.30-30.39%) | Fast | 15.30-30.39% of parameters |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that these numbers come from your script? Then it's important to wait for that PR to be finished first and explain how the numbers were derived. There is also an issue with duplicate info with "memory efficiency" and "parameter efficiency" showing the same info. Finally, 15-30% for Bone seems exceedingly high, did you mean 0.15-0.30%?


## Choosing the Right Method

When selecting a PEFT method, consider the following factors:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove this section if we don't have evidence corroborating the claims.


## Implementation

### Basic Usage
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still open.

Comment on lines +162 to +164
bottleneck_size=32,
bottleneck_alpha=2.0,
bottleneck_dropout=0.1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, these values arguments appear to be hallucinated.

@ved1beta ved1beta mentioned this pull request Apr 30, 2025
@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

@githubnemo
Copy link
Collaborator

not stale

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

@github-actions github-actions bot closed this Jun 25, 2025
@githubnemo githubnemo reopened this Jun 26, 2025
@github-actions github-actions bot closed this Jul 4, 2025
cyyever pushed a commit to cyyever/peft that referenced this pull request Sep 4, 2025
* Update community_tutorials.md

* Update community_tutorials.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants