这是indexloc提供的服务,不要输入任何密码
Skip to content

0.18.0: RoAd, ALoRA, Arrow, WaveFT, DeLoRA, OSF, and more

Latest

Choose a tag to compare

@BenjaminBossan BenjaminBossan released this 13 Nov 11:14
· 7 commits to main since this release
77daa8d

Highlights

peft-v0 18 0

FIXME update list of all changes, so some more commits were added

New Methods

RoAd

@ppetrushkov added RoAd: 2D Rotary Adaptation to PEFT in #2678. RoAd learns 2D rotation matrices that are applied using only element-wise multiplication, thus promising very fast inference with adapters in unmerged state.

Remarkably, besides LoRA, RoAd is the only PEFT method that supports mixed adapter batches. This means that when you have loaded a model with multiple RoAd adapters, you can use all of them for different samples in the same batch, which is much more efficient than switching adapters between batches:

model = PeftModel.from_pretrained(base_model, <path-to-road-adapter-A>, adapter_name="adapter-A")
model.add_adapter("adapter-B", <path-to-road-adapter-B>)

inputs = ...  # input with 3 samples
# apply adapter A to sample 0, adapter B to sample 1, and use the base model for sample 2:
adapter_names = ["adapter-A", "adapter-B", "__base__"]
output_mixed = model(**inputs, adapter_names=adapter_names)
gen_mixed = model.generate(**inputs, adapter_names=adapter_names)

ALoRA

Activated LoRA is a technique added by @kgreenewald in #2609 for causal language models, allowing to selectively enable LoRA adapters depending on a specific token invocation sequence in the input. This has the major benefit of being able to re-use most of the KV cache during inference when the adapter is only used to generate part of the response, after which the base model takes over again.

Arrow & GenKnowSub

@TheTahaaa contributed not only support for Arrow, a dynamic routing algorithm between multiple loaded LoRAs in #2644, but also GenKnowSub, a technique built upon Arrow where the 'library' of LoRAs available to Arrow is first modified by subtracting general knowledge adapters (e.g., trained on subsets of Wikipedia) to enhance task-specific performance.

WaveFT

Thanks to @Bilican, Wavelet Fine-Tuning (WaveFT) was added to PEFT in #2560. This method trains sparse updates in the wavelet domain of residual matrices, which is especially parameter efficient. It is very interesting for image generation, as it promises to generate diverse outputs while preserving subject fidelity.

DeLoRA

Decoupled Low-rank Adaptation (DeLoRA) was added by @mwbini in #2780. This new PEFT method is similar to DoRA in so far as it decouples the angle and magnitude of the learned adapter weights. However, DeLoRA implements this in a way that promises to better prevent divergence. Moreover, it constrains the deviation of the learned weight by imposing an upper limit of the norm, which can be adjusted via the delora_lambda parameter.

OSF

Orthogonal Fine-Tuning (OSF) was added by @NikhilNayak-debug in #2685. By freezing the high-rank subspace of the targeted weight matrices and projecting gradient updates to a low-rank subspace, OSF achieves good performance on continual learning tasks. While it is a bit memory intensive for standard fine-tuning processes, it is definitely worth checking out on tasks where performance degradation of previously learned tasks is a concern.

Enhancements

Text generation benchmark

In #2525, @ved1beta added the text generation benchmark to PEFT. This is a framework to determine and compare metrics with regard to text generation of different PEFT methods, e.g. runtime and memory usage. Right now, this benchmark is still lacking experimental settings and a visualization, analogous to what we have in the MetaMathQA benchmark. If this is something that interests you, we encourage you to let us know or, even better, contribute to this benchmark.

Reliable interface for integrations

PEFT has integrations with other libraries like Transformers and Diffusers. To facilitate this integration, PEFT now provides a stable interface of functions that should be used if applicable. For example, the set_adapter function can be used to switch between PEFT adapters on the model, even if the model is not a PeftModel instance. We commit to keeping these functions backwards compatible, so it's safe for other libraries to build on top of those.

Handling of weight tying

Some Transformers models can have tied weights. This is especially prevalent when it comes to the embedding and the LM head. Currently, the way that this is handled in PEFT is not obvious. We thus drafted an issue to illustrate the intended behavior in #2864. This shows what our goal is, although not everything is implemented yet.

In #2803, @romitjain added the ensure_weight_tying argument to LoraConfig. This argument, if set to True, enforces weight tying of the modules targeted with modules_to_save. Thus, if embedding and LM head are tied, they will share weights, which is important to allow, for instance, weight merging. Therefore, for most users, we recommend to enable this setting if they want to fully fine-tune the embedding and LM head. For backwards compatability, the setting is off by default though.

Note that in accordance with #2864, the functionality of ensure_weight_tying=True will be expanded to also include trainable tokens (#2870) and LoRA (tbd.) in the future.

Support Conv1d and 1x1 Conv2 layers in LoHa and LoKr

@grewalsk extended LoHa and LoKr to support nn.Conv1d layers, as well as nn.Conv2d with 1x1 kernels, in #2515.

New prompt tuning initialization

Thanks to @macmacmacmac, we now have a new initialization option for prompt tuning, random discrete initialization (#2815). This option should generally work better than random initialization, as corroborated on our PEFT method comparison suite. Give it a try if you use prompt tuning.

Combining LoRA adapters with negative weights

If you use multiple LoRA adapters, you can merge them into a single adapter using model.add_weighted_adapter. However, so far, this only worked with positive weights per adapter. Thanks to @sambhavnoobcoder and @valteu, it is now possible to pass negative weights too.

Changes

Transformers compatibility

At the time of writing, the Transformers v5 release is imminent. This Transformers version will be incomptabile with PEFT < 0.18.0. If you plan to use Transformers v5 with PEFT, please upgrade PEFT to 0.18.0+.

Python version

This PEFT version no longer supports Python 3.9, which has reached its end of life. Please use Python 3.10+.

Updates to OFT

The OFT method has been updated to make it slightly faster and to stabilize the numerics in #2805. This means, however, that existing checkpoints may give slightly different results after upgrading to PEFT 0.18.0. Therefore, if you use OFT, we recommend to retrain the adapter.

All Changes

New Contributors

Full Changelog: v0.17.1...v0.18.0