FIX: DynamicCache key_cache attribute deprecation #2737

BenjaminBossan · 2025-08-13T11:26:36Z

Resolves failing CI with transformers source install.

The key_cache attribute on DynamicCache is deprecated and will be removed in the 4.56.0 transformers release. The deprecation message mentions that the layers attribute should be used instead, which is what this PR does.

Resolves failing CI with transformers source install. The key_cache attribute on DynamicCache is deprecated and will be removed in the 4.56.0 transformers release. The deprecation message mentions that the layers attribute should be used instead, which is what this PR does.

HuggingFaceDocBuilderDev · 2025-08-13T11:30:35Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

githubnemo · 2025-08-19T09:35:49Z

src/peft/peft_model.py

                past_key_values.cross_attention_cache = DynamicCache()
                past_key_values.is_updated = {
-                    layer_idx: False for layer_idx in range(len(past_key_values.cross_attention_cache.key_cache))
+                    layer_idx: False for layer_idx in range(len(past_key_values.cross_attention_cache.layers))


Won't we need to check both attributes for a certain time? AFAICS the old DynamicCache instance doesn't have a layers attribute.

Yes, true, but while looking at this, I found another issue. Namely, we first initialize past_key_values.cross_attention_cache = DynamicCache() and then iterate over its key_cache / layers, but of course those are going to be empty. I confirmed this when running the tests. So, technically, this line is a no-op.

Now I wonder: can it be removed or should we save the is_updated from before we override cross_attention_cache with the empty DynamicCache()? When checking the attribute before, it is not empty but can have values like {0: True, 1: True}.

@zucchini-nlp I think something was messed up in #2096 to end up in this situation.

We had a major refactor on cache recently, and prob the new format isn't 100% backward compatible. Can you remind me what was the desired behavior for prefix-tuning on encoder decoder models. We append virtual tokens to decoder inputs only or to encoder ids as well?

I just checked the prefix tuning paper, the idea is to have separate prefix tokens (or rather, embeddings) for both encoder and decoder.

Ahh, I see, Overwriting cross_attention_cache actually won't change the values of is_updated afaik and thus we still have to set them to False in order to save cross-encoder cache

Maybe we can just iterate over cache.is_updated and change it in-place, so we don't have to infer number of layers from cache.layers, wdyt?

Ah, good suggestion. Just so I'm sure I understand correctly, currently we have:

past_key_values = EncoderDecoderCache.from_legacy_cache(past_key_values) past_key_values.cross_attention_cache = DynamicCache() past_key_values.is_updated = { layer_idx: False for layer_idx in range(len(past_key_values.cross_attention_cache.layers)) }

which makes no sense because past_key_values.is_updated will always end up being an empty dict. Instead, we should do this, right?

past_key_values = EncoderDecoderCache.from_legacy_cache(past_key_values) past_key_values.cross_attention_cache = DynamicCache() for key in past_key_values.is_updated.keys(): past_key_values.is_updated[key] = False

And this should also work with older transformers versions.

yep, an empty dict would throw errors so we need an explicit False

BenjaminBossan · 2025-08-20T09:41:11Z

@githubnemo Ready for another review.

githubnemo

LGTM otherwise

src/peft/peft_model.py

BenjaminBossan · 2025-08-25T08:39:46Z

@githubnemo Could you please re-review?

In huggingface#2737, we fixed some code that relied on the deprecated attribute but some was being missed, as it only runs on the nightly CI with multiple GPUs. This PR fixes this. Note that the original transformers code that this solution was based on no longer exists, as transformers now initializes the cache lazily, so pre-allocating the keys and values to the correct device is not necessary. But since prefix tuning inserts "virtual" keys/values, we still have to ensure the correct device in PEFT. I have tested the failing tests locally and they pass.

In #2737, we fixed some code that relied on the deprecated attribute but some was being missed, as it only runs on the nightly CI with multiple GPUs. This PR fixes this. Note that the original transformers code that this solution was based on no longer exists, as transformers now initializes the cache lazily, so pre-allocating the keys and values to the correct device is not necessary. But since prefix tuning inserts "virtual" keys/values, we still have to ensure the correct device in PEFT. I have tested the failing tests locally and they pass.

BenjaminBossan requested a review from githubnemo August 13, 2025 15:31

githubnemo reviewed Aug 19, 2025

View reviewed changes

BenjaminBossan added 3 commits August 19, 2025 11:57

Merge branch 'main' into fix-dynamiccache-key-cache-deprecation

0010527

Fixes in comments

d916904

Proper fix, don't end up with empty dict

2957cfb

BenjaminBossan requested a review from githubnemo August 20, 2025 09:41

githubnemo reviewed Aug 20, 2025

View reviewed changes

src/peft/peft_model.py Show resolved Hide resolved

Reviewer feedback: add comment to explain

970558d

BenjaminBossan requested a review from githubnemo August 20, 2025 16:10

githubnemo approved these changes Aug 26, 2025

View reviewed changes

BenjaminBossan merged commit 2d9b22f into huggingface:main Aug 26, 2025
26 of 27 checks passed

BenjaminBossan deleted the fix-dynamiccache-key-cache-deprecation branch August 26, 2025 08:37

BenjaminBossan mentioned this pull request Aug 27, 2025

FIX Deprecated key_cache attribute on Cache pt 2 #2753

Merged

BenjaminBossan mentioned this pull request Oct 9, 2025

PrefixTuning fails with DynamicCache: 'DynamicCache' object has no attribute 'key_cache' #2821

Closed

FIX: DynamicCache key_cache attribute deprecation #2737

FIX: DynamicCache key_cache attribute deprecation #2737

Uh oh!

Conversation

BenjaminBossan commented Aug 13, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Aug 13, 2025

Uh oh!

githubnemo Aug 19, 2025

Choose a reason for hiding this comment

Uh oh!

BenjaminBossan Aug 19, 2025

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Aug 19, 2025

Choose a reason for hiding this comment

Uh oh!

BenjaminBossan Aug 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Aug 19, 2025

Choose a reason for hiding this comment

Uh oh!

BenjaminBossan Aug 19, 2025

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Aug 19, 2025

Choose a reason for hiding this comment

Uh oh!

BenjaminBossan commented Aug 20, 2025

Uh oh!

githubnemo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

BenjaminBossan commented Aug 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

BenjaminBossan Aug 19, 2025 •

edited

Loading