PEFT doesn't inject virtual tokens into generate forward pass

### System Info

- `transformers` version: 4.46.0.dev0
- Platform: Linux-5.4.0-148-generic-x86_64-with-glibc2.31
- Python version: 3.9.19
- Huggingface_hub version: 0.24.0
- Safetensors version: 0.4.3
- Accelerate version: 0.33.0
- Accelerate config:    not found
- PyTorch version (GPU?): 2.4.1+cu121 (True)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using distributed or parallel set-up in script?: <fill in>
- Using GPU in script?: <fill in>
- GPU type: NVIDIA RTX A6000

### Who can help?

@BenjaminBossan @sayakpaul 

### Information

- [ ] The official example scripts
- [X] My own modified scripts

### Tasks

- [ ] An officially supported task in the `examples` folder
- [X] My own task or dataset (give details below)

### Reproduction

I met the problem by the following code:
```python
import peft
from transformers import IdeficsForVisionText2Text, AutoProcessor
import sys
import torch

sys.path.insert(0, "..")
import config  # my own file

device = torch.device("cuda:3")
model = IdeficsForVisionText2Text.from_pretrained(
    config.idefics_9b_path, torch_dtype=torch.float16
).to(device)
processor = AutoProcessor.from_pretrained(
    config.idefics_9b_path, torch_dtype=torch.float16
)
model = peft.get_peft_model(
    model,
    peft.PrefixTuningConfig(
        peft_type="PREFIX_TUNING",
        task_type="CAUSAL_LM",
        num_virtual_tokens=2,
        token_dim=4096,
        num_transformer_submodules=1,
        num_attention_heads=32,
        num_layers=32,
        encoder_hidden_size=768,
    ),
    mixed=False,
)
inputs = processor(["hello"]).to(device)
model.eval()
model.generate(**inputs)
```
When I add `print(past_key_values)` in transformers side, I got `DynamicCache()`, which means the virtual tokens weren't injected to forward pass.

### Expected behavior

It should get a cache with length of `num_virtual_tokens`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PEFT doesn't inject virtual tokens into generate forward pass #2134

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

PEFT doesn't inject virtual tokens into generate forward pass #2134

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions