-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Closed
Description
System Info
transformersversion: 4.46.0.dev0- Platform: Linux-5.4.0-148-generic-x86_64-with-glibc2.31
- Python version: 3.9.19
- Huggingface_hub version: 0.24.0
- Safetensors version: 0.4.3
- Accelerate version: 0.33.0
- Accelerate config: not found
- PyTorch version (GPU?): 2.4.1+cu121 (True)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using distributed or parallel set-up in script?:
- Using GPU in script?:
- GPU type: NVIDIA RTX A6000
Who can help?
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder - My own task or dataset (give details below)
Reproduction
I met the problem by the following code:
import peft
from transformers import IdeficsForVisionText2Text, AutoProcessor
import sys
import torch
sys.path.insert(0, "..")
import config # my own file
device = torch.device("cuda:3")
model = IdeficsForVisionText2Text.from_pretrained(
config.idefics_9b_path, torch_dtype=torch.float16
).to(device)
processor = AutoProcessor.from_pretrained(
config.idefics_9b_path, torch_dtype=torch.float16
)
model = peft.get_peft_model(
model,
peft.PrefixTuningConfig(
peft_type="PREFIX_TUNING",
task_type="CAUSAL_LM",
num_virtual_tokens=2,
token_dim=4096,
num_transformer_submodules=1,
num_attention_heads=32,
num_layers=32,
encoder_hidden_size=768,
),
mixed=False,
)
inputs = processor(["hello"]).to(device)
model.eval()
model.generate(**inputs)When I add print(past_key_values) in transformers side, I got DynamicCache(), which means the virtual tokens weren't injected to forward pass.
Expected behavior
It should get a cache with length of num_virtual_tokens.
Metadata
Metadata
Assignees
Labels
No labels