这是indexloc提供的服务,不要输入任何密码
Skip to content

Conversation

@BenjaminBossan
Copy link
Member

Resolves #2888

This is a test for a hypothetical exploit that would enable trust_remote_code (and thus RCE) when a user loads a malicious prompt tuning model. This is because PEFT would just pass the on the tokenizer_kwargs defined in the prompt tuning config unsanitzed, which means that if the tokenizer is also malicious, the malicious code would be executed.

For this exploit to work, a user cannot load a model using PeftModel.from_pretrained as normal, because the tokenizer is only loaded in training mode. Although the attacker could set inference_mode=True in the adapter_config.json, that would still not work because prompt tuning methods cannot be loaded in inference mode. Therefore, the only way for the exploit to work would be if the user manually loads the model.

Resolves huggingface#2888

This is a test for a hypothetical exploit that would enable
trust_remote_code (and thus RCE) when a user loads a malicious prompt
tuning model. This is because PEFT would just pass the on the
tokenizer_kwargs defined in the prompt tuning config unsanitzed, which
means that if the tokenizer is also malicious, the malicious code would
be executed.

For this exploit to work, a user cannot load a model using
PeftModel.from_pretrained as normal, because the tokenizer is only
loaded in training mode. Although the attacker could set
inference_mode=True in the adapter_config.json, that would still not
work because prompt tuning methods cannot be loaded in inference mode.
Therefore, the only way for the exploit to work would be if the user
manually loads the model.
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

- moved new test to other file to group similar test
- fix old test that was failing due to change
Copy link
Collaborator

@githubnemo githubnemo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@BenjaminBossan BenjaminBossan merged commit ff00848 into huggingface:main Nov 5, 2025
24 of 25 checks passed
@BenjaminBossan BenjaminBossan deleted the fix-exploit-trust-remote-code-prompt-tuning branch November 5, 2025 15:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Potential remote code execution via untrusted tokenizer_kwargs in PromptEmbedding

3 participants