这是indexloc提供的服务,不要输入任何密码
Skip to content

Conversation

@BenjaminBossan
Copy link
Member

Resolves #2603

Trainable tokens are erroring when using DS Z3 because the embedding weights are not available on all ranks. This solution fixes this in an efficient way that collects these weights on a single rank, initializes them, and then broadcasts only the slice that is affected.

Resolves huggingface#2603

Trainable tokens are erroring when using DS Z3 because the embedding
weights are not available on all ranks. This solution fixes this in an
efficient way that collects these weights on a single rank, initializes
them, and then broadcasts only the slice that is affected.
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@BenjaminBossan BenjaminBossan marked this pull request as draft June 24, 2025 08:41
Copy link
Collaborator

@githubnemo githubnemo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking this on. The changes so far look reasonable but I see that there are still some open points in #2603. Marking these changes as looking reasonable.

@BenjaminBossan
Copy link
Member Author

@githubnemo The new solution works for me locally (at first I had repro issues because of an incorrect accelerate config) and the user also confirmed that their original issue was solved. Please review again.

@BenjaminBossan BenjaminBossan marked this pull request as ready for review June 25, 2025 09:23
@BenjaminBossan BenjaminBossan merged commit 5af0cbe into huggingface:main Jun 26, 2025
25 of 40 checks passed
@BenjaminBossan BenjaminBossan deleted the fix-trainable-tokens-deepspeed-compatibility branch June 26, 2025 14:48
efraimdahl pushed a commit to efraimdahl/peft that referenced this pull request Jul 12, 2025
Resolves huggingface#2603

Trainable tokens are erroring when using DS Z3 because the embedding
weights are not available on all ranks. This solution fixes this in an
efficient way that collects these weights on a single rank, initializes
them, and then broadcasts only the slice that is affected.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

lora with trainable_token_indices do NOT support zero3?

3 participants