[Feature] Common kwargs for generate across vLLM and Transformers #3107
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary: Standardized Generation Parameters for vLLM and Transformers Wrappers
Overview
This PR introduces standardized generation parameters across vLLM and Transformers wrappers, enabling cross-backend compatibility while maintaining full backward compatibility.
Key Changes
1. Standardized Parameter Names
max_new_tokens
,num_return_sequences
,temperature
,top_p
,top_k
,repetition_penalty
,do_sample
,num_beams
,length_penalty
,early_stopping
,stop_sequences
,skip_special_tokens
,logprobs
2. Legacy Parameter Support
max_tokens
→max_new_tokens
n
→num_return_sequences
3. Parameter Conflict Resolution
max_tokens=20
+max_new_tokens=10
→max_tokens=20
wins4. Backend-Specific Mappings
SamplingParams
formatmax_new_tokens
→max_tokens
num_return_sequences
→n
num_beams
→best_of
do_sample=False
→temperature=0.0
(greedy decoding)logprobs
→output_scores
length_penalty
,early_stopping
5. Comprehensive Documentation
6. Test Coverage
Files Modified
Core Implementation
torchrl/modules/llm/policies/common.py
: Added standardization logic and common parameter definitionstorchrl/modules/llm/policies/vllm_wrapper.py
: Integrated standardization with vLLM-specific mappingstorchrl/modules/llm/policies/transformers_wrapper.py
: Integrated standardization with Transformers-specific mappingsDocumentation
docs/standardized_generation_parameters.md
: Comprehensive guide with examples and best practicesTests
test/llm/test_wrapper.py
: Added test cases for all new functionalityBenefits
Usage Examples
Breaking Changes
None - This is a purely additive change that maintains full backward compatibility.
Testing
All existing tests pass, and new tests verify the standardization functionality works correctly across both backends.