[Feature] Common kwargs for generate across vLLM and Transformers #3107

vmoens · 2025-08-01T10:59:08Z

Summary: Standardized Generation Parameters for vLLM and Transformers Wrappers

Overview

This PR introduces standardized generation parameters across vLLM and Transformers wrappers, enabling cross-backend compatibility while maintaining full backward compatibility.

Key Changes

1. Standardized Parameter Names

Introduced common parameter names that work across both vLLM and Transformers wrappers
Core parameters: max_new_tokens, num_return_sequences, temperature, top_p, top_k, repetition_penalty, do_sample, num_beams, length_penalty, early_stopping, stop_sequences, skip_special_tokens, logprobs

2. Legacy Parameter Support

Automatic conversion of legacy parameter names:
- max_tokens → max_new_tokens
- n → num_return_sequences
Ensures existing code continues to work without modification

3. Parameter Conflict Resolution

Silent override behavior: When both legacy and standardized names are provided, legacy names prevail
Example: max_tokens=20 + max_new_tokens=10 → max_tokens=20 wins
This ensures backward compatibility with existing code

4. Backend-Specific Mappings

vLLM: Maps standardized names to vLLM's SamplingParams format
- max_new_tokens → max_tokens
- num_return_sequences → n
- num_beams → best_of
- do_sample=False → temperature=0.0 (greedy decoding)
Transformers: Maps to Hugging Face's generation arguments
- logprobs → output_scores
- Filters unsupported parameters like length_penalty, early_stopping

5. Comprehensive Documentation

Added detailed docstrings to all three classes explaining:
- Standardized parameter list
- Legacy parameter support
- Parameter conflict resolution behavior
- Backend-specific mappings
Created comprehensive documentation file with examples and best practices

6. Test Coverage

Added comprehensive test suite covering:
- Standardized parameter usage
- Legacy parameter conversion
- Parameter conflict resolution
- Cross-backend compatibility

Files Modified

Core Implementation

torchrl/modules/llm/policies/common.py: Added standardization logic and common parameter definitions
torchrl/modules/llm/policies/vllm_wrapper.py: Integrated standardization with vLLM-specific mappings
torchrl/modules/llm/policies/transformers_wrapper.py: Integrated standardization with Transformers-specific mappings

Documentation

docs/standardized_generation_parameters.md: Comprehensive guide with examples and best practices

Tests

test/llm/test_wrapper.py: Added test cases for all new functionality

Benefits

Cross-Backend Compatibility: Same parameter names work with both vLLM and Transformers
Backward Compatibility: Existing code continues to work without changes
Clear Documentation: Users understand exactly how parameters are handled
Future-Proof: Easy to add new standardized parameters
Consistent API: Unified interface across different LLM backends

Usage Examples

# ✅ Standardized parameters (recommended)
generate_kwargs = {
    "max_new_tokens": 50,
    "temperature": 0.7,
    "do_sample": True,
}

# ✅ Legacy parameters (backward compatible)
generate_kwargs = {
    "max_tokens": 50,
    "n": 1,
    "temperature": 0.7,
}

# ⚠️ Mixed usage (legacy wins)
generate_kwargs = {
    "max_tokens": 20,      # This wins
    "max_new_tokens": 10,  # This is ignored
}

Breaking Changes

None - This is a purely additive change that maintains full backward compatibility.

Testing

All existing tests pass, and new tests verify the standardization functionality works correctly across both backends.

pytorch-bot · 2025-08-01T10:59:12Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3107

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures, 2 Cancelled Jobs, 19 Pending, 2 Unrelated Failures

As of commit 8a3cb80 with merge base 7752561 ():

NEW FAILURES - The following jobs have failed:

Continuous Benchmark (PR) / GPU Pytest benchmark (gh)
Process completed with exit code 2.
Continuous Benchmark / CPU Pytest benchmark (gh)
Process completed with exit code 2.
Habitat Tests on Linux / tests (3.9, 12.8) / linux-job (gh)
RuntimeError: Command docker exec -t a826c426cf3471068301d5c677df4e579db4fa659dea42ae81d3b02e5e253276 /exec failed with exit code 1

CANCELLED JOBS - The following jobs were cancelled. Please retry:

Continuous Benchmark (PR) / CPU Pytest benchmark (gh)
##[error]The operation was canceled.
Continuous Benchmark / GPU Pytest benchmark (gh)
##[error]The operation was canceled.

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

Libs Tests on Linux / unittests-gym (3.9, 12.8) / linux-job (gh) (trunk failure)
AttributeError: module 'torch.utils._pytree' has no attribute 'register_pytree_node'
Unit-tests on Linux / tests-olddeps (3.9, 11.6) / linux-job (gh) (trunk failure)
AttributeError: module 'torch.utils._pytree' has no attribute 'register_pytree_node'

This comment was automatically generated by Dr. CI and updates every 15 minutes.

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 1, 2025

vmoens added enhancement New feature or request llm/api labels Aug 1, 2025

vmoens force-pushed the more-func-wrappers branch from 19264cb to 7dd4dd6 Compare August 1, 2025 11:13

[Feature] Common kwargs for generate across vLLM and Transformers

ea22405

vmoens force-pushed the more-func-wrappers branch from 7dd4dd6 to ea22405 Compare August 1, 2025 11:14

amend

8a3cb80

vmoens merged commit 978424e into main Aug 1, 2025
80 of 93 checks passed

vmoens deleted the more-func-wrappers branch August 1, 2025 13:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] Common kwargs for generate across vLLM and Transformers #3107

[Feature] Common kwargs for generate across vLLM and Transformers #3107

Uh oh!

vmoens commented Aug 1, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Aug 1, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

[Feature] Common kwargs for generate across vLLM and Transformers #3107

[Feature] Common kwargs for generate across vLLM and Transformers #3107

Uh oh!

Conversation

vmoens commented Aug 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary: Standardized Generation Parameters for vLLM and Transformers Wrappers

Overview

Key Changes

1. Standardized Parameter Names

2. Legacy Parameter Support

3. Parameter Conflict Resolution

4. Backend-Specific Mappings

5. Comprehensive Documentation

6. Test Coverage

Files Modified

Core Implementation

Documentation

Tests

Benefits

Usage Examples

Breaking Changes

Testing

Uh oh!

pytorch-bot bot commented Aug 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3107

❌ 3 New Failures, 2 Cancelled Jobs, 19 Pending, 2 Unrelated Failures

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vmoens commented Aug 1, 2025 •

edited

Loading

pytorch-bot bot commented Aug 1, 2025 •

edited

Loading