Refactor AWS Bedrock Provider for Multi-modal Support & Correct Token Limits #3714
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Pull Request Type
What is in this change?
This PR significantly refactors the AWSBedrockLLM provider (server/utils/AiProviders/bedrock/index.js) to:
Additional Information
The previous version of the Bedrock provider had two major issues:
No Multi-modal Support: The #generateContent function responsible for handling message attachments was unimplemented or commented out. This prevented users from sending images to multi-modal Bedrock models (like Claude 3).
Incorrect maxTokens Usage: The promptWindowLimit() function (reading AWS_BEDROCK_LLM_MODEL_TOKEN_LIMIT) was used to set the inferenceConfig.maxTokens parameter in API calls. This value represents the total context window, which is often much larger than the maximum number of tokens a model can generate in a single output. This led to ValidationException errors when the configured context window exceeded the model's specific output limit (e.g., sending 131k when the model's output limit was 8192).
Detailed Changes:
Multi-modal Input Implementation (#generateContent & Helpers)
Implemented #generateContent:
Added getImageFormatFromMime Helper:
Added Base64 Data URI Stripping:
Added base64ToUint8Array Helper:
Output Token Limit Correction (getMaxOutputTokens & API Calls)
Introduced DEFAULT_MAX_OUTPUT_TOKENS:
Added getMaxOutputTokens Method:
Updated API Calls (getChatCompletion, streamGetChatCompletion):
Prompt Construction (constructPrompt)
API Call Structure
Code Quality and Refinements
Introduced constants for better maintainability (SUPPORTED_BEDROCK_IMAGE_FORMATS, DEFAULT_MAX_OUTPUT_TOKENS, DEFAULT_CONTEXT_WINDOW_TOKENS).
Developer Validations
yarn lintfrom the root of the repo & committed changes