Refactor AWS Bedrock Provider for Multi-modal Support & Correct Token Limits #3714

tristan-stahnke-GPS · 2025-04-24T18:41:06Z

Pull Request Type

🐛 fix
♻️ refactor

What is in this change?

This PR significantly refactors the AWSBedrockLLM provider (server/utils/AiProviders/bedrock/index.js) to:

Enable Multi-modal Input: Correctly implement support for sending image attachments alongside text prompts to compatible Bedrock models using the Converse API.
Fix Max Tokens Error: Address the ValidationException caused by incorrectly using the total context window limit for the maximum output tokens parameter in API calls.
Improve Code Quality: Enhance clarity, maintainability, and error handling.

Additional Information

The previous version of the Bedrock provider had two major issues:

No Multi-modal Support: The #generateContent function responsible for handling message attachments was unimplemented or commented out. This prevented users from sending images to multi-modal Bedrock models (like Claude 3).

Incorrect maxTokens Usage: The promptWindowLimit() function (reading AWS_BEDROCK_LLM_MODEL_TOKEN_LIMIT) was used to set the inferenceConfig.maxTokens parameter in API calls. This value represents the total context window, which is often much larger than the maximum number of tokens a model can generate in a single output. This led to ValidationException errors when the configured context window exceeded the model's specific output limit (e.g., sending 131k when the model's output limit was 8192).

Detailed Changes:

Multi-modal Input Implementation (#generateContent & Helpers)
Implemented #generateContent:

This private method now correctly processes both userPrompt (text) and attachments (images).

Added getImageFormatFromMime Helper:

Parses MIME types (e.g., image/png) from attachments.
Extracts the format (png).
Normalizes jpg to jpeg.
Validates the format against SUPPORTED_BEDROCK_IMAGE_FORMATS (jpeg, png, gif, webp).

Added Base64 Data URI Stripping:

Logic within #generateContent now automatically removes the data:image/...;base64, prefix from attachment.contentString before decoding.

Added base64ToUint8Array Helper:

Decodes the pure base64 string into a Uint8Array using atob() and charCodeAt(). This method was chosen to closely match the technique observed in previous Langchain JS implementations.
Formatted Image Blocks: #generateContent now constructs image blocks in the precise format required by the Bedrock Converse API: { image: { format: "...", source: { bytes: ... } } }.
Robustness: Added checks for invalid attachment objects and handles potential errors during base64 decoding, logging warnings/errors and skipping problematic attachments. Ensures the final content array sent to the API is never empty.

Output Token Limit Correction (getMaxOutputTokens & API Calls)
Introduced DEFAULT_MAX_OUTPUT_TOKENS:

Defined a constant (4096) as a safe default for the maximum number of tokens to generate in a response.

Added getMaxOutputTokens Method:

Reads the optional AWS_BEDROCK_LLM_MAX_OUTPUT_TOKENS environment variable.
If the env var is set and valid, it uses that value.
Otherwise, it falls back to DEFAULT_MAX_OUTPUT_TOKENS.
This clearly separates the output limit from the total context window limit.

Updated API Calls (getChatCompletion, streamGetChatCompletion):

Modified the inferenceConfig in both ConverseCommand and ConverseStreamCommand to use the value returned by this.getMaxOutputTokens() for the maxTokens parameter.
This ensures the requested output length respects the specific model's generation capabilities, fixing the ValidationException.
Clarified promptWindowLimit: Renamed the static function back to promptWindowLimit (from the temporary getContextWindowLimit) for consistency. Added comments clarifying that this function reads AWS_BEDROCK_LLM_MODEL_TOKEN_LIMIT and represents the total context window, primarily used for calculating input limits (this.limits).

Prompt Construction (constructPrompt)

Now correctly utilizes the implemented #generateContent to format messages containing text and/or images for both chat history and the final user prompt.
Ensures system prompts (both real and simulated for noSystemPromptModels) are generated without attachments.
Improved handling of potentially missing/invalid attachments arrays in history messages.

API Call Structure

Correctly separates the system content block from the main messages array (containing only user/assistant turns) when calling ConverseCommand and ConverseStreamCommand, adhering to the API's expected structure.

Code Quality and Refinements

Added/improved JSDoc comments for all major functions and helpers, explaining parameters, return values, and logic.
Introduced constants for better maintainability (SUPPORTED_BEDROCK_IMAGE_FORMATS, DEFAULT_MAX_OUTPUT_TOKENS, DEFAULT_CONTEXT_WINDOW_TOKENS).
Refined error messages, particularly for the maxTokens validation error, providing more context to the user.
Improved validation of environment variables in the constructor.
Minor improvements to logging messages.
Added MODEL_MAP import (consistent with other providers, though not yet used for Bedrock limits).

Developer Validations

I ran yarn lint from the root of the repo & committed changes
Relevant documentation has been updated
I have tested my code functionality
Docker build succeeds locally

…ropic Claude Sonnet models: - Context Window defaults to 8192 maximum, which isn't correct - Multimodal stopped working when removing langchain, which was transparently handling image_url to a format sonnet expects.

move utils for AWS specific functionality to subfile add token output max to ENV so setting persits

… Limits (Mintplex-Labs#3714) * Fixed two primary issues discovered while using AWS Bedrock with Anthropic Claude Sonnet models: - Context Window defaults to 8192 maximum, which isn't correct - Multimodal stopped working when removing langchain, which was transparently handling image_url to a format sonnet expects. * Ran `yarn lint` * Updated .env.example to have aws bedrock examples too * Refactor for readability move utils for AWS specific functionality to subfile add token output max to ENV so setting persits --------- Co-authored-by: Tristan Stahnke <tristan.stahnke+gpsec@guidepointsecurity.com> Co-authored-by: Timothy Carambat <rambat1010@gmail.com>

Tristan Stahnke added 4 commits April 24, 2025 12:59

Merge remote-tracking branch 'origin/master' into fix-aws-bedrock

828d218

Ran yarn lint

78a6ce3

Updated .env.example to have aws bedrock examples too

b17c69c

tristan-stahnke-GPS mentioned this pull request Apr 24, 2025

[BUG/FEAT]: AWS Bedrock reasoning models with @agent #3553

Open

tristan-stahnke-GPS and others added 5 commits April 28, 2025 10:23

Merge branch 'Mintplex-Labs:master' into fix-aws-bedrock

33872db

Merge branch 'Mintplex-Labs:master' into fix-aws-bedrock

fb00508

Merge branch 'Mintplex-Labs:master' into fix-aws-bedrock

ce64a1e

Merge branch 'Mintplex-Labs:master' into fix-aws-bedrock

060c4c4

Merge branch 'master' into fix-aws-bedrock

b389b0d

timothycarambat added the PR:needs review Needs review by core team label Apr 29, 2025

tristan-stahnke-GPS and others added 5 commits April 30, 2025 10:48

Merge branch 'Mintplex-Labs:master' into fix-aws-bedrock

b54b229

Merge branch 'Mintplex-Labs:master' into fix-aws-bedrock

f707396

Merge branch 'Mintplex-Labs:master' into fix-aws-bedrock

2447230

Merge branch 'Mintplex-Labs:master' into fix-aws-bedrock

d385b7d

Refactor for readability

a3b27a0

move utils for AWS specific functionality to subfile add token output max to ENV so setting persits

timothycarambat merged commit b64a77f into Mintplex-Labs:master May 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Refactor AWS Bedrock Provider for Multi-modal Support & Correct Token Limits #3714

Refactor AWS Bedrock Provider for Multi-modal Support & Correct Token Limits #3714

Uh oh!

tristan-stahnke-GPS commented Apr 24, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Refactor AWS Bedrock Provider for Multi-modal Support & Correct Token Limits #3714

Refactor AWS Bedrock Provider for Multi-modal Support & Correct Token Limits #3714

Uh oh!

Conversation

tristan-stahnke-GPS commented Apr 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Type

What is in this change?

Additional Information

Developer Validations

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tristan-stahnke-GPS commented Apr 24, 2025 •

edited

Loading