-
Notifications
You must be signed in to change notification settings - Fork 14
generate command #79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
generate command #79
Conversation
- Implement tests for Float32Ptr to validate pointer creation for float32 values. - Create tests for ExtractJSON to ensure correct extraction of JSON from various input formats. - Add tests for cleanJavaScriptStringConcat to verify string concatenation handling in JavaScript context. - Introduce tests for StringSliceContains to check for string presence in slices. - Implement tests for MergeStringMaps to validate merging behavior of multiple string maps, including overwrites and handling of nil/empty maps.
…ove unused ChatMessage type
…Pex context conversion
… tests in export_test.go - Changed modelParams from pointer to value in toGitHubModelsPrompt function for better clarity and safety. - Updated the assignment of ModelParameters to use the value directly instead of dereferencing a pointer. - Introduced a new test suite in export_test.go to cover various scenarios for GitHub models evaluation generation, including edge cases and expected outputs. - Ensured that the tests validate the correct creation of files and their contents based on the provided context and options.
- Added NewPromptPex function to create a new PromptPex instance. - Implemented Run method to execute the PromptPex pipeline with context management. - Created context from prompt files or loaded existing context from JSON. - Developed pipeline steps including intent generation, input specification, output rules, and tests. - Added functionality for generating groundtruth outputs and evaluating test results. - Implemented test expansion and rating features for improved test coverage. - Introduced error handling and logging throughout the pipeline execution.
- Implemented TestCreateContext to validate various prompt YAML configurations and their expected context outputs. - Added TestCreateContextRunIDUniqueness to ensure unique RunIDs are generated for multiple context creations. - Created TestCreateContextWithNonExistentFile to handle cases where the prompt file does not exist. - Developed TestCreateContextPromptValidation to check for valid and invalid prompt formats. - Introduced TestGithubModelsEvalsGenerate to test the generation of GitHub Models eval files with various scenarios. - Added TestToGitHubModelsPrompt to validate the conversion of prompts to GitHub Models format. - Implemented TestExtractTemplateVariables and TestExtractVariablesFromText to ensure correct extraction of template variables. - Created TestGetMapKeys and TestGetTestScenario to validate utility functions related to maps and test scenarios.
…tPex configuration
… summary generation
… improved summary reporting
…se and restore its implementation; remove obsolete promptpex.go and summary_test.go files
…covering various scenarios and error handling
…entiment analysis test prompt
…neFlags function and update flag parsing to use consistent naming
… in generate_test.go
…ck responses for sentiment analysis stages
…odology for test generation
…derMessagesToString for message formatting
…ription for clarity; remove unused test functions
…d; clean up pipeline comments for clarity
…erations field; update related tests for consistency
…s; update related tests and documentation for consistency
…pdate related parsing and test logic for consistency
…values; update related tests for consistency and remove unused test_types.go file
…s to values; update ApplyEffortConfiguration and tests for consistency
…rs for improved structure
…e GetDefaultOptions and pipeline logic for usage
…s, parsing, and tests
…nced options and customization instructions
…nd update Test Generation section; add mermaid diagram for clarity
…larity and consistency
…related tests for consistency
…alid and invalid effort inputs
…tures - Introduced constants for evaluator rules compliance in constants.go. - Implemented GenerateRulesEvaluator function in evaluators.go for evaluating compliance with output rules. - Updated GetDefaultOptions to include evaluation model in options.go. - Modified pipeline to insert output rule evaluator into the prompt context. - Refactored render functions to use new color constants. - Added Eval field to PromptPexOptions in types.go for configuration.
…improved clarity and functionality; enhance test generation process with new rules and options
…ling in generateGroundtruth function and remove obsolete prompt_hash_test file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request implements the generate
command to automatically generate test cases for prompts using the PromptPex methodology. The implementation adds comprehensive test generation capabilities that analyze prompts and create diverse test scenarios to evaluate prompt behavior across different edge cases.
- Adds new
generate
command with full PromptPex pipeline for automated test generation - Implements HTTP request logging functionality for debugging API interactions
- Extends prompt file structure to support generated test data and evaluations
Reviewed Changes
Copilot reviewed 37 out of 38 changed files in this pull request and generated 5 comments.
Show a summary per file
File | Description |
---|---|
pkg/prompt/prompt.go | Added SaveToFile method and TestDataItem type, updated YAML tags for omitempty |
internal/azuremodels/client.go | Added HTTP logging context utilities for request debugging |
internal/azuremodels/azure_client.go | Implemented HTTP request logging to specified log files |
examples/test_generate.yml | Example generated prompt file with 40+ test cases and evaluator configuration |
examples/custom_instructions_example.md | Documentation for custom instruction flags usage |
cmd/run/run.go | Minor variable extraction refactor |
cmd/root_test.go | Added test assertion for generate command in help output |
cmd/root.go | Registered new generate command |
cmd/generate/* | Complete generate command implementation with pipeline, parsing, utilities, and tests |
README.md | Added comprehensive documentation for generate command and PromptPex methodology |
Makefile | Added ci-lint, build, and clean targets |
defer sp.Stop() | ||
|
||
resp, err := h.client.GetChatCompletionStream(ctx, req, h.org) | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The defer statement for sp.Stop() is placed inside a loop and will be executed when the function returns, not when the loop iteration ends. This could lead to multiple spinners running simultaneously. Consider calling sp.Stop() explicitly before continuing to the next iteration or restructuring the code.
if err != nil { | |
resp, err := h.client.GetChatCompletionStream(ctx, req, h.org) | |
if err != nil { | |
sp.Stop() // Ensure spinner is stopped before handling errors |
Copilot uses AI. Check for mistakes.
} | ||
reader := resp.Reader | ||
//nolint:gocritic,revive // TODO | ||
defer reader.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the spinner issue, the defer statement for reader.Close() is inside a loop and may not behave as expected. Consider explicit resource management.
defer reader.Close() |
Copilot uses AI. Check for mistakes.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Implement PromptPex strategy to generate tests for prompts automatically.
🚀 Automated Prompt Test Generation: PromptPex Integration, Robust CLI, and Enhanced Utilities
This PR introduces advanced automated test generation for prompt files using the PromptPex methodology, empowering users to systematically validate and harden prompt engineering workflows.
Highlights:
🧪 PromptPex Test Generation Pipeline:
Implements a new
generate
CLI command that orchestrates intent analysis, input specification, rule extraction, scenario generation, and test case creation for prompts—enabling automated, stepwise test generation.🛠️ Extensive CLI Enhancements:
🧰 Utility Functions & Helpers:
🏗️ Improved Reliability & Error Handling:
.gitignore
and CI tooling for better artifact management and workflow reliability.🧑🔬 Comprehensive Testing:
📄 Documentation & Examples:
generate
command, PromptPex integration, and advanced usage.🔍 Debugging & Transparency:
These changes deliver a powerful, research-backed framework for automated prompt validation—streamlining prompt engineering, improving reliability, and making the CLI experience more transparent and user-friendly.