This project aims to develop an intelligent agent that can automatically attempt to reproduce the code and analysis associated with a scientific manuscript and its corresponding dataset.
Reproducibility is a cornerstone of scientific integrity. However, many published research manuscripts lack sufficient detail or accessible code to reproduce the reported findings. This creates a significant barrier to validating and building upon existing research.
Given a scientific manuscript (PDF or text) and its corresponding dataset(s), the agent will:
- Extract relevant information (methodology, results, dependencies) from the manuscript.
- Process the provided dataset(s) according to the described analysis steps.
- Generate the Python code required to perform the analysis.
- Create a virtual environment (e.g., using Docker) with the necessary software dependencies.
- Execute the generated code within the environment to produce results (tables, figures, statistical values).
- Compare the generated results with the reported results in the manuscript.
- Assign a confidence score to the reproducibility attempt.
- Identify and document potential points of failure or ambiguity.
- Generate a detailed report summarizing the attempt.