+
Skip to content

jordantab/corl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

98 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Code Optimization using Reinforcement Learning (CORL)

18668 Data Science for Software Engineering Team 7

Setup

You can manage your python environment using a virtual environment:

python3 -m venv .venv

You can activate the environment and install the requirements like so:

source .venv/bin/activate
pip3 install -r requirements.txt

Install the precommit hooks used for python formatting:

pre-commit install

To exit the virtual environment, simply run deactivate.

Dataset

Code samples (dataset/) and scores for initial fine-tuning sourced from the PIE dataset. You can download the dataset directly here.

Public test cases (dataset/) from the PIE dataset. You can download the dataset directly here.

Hidden test cases (dataset/) from the PIE dataset. You can download the dataset directly here. The improvement_pairs_additional_metadata.csv dataset's columns are in the following format:

user_id, problem_id, language, submission_id_v0, submission_id_v1, cpu_time_v0, cpu_time_v1, memory_v0, memory_v1, status_v0, status_v1, improvement_frac, code_v0, code_v1

Subsets of the data processed for initial fine tuning (with the optimization instruction) can be found here.

The following scripts are available for data processing and analysis:

  • process_csv.py - Converts the csv dataset to a json file for instruction tuning.
  • count_tokens.py - Analyzes the distribution for input/output sizes (in terms of tokens) ofn instruction tuning dataset (json).
  • filter_dataset - Creates a subset of an instruction tuning dataset with samples where input/outputs are below the provided threshold.
  • train_test_split.py - Splits a JSON dataset into a train and test set, ensuring that all instances of a particular problem_id are either in the training set or the test set.

You can read more about how to use these scripts in their respective file headers, or by running the script with the --help flag.

Initial Fine Tuning

The initial fine-tuning with the instruction dataset can be performed with tune.py:

python3 tune.py

You may also use commandline flags to override the default configruation. For a complete list of options, check out the file header in tune.py or run:

python3 tune.py -h

The model checkpoint is saved to models/dataset_tuned_checkpoint.

References

PPOCoder

@article{shojaee2023ppocoder,
  title={Execution-based code generation using deep reinforcement learning},
  author={Shojaee, Parshin and Jain, Aneesh and Tipirneni, Sindhu and Reddy, Chandan K},
  journal={arXiv preprint arXiv:2301.13816},
  year={2023}
}

Pie Perf Dataset

@misc{shypula2023learning,
      title={Learning Performance-Improving Code Edits},
      author={Alexander Shypula and Aman Madaan and Yimeng Zeng and Uri Alon and Jacob Gardner and Milad Hashemi and Graham Neubig and Parthasarathy Ranganathan and Osbert Bastani and Amir Yazdanbakhsh},
      year={2023},
      eprint={2302.07867},
      archivePrefix={arXiv},
      primaryClass={cs.SE}
}

CodeT5+

@article{wang2023codet5plus,
  title={CodeT5+: Open Code Large Language Models for Code Understanding and Generation},
  author={Wang, Yue and Le, Hung and Gotmare, Akhilesh Deepak and Bui, Nghi D.Q. and Li, Junnan and Hoi, Steven C. H.},
  journal={arXiv preprint},
  year={2023}
}

About

Code Optimization using Reinforcement Learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 5

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载