A Neural Architecture for Generating Natural Language Descriptions from Source Code Changes

Requirements
- Torch, Cutorch (http://torch.ch/docs/getting-started.html)
- Python packages unidiff, pygments: pip install unidiff pygments
Setup environment
1. Clone this repositoty: cd ~ git clone https://github.com/epochx/commitgen-dev.git
2. Create data path: mkdir ~/data/preprocessing
3. Export env variable: export env WORK_DIR=~/data (without trailing slash!)
Download our paper data:
1. Get the raw commit data used in our paper from https://osf.io/67kyc/?view_only=ad588fe5d1a14dd795553fb4951b5bf9 (click on "OSF Storage" and then on "Download as zip".) Unzip the file where convenient.
2. Unzip the desired dataset zip and move the resulting folder to ~/data.
Pre-process data
1. Parse and filter commits and messages: cd ~/commitgen python ./preprocess.py FOLDER_NAME --language LANGUAGE, where FOLDER_NAME is the name of the folder from the previous step. Add the '--atomic' flag to keep only atomic commits. This will generate a pre-processed version of the dataset in a pickle file in ~/data/preprocessing. Try python ./preprocess.py --help for more details on additional pre-processing parameters.
2. Generate training data: cd ~/commitgen ./buildData.sh PICKLE_FILE_NAME LANGUAGE (PICKLE_FILE_NAME with no .pickle).
Train the model 1.- Run the model cd ~/commitgen ./run.sh PICKLE_FILE_NAME LANGUAGE (PICKLE_FILE_NAME with no .pickle)

You can also dowload additional github project data by using our crawler do cd ~/commitgen and run python crawl_commits.py --help for more details on how to do it.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
commitgen		commitgen
model		model
README.md		README.md
buildData.lua		buildData.lua
buildData.py		buildData.py
buildData.sh		buildData.sh
buildMosesData.py		buildMosesData.py
crawl_commits.py		crawl_commits.py
exploration.ipynb		exploration.ipynb
explore_predict.ipynb		explore_predict.ipynb
filtering.ipynb		filtering.ipynb
gen_attention_figs.py		gen_attention_figs.py
get_best_gen.py		get_best_gen.py
predict.sh		predict.sh
preprocess.py		preprocess.py
preprocess_multi_template.ipynb		preprocess_multi_template.ipynb
preprocess_template.ipynb		preprocess_template.ipynb
run.sh		run.sh
run_moses.sh		run_moses.sh
study_data.py		study_data.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Neural Architecture for Generating Natural Language Descriptions from Source Code Changes

About

Releases

Packages

Languages

epochx/commitgen

Folders and files

Latest commit

History

Repository files navigation

A Neural Architecture for Generating Natural Language Descriptions from Source Code Changes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages