This repository contains the scripts, inputs and the results generated as part of the training the Sage line of OpenFF force fields.
Warning: This repository, its structure and contents, are currently in a state of flux and incompleteness while the first releases of Sage are being trained. We do not guarantee the scientific correctness of anything found within, nor do we yet recommend using any force field parameters found here.
This repository is structured into four main directories:
-
data-set-curation
- contains the script used to curate the training and test data sets. -
inputs-and-results
- contains the main input files required to reproduce the performed optimizations and benchmarks. -
schema
- contains schemas which define most parts of the project, including definitions of which optimizations and benchmarks were performed. -
scripts
- contains the script used to generate the input schemas / files, and scripts which perform ancillary data analysis.
The experimental data sets used in this project were curated from the NIST ThermoML
archive. The citations for the individual measurements can be found in DATA_CITATIONS.bib
The QM data sets used in this project were generated by and are stored in the MolSSI QCArchive repository.
The exact inputs and outputs used for each new version of Sage (including the conda environment used to generate them) have been included as tagged releases to this repository.
For those looking to reproduce the study, the required dependencies may be obtained directly using conda:
conda env create --name openff-sage --file environment.yaml
In most cases the VdW optimizations can be re-run using the following commands
cd inputs-and-results/optimizations/vdw-vx/
nonbonded optimization run
nonbonded optimization analyze
while the QM optimizations may be re-run, e.g., according to
cd inputs-and-results/optimizations/vdw-v1-td-opt-vib-v1/
ForceBalance optimize.in
A more complete set of instructions for performing QM fits with ForceBalance can be found here
A comprehensive set of instructions for re-running the benchmarks can be found in the inputs-and-results/benchmarks
directory.