GitHub - MJ10/clamp-gen-data: Code to load data and oracle for active learning on peptides

This is code to use the a subset of data from DBAASP suited for active learning for generating peptides. This is a minimal working version that has been extracted from an internal repository. Original commits are lost but the credit goes to Jie Fu, Tianyu Zhang and Moksh Jain.

We use git-lfs to track the checkpoints and data.

Installing

pip install -r requirements.txt
pip install -e .

Dataset Split

To get training data for our methods:

from clamp_common_eval.defaults import get_default_data_splits
data = get_default_data_splits(setting='Cluster')
data = get_default_data_splits(setting='Target') # or get_default_data_splits(setting='Title')
train_data = data.sample(dataset = "D1", neg_ratio = 2)     # Get D1 and Neg(1 : 2)
train_data = data.sample(dataset = "D1-177", neg_ratio = 1) # Get C. Albican and 177 Neg
train_data = data.sample(dataset = "D2", neg_ratio = 1)     # Get D2 and Neg(1 : 1)

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
clamp_common_eval		clamp_common_eval
data		data
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Installing

Dataset Split

About

Releases 1

Packages

Languages

MJ10/clamp-gen-data

Folders and files

Latest commit

History

Repository files navigation

Installing

Dataset Split

About

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages