Skip to content

oskoa/paccmann

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Build Status

PaccMann

paccmann is a package for drug sensitivity prediction and is the core component of the repo.

The package provides a toolbox of learning models for IC50 prediction using drug's chemical properties and tissue-specific cell lines gene expression.

Installation

Setup of the virtual environment

We strongly recommend to work inside a virtual environment (venv).

Create the environment:

python3 -m venv venv

Activate it:

source venv/bin/activate

Module installation

The module can be installed either in editable mode:

pip3 install -e .

Or as a normal package:

pip3 install .

NOTE: For multi-GPU training we use horovod. Its installation requires a working MPI installation. If you don't plan to use paccmann for distributed training on GPU you might remove horovod from requirements.txt before installing the package (you then will need to comment out the horovod imports in paccmann/bin/training_paccmann and paccmann/paccmann/models/core.py. In case you don't have a working MPI installation see here for details on setting up horovod.

Models training

Models can be trained using the script bin/training_paccmann that is installed together with the module. Check the examples for a quick start. For more details see the help of the training command by typing training_paccmann -h:

usage: training_paccmann [-h] [-save_checkpoints_steps 300]
                         [-eval_throttle_secs 60] [-model_suffix]
                         [-train_steps 10000] [-batch_size 64]
                         [-learning_rate 0.001] [-dropout 0.5]
                         [-buffer_size 20000] [-number_of_threads 1]
                         [-prefetch_buffer_size 6]
                         train_filepath eval_filepath model_path
                         model_specification_fn_name params_filepath
                         feature_names

Run training of a `paccmann` model.

positional arguments:
  train_filepath        Path to train data.
  eval_filepath         Path to eval data.
  model_path            Path where the model is stored.
  model_specification_fn_name
                        Model specification function. Pick one of the
                        following: ['dnn', 'rnn', 'scnn', 'sa', 'ca', 'mca'].
  params_filepath       Path to model params. Dictionary with parameters
                        defining the model.
  feature_names         Comma separated feature names. Select from the
                        following: ['smiles_character_tokens',
                        'smiles_atom_tokens', 'fingerprints_256',
                        'fingerprints_512', 'targets_10', 'targets_20',
                        'targets_50', 'selected_genes_10',
                        'selected_genes_20', 'cnv_min', 'cnv_max', 'disrupt',
                        'zigosity', 'ic50', 'ic50_labels'].

optional arguments:
  -h, --help            show this help message and exit
  -save_checkpoints_steps 300, --save-checkpoints-steps 300
                        Steps before saving a checkpoint.
  -eval_throttle_secs 60, --eval-throttle-secs 60
                        Throttle seconds between evaluations.
  -model_suffix , --model-suffix 
                        Suffix for the trained moedel.
  -train_steps 10000, --train-steps 10000
                        Number of training steps.
  -batch_size 64, --batch-size 64
                        Batch size.
  -learning_rate 0.001, --learning-rate 0.001
                        Learning rate.
  -dropout 0.5, --dropout 0.5
                        Dropout to be applied to set and dense layers.
  -buffer_size 20000, --buffer-size 20000
                        Buffer size for data shuffling.
  -number_of_threads 1, --number-of-threads 1
                        Number of threads to be used in data processing.
  -prefetch_buffer_size 6, --prefetch-buffer-size 6
                        Prefetch buffer size to allow pipelining.

About

PaccMann training models

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%