Skip to content

Signature based molecule enumeration from morgan fingerprints

License

Notifications You must be signed in to change notification settings

brsynth/molecule-signature

Repository files navigation

Molecule Signature

Github Version Github Licence Unit Test Coverage

Signature-based enumeration of molecules from morgan fingerprints.

Table of Contents

Installation

From conda package

Installation using conda is the easiest way to get started. First, install Conda and then install the package from the conda-forge channel.

  1. Install Conda: Download the installer for your operating system from the Conda Installation page. Follow the instructions on the page to install Conda. For example, on Windows, you would download the installer and run it. On macOS and Linux, you might use a command like:

    bash ~/Downloads/Miniconda3-latest-Linux-x86_64.sh

    Follow the prompts on the installer to complete the installation.

  2. Install signature from conda-forge:

    conda install -c conda-forge signature

From source code

One can also install the tool from the source code. This method is useful for development purposes.

  1. Install dependencies:

    conda env create -f environment.yml
  2. Add the signature to conda:

    conda activate sig
    pip install -e .  # From the root of the repository
  3. Add development dependencies:

    conda activate sig
    conda env update -n sig -f environment-dev.yml

Usage

Build a signature from SMILES

Here a simple example showing how to build a signature from a SMILES string. For more example, one can refer to the signature-basics notebook.

from rdkit import Chem
from molsig.Signature import MoleculeSignature

mol = Chem.MolFromSmiles("CCO")
mol_sig = MoleculeSignature(mol)
mol_sig.to_list()
# [
#  '80-1410 ## [C;H3;h3;D1;X4]-[C;H2;h2;D2;X4:1]-[O;H1;h1;D1;X2]',
#  '807-222 ## [C;H3;h3;D1;X4]-[C;H2;h2;D2;X4]-[O;H1;h1;D1;X2:1]',
#  '1057-294 ## [O;H1;h1;D1;X2]-[C;H2;h2;D2;X4]-[C;H3;h3;D1;X4:1]'
# ]

Build an alphabet

Alphabet makes use of signatures to create a collection of morgan bits-to-atom signature mappings.

See the creating-alphabet-basics notebook.

Enumerate molecules from a ECFP fingerprint

See the enumeration-basics notebook.

Citation

If you use this software, please cite it as below.

Meyer, P., Duigou, T., Gricourt, G., & Faulon, J.-L. Reverse Engineering Molecules from Fingerprints through Deterministic Enumeration and Generative Models. In preparation.

License

This project is licensed under the MIT License. See the LICENSE file for details.