diff --git a/docs/databases_klifs.rst b/docs/databases_klifs.rst index 7a48f9e6..1b262760 100644 --- a/docs/databases_klifs.rst +++ b/docs/databases_klifs.rst @@ -10,25 +10,29 @@ This module offers a simple API to interact with data from KLIFS remotely and lo What is KLIFS and who created it? --------------------------------- -"KLIFS, intially developed at the Vrije Universiteit Amsterdam, is a database that dissects experimental structures of catalytic kinase domains and the way kinase inhibitors interact with them. The KLIFS structural alignment enables the comparison of all structures and ligands to each other. Moreover, the KLIFS residue numbering scheme capturing the catalytic cleft of 85 residues allows for the comparison of the interaction patterns of kinase-inhibitors to each other to, for example, identify crucial interactions determining kinase-inhibitor selectivity." +"KLIFS is a kinase database that dissects experimental structures of catalytic kinase domains and the way kinase inhibitors interact with them. The KLIFS structural alignment enables the comparison of all structures and ligands to each other. Moreover, the KLIFS residue numbering scheme capturing the catalytic cleft with 85 residues enables the comparison of the interaction patterns of kinase-inhibitors, for example, to identify crucial interactions determining kinase-inhibitor selectivity." - KLIFS database: https://klifs.net - KLIFS online service: https://klifs.net/swagger -- KLIFS citation: - - - Description of the KLIFS website, the web services, and/or data/annotations from the KLIFS database: `Nucleic Acids Res. (2016), 44, 6, D365–D371 `_ - - Description of the initial KLIFS dataset, the binding mode classification, or the residue nomenclature: `J. Med. Chem. (2014), 57, 2, 249-277 `_ - +- KLIFS citation: `Nucleic Acids Res. (2021), 49, D1, D562–D569 `_ What does ``opencadd.databases.klifs`` offer? --------------------------------------------- +This module allows you to access KLIFS data such as information about kinases, structures, ligands, interaction fingerprints, bioactivities. +On the one hand, you can query the KLIFS webserver directly. + +On the other hand, you can query your local KLIFS download. +We provide identical APIs for the remote and local queries and streamline all output into standardized ``pandas`` DataFrames for easy and quick downstream manipulation. + Work with KLIFS data from KLIFS server (remotely) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The ``opencadd.databases.klifs.remote`` submodule offers you to access KLIFS data from the KLIFS server. -This module uses the official KLIFS API: https://klifs.net/swagger. +Our API relies on the REST API and OpenAPI (Swagger) specification at https://dev.klifs.net/swagger_v2/ to dynamically generate a Python client with ``bravado``. + +Example for ``opencadd``'s API to access remote data: .. code-block:: python @@ -46,7 +50,8 @@ This module uses the official KLIFS API: https://klifs.net/swagger. Work with KLIFS data from disc (locally) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -The ``opencadd.databases.klifs.local`` submodule offers you to access KLIFS data from the KLIFS server. In order to make use of the module's functionality, you need a KLIFS download folder ``KLIFS_download`` with the following structure (files downloaded from `KLIFS `_): +The ``opencadd.databases.klifs.local`` submodule offers you to access KLIFS data from your KLIFS download. +In order to make use of the module's functionality, you need a KLIFS download folder ``KLIFS_download`` with the following structure (files downloaded from `KLIFS `_): .. code-block:: console @@ -64,6 +69,8 @@ The ``opencadd.databases.klifs.local`` submodule offers you to access KLIFS data │ └── ... └── ... +Example for ``opencadd``'s API to access local data: + .. code-block:: python from opencadd.databases.klifs import setup_local @@ -88,48 +95,61 @@ The module's structure looks like this, trying to use the same API for both modu opencadd/ └── databases/ └── klifs/ - ├── api.py # Defines the API for local and remote sessions. - ├── core.py # Defines the parent classes used in the local and remote modules. - ├── local.py # Defines the API for local queries. - ├── remote.py # Defines the API for remote queries. - ├── schema.py # Defines the schema for class method return values. - └── utils.py # Defines utility functions. + ├── api.py # Defines the main API for local and remote sessions. + ├── session.py # Defines a KLIFS session. + ├── core.py # Defines the parent classes used in the local and remote modules. + ├── local.py # Defines the API for local queries. + ├── remote.py # Defines the API for remote queries. + ├── schema.py # Defines the schema for class method return values. + ├── fields.py # Defines the different KLIFS data fields and their names/dtypes in ``opencadd``. + ├── utils.py # Defines utility functions. + └── exceptions.py # Defines exceptions. This structure mirrors the KLIFS Swagger API structure in the following way to access different kinds of information both remotely and locally: - ``kinases`` - Get information about kinases (groups, families, names). - - In KLIFS swagger API called ``Information``. + - In KLIFS swagger API called ``Information``: https://dev.klifs.net/swagger_v2/#/Information - ``ligands`` - Get ligand information. - - In KLIFS swagger API called ``Ligands``. + - In KLIFS swagger API called ``Ligands``: https://dev.klifs.net/swagger_v2/#/Ligands - ``structures`` - Get structure information. - - In KLIFS swagger API called ``Structures``. + - In KLIFS swagger API called ``Structures``: https://dev.klifs.net/swagger_v2/#/Structures - ``bioactivities`` - Get bioactivity information. - - In KLIFS swagger API part of ``Ligands``. + - In KLIFS swagger API part of ``Ligands``: https://dev.klifs.net/swagger_v2/#/Ligands - ``interactions`` - Get interaction information. - - In KLIFS swagger API called ``Interactions``. + - In KLIFS swagger API called ``Interactions``: https://dev.klifs.net/swagger_v2/#/Interactions - ``pocket`` - Get interaction information. - - In KLIFS swagger API part of ``Interactions``. + - In KLIFS swagger API part of ``Interactions``: https://dev.klifs.net/swagger_v2/#/Interactions - ``coordinates`` - Get structural data (structure coordinates). - - In KLIFS swagger API part of ``Structures``. + - In KLIFS swagger API part of ``Structures``: https://dev.klifs.net/swagger_v2/#/Structures + +- ``conformations`` + + - Get information on structure conformations. + - In KLIFS swagger API part of ``Structures``: https://dev.klifs.net/swagger_v2/#/Structures/get_structure_conformation + +- ``modified_residues`` + + - Get information on residue modifications in structures. + - In KLIFS swagger API part of ``Structures``: https://dev.klifs.net/swagger_v2/#/Structures/get_structure_modified_residues diff --git a/docs/tutorials/databases_klifs.ipynb b/docs/tutorials/databases_klifs.ipynb index 32e03672..2195d4d1 100644 --- a/docs/tutorials/databases_klifs.ipynb +++ b/docs/tutorials/databases_klifs.ipynb @@ -15,13 +15,16 @@ "\n", "In the following, the API will be demonstrated in parallel for local and remote access for the following sources of information (classes in `local` and `remote` modules):\n", "\n", - "- `Kinases`: Details on kinases in KLIFS\n", - "- `Ligands`: Details on ligands in KLIFS\n", - "- `Structures` Details on structures in KLIFS (extracted from the PDB)\n", - "- `Bioactivities` (accessible only remotely): Details on bioactivities in KLIFS (extracted from ChEMBL)\n", - "- `Interactions`: Details on kinase-ligand interactions in KLIFS\n", - "- `Pockets`: Details on kinase pockets in KLIFS\n", - "- `Coordinates`: Coordinates for complexes, ligands, pockets, proteins, and water in KLIFS" + "- `kinases`: Details on kinases\n", + "- `ligands`: Details on ligands\n", + "- `drugs`: Details on kinase inhibitors\n", + "- `bioactivities` (accessible only remotely): Details on bioactivities (extracted from ChEMBL)\n", + "- `structures` Details on structures (extracted from the PDB)\n", + "- `interactions`: Details on kinase-ligand interactions\n", + "- `conformations`: Details on structures' conformation parameters such as DFG, aC-helix, p-loop\n", + "- `pockets`: Details on a structure's pocket residues\n", + "- `modified_residues`: Details on a structure's modified residues\n", + "- `coordinates`: Coordinates for complexes, ligands, pockets, proteins, and water" ] }, { @@ -72,7 +75,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "From the `api` module, import the session setup functions `setup_remote()` and `setup_local()`." + "Import the session setup functions `setup_remote()` and `setup_local()`." ] }, { @@ -544,14 +547,16 @@ "source": [ "Different sources of information are accessible via (instances of classes introduced earlier, which are initialized upon session setup):\n", "\n", - "- `local.kinases` or `remote.kinases`\n", - "- `local.ligands` or `remote.ligands`\n", - "- `local.structures` or `remote.structures`\n", - "- `local.bioactivities` or `remote.bioactivities`\n", - "- `local.interactions` or `remote.interactions`\n", - "- `local.pockets` or `remote.pockets`\n", + "- `remote.kinases` or `local.kinases`\n", + "- `remote.ligands` or `local.ligands`\n", "- `remote.drugs` (not available locally)\n", - "- `local.coordinates` or `remote.coordinates` (check more details in the \"Coordinates\" section of this notebook)" + "- `remote.bioactivities` or `local.bioactivities`\n", + "- `remote.structures` or `local.structures`\n", + "- `remote.interactions` or `local.interactions`\n", + "- `remote.conformations` (not available locally)\n", + "- `remote.pockets` or `local.pockets`\n", + "- `remote.modified_residues` (not available locally)\n", + "- `remote.coordinates` or `local.coordinates` (check more details in the \"Coordinates\" section of this notebook)" ] }, { @@ -2735,7 +2740,7 @@ { "data": { "application/vnd.jupyter.widget-view+json": { - "model_id": "351ad69a83a542dda28ee667a897ce5f", + "model_id": "5ade4d5350574fd69b35c62145628f56", "version_major": 2, "version_minor": 0 }, @@ -2846,7 +2851,7 @@ { "data": { "application/vnd.jupyter.widget-view+json": { - "model_id": "c33596d342ec4cb8b3156e4b9e2cec68", + "model_id": "1446d662470b40f78f5463889f45b08d", "version_major": 2, "version_minor": 0 }, @@ -2995,7 +3000,7 @@ { "data": { "application/vnd.jupyter.widget-view+json": { - "model_id": "92be490e8a3747a7b5d0c0431e0a5194", + "model_id": "03739bd3ea184a3ebad8436f6d3d952b", "version_major": 2, "version_minor": 0 }, @@ -3227,7 +3232,7 @@ { "data": { "application/vnd.jupyter.widget-view+json": { - "model_id": "232a13db996740e3b869f604f2eb9447", + "model_id": "6fe5dda3f07645958dbc8f9c6adb3e4c", "version_major": 2, "version_minor": 0 }, @@ -4060,7 +4065,7 @@ { "data": { "application/vnd.jupyter.widget-view+json": { - "model_id": "896e99651c2646d7bc635480ac2d0d1d", + "model_id": "81c3a56b52e8464b9e0952b94520d06f", "version_major": 2, "version_minor": 0 }, @@ -4220,7 +4225,7 @@ } ], "source": [ - "# Takes a couple of minutes for all ligands, thus use only top 50 as show case here\n", + "# Takes a couple of minutes for all ligands, thus use only top x as show case here\n", "bioactivities_all = remote.bioactivities.all_bioactivities(_top_n=10)\n", "bioactivities_all" ] @@ -4256,7 +4261,7 @@ { "data": { "application/vnd.jupyter.widget-view+json": { - "model_id": "99eab2720c1543e0a486ac6e8ec25f1a", + "model_id": "6cc68c6ae8ef4411975bf906438aa5a6", "version_major": 2, "version_minor": 0 }, @@ -4279,7 +4284,7 @@ { "data": { "application/vnd.jupyter.widget-view+json": { - "model_id": "4764dcbef11549d6bec5bc4a3a8c8da1", + "model_id": "b753f29f32fc4ee98152800ee4ac69be", "version_major": 2, "version_minor": 0 }, @@ -4460,7 +4465,7 @@ { "data": { "application/vnd.jupyter.widget-view+json": { - "model_id": "dc76d14a444f446982d58014506e8684", + "model_id": "1cd68f0bff914a16a6297f0be2004471", "version_major": 2, "version_minor": 0 }, @@ -4519,7 +4524,7 @@ { "data": { "application/vnd.jupyter.widget-view+json": { - "model_id": "d1d764de05fd4ec48891e1b6b613c797", + "model_id": "3c5a16eace8b47b3aa35e70c5a6b6a27", "version_major": 2, "version_minor": 0 }, @@ -4705,7 +4710,7 @@ { "data": { "application/vnd.jupyter.widget-view+json": { - "model_id": "5ba34257783b4f1eacc359e75d3e1bbf", + "model_id": "5a575ef2341e432cab1a7277296b5fee", "version_major": 2, "version_minor": 0 }, @@ -10547,22 +10552,11 @@ "source": [ "## Structure conformations\n", "\n", - "KLIFS also provides a lot of conformation parameters for each structure:\n", - "\n", - "> The Structure conformation endpoint returns a comprehensive set of conformational annotations from KLIFS for the user-specified structure ID(s). This ranges from aC-helix annotations to DFG rotational analyses.\n", - "\n", - "https://dev.klifs.net/swagger_v2/#/Structures/get_structure_conformation\n", + "Explore access to information about structure conformations.\n", "\n", "__Note__: Only available remotely (not locally)." ] }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "__Remote__" - ] - }, { "cell_type": "markdown", "metadata": {}, @@ -11205,11 +11199,7 @@ "source": [ "## Residue modifications (structure's modified residues)\n", "\n", - "KLIFS also provides all modified residues for each structure if available:\n", - "\n", - "> The Structure modified residues endpoint returns a list of all residues that have undergone phosphorylation or sulfation for a specific structure. When the residues are within the KLIFS binding site, also the KLIFS numbering for that residue is provided.\n", - "\n", - "https://dev.klifs.net/swagger_v2/#/Structures/get_structure_modified_residues\n", + "Explore access to information about residue modifications in structures.\n", "\n", "__Note__: Only available remotely (not locally)." ] @@ -11221,13 +11211,6 @@ "### Residue modifications from structure KLIFS ID" ] }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "__Remote__" - ] - }, { "cell_type": "markdown", "metadata": {}, @@ -11381,11 +11364,11 @@ "| Entity | _mol2_ (default) | _pdb_ |\n", "|-----------------------|:----------------:| :-------------:|\n", "| __complex (default)__ | local / remote | local / remote |\n", - "| __protein__ | local / remote | - |\n", + "| __protein__ | local / remote | NaN |\n", "| __pocket__ | local / remote | local |\n", "| __ligand__ | local / remote | local |\n", - "| __water__ | local | - |\n", - "| __ions__ | local | - |\n", + "| __water__ | local | NaN |\n", + "| __ions__ | local | NaN |\n", "\n", "__Note__: The KLIFS _mol2_ files contain (implicit and explicit) hydrogens, whereas the _pdb_ files do not. Therefore, the _pdb_ files contain less atoms as compared to the _mol2_ files." ] @@ -11780,9 +11763,7 @@ "output_type": "stream", "text": [ "Complex (mol2): Number of atoms: 3604\n", - "Protein (mol2): Number of atoms: 3552\n", - "Pocket (mol2): Number of atoms: 1156\n", - "Pocket (pdb): Number of atoms: 1156\n" + "Protein (mol2): Number of atoms: 3552\n" ] }, { @@ -11796,7 +11777,8 @@ "name": "stdout", "output_type": "stream", "text": [ - "Ligand (mol2): Number of atoms: 49\n" + "Pocket (mol2): Number of atoms: 1156\n", + "Pocket (pdb): Number of atoms: 1156\n" ] }, { @@ -11810,6 +11792,7 @@ "name": "stdout", "output_type": "stream", "text": [ + "Ligand (mol2): Number of atoms: 49\n", "Ligand (pdb): Number of atoms: 31\n", "Water (mol2): Number of atoms: 3\n" ] @@ -11859,7 +11842,7 @@ "data": { "image/png": "\n", "text/plain": [ - "" + "" ] }, "execution_count": 113, @@ -11880,7 +11863,7 @@ "data": { "image/png": "\n", "text/plain": [ - "" + "" ] }, "execution_count": 114, @@ -11935,7 +11918,7 @@ "data": { "image/png": "\n", "text/plain": [ - "" + "" ] }, "execution_count": 116, @@ -11956,7 +11939,7 @@ "data": { "image/png": "\n", "text/plain": [ - "" + "" ] }, "execution_count": 117, @@ -11984,7 +11967,7 @@ "data": { "image/png": "\n", "text/plain": [ - "" + "" ] }, "execution_count": 118,