Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
PDBManager
- Bug fixes, adding necessary changes to export only fir…
…st PDB model, and merging-in latest updates from `master` (#311) * add PDB manager #270 * add download method * add clustering utilities * `PDBManager` - Bug fixes, adding necessary changes to export only first PDB model, and merging-in latest updates from `master` (#309) * Fix graph sequence (atomistic graphs in `initialise_graph_with_metadata` had duplicated residues) (#268) * Fix param name typo in function docstring * fix: atomistic graph only has sequence residues for CA atom in `initialise_graph_with_metadata` * fix: avoid changing dataframe when extracting rows * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add: test sequence feature in graphs * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix graph sequence feature (#268) * fix matplotlib deprecation * fix test bug * change build to ubuntu-latest * remove unecessary selection --------- Co-authored-by: Cam <73625486+cimranm@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Arian Jamasb <arjamasb@gmail.com> * Add dataset splits functionality and add new documentation * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Resolve merge conflicts with remote * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused test * Address lingering SonarCloud concerns * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add deposition date parsing * remove pdb.py * add chain extraction util * add chain writing method * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * After fixing merge conflicts, add more filters and add time-based splits * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix up SonarCloud concerns * Improve verbiage surrounding PDB resolutions * Simplify code and improve variable names * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Track names of splits in df_splits * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix column naming during merging of DataFrame splits * add additional properties * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refactor clustering to allow file caching and overwriting * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add description to assert statements * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add extra documentation around clustering function, and address small formatting issues * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add method to write selection to CSV * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve from_fasta documentation * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Enable code reuse for length filters * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Minor documentation changes to FASTA write-out function * Add ability to perform most API calls for a subset of splits * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update .gitignore * Fix missing download call, and add more documentation to download functions * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix small bug when merging different splits together * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix bug in length filtering functions, fix print bugs in utils, and add ability to write-out PDB files after selecting a subset of chains to include in them * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix string formatting * Update PDB write-out logic and documentation * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add PDB download workaround for PDBs that can no longer be downloaded * Make exception more specific * Add TQDM for data split exporting * Add improved error message for non standard node funcs #274 (#275) * Add improved error message for non standard node funcs #274 * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * clean up unused files and move docs from root (#276) * clean up unused files and move docs from root * remove setup.cfg * prelim path support #269 (#277) * prelim path support #269 * fix import error * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update changelog --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Switch to miniconda for build (#278) * switch to miniconda for build * update docker build * switch to checkout v3 * Improve altloc handling (#263) * Fix bug in `add_k_nn_edges`. `kneighbors_graph(X=dist_mat, ...)` is wrong since `X` may not be a distance matrix. This leads to wrong results which may be similar to correct ones. * Extend `add_k_nn_edges`. * Add types to docstring * Update changelog * Add `kind_name` argument * Test `filter_distmat` * Set default value of `long_interaction_threshold` to 0 * Fix filtering bug in `add_k_nn_edges` * Test `add_k_nn_edges` * Refactor with `add_edge` * Fix bug for empty `edges_to_excl` * Improve `convert_nx_to_pyg` * Fix bug in `plot_pyg_data` * Test `convert_nx_to_pyg` on multimers * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update `CHANGELOG.md` * Fix version in `CHANGELOG.md` * Handle corner cases * Handle NaNs in coordinatess * Add PyG install to CI * typo in CI config * bump torch versions in CI * make pyg-related tests conditional pyg installation * Try fixing graph attributes * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix typo and extend amino acid 3to1, 1to3 mappings * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Adapt imports of amino acid codes * add semicolon to version * remove wildcard version number for pyyaml * fix typo * fix additonal typos * Extend aggregation to vectors * Implement `aggregate_feature_over_residues` * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add docstring and aggregation type * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * import literal from typing extensions * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add missing `median` in exception message * Fix `nullcontext` * fix dataset test * fix division by zero errors in edge colouring * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update changlelog * Separate and improve `remove_alt_locs` Removal of alt_locs is separeted from removal of insertions. Additionaly, now alt_locs with hihger occupancies are left * Test `remove_alt_locs` * Rename test * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Set `insertions=True` by default * Make `alt_locs` configurable (TODO `include` case) * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * use typing_extensions literal for 3.7 compatibility * use typing extensions literal for 3.7 compatibility * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve hbond donor/acceptor assignment robustnness * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * replace trailing ":" in insertions * fix test and hbond granularity inference * Add altloc identifer to node ID * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix tests * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix test * fix test * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * actually fix test * update changelog * Fix typo --------- Co-authored-by: Arian Jamasb <arjamasb@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Df processing #216 (#222) * docstrings and df processing funcs #216 * dcstrings * add test * lint test * fix test * fix typo in test * Update changelog * fix typo in test * fix broken test * fix broken test * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add hetatm removal to test * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * use atomic granularity * fix syntax error * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix bugs in test * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix test * typo --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Minor patch `convert_nx_to_pyg` #280 (#281) * nx_to_pyg bug fix #280 * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update changelog --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Arian Jamasb <arjamasb@gmail.com> * changes for 1.6.0 (#279) * changes for 1.6.0 * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Enable PDBManager root to be set to an arbitrary location * add initial tests * update changelog * add tutorial notebook * Allow all chains in a complex to be exported together * add module-level import * Remove old, unused PDBManager prototype file * add parsing & checks for unavailable PDB structures * fix download checker * actually fix download checker * add availability filter * FoldComp ML Datasets (#284) * add foldcomp dataset util * clean up * add import warnings * add foldcomp dataset extra dependencies * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * exclude foldcomp from notebook tests. download too big :( * update changelog * add lightning datamodule wrapper * add transform functionality * docs: add new module to API reference * update notebook * fix: fix paths issue on setup * add foldcomp dataset tutorial to docs * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add stage param to setup --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Default to export model 1's chains only in PDBManager, and clean-up notebook and utilities * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add tutorial nblink * add tutorial to datasets sections * mv pdb data to ml API * rm pyg dataset import * rm unused code * fix annotation * add MMTF download format * refactor dependency utils * refactor graphein.utils.utils.import_message * refactor graphein.protein.utils.is_tool * update .gitignore * ignore cif too * ignore cif too * ignore foldcomp files * catch straggling erroneous imports * ignore mol2 * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update folding utils * add max batch option * add foldcomp utils * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add notebook updates [WIP] * move manager class into graphein.ml * remove datasets init * fix import util refactor I didn't catch * add PDBmanager to __init__ * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix oligomeric filtering * update notebook * fix dataset init * fix protein.coord renaming in tensor module * add try/except to pyg-related datasets * add try/except to pyg-related datasets * add mmseqs to CI build * rollback dssp install to conda * ignore pdb manager notebook in minimal tests * fix code smell * fix metrics * shorten line lengths * add minimum scipy version * remove python 3.7 from CI * Add Torch 2.0.0 to CI * add note about multiple split strategies * add torch cluster install to CI * update dockerfile to torch 2.0 * switch docker pytorch 1.13 for VMD python version conflict * switch out torchtyping for jaxtyping * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update tensor shape syntax for jaxtyping * remove torch-dependent tests from minimal install testing * update test ignores * install dssp from apt, rather than conda in docker * update typing extensions version * Update citation (#287) * update citation * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Support MMTF & rename pdb_path to path throughout (#293) * rename pdb_path to path throughout * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * install from biopandas bleeding edge * fix bleeding edge biopandas install * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update to bleeding edge biopandas * [pre-commit.ci] pre-commit autoupdate (#294) * [pre-commit.ci] pre-commit autoupdate updates: - [github.com/psf/black: 23.1.0 → 23.3.0](psf/black@23.1.0...23.3.0) * pin pandas to <2.0.0 * Bump AF2 version --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Arian Jamasb <arjamasb@gmail.com> * update path in notebooks * Add missing import #296 (#297) * update changelog --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Prep for 1.7.0 release (#292) * update version string * update readme * update doc version * update changelog * Add autopublish workflow (#298) * Add autopublish workflow * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * update version for 1.7.0 * update workflow version * remove rogue print statement (#302) * Consistent conversion to undirected graphs (#301) * Fix `convert_nx_to_pyg` to return undirected graph * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix symmetrization of edges of different kinds * Clean * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix case when `edge_index` is not desired * Test directed/undirected conversion consistency * Update contributors * Update CHANGELOG.md --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Add graphein install to tutorial notebook #306 * Tensor fixes (#307) * add PSW to nonstandard residues * improve insertion and non-standard residue handling * refactor chain selection * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove unused verbosity arg * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix chain selection in tests * fix chain selection in tutorial notebook * fix notebook chain selection * fix chain selection typehint * Update changelog --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Add NLW as a nonstandard residue * Export only first model of each downloaded PDB file, and typecast model_id column to str to avoid to_pdb() errors * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Track split names for edge cases in dataset splitting * Add fix for scenario where downloaded PDB files do not contain ATOMs for an entry's listed chains * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: Cam <73625486+kamurani@users.noreply.github.com> Co-authored-by: Cam <73625486+cimranm@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Arian Jamasb <arjamasb@gmail.com> Co-authored-by: Anton Bushuiev <67932762+anton-bushuiev@users.noreply.github.com> Co-authored-by: Ryan Greenhalgh <35999546+rg314@users.noreply.github.com> * Add structure format parameter to allow mmtf manipulation * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update changelog --------- Co-authored-by: Alex Morehead <acmwhb@missouri.edu> Co-authored-by: Cam <73625486+kamurani@users.noreply.github.com> Co-authored-by: Cam <73625486+cimranm@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Anton Bushuiev <67932762+anton-bushuiev@users.noreply.github.com> Co-authored-by: Ryan Greenhalgh <35999546+rg314@users.noreply.github.com>
- Loading branch information