Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lammps molecule number #1073

Merged
merged 24 commits into from
Feb 3, 2023
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
8ad8bbd
Allow kwargs for to_parmed to be passed through Compound.save()
jaclark5 Dec 11, 2022
3c114ad
infer_hierarchy takes the names of the first compounds without box de…
jaclark5 Dec 11, 2022
4f40f1b
Merge branch 'main' into lammps_molecule_number
daico007 Dec 12, 2022
b3efabd
Merge branch 'main' into lammps_molecule_number
daico007 Dec 13, 2022
0fa2abc
Merge branch 'main' into lammps_molecule_number
daico007 Dec 13, 2022
1427a76
Merge branch 'main' into lammps_molecule_number
jaclark5 Dec 15, 2022
2a1af5b
Update pull_hierarchical_attr to be an exposed function, pull_residue…
jaclark5 Dec 15, 2022
5c3ea3e
Merge branch 'main' into lammps_molecule_number
jaclark5 Jan 9, 2023
03a3ad9
Merge branch 'main' into lammps_molecule_number
jaclark5 Jan 18, 2023
5d150c7
Use object ids in pull_residues
jaclark5 Jan 27, 2023
07d0899
Merge branch 'lammps_molecule_number' of github.com:jaclark5/mbuild i…
jaclark5 Jan 27, 2023
e3cb97f
Merge branch 'main' into lammps_molecule_number
jaclark5 Jan 30, 2023
3b8fcbb
code coverage
jaclark5 Jan 30, 2023
5391eff
default to infer compounds
jaclark5 Jan 30, 2023
629dfcd
Merge branch 'lammps_molecule_number' of github.com:jaclark5/mbuild i…
jaclark5 Jan 30, 2023
adadb5d
Revise docs and tests
jaclark5 Jan 31, 2023
a75aa9c
Fixed a bug with residues being named from parents with an empty stri…
CalCraven Feb 1, 2023
70fb932
Merge branch 'main' of https://github.com/mosdef-hub/mbuild into lamm…
CalCraven Feb 1, 2023
7b9a41c
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Feb 1, 2023
1c91a67
Fixes for pre-commit config in doc string of pull_residues in convers…
CalCraven Feb 1, 2023
cf2979e
Merge branch 'lammps_molecule_number' of https://github.com/jaclark5/…
CalCraven Feb 1, 2023
41cf89e
Tests for coarse grained residue matching using new pull_residues fun…
CalCraven Feb 2, 2023
cafc60f
Merge branch 'main' into lammps_molecule_number
jaclark5 Feb 2, 2023
5b3b49a
Bug fix catalog_bondgraph_type
jaclark5 Feb 3, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,9 @@ test-output.xml
# C extensions
*.so

# Tmp Files
*.swp

# Packages
*.egg
*.egg-info
Expand Down
15 changes: 13 additions & 2 deletions mbuild/compound.py
Original file line number Diff line number Diff line change
Expand Up @@ -2579,6 +2579,7 @@ def save(
residues=None,
combining_rule="lorentz",
foyer_kwargs=None,
parmed_kwargs=None,
**kwargs,
):
"""Save the Compound to a file.
Expand Down Expand Up @@ -2623,6 +2624,14 @@ def save(
`write_mcf`, or `parmed.Structure.save`.
See `parmed structure documentation
<https://parmed.github.io/ParmEd/html/structobj/parmed.structure.Structure.html#parmed.structure.Structure.save>`_
parmed_kwargs : dict, optional, default=None
Keyword arguments to provide to :meth:`mbuild.Compound.to_parmed`
**kwargs
Depending on the file extension these will be passed to either
`write_gsd`, `write_hoomdxml`, `write_lammpsdata`, `write_mcf`, or
`parmed.Structure.save`.
See https://parmed.github.io/ParmEd/html/structobj/parmed.structure.
Structure.html#parmed.structure.Structure.save

Other Parameters
----------------
Expand Down Expand Up @@ -2677,6 +2686,7 @@ def save(
residues,
combining_rule,
foyer_kwargs,
parmed_kwargs,
**kwargs,
)

Expand Down Expand Up @@ -2941,8 +2951,9 @@ def to_parmed(
checking against Compound.name.
show_ports : boolean, optional, default=False
Include all port atoms when converting to a `Structure`.
infer_residues : bool, optional, default=False
Attempt to assign residues based on names of children.
infer_residues : bool, optional, default=True
Attempt to assign residues based on names of children for the level
of compound hierarchy without a box definition.
jaclark5 marked this conversation as resolved.
Show resolved Hide resolved

Returns
-------
Expand Down
51 changes: 45 additions & 6 deletions mbuild/conversion.py
Original file line number Diff line number Diff line change
Expand Up @@ -935,6 +935,7 @@ def save(
residues=None,
combining_rule="lorentz",
foyer_kwargs=None,
parmed_kwargs=None,
**kwargs,
):
"""Save the Compound to a file.
Expand Down Expand Up @@ -975,6 +976,8 @@ def save(
geometric combining rules respectively.
foyer_kwargs : dict, optional, default=None
Keyword arguments to provide to `foyer.Forcefield.apply`.
parmed_kwargs : dict, optional, default=None
Keyword arguments to provide to :meth:`mbuild.Compound.to_parmed`
**kwargs
Depending on the file extension these will be passed to either
`write_gsd`, `write_hoomdxml`, `write_lammpsdata`, `write_mcf`, or
Expand Down Expand Up @@ -1047,8 +1050,13 @@ def save(
if os.path.exists(filename) and not overwrite:
raise IOError("{0} exists; not overwriting".format(filename))

if not parmed_kwargs:
parmed_kwargs = {}
structure = compound.to_parmed(
box=box, residues=residues, show_ports=show_ports
box=box,
residues=residues,
show_ports=show_ports,
**parmed_kwargs,
)
# Apply a force field with foyer if specified
if forcefield_name or forcefield_files:
Expand Down Expand Up @@ -1115,7 +1123,7 @@ def to_parmed(
title="",
residues=None,
show_ports=False,
infer_residues=False,
infer_residues=True,
):
"""Create a Parmed Structure from a Compound.

Expand All @@ -1135,8 +1143,9 @@ def to_parmed(
against Compound.name.
show_ports : boolean, optional, default=False
Include all port atoms when converting to a `Structure`.
infer_residues : bool, optional, default=False
Attempt to assign residues based on names of children.
infer_residues : bool, optional, default=True
Attempt to assign residues based on names of children for the level
of compound hierarchy without a box definition.
jaclark5 marked this conversation as resolved.
Show resolved Hide resolved

Returns
-------
Expand All @@ -1147,14 +1156,44 @@ def to_parmed(
--------
parmed.structure.Structure : Details on the ParmEd Structure object
"""

def _pull_hierarchical_attrs(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the recursive approach that you are doing with this method! My only concern is that I am not sure _box == None is a good indicator of to refer a residue, since many of us often convert a compound without a box to parmed and assign the box information later (during the writing out step). One method I think might be useful is Compound.is_independent(), which basically indicate if a Compound has any external bond or not (i.e., if it's a stand-alone molecule), but it will require minor alteration of this method and might make not as versatile as you have it now (i.e., when you traveling down the hierarchy and bump into a Compound that is not independent, the residue should be its parent but not itself).

Also, I think I have tried to solve similar issue before (infer relevant residues/molecule information from the hierarchy) and came up with several different approaches:

  1. Bottom up approach, determine the lowest level Compound that form a full molecule
def _infer_molecules(compound):
    """Bottom up approach to infer residues"""
    connected_subgraph = compound.bond_graph.connected_components()
    residues = IndexedSet()
    for molecule in connected_subgraph:
        if len(molecule) == 1:
            ancestors = [molecule[0]]
        else:
            ancestors = IndexedSet(molecule[0].ancestors())
            for particle in molecule[1:]:
                # This works because particle.ancestors traversed, and hence
                # the lower level will be in the front.
                # The intersection will be left at the end,
                # ancestor of the first particle is used as reference.
                # Hence, this called will return the lowest-level Compound]
                # that is a molecule
                ancestors = ancestors.intersection(
                    IndexedSet(particle.ancestors())
                )
        residues.add(ancestors[0].name)
    return list(residues)
  1. Top down approach, always take the 2nd level Compound (if they are independent) as residue
def _infer_groups(compound):
    """Top down, second level if any is available"""
    residues = IndexedSet()
    if all(not child.children for child in compound.children):
        residues.add(compound.name)
    else:
        residues.update([child.name for child in compound.children])
    return residues

I believe the 1st approach should work in most cases, but the 2nd may be useful if you want better control of the grouping (e.g., a container make up of water, methanol and ethanol and you want to group methanol and ethanol under an alcohol group/Compound and water by itself).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the failing test, it's the test_resname_parmed in test_compound.py, specifically, when we try convert a compound (with not top level box) to parmed, so the infer_hierarchy step stops short at determining the lower level residues. A quick fix for this will be to add a line to define a box for that compound/system.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! I actually made many changes to this branch already too but haven't gotten back to it either. I'll wrap up my changes and push them soon. I think it would be easier to brain storm at that point.

compound,
target_attr="name",
hierarchical_attr="children",
target_attr_conditions={"_box": None},
):
"""List of attributes from children in an obscure hierarchy."""
value_list = []
if any(
[
getattr(compound, x) == y
for x, y in target_attr_conditions.items()
]
):
value_list.append(getattr(compound, target_attr))
else:
for child in getattr(compound, "children"):
value_list.extend(
_pull_hierarchical_attrs(
child,
target_attr=target_attr,
hierarchical_attr=hierarchical_attr,
target_attr_conditions=target_attr_conditions,
)
)

return value_list

structure = pmd.Structure()
structure.title = title if title else compound.name
atom_mapping = {} # For creating bonds below
guessed_elements = set()

# Attempt to grab residue names based on names of children
# Attempt to grab residue names based on names of children for the first
# level of hierarchy without a box definition
if not residues and infer_residues:
residues = list(set([child.name for child in compound.children]))
residues = _pull_hierarchical_attrs(compound)

if isinstance(residues, str):
residues = [residues]
Expand Down