Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reorder bond, angle, and dihedral in write_lammpsdata #1071

Merged
merged 10 commits into from
Jan 30, 2023

Conversation

jaclark5
Copy link
Contributor

@jaclark5 jaclark5 commented Dec 10, 2022

PR Summary:

Resolves #1070

I haven't fixed the tests to accept these changes because I expect there to be comments.

PR Checklist


  • Includes appropriate unit test(s)
  • [NA] Appropriate docstring(s) are added/updated
  • Code is (approximately) PEP8 compliant
  • Issue(s) raised/addressed?

@daico007
Copy link
Member

I think I am good with this change! I didn't realize the order was not enforced in the lammps writer.

@codecov
Copy link

codecov bot commented Dec 20, 2022

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 90.54%. Comparing base (9d618c0) to head (c7a0099).
Report is 132 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1071      +/-   ##
==========================================
+ Coverage   90.49%   90.54%   +0.04%     
==========================================
  Files          61       61              
  Lines        6157     6186      +29     
==========================================
+ Hits         5572     5601      +29     
  Misses        585      585              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jaclark5
Copy link
Contributor Author

jaclark5 commented Dec 20, 2022

The reported failures are due to a merger of the main branch... why doesn't this repository have a develop branch?

@daico007
Copy link
Member

I can look into the issue with the test (seems like it's stemming from something outside the scope of this PR). Regarding missing of the develop branch, we just don't feel the need to have one (at least up until now). For every new feature added, we usually try to make sure it's (kinda) ready for production (passed all the CIs test and don't interfere with other existing features, and hence, can be directly merged to main) but I am open to start having a develop branch if you think would make contributing to it easier (i.e., not having to worry about issue outside the scope of the PR), that way we can have things move faster and not having to worry about having a not-so-clean main. Do you have any opinion @CalCraven @chrisiacovella @bc118 ?

@CalCraven
Copy link
Contributor

A develop branch is definitely a nice feature to have. The way I look at it, essentially the dev version of this package is similar to a dev branch, and in this case then the most recent minor release available on Conda is the main branch that has been fully tested and stabilized. However, there are definitely some added benefits to having a more three tiered system with the Conda release, a main branch, and then a dev branch, as noted by @jaclark5. Something to consider adding (and something we have added to GMSO where the majority of the MoSDeF additions have focused in the past year).

Copy link
Contributor

@CalCraven CalCraven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. I still have some concerns with the time it takes for this sorting mechanism, and I'm wondering if there might be something a little more efficient. In the worst case scenario, which looks to be the charmm dihedrals and impropers, we scale by 10N^4. So that can quickly get out of hand, with all of those nested for loops. I think it's a rare case, where you have like 100's of atomtypes but not something entirely impossible.

mbuild/formats/lammpsdata.py Outdated Show resolved Hide resolved
mbuild/formats/lammpsdata.py Outdated Show resolved Hide resolved
mbuild/formats/lammpsdata.py Outdated Show resolved Hide resolved
mbuild/formats/lammpsdata.py Outdated Show resolved Hide resolved
mbuild/formats/lammpsdata.py Outdated Show resolved Hide resolved
mbuild/formats/lammpsdata.py Outdated Show resolved Hide resolved
mbuild/formats/lammpsdata.py Outdated Show resolved Hide resolved
mbuild/formats/lammpsdata.py Outdated Show resolved Hide resolved
mbuild/formats/lammpsdata.py Outdated Show resolved Hide resolved
mbuild/formats/lammpsdata.py Outdated Show resolved Hide resolved
@CalCraven
Copy link
Contributor

CalCraven commented Dec 22, 2022

I think a way we could improve the sorting would be to replace a line like:

atom_types = list(set(atom.type for atom in structure.atoms))
atom_types.sort(key=natural_sort)
lx = len(atom_types)
angle_sets = [x[-1] for _, x in unique_angle_types.items()]
ordered_angle_sets = [
    (atom_types[x], atom_types[y])
    for x in range(lx)
    for y in range(x, lx)
    if (atom_types[x], atom_types[y]) in angle_sets
]

which scales at O(N^3)

with a keyed sort which scales on O(NlogN) as far as I'm aware.

atom_types = list(set(atom.type for atom in structure.atoms))
atom_types.sort(key=natural_sort)
angle_sets = [
    (x[2], x[-1][0], x[-1][1]) for _, x in unique_angle_types.items()
]
angle_sets.sort(key=lambda x: atom_types.index(x[0])) #improved sorting function
ordered_angle_sets = angle_sets

@jaclark5
Copy link
Contributor Author

jaclark5 commented Jan 8, 2023

Sure the sorting function is more efficient if there's a single quantity to sort by, but unless I'm reading it wrong, your proposed solution isn't sorting the second or third atom_types in the angle and so is insufficient. I can use the sorting function with something like this:

magnitude = np.ceil(np.log10(len(atom_types)))
angle_sets.sort(
    key=lambda x: atom_types.index(x[0]) * 10**(2*mangnitude) 
        + atom_types.index(x[1]) * 10**mangnitude 
        + atom_types.index(x[2]))
)

@CalCraven
Copy link
Contributor

Sure the sorting function is more efficient if there's a single quantity to sort by, but unless I'm reading it wrong, your proposed solution isn't sorting the second or third atom_types in the angle and so is insufficient. I can use the sorting function with something like this:

That is correct, I think your solution here completes it!

We have discussed this further among the devs, and I tested out some 100 unique atomtypes 100,000 atom systems, and the writing isn't the holdup as compared to just atomtyping the system, so I think once the tests are passing here, we will be good to merge this. I'll also ping the original issue #1070 once this feature is implemented on the GMSO side of things as well.

@jaclark5
Copy link
Contributor Author

What is going on with the docs?

@daico007
Copy link
Member

Oh, I think the docs sometimes just doesn't build properly for PR. I will try to review this PR today. @CalCraven seems like most of your comments have been addressed too.

Copy link
Member

@daico007 daico007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good! Tried out the sorting locally and I think those are quite neat.

Copy link
Contributor

@CalCraven CalCraven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for noticing this feature.

@jaclark5
Copy link
Contributor Author

I'm not seeing the output for LGTM, is that because the docs failed? It seems that there's an unknown attribute "spin"?

@CalCraven
Copy link
Contributor

CalCraven commented Jan 30, 2023

I believe the LGTM checks as of 22-12-16 have been discontinued and are implemented now in CodeQL. So I'm not too worried about that. I'm seeing an issue in read the docs for mbuild.coordinate_transform import translate_to as well as mbuild.coordinate_transform import spin and mbuild.coordinate_transform import rotate. These functions aren't touched in this PR, so it shouldn't be an issue here. It looks like maybe changes implemented in this PR affected them, and weren't caught due to other issues with the docs, so I think we can merge this as is, and address the docs issues in another PR @daico007.

@daico007 daico007 merged commit 1816bcd into mosdef-hub:main Jan 30, 2023
@jaclark5 jaclark5 deleted the issue_lammpsdataformat branch January 30, 2023 15:19
chrisiacovella pushed a commit to chrisiacovella/mbuild that referenced this pull request Feb 2, 2023
* Reorder bond, angle, and dihedral in write_lammpsdata

* Update ordering charm angles/dihedrals

* Bug fix tests

* Update bond/angle/dihedral sorting algorithm

---------

Co-authored-by: Co Quach <43968221+daico007@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Order bonds, angles, and dihedrals consistently for lammpsdata files
3 participants