Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update VASP input sets for atomate2 compatibility #3835

Merged
merged 22 commits into from
May 23, 2024

Conversation

esoteric-ephemera
Copy link
Contributor

@esoteric-ephemera esoteric-ephemera commented May 16, 2024

This PR continues #3484 by updating the pymatgen.io.vasp.DictSet class to include features from the former atomate2 VaspInputGenerator class.
This allows transitioning atomate2 to exclusively use pymatgen input sets in materialsproject/atomate2#854.

These added features (such as auto_lreal and auto_ispin) are initialized to values for which they won't trigger in existing PMG classes, and are thus non-breaking.

Some of the logic in DictSet had to be reordered for consistency with its implementation in atomate2, but these changes appear to be non-breaking. Additionally, some checks (like ensuring that KSPACING and KPOINTS are not both set by the user) have been corrected. In the previous example, pymatgen would only check that user_kpoints_settings was not None, however user_kpoints_settings = user_kpoints_settings or {} occurs immediately before this check, thus potentially erroneously flagging calculations with no user_kpoints_settings.

The only potentially breaking change is to the initialization of MAGMOMs. I've made any MAGMOMs set in user_incar_settings the first choice (as in atomate2) before pymatgen searches for MAGMOMs in the input set's config, etc. The subsequent MAGMOM search order is the same.

Per atomate2 issue #724, one of the added features to DictSet is auto_metal_kpoints / reciprocal_density_metal tag in KPOINTS. This lets the user set a higher kpoint density for a metal while still using KPOINTS, rather than KSPACING.

There was also a minor bug in the MatPESStaticSet that would warn the user that the POTCAR functional was set incorrectly - that has been corrected.

Tagging @utf, @Andrew-S-Rosen, @JaGeo, @mkhorton, and @rkingsbury as they have been involved in developing these sets and/or atomate2.

@Andrew-S-Rosen
Copy link
Member

All makes sense to me!

@JaGeo
Copy link
Member

JaGeo commented May 17, 2024

@janosh, I will try to review soon.

def test_dict_set_alias():
assert isinstance(DictSet(), VaspInputSet)
with pytest.warns(
FutureWarning, match="DictSet is deprecated, and will be removed on 2025-12-31\n; use VaspInputSet"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DanielYang59 looks like your recent changes to @deprecated in monty put the new line in front of the semicolon. looks weird to start a line with a semicolon

- 2025-12-31\n;
+ 2025-12-31;\n

Copy link
Contributor

@DanielYang59 DanielYang59 May 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this and apologize for this overlook.

https://github.com/materialsvirtuallab/monty/blob/7568d6abcfc8dec14ba3022523656d5f9217835e/monty/dev.py#L94-L109

It looks like the semicolon was original there, and the deprecation message we inserted was too long so a line break was inserted. It would be a bit tricky to properly format this without breaking, I would think about this later

        msg = f"{old.__name__} is deprecated"


        if deadline is not None:
            msg += f", and will be removed on {_deadline:%Y-%m-%d}\n"


        if replacement is not None:
            if isinstance(replacement, property):
                r = replacement.fget
            elif isinstance(replacement, (classmethod, staticmethod)):
                r = replacement.__func__
            else:
                r = replacement
            msg += f"; use {r.__name__} in {r.__module__} instead."


        if message:
            msg += "\n" + message

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One other change in monty to make if you can add it to the same PR for the message change: this decorator is just for functions. Would be great if there's a way to decorate classes too

VaspInputGenerator = VaspInputSet


@deprecated(replacement=VaspInputSet, deadline=(2025, 12, 31))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

open to suggestions on the DictSet deprecation deadline here. it's currently set to 19.5 months

@shyuep
Copy link
Member

shyuep commented May 17, 2024

@janosh As I told you before, pls consult before making this kind of cosmetic stuff. Otherwise, you are going to be doing a lot of redundant work. Given that most people just jump to the relevant class in the IDE, why would this split be needed?

@janosh
Copy link
Member

janosh commented May 17, 2024

Given that most people just jump to the relevant class in the IDE, why would this split be needed?

as i wrote in the commit, huge files are harder to navigate (even with IDEs). if that weren't the case, there'd be no need for packages anywhere else in pymatgen. e.g. i could ask why not have all of pymatgen.core in a single file? do you see an issue, given this is not breaking?

@shyuep
Copy link
Member

shyuep commented May 17, 2024

@janosh The issue is you not consulting before making this change.

  1. The sets are a logical set of functionality. It is ridiculous to claim that just because sets are in one file then the whole of pymatgen should be in one big file.
  2. The sets.py was by no means that large. Certainly no larger than something like numpy.linalg.
  3. Even if we took a mind to split it up, the way you did it is not the best way of organization. For example, there is no reason why the submodules, which is an implementation detail, need to be public. That is but one issue I have with the organization. Too many for me to want to spend the effort to type it out here.
  4. Finally, even if this is a good idea and should be done, do it in ANOTHER PR. You are polluting a simple PR here with a massive cosmetic change that is unrelated.

I repeat again - it is a question of consultation. Learn to do it. For now, revert your commits in this PR. We settle this PR before dealing with the separate question of roganization.

Comment on lines 335 to 399
def write_input(
self,
output_dir: str,
make_dir_if_not_present: bool = True,
include_cif: bool = False,
potcar_spec: bool = False,
zip_output: bool = False,
) -> None:
"""
Writes a set of VASP input to a directory.

Args:
output_dir (str): Directory to output the VASP input files
make_dir_if_not_present (bool): Set to True if you want the
directory (and the whole path) to be created if it is not
present.
include_cif (bool): Whether to write a CIF file in the output
directory for easier opening by VESTA.
potcar_spec (bool): Instead of writing the POTCAR, write a "POTCAR.spec".
This is intended to help sharing an input set with people who might
not have a license to specific Potcar files. Given a "POTCAR.spec",
the specific POTCAR file can be re-generated using pymatgen with the
"generate_potcar" function in the pymatgen CLI.
zip_output (bool): If True, output will be zipped into a file with the
same name as the InputSet (e.g., MPStaticSet.zip)
"""
if potcar_spec:
vasp_input = None
if make_dir_if_not_present:
os.makedirs(output_dir, exist_ok=True)

with zopen(f"{output_dir}/POTCAR.spec", mode="wt") as file:
file.write("\n".join(self.potcar_symbols))

for key in ["INCAR", "POSCAR", "KPOINTS"]:
if (val := getattr(self, key.lower())) is not None:
with zopen(os.path.join(output_dir, key), mode="wt") as file:
file.write(str(val))
else:
vasp_input = self.get_input_set()
vasp_input.write_input(output_dir, make_dir_if_not_present=make_dir_if_not_present)

cif_name = ""
if include_cif and vasp_input is not None:
struct = vasp_input["POSCAR"].structure
cif_name = f"{output_dir}/{struct.formula.replace(' ', '')}.cif"
struct.to(filename=cif_name)

if zip_output:
filename = f"{type(self).__name__}.zip"
with ZipFile(os.path.join(output_dir, filename), mode="w") as zip_file:
for file in ["INCAR", "POSCAR", "KPOINTS", "POTCAR", "POTCAR.spec", cif_name]:
try:
zip_file.write(os.path.join(output_dir, file), arcname=file)
except FileNotFoundError:
pass

try:
os.remove(os.path.join(output_dir, file))
except (FileNotFoundError, PermissionError, IsADirectoryError):
pass

for key, val in self.files_to_transfer.items():
with zopen(val, "rb") as fin, zopen(str(Path(output_dir) / key), "wb") as fout:
shutil.copyfileobj(fin, fout)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly speaking, write_input should be attached to the InputSet rather than the generator (although I see that we're conflating the two here for backward compatibility reasons). That's OK, but is there a way to

  1. make the call signature more compatible with the base class (e.g., directory instead of output_dir) and

  2. Actually use the underlying write_input method of self.get_input_set instead of relying on a custom implementation here?

One of my main objectives in implementing the abc was to achieve a (more) consistent file IO call signature across different InputSet.

That's my only suggestion - I'll defer to @janosh and others on the thread if it's feasible to implement or not.

Copy link
Contributor Author

@esoteric-ephemera esoteric-ephemera May 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @rkingsbury. I've moved the features of VaspInputSet.write_input to VaspInput.write_input (VaspInput is the underlying class), and made VaspInputSet.write_input an interface to the underlying method

Re the output_dir --> directory stylistic change: seems like the VASP classes use output_dir rather than directory. While it's better to have a cohesive style across PMG, I'd like to avoid breaking changes with this PR. I can add a directory kwarg that aliases to output_dir and specify a deprecation period for output_dir if that's preferable

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep output_dir. "Directory" is unclear. While the context would suggest an output, there are instances where there is both an input_dir and an output_dir. It also goes too close to the dir special function in python.

@rkingsbury
Copy link
Contributor

rkingsbury commented May 20, 2024

Thanks for putting this together @esoteric-ephemera! A big step forward for consistency of InputSets. I left one comment on VaspInputSet; otherwise I defer to others here.

@shyuep
Copy link
Member

shyuep commented May 20, 2024

@janosh I asked for the splitting of sets to be removed from this PR. Do this first. Otherwise this is just holding up the entire PR. The PR is not to be merged with the broken up sets.

@JaGeo
Copy link
Member

JaGeo commented May 20, 2024

@janosh it would be very helpful to see the exact changes in this PR to all sets. With the split, it is really hard to trace the changes.


# newest potcars are preferred
# Choose PBE_54 unless the user specifies a different potcar_functional
user_potcar_functional: UserPotcarFunctional = "PBE_54"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notice to myself that newer ones should also be allowed in the future.

@shyuep
Copy link
Member

shyuep commented May 20, 2024

@esoteric-ephemera I am tired of waiting. If you don't hear back from @janosh by tomorrow morning, I would appreciate if you can revert the split and/or resubmit a PR with the exact changes to the input sets. I will then merge your changes first. I apologize on behalf of the maintainers for any inconvenience and will take action to make sure this never happens again.

esoteric-ephemera and others added 8 commits May 20, 2024 17:16
Bumps [rexml](https://github.com/ruby/rexml) from 3.2.5 to 3.2.8.
- [Release notes](https://github.com/ruby/rexml/releases)
- [Changelog](https://github.com/ruby/rexml/blob/master/NEWS.md)
- [Commits](ruby/rexml@v3.2.5...v3.2.8)

---
updated-dependencies:
- dependency-name: rexml
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Shyue Ping Ong <shyuep@users.noreply.github.com>
Bumps [nokogiri](https://github.com/sparklemotion/nokogiri) from 1.16.2 to 1.16.5.
- [Release notes](https://github.com/sparklemotion/nokogiri/releases)
- [Changelog](https://github.com/sparklemotion/nokogiri/blob/main/CHANGELOG.md)
- [Commits](sparklemotion/nokogiri@v1.16.2...v1.16.5)

---
updated-dependencies:
- dependency-name: nokogiri
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Shyue Ping Ong <shyuep@users.noreply.github.com>
@janosh
Copy link
Member

janosh commented May 20, 2024

@shyuep no need for a separate PR. lot of meetings today, hence the delay. i still think the split of sets.py is a sensible change and should be done in a subsequent PR. i'll go ahead and save my commits into a separate branch and then rebase this branch to remove them here, then reapply @esoteric-ephemera's subsequent changes to sets.py. will require a force push. any opposition to that please voice it in the next 10 mins

@shyuep shyuep merged commit 74e692b into materialsproject:master May 23, 2024
23 checks passed
@janosh janosh added enhancement A new feature or improvement to an existing one io Input/output functionality vasp Vienna Ab initio Simulation Package ecosystem Concerning the larger pymatgen ecosystem breaking Breaking change labels May 31, 2024
@janosh janosh changed the title Update VASP sets to transition atomate2 to use pymatgen input sets exclusively Update VASP input sets for atomate2 compatibility Jun 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Breaking change ecosystem Concerning the larger pymatgen ecosystem enhancement A new feature or improvement to an existing one io Input/output functionality vasp Vienna Ab initio Simulation Package
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants