Add capability for Vasprun to read KPOINTS_OPT data #3509

bfield1 · 2023-12-10T23:04:22Z

Summary

This pull request resolves Issue #3455 , implementing parsing functionality for band structures generated using KPOINTS_OPT.
KPOINTS_OPT is a new VASP feature which provides an alternative method for specifying k-points for hybrid band structures.
The KPOINTS_OPT-related data is now read in by the Vasprun class, and Vasprun.get_band_structure defaults to reading the KPOINTS_OPT-based eigenvalues if present.

Unfinished tasks

Implement KPOINTS_OPT functionality for density of states. As the DOS is accessed using a property rather than a function, it is unclear what the best interface would be. The current pull request reads and saves the data from vasprun.xml, but it doesn't have a convenient getter. Regardless, the other features of this pull request stand without this feature.

Checklist

Google format doc strings added. Check with ruff.
Type annotations included. Check with mypy.
Tests added for new features/fixes.
If applicable, new classes/functions/modules have duecredit @due.dcite decorators to reference relevant papers by DOI (example)

Allow reading in KPOINTS_OPT data from vasprun.xml. Also get band structure from KPOINTS_OPT data, if requested. (May change that to the default behaviour.) I've tested it separately, but still need to test integration into the pymatgen code, once I get a working development build of pymatgen.

projected_magnetism_kpoints_opt -> projected_magnetisation_kpoints_opt

From issue materialsproject#3455 , dataset supplied by Jeff-oakly materialsproject#3455 (comment) Although, for some reason, even though LORBIT = 11, vasprun.xml doesn't have projected eigenvalues for kpoints_opt. Wierd.

It was with VASP 6.4 rather than 6.3, if that makes a difference. And more CPUs so it snuck in more bands, which slightly changed some energy.

if KPOINTS_OPT data is present.

Prior name was not descriptive of the tests being done.

…tgen into vasprun-kpoints-opt

in both Vasprun and BSVasprun

improve test err message on bad vbm/cbm kpoint labels

janosh

Nice work! All good except for the large new file tests/files/kpoints_opt/vasprun.xml.gz. Can you check if we can use it in place of one of the existing vasprun.xml files?

tests/files/kpoints_opt/vasprun.xml.gz

janosh · 2024-01-10T10:30:42Z

pymatgen/io/vasp/outputs.py

@esoteric-ephemera If you're not too busy, could you take a look and let me know if the changes in this file all make sense to you?

Yes sorry I will soon!

These overall look pretty reasonable to me. @bfield1, are the changes to the structure of the XML parsing needed? It seems like you've added explicit triggers for when XML blocks are read.

Also, if a user specifies a KPOINTS_OPT file, it might make more sense to put the various *_opt quantities without the suffix?

For many use cases, where KPOINTS_OPT isn't used, there would be 5 new top-level attrs that default to None, which is OK. Maybe it makes more sense to have underscored attrs that default to None and store the quantities computed with KPOINTS/KSPACING when KPOINTS_OPT is used?

For example, when no KPOINTS_OPT file is used, or LKPOINTS_OPT = False:

self.eigenvalues_kpoints = < eigenvalues using KPOINTS / KSPACING > self._eigenvalues_kpoints = None

and when KPOINTS_OPT is used:

self.eigenvalues_kpoints = < eigenvalues using KPOINTS_OPT > self._eigenvalues_kpoints = < eigenvalues using KPOINTS / KSPACING >

Just to keep things more organized and hopefully reflect what a user expects to get from a Vasprun.

@bfield1, are the changes to the structure of the XML parsing needed? It seems like you've added explicit triggers for when XML blocks are read.

Yes, we do need those changes. The prior 'end' condition for reading XML elements doesn't tell you which element it is nested within until after you have left that element. The 'start' condition is used to identify when we are entering certain blocks (but we have to wait until 'end' to read the data).
When using KPOINTS_OPT, some of the blocks can only be distinguished from their regular KPOINTS counterparts by knowing which block they are nested within. As such, I had to change the parser to read both start and end tags, and add a flag to track when we have entered (or left) certain blocks.

Regarding how to structure the data:

My choice of convention was mainly based on how the data is structured within the vasprun.xml file itself, as well as the calculation outputs. KPOINTS and KPOINTS_OPT are separate files, as are PROCAR and PROCAR_OPT. The vasprun.xml files stores the conventional KPOINTS data in the standard location and the KPOINTS_OPT data under special flags. The KPOINTS_OPT calculation is an extra non-self-consistent calculation run on top of the regular self-consistent calculation.
I also didn't want to change what the current variables in Vasprun were pointing to without proper consultation.

That said, I am not opposed to making the KPOINTS_OPT data the default when it is present. You probably have a better idea of what the average user "expects" from a Vasprun than I do. I only recently started using KPOINTS_OPT too, so different people may have different conventions.

One use-case of KPOINTS_OPT I was considering was doing a regular grid with normal KPOINTS (which you have to do) and a band structure with KPOINTS_OPT. In principle, one could get DOS data from the regular grid and band structure data from the KPOINTS_OPT data. But maybe this is a niche use case?

I would guess that typically someone cares most about the KPOINTS_OPT data when they specify it, so making that the prime data then using extra flags to get the underlying data probably makes sense. But also, I would expect the Vasprun class to reflect the data structure of the vasprun.xml file, so listing KPOINTS_OPT separately also makes sense. I suppose it depends which we value more.

For what it's worth, I counted 10 new top-level attributes rather than 5. tdos_opt, idos_opt, pdos_opt, efermi_opt, eigenvalues_kpoints_opt, projected_eigenvalues_kpoints_opt, projected_magnetisation_kpoints_opt, kpoints_opt, actual_kpoints_opt, and actual_kpoints_weights_opt, which are all the KPOINTS_OPT equivalents of their non-opt counterparts and are listed in separate fields in vasprun.xml. I can appreciate that having lots of usually-empty attributes floating around might be a bit messy, though. If we wanted to tidy up, maybe we could store these attributes inside a dictionary so we only have one new top-level attribute?

Let me know your thoughts.

When using KPOINTS_OPT, some of the blocks can only be distinguished from their regular KPOINTS counterparts by knowing which block they are nested within. As such, I had to change the parser to read both start and end tags, and add a flag to track when we have entered (or left) certain blocks.

this is very helpful information to understand why the code is as it is. could you insert this as a comment along with a date?

For what it's worth, I counted 10 new top-level attributes rather than 5. [...] I can appreciate that having lots of usually-empty attributes floating around might be a bit messy, though. If we wanted to tidy up, maybe we could store these attributes inside a dictionary so we only have one new top-level attribute?

would be interested to hear @utf opinion on this and also if it affects atomate2 in some way?

Thanks @bfield1! This is super helpful context and I appreciate the work you've put into this.

Re:

In principle, one could get DOS data from the regular grid and band structure data from the KPOINTS_OPT data. But maybe this is a niche use case?

This is actually the default behavior of the Materials Project's bandstructure workflows, which means we should probably save some compute effort and revise these after this PR is merged. As they stand currently, two separate self-consistent calculations are performed to get the DOS from a uniform k-mesh, and the bandstructure from a line-mode k-mesh

If we wanted to tidy up, maybe we could store these attributes inside a dictionary so we only have one new top-level attribute?

I might lean towards this approach, or perhaps defining a small container class (e.g., electronic_properties) that contains these 10 attrs (electronic_properties.tdos, electronic_properties.idos,...). The container class would be a top-level attr of Vasprun. Hope that makes sense

this is very helpful information to understand why the code is as it is. could you insert this as a comment along with a date?

Done.

This is actually the default behavior of the Materials Project's bandstructure workflows, which means we should probably save some compute effort and revise these after this PR is merged.

You'd probably want to double-check that VASP prints its output to vasprun.xml as expected when using KPOINTS_OPT, as it's a new feature so may have some bugs (the compilation of VASP I'm using doesn't write the projected eigenvalues to vasprun.xml properly, for instance). But if VASP behaves you might be able to save some effort, yes.

I might lean towards this approach, or perhaps defining a small container class (e.g., electronic_properties) that contains these 10 attrs (electronic_properties.tdos, electronic_properties.idos,...). The container class would be a top-level attr of Vasprun. Hope that makes sense

I have now implemented such a solution. A simple container class (_ElectronicPropertiesContainer) (which I made MSONable because it seemed straightforward) which is used in the attribute Vasprun.kpoints_opt_properties. It holds the electronic properties with their normal names (without the suffixes). The names might be a little wordy, but it's probably better to be more verbose than less.

In principle, one could use such a container for the regular electronic properties too, but that's probably an unnecessary extra layer with no obvious benefits. Unless, maybe, we have setters/getters to the top-level attributes which intelligently decide whether to access the kpoints_opt or regular data, but that's getting ahead of ourselves. People can request such a feature if demand arises.

Thanks for these changes @bfield1! I'm happy with the current state of this PR. Re the "non-opt" electronic properties: just to avoid breaking changes, I prefer the way you've done things by not storing them in an _ElectronicPropertiesContainer.

The option to transition to a container class for these is there, but not necessary now.

tweak var names

Signed-off-by: Bernard Field <51108483+bfield1@users.noreply.github.com>

materialsproject#3509

Now they're referenced as attributes of that container (kpoints_opt_properties) rather than being top-level quantities in Vasprun.

Signed-off-by: Bernard Field <51108483+bfield1@users.noreply.github.com>

janosh · 2024-02-01T06:46:47Z

pymatgen/io/vasp/outputs.py

@@ -140,6 +140,50 @@ def _vasprun_float(flt: float | str) -> float:
        raise exc


+class _ElectronicPropertiesContainer(MSONable):


i'm confused on what the consensus re _ElectronicPropertiesContainer is in this PR. to me it looks like boilerplate. not sure what the benefits are. if i understand @esoteric-ephemera's last comment, he's also (no longer?) in favor of keeping it?

No I think this is good to keep things organized. The alternative is to have 10 new attrs on Vasprun.*_opt that are None whenever a KPOINTS_OPT file isn't used. That's fine but (in my view) messy for most use cases.

Either solution (removing the container class or having the *_opt attrs directly on Vasprun) works tho

@janosh As @esoteric-ephemera said, this container is useful for storing these extra variables.

The earlier remark about not storing things in the container was regarding the pre-existing electronic properties. I had briefly floated the idea of putting all the electronic properties into these containers (and using getters to choose them intelligently), but decided against it to avoid breaking changes and unnecessary complexity. But the container is still good for the KPOINTS_OPT attributes.

… dataclass which have 1st class MSONable support

janosh

i'm a bit concerned here that the tests pass even if you remove all attributes from

@dataclass
class KpointOptProps:
    """Simple container class to store KPOINTS_OPT data in a separate namespace. Used by Vasprun."""

    tdos: Dos | None = None
    idos: Dos | None = None
    pdos: list | None = None
    efermi: float | None = None
    eigenvalues: dict | None = None
    projected_eigenvalues: dict | None = None
    projected_magnetisation: np.ndarray | None = None
    kpoints: Kpoints | None = None
    actual_kpoints: list | None = None
    actual_kpoints_weights: list | None = None
    dos_has_errors: bool | None = None

i.e.

@dataclass
class KpointOptProps:
    """Simple container class to store KPOINTS_OPT data in a separate namespace. Used by Vasprun."""

works fine too. the attributes help with IDE auto-complete and type checking of course but maybe @bfield1 can think about more stringent tests for this?

either way, merging because of the very long wait time. sorry about that.

bfield1 · 2024-02-23T18:55:06Z

i'm a bit concerned here that the tests pass even if you remove all attributes from

@dataclass
class KpointOptProps:
    """Simple container class to store KPOINTS_OPT data in a separate namespace. Used by Vasprun."""

    tdos: Dos | None = None
    idos: Dos | None = None
    pdos: list | None = None
    efermi: float | None = None
    eigenvalues: dict | None = None
    projected_eigenvalues: dict | None = None
    projected_magnetisation: np.ndarray | None = None
    kpoints: Kpoints | None = None
    actual_kpoints: list | None = None
    actual_kpoints_weights: list | None = None
    dos_has_errors: bool | None = None

i.e.

@dataclass
class KpointOptProps:
    """Simple container class to store KPOINTS_OPT data in a separate namespace. Used by Vasprun."""

works fine too. the attributes help with IDE auto-complete and type checking of course but maybe @bfield1 can think about more stringent tests for this?

either way, merging because of the very long wait time. sorry about that.

My rationale was that some vasprun.xml files would not have all that data in them (e.g. if your calculation doesn't have magnetism, it wouldn't have a projected_magnetisation field, I don't think. Or if your calculation didn't use LORBIT, it won't have any projected_* fields). I felt it was safer for the end user if the blank fields returned None rather than throwing an AttributeError.

But I can indeed consider some way to test this behaviour and ensure it acts as expected. I'll open another pull request once I get around to it.

Anyway, thanks for working with me on this.

bfield1 and others added 15 commits December 7, 2023 18:50

Update docstrings, fix names in Vasprun

909b98b

projected_magnetism_kpoints_opt -> projected_magnetisation_kpoints_opt

Test case for Vasprun kpoints_opt

d3c136b

From issue materialsproject#3455 , dataset supplied by Jeff-oakly materialsproject#3455 (comment) Although, for some reason, even though LORBIT = 11, vasprun.xml doesn't have projected eigenvalues for kpoints_opt. Wierd.

Bandstructure test for kpoints_opt vasprun.xml

33aeb14

Ran my own calculation, so have projected data

f750277

It was with VASP 6.4 rather than 6.3, if that makes a difference. And more CPUs so it snuck in more bands, which slightly changed some energy.

Unittest for projected kpoints_opt

50c833b

Change use_kpoints_opt to be default behaviour

1bda574

if KPOINTS_OPT data is present.

Merge branch 'master' into vasprun-kpoints-opt

4c89b40

Update borg test to account for new test documents

4ad198a

Rename new test file directory to kpoints_opt

4ac674c

Prior name was not descriptive of the tests being done.

pre-commit auto-fixes

400776c

Code linting from ruff (whitespace, line length)

6575ba4

Merge branch 'vasprun-kpoints-opt' of https://github.com/bfield1/pyma…

1b59aaa

…tgen into vasprun-kpoints-opt

Support for KPOINTS_OPT in BSVasprun

58d2979

as_dict now reads KPOINTS_OPT parameters

3372988

in both Vasprun and BSVasprun

bfield1 marked this pull request as ready for review December 11, 2023 01:19

janosh added enhancement A new feature or improvement to an existing one io Input/output functionality vasp Vienna Ab initio Simulation Package labels Dec 11, 2023

bfield1 changed the title ~~Vasprun kpoints opt~~ Add capability for Vasprun to read KPOINTS_OPT data Dec 11, 2023

janosh force-pushed the master branch 2 times, most recently from 3c23114 to 36e289c Compare December 19, 2023 02:10

bfield1 and others added 2 commits December 20, 2023 11:38

Merge branch 'master' into vasprun-kpoints-opt

6c9da5e

tweak Vasprun Attributes: doc str

73256ad

improve test err message on bad vbm/cbm kpoint labels

janosh suggested changes Dec 20, 2023

View reviewed changes

tests/files/kpoints_opt/vasprun.xml.gz Outdated Show resolved Hide resolved

bfield1 added 2 commits January 9, 2024 16:37

Merge branch 'master' into vasprun-kpoints-opt

4497f08

Replace vasprun.xml for kpoints_opt with smaller file

a691a45

janosh reviewed Jan 10, 2024

View reviewed changes

refactor get_band_structure

e12e73b

tweak var names

janosh force-pushed the master branch from d7c651a to 9751f96 Compare January 10, 2024 10:42

bfield1 added 2 commits January 16, 2024 13:47

Merge remote-tracking branch 'upstream/master' into vasprun-kpoints-opt

071690d

Merge branch 'master' into vasprun-kpoints-opt

27f9cd1

Signed-off-by: Bernard Field <51108483+bfield1@users.noreply.github.com>

janosh force-pushed the master branch from 84d3a4c to 59a76b5 Compare January 23, 2024 10:20

bfield1 and others added 7 commits January 24, 2024 13:44

Merge branch 'master' into vasprun-kpoints-opt

681a08a

Signed-off-by: Bernard Field <51108483+bfield1@users.noreply.github.com>

Merge branch 'master' into vasprun-kpoints-opt

6295e29

Merge branch 'master' into vasprun-kpoints-opt

56e854c

Signed-off-by: Bernard Field <51108483+bfield1@users.noreply.github.com>

Explanatory comment on parser restructure

a2d4df7

materialsproject#3509

Stick all KPOINTS_OPT properties in a container

cf51ec3

Now they're referenced as attributes of that container (kpoints_opt_properties) rather than being top-level quantities in Vasprun.

Merge branch 'master' into vasprun-kpoints-opt

2dc1d70

Signed-off-by: Bernard Field <51108483+bfield1@users.noreply.github.com>

Merge branch 'master' into vasprun-kpoints-opt

931c85d

bfield1 requested a review from janosh January 31, 2024 23:08

janosh suggested changes Feb 1, 2024

View reviewed changes

janosh force-pushed the master branch 4 times, most recently from d725325 to dca98be Compare February 2, 2024 11:47

bfield1 added 2 commits February 6, 2024 10:42

Merge branch 'master' into vasprun-kpoints-opt

583e259

Merge branch 'master' into vasprun-kpoints-opt

ecbd080

bfield1 requested review from shyuep and mkhorton as code owners February 12, 2024 21:06

janosh added 5 commits February 23, 2024 12:08

rename _ElectronicPropertiesContainer to KpointOptProps and turn into…

e40d40c

… dataclass which have 1st class MSONable support

rename Vasprun.kpoints_opt_properties to kpoint_opt_props

712da14

Merge branch 'master' into pr/bfield1/3509

0d090dd

fix typo kpoint_opt_props -> kpoints_opt_props

58622ed

fix mypy

6c5d445

janosh approved these changes Feb 23, 2024

View reviewed changes

janosh merged commit 53141c1 into materialsproject:master Feb 23, 2024
22 checks passed

bfield1 mentioned this pull request Feb 26, 2024

[Feature Request]: Vasprun.xml needs to be updated to be compatible vasp > 6.3 #3455

Open

esoteric-ephemera mentioned this pull request Apr 29, 2024

Add support for vaspout.h5, improvements to potcar handling #3680

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add capability for Vasprun to read KPOINTS_OPT data #3509

Add capability for Vasprun to read KPOINTS_OPT data #3509

bfield1 commented Dec 10, 2023 •

edited

Loading

janosh left a comment

janosh Jan 10, 2024

esoteric-ephemera Jan 11, 2024

esoteric-ephemera Jan 25, 2024

bfield1 Jan 25, 2024

janosh Jan 26, 2024

janosh Jan 26, 2024

esoteric-ephemera Jan 26, 2024

bfield1 Jan 26, 2024

esoteric-ephemera Jan 29, 2024

janosh Feb 1, 2024

esoteric-ephemera Feb 1, 2024

bfield1 Feb 2, 2024

janosh left a comment

bfield1 commented Feb 23, 2024

		@@ -140,6 +140,50 @@ def _vasprun_float(flt: float \| str) -> float:
		raise exc


		class _ElectronicPropertiesContainer(MSONable):

Add capability for Vasprun to read KPOINTS_OPT data #3509

Add capability for Vasprun to read KPOINTS_OPT data #3509

Conversation

bfield1 commented Dec 10, 2023 • edited Loading

Summary

Unfinished tasks

Checklist

janosh left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Regarding how to structure the data:

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

janosh left a comment

Choose a reason for hiding this comment

bfield1 commented Feb 23, 2024

bfield1 commented Dec 10, 2023 •

edited

Loading