Quasi-RRHO Thermochemistry Analysis Module #2028

arepstein · 2021-01-07T00:19:44Z

Summary

Added quasirrho.py to the analysis subpackage to calculate the Quasi-RRHO free energy from a Gaussian or QChem frequency calculation.

Calculates Grimme's Quasi-RRHO free energy
Option for also correcting for solvent concentration

Additional dependencies introduced (if any)

*None

TODO (if any)

The rotational symmetry number is required as an input. To calculate this on the fly, will consider updates to PointGroupAnalyzer

Before a pull request can be merged, the following items must be checked:

[ x] Code is in the standard Python style.
Run pycodestyle and flake8
on your local machine.
[x ] Docstrings have been added in the Google docstring format.
Run pydocstyle on your code.
Type annotations are highly encouraged. Run mypy
to type check your code.
[x ] Tests have been added for any new functionality or bug fixes.
[ x] All existing tests pass.

Note that the CI system will run all the above checks. But it will be much more
efficient if you already fix most errors prior to submitting the PR. It is
highly recommended that you use the pre-commit hook provided in the pymatgen
repository. Simply cp pre-commit .git/hooks and a check will be run prior to
allowing commits.

shyuep · 2021-01-07T04:06:16Z

pymatgen/analysis/tests/test_quasirrho.py

+from pymatgen.io.gaussian import GaussianOutput
+from pymatgen.io.qchem.outputs import QCOutput
+
+test_dir = os.path.join(os.path.dirname(__file__), "..", "..", "..",


Instead of using module level test_dir, pls inherit from PymatgenTest for unit tests and then use self.test_dir instead. This will ensure correct behavior. Thanks.

mkhorton · 2021-01-26T21:27:51Z

@arepstein is this PR ready for review to be merged?

arepstein · 2021-01-26T21:59:20Z

I believe so. The only pending update in my mind is automated detection of the rotational symmetry number, but I think it's okay as an input parameter for now.

mkhorton · 2021-01-27T02:01:03Z

pymatgen/analysis/quasirrho.py

+from pymatgen.io.qchem.outputs import QCOutput
+
+# Define useful constants
+kb = 1.380662E-23  # J/K


Are these already defined in pymatgen.core.units?

Looks like yes! I was not aware of pymatgen units -- will fix and commit the changes

A lot of the units I'm using are slightly different and don't lend themselves well to the subclasses in pymatgen.core.units. For example, converting amuangs^2 to kgm^2 for moments of inertia. Would you recommend still using Units subclasses when I can?

I would recommend supplementing the existing units if you don't have the ones you want.

To be specific, I'm talking about the units constants in pymatgen.core.units specifically, rather than suggesting you use the classes in that file.

The reason for this recommendation is simply that if we expand our unit functionality later (we've explored a few options here), it's a lot easier for us to spot where units are being used in pymatgen if unit constants are imported from the same place.

Tried to use more built-in units. Added functionality for checking if linear and adjusting rotational entropy accordingly. Added testing for linear molecule

htz1992213 · 2021-07-07T00:32:47Z

Hey @arepstein , I get to know this PR in today's subgroup meeting. I am just wondering how the functionalities here comparing to those in the Goodvibes package https://github.com/bobbypaton/GoodVibes?

arepstein · 2021-07-08T13:11:07Z

Hey @arepstein , I get to know this PR in today's subgroup meeting. I am just wondering how the functionalities here comparing to those in the Goodvibes package https://github.com/bobbypaton/GoodVibes?

Thanks for pointing this out, I didn't know about Goodvibes when I put in this PR. This PR should be identical to the Grimme Quasi-RRHO approximation for the entropy that's implemented in GoodVibes, but does not include any other methods that GoodVibes implements. One big difference I see in terms of implementation into pymatgen infrastructure is that this PR can calculate Quasi-RRHO entropeis for Q-Chem output files, Gaussian output files, or manual input parameters, which is useful for Atomate integration. It would be a good idea to check that this agrees with GoodVibes.

for more information, see https://pre-commit.ci

Tried to make avg_mom_inertia an internal function

coveralls · 2022-03-04T17:57:22Z

Coverage decreased (-0.7%) to 83.428% when pulling d08a1ca on arepstein:readytoPR into 3376f27 on materialsproject:master.

Moved get_avg_mom_inertia outside the class

for more information, see https://pre-commit.ci

rkingsbury

Thanks for putting this together @arepstein ! I used it recently and I think it's a very useful addition to the molecular DFT infrastructure in pymatgen. I made a few comments; if you can pull in the latest pymatgen and address these hopefully we can get @janosh or another maintainer to review soon; I know this one has been open for a while.

rkingsbury · 2023-03-05T20:46:10Z

pymatgen/analysis/quasirrho.py

+# Define useful conversion factors
+kcal2hartree = 0.0015936  # kcal/mol to hartree/mol
+


Is there a way to define this conversion factor using scipy.constants? If not, just make sure you carry enough decimal places to not lose precision due to the very different magnitudes of Ha and kcal.

rkingsbury · 2023-03-05T20:46:50Z

pymatgen/analysis/quasirrho.py

+class QuasiRRHO:
+    """
+    Class to calculate thermochemistry using Grimme's Quasi-RRHO approximation.
+    All outputs are in atomic units.


Can you be explicit about what atomic units are? For thickheaded engineers like myself, I have to go and google this :)

rkingsbury · 2023-03-05T20:48:01Z

pymatgen/analysis/quasirrho.py

+
+class QuasiRRHO:
+    """
+    Class to calculate thermochemistry using Grimme's Quasi-RRHO approximation.


May add the citation to the paper in this docstring so that it's visible in, e.g., Jupyter Notebook and documentation pages?

Thanks for adding to the docstring! Since you opened this PR, we have also added a nice system for citations in pymatgen based on duecredit. Would you mind adding a @due.dcite decorator to this class? (example)

rkingsbury · 2023-03-05T20:49:46Z

pymatgen/analysis/quasirrho.py

+        :param output: Requires input of a Gaussian output file,
+                        QChem output file, or dictionary of necessary inputs:
+                        {"mol": Molecule, "mult": spin multiplicity (int),
+                        "frequencies": list of vibrational frequencies [a.u.],
+                        elec_energy": electronic energy [a.u.]}
+        :param sigma_r (int): Rotational symmetry number
+        :param temp (float): Temperature [K]
+        :param press (float): Pressure [Pa]
+        :param conc (float): Solvent concentration [M]
+        :param v0 (float): Cutoff frequency for Quasi-RRHO method [cm^1]


If possible, please convert to a google style docstring

Since all of the outputs are stored as class attributes, it would also be especially nice to have a Attributes section that describes each output, with units

Alternatively, you could define python @property methods for each of the relevant outputs (e.g., g_corrected, h_corrected`, which would let you give each one an explicit docstring. Each property would not do anything other than contain the docstring, e.g.

@property def g_corrected(self): ''' Corrected free energy in Ha ''' return self.g_corrected

so the only benefit to this structure would be visibility of the outputs to users

rkingsbury · 2023-03-05T20:55:36Z

See also some small edits I made when I used the code, in a PR against your fork: arepstein#1

rkingsbury · 2023-03-05T20:56:22Z

I will also note for other reviewers that although the GoodVibes package contains similar functionality, it only accepts Gaussian output files, and hence is not useful with our high-throughput Q-Chem infrastructure. So this PR is valuable because it gives us those capabilities.

janosh · 2023-05-08T19:55:30Z

@arepstein Is this still WIP?

arepstein · 2023-08-01T21:56:44Z

Hey @rkingsbury and @janosh, I've incorporated Ryan's edits and recommendations. While making these edits it became clear that a Class might not be the best choice for QuasiRRHO. As it stands now, it might be better as a simple function. Ideally, I think it could be nice to have a Thermochemistry class that is a Molecule plus some extra information like frequencies. We could then have different treatments of thermochemistry, like GoodVibes implements, but in a way works well with our ecosystem.

That being said, hopefully QuasiRRHO is still useful for now and can be a starting point for later changes if desired.

rkingsbury

Thanks @arepstein and nice work on this. Should be a good addition to pymatgen.

rkingsbury · 2023-08-02T16:32:25Z

pymatgen/analysis/quasirrho.py

+    from pymatgen.io.qchem.outputs import QCOutput
+
+# Define useful constants
+kb = kb_ev * const.eV  # Pymatgen kb [J/k]


[J/k] -> [J/K]

rkingsbury · 2023-08-02T16:34:45Z

pymatgen/analysis/quasirrho.py

+kcal2hartree = 1000 * const.calorie / const.value("Hartree energy") / const.Avogadro
+
+
+def get_avg_mom_inertia(mol):


Not required for this PR, but I wonder if this should be made a @Property of the Molecule class? Any strong opinions one way or another?

rkingsbury · 2023-08-02T16:36:45Z

pymatgen/analysis/quasirrho.py

+
+class QuasiRRHO:
+    """
+    Class to calculate thermochemistry using Grimme's Quasi-RRHO approximation.


Thanks for adding to the docstring! Since you opened this PR, we have also added a nice system for citations in pymatgen based on duecredit. Would you mind adding a @due.dcite decorator to this class? (example)

rkingsbury · 2023-08-02T16:39:36Z

pymatgen/analysis/quasirrho.py

+        ev *= R
+        etot = (et + er + ev) * kcal2hartree / 1000
+        self.h_corrected = etot + R * self.temp * kcal2hartree / 1000
+        molarity_corr = R_ha * self.temp * np.log(R_volume * self.temp * self.conc)


It looks like the molarity correction is RT log (RTC). Shouldn't it just be RT log(C)? I may be ignorant, but please double check / confirm.

Also, this formulation implicitly assumes a standard state of 1 M. It might be good to state that in the docstring in case someone is using a calc with a different standard state

I was following Wheeler's script for the concentration correction. I believe it comes from a thermodynamic cycle for cluster continuum models that leads to an extra RT log(R_volume T). It's equations 7 and 8 in the Ho and Coote pKa paper.
Thanks for the point about the 1M assumption. I will add that.

Ah, thanks for clarifying. So looking at that section of the paper:

I'm pretty sure the extra -RT log [R~T] term is there to adjust the reference states (the paragraph above and the different units of R and R~ are the clues). The solvation energy assumes 1 mol/L in both gas and solution, so you need a correction for that difference. The numerical value of R~T is 24.43 atm / (mol/L), which is the same value one might use to make a standard state correction.

So I think it should be RT ln C, and I'm glad we have the standard state in the docstring to hopefully call attention to the need to be careful interpreting the number. Does that make sense?

Your explanation of the source of the extra term makes sense to me, but I don't understand why we shouldn't include the standard state correction when providing a concentration-corrected free energy in solvent. If RTln(concentration) uses mol/L, shouldn't we also correct for the 1 atm standard state used in the DFT code?

That's a fair point. To me what makes it tricky is that you only need to do the standard state correction when you are writing a reaction energy between species with different reference states, but this module just computes a single correction for a single species. (in other words, the reason for the correction is to make sure all species use the same reference state). We don't necessarily know that users of this class are going to take this free energy and combine it with a gas phase energy.

The other small issue is that I think not every DFT code necessarily uses the same reference state. But I think both Q-Chem and Gaussian use 1 atm for gas phase calcs, and as long as we document what we're doing I'm not too worried about that.

Personally I think it will be more transparent and less confusing to users to just do RT ln C and let them adjust as needed if they have to change reference states. But again, as long as we document what's going on it could be OK to build it in.

If you want to build it in, I suggest writing it as ( RT log ( C) - RT log (1/24.55) ) and adding a comment line to make explicitly clear that you're dividing C by the molarity of an ideal gas at 1 atm.

Do others like @samblau or @espottesmith or @materialsproject/second-foundation have thoughts?

I definitely agree that we should be more specific in the documentation, and I'll put that in. Maybe with a reference to describe it in detail if I can find a really good one. I also think of the free energy in implicit solvent as already containing an assumed change in free energy from vacuum to solution, and would thus include the standard state correction when using implicit solvent. Is that correct?

I would also appreciate input from everyone you tagged!

I think another option would be to just entirely remove the concentration correction for now. It is somewhat just a legacy from converting a script that was useful to me into Pymatgen.

I definitely agree that we should be more specific in the documentation, and I'll put that in. Maybe with a reference to describe it in detail if I can find a really good one. I also think of the free energy in implicit solvent as already containing an assumed change in free energy from vacuum to solution, and would thus include the standard state correction when using implicit solvent. Is that correct?

I just think of the free energy of an isolated molecule as "DFT energy (~ 0K enthalpy) plus T-dependent enthalpy and entropy".

My understanding is that implicit solvent models adjust the free energy for interactions between the solute and the medium but do not account for changes in standard state (vac -> solution). Unless we're talking about formation free energy or a reaction energy, the free energy of an isolated molecule begs the question "relative to what"? And you don't have to introduce a standard/reference state until you answer that question. I know some solvent models output a solvation (reaction) energy, but I believe that implicitly assumes the same reference state for both vacuum and solvent. There's nothing inherently wrong with that but it's often not what the end user needs for comparison to experiment.

I think another option would be to just entirely remove the concentration correction for now. It is somewhat just a legacy from converting a script that was useful to me into Pymatgen.

Yeah, the more I think about it, that might be the cleanest thing to do here. This module is for calculating QRRHO thermochemistry, which (unless I'm mistaken) does not specifically have anything to do with an implicit solvent model and does not make any assumptions about the concentration; it's just a way of dampening low-frequency contributions to rot and vib entropy. So in the same way that standard thermochemistry output in Gaussian/Q-Chem doesn't make any adjustments for concentration, maybe this shouldn't either.

I'm still not entirely convinced, mainly because implicit solvent models are made to capture solvation free energies, but I definitely understand what you're saying. I'd love to have this conversation offline more too, given how fundamentally important it is to get these corrections correct, but I agree that that can be beyond the scope of this PR. I'll remove it for now and maybe if Pymatgen wants to start recommending free energy treatments we can implement more general thermochemistry!

Happy to discuss offline!

fixed typo in units added duecredit decorator

update 1M in docs

fixing bugs

rkingsbury · 2023-08-02T18:32:19Z

pymatgen/analysis/quasirrho.py

@@ -83,7 +84,7 @@ class QuasiRRHO:
    Attributes:
        temp (float): Temperature [K]
        press (float): Pressure [Pa]
-        conc (float): Solvent concentration [M]
+        conc (float): Solvent concentration. Assumes 1M unless specified [M]


I would add "Concentration correction is reference to a 1 M standard state."

Removed solvent concentration corrections from the code so the class only deals with QuasiRRHO corrections

codecov-commenter · 2023-08-11T19:37:27Z

Codecov Report

❗ No coverage uploaded for pull request base (master@57d8a2f). Click here to learn what that means.
Patch has no changes to coverable lines.

Additional details and impacted files

@@            Coverage Diff            @@
##             master    #2028   +/-   ##
=========================================
  Coverage          ?   74.06%           
=========================================
  Files             ?      230           
  Lines             ?    69403           
  Branches          ?    16161           
=========================================
  Hits              ?    51403           
  Misses            ?    14957           
  Partials          ?     3043

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

rkingsbury · 2023-08-21T13:49:38Z

I think this is ready to merge, right @arepstein ? @janosh can you remove the 'stale' label?

arepstein · 2023-08-21T16:29:45Z

Yes, I believe ready to merge!

janosh

There's been a recent change of the pymatgen test folder structure. The file test_files/molecules/co2.log now needs to in tests/files/.... Would also be good to gzip this file.

…tia()

janosh

Thanks for this PR @arepstein! 👍

arepstein · 2023-08-22T14:58:04Z

Thanks @janosh and @rkingsbury for all the help, edits, and revisions!

arepstein changed the title ~~Readyto pr~~ Quasi-RRHO Thermochemistry Analysis Module Jan 7, 2021

shyuep reviewed Jan 7, 2021

View reviewed changes

mkhorton reviewed Jan 27, 2021

View reviewed changes

Alex Epstein added 3 commits June 10, 2021 16:15

Merge remote-tracking branch 'materialsproject/master' into readytoPR

1762e15

Linear molecules + units

dda14db

Tried to use more built-in units. Added functionality for checking if linear and adjusting rotational entropy accordingly. Added testing for linear molecule

Merge remote-tracking branch 'materialsproject/master' into readytoPR

0c1eed3

Alex Epstein and others added 5 commits August 4, 2021 13:57

Merge remote-tracking branch 'materialsproject/master' into readytoPR

5217e93

Merge remote-tracking branch 'materialsproject/master' into readytoPR

2ff2dd1

[pre-commit.ci] auto fixes from pre-commit.com hooks

9bf5f12

for more information, see https://pre-commit.ci

pylint debugging

ed81ba2

Tried to make avg_mom_inertia an internal function

Merge remote-tracking branch 'origin/readytoPR' into readytoPR

83d6c41

Alex Epstein and others added 7 commits March 4, 2022 16:01

Pylint fixing

ba4edbe

Moved get_avg_mom_inertia outside the class

[pre-commit.ci] auto fixes from pre-commit.com hooks

793897a

for more information, see https://pre-commit.ci

black

023ba48

Merge remote-tracking branch 'origin/readytoPR' into readytoPR

76503f1

[pre-commit.ci] auto fixes from pre-commit.com hooks

80f2cc2

for more information, see https://pre-commit.ci

black 22.1.0

8e6e005

Merge remote-tracking branch 'origin/readytoPR' into readytoPR

d08a1ca

janosh added the stale Abandoned or conflicting PRs and outdated issues label Oct 23, 2022

rkingsbury added 2 commits March 5, 2023 12:39

QuasiRRHO: edits

e8a6a81

QuasiRRHO: update tests

dcb07ef

rkingsbury reviewed Mar 5, 2023

View reviewed changes

janosh force-pushed the master branch from 6349dc1 to da318a2 Compare August 1, 2023 23:26

rkingsbury approved these changes Aug 2, 2023

View reviewed changes

Alex Epstein and others added 4 commits August 2, 2023 10:13

Typos and doi

e6df7d9

fixed typo in units added duecredit decorator

docs

9420e6f

update 1M in docs

duecredit

54ed557

fixing bugs

pre-commit auto-fixes

0122859

rkingsbury reviewed Aug 2, 2023

View reviewed changes

arepstein and others added 2 commits August 11, 2023 12:08

Remove Concentration Correction

b6fa300

Removed solvent concentration corrections from the code so the class only deals with QuasiRRHO corrections

pre-commit auto-fixes

3e2a0cd

janosh added enhancement A new feature or improvement to an existing one molecules Molecule stuff analysis Concerning pymatgen.analysis and removed stale Abandoned or conflicting PRs and outdated issues labels Aug 21, 2023

janosh suggested changes Aug 21, 2023

View reviewed changes

Alex Epstein and others added 9 commits August 21, 2023 11:36

Merge remote-tracking branch 'materialsproject/master' into readytoPR

5963413

Changes to testing

e4cf61c

pre-commit auto-fixes

f9ede66

Merge remote-tracking branch 'materialsproject/master' into readytoPR

e7cf9bf

Merge remote-tracking branch 'origin/readytoPR' into readytoPR

06ad8e4

add type hints and snake_case QuasiRRHO method names

9d2bcc9

rename single-letter module-scoped vars

9232e5d

fix tests following method renaming

02ac2c7

add test_extreme_temperature_and_pressure() and test_get_avg_mom_iner…

f7d3fa7

…tia()

janosh approved these changes Aug 21, 2023

View reviewed changes

janosh merged commit 0d5eed0 into materialsproject:master Aug 21, 2023

		# Define useful conversion factors
		kcal2hartree = 0.0015936 # kcal/mol to hartree/mol

		kcal2hartree = 1000 * const.calorie / const.value("Hartree energy") / const.Avogadro


		def get_avg_mom_inertia(mol):

Quasi-RRHO Thermochemistry Analysis Module #2028

Quasi-RRHO Thermochemistry Analysis Module #2028

Conversation

arepstein commented Jan 7, 2021

Summary

Additional dependencies introduced (if any)

TODO (if any)

Choose a reason for hiding this comment

mkhorton commented Jan 26, 2021

arepstein commented Jan 26, 2021

mkhorton Jan 27, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

htz1992213 commented Jul 7, 2021

arepstein commented Jul 8, 2021

coveralls commented Mar 4, 2022 • edited Loading

rkingsbury left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rkingsbury commented Mar 5, 2023

rkingsbury commented Mar 5, 2023

janosh commented May 8, 2023

arepstein commented Aug 1, 2023

rkingsbury left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rkingsbury Aug 2, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-commenter commented Aug 11, 2023 • edited Loading

Codecov Report

rkingsbury commented Aug 21, 2023

arepstein commented Aug 21, 2023

janosh left a comment • edited Loading

Choose a reason for hiding this comment

janosh left a comment

Choose a reason for hiding this comment

arepstein commented Aug 22, 2023

mkhorton Jan 27, 2021 •

edited

Loading

coveralls commented Mar 4, 2022 •

edited

Loading

rkingsbury Aug 2, 2023 •

edited

Loading

codecov-commenter commented Aug 11, 2023 •

edited

Loading

janosh left a comment •

edited

Loading