Minor improvements for generation of dataset of reaction profiles. #359

javialra97 · 2024-10-18T09:53:30Z

Reading the output.csv as a data frame (with pandas) can be problematic because of the first line, removing the line you will have a proper data frame with well-structured columns. Also, that information is printed in the methods.txt file.
For specific purposes, it is not necessary a final single point, add it as a boolean (set to True by default). Sort of related to pull request add set_basis_set method #330.
The convergence criterion for the optimization with Gaussian should be set to Tight.
Implemented bug Extract coordinates failed #358.

Checklist

The changes include an associated explanation of how/why
Test pass
Documentation has been updated
Changelog has been updated

t-young31 · 2024-10-18T18:42:13Z

Thanks for the PR @javialra97

Reading the output.csv as a data frame (with pandas) can be problematic because of the first line, removing the line you will have a proper data frame with well-structured columns. Also, that information is printed in the methods.txt file.

This would be a breaking change, and pandas supports skipping rows natively

import pandas as pd
pd.read_csv("<filename>", skiprows=1)

so I'm not sure about that.

For specific purposes, it is not necessary a final single point, add it as a boolean

Interesting. I wouldn't be that adverse to this functionality, but can't think of a scenario where this would be useful. What did you have in mind?

The convergence criterion for the optimization with Gaussian should be set to Tight.

On this I don't agree I'm afraid. I've not come across a situation where tight optimisations are worthwhile. Maybe if you're wanting super accurate frequencies and you've got a load of compute to burn then maybe, in my view.

Implemented bug #358.

Thanks! 😄

javialra97 · 2024-10-19T15:24:32Z

Hi @t-young31

Interesting. I wouldn't be that adverse to this functionality, but can't think of a scenario where this would be useful. What did you have in mind?

I think the simplest case is when you have the resources to compute the full profile at a TZ/QZ basis set, in those cases a final single point is unnecessary. I have used it for benchmarking purposes, I want the geometries (with all the autodE checks) along the full profile to compute single points with N different functionals.

On this I don't agree I'm afraid. I've not come across a situation where tight optimisations are worthwhile. Maybe if you're wanting super accurate frequencies and you've got a load of compute to burn then maybe, in my view.

Well, I would prefer it to use the tight criterion, and I couldn't configure it with ade.Config, I will give it another try.

Thanks! 😄

Thanks to you for autodE!

t-young31 · 2024-10-28T19:24:07Z

Sorry for the slow reply – work took over last week

I think the simplest case is when you have the resources to compute the full profile at a TZ/QZ basis set, in those cases a final single point is unnecessary.

Cool – sounds good 👍🏼

Well, I would prefer it to use the tight criterion, and I couldn't configure it with ade.Config, I will give it another try.

Hopefully the second try was successful! The following I think should work

import autode as ade

ade.Config.G09.keywords.opt.remove('Opt')
ade.Config.G09.keywords.opt.append('Opt=Tight')
# likewise for ade.Config.G09.keywords.opt_ts

To get this merged would you mind

Reverting the changes to _ConfigClass and Reaction.print_output
Merging origin/v1.4.5 into your branch
Updating the contributors list and the changelog

Thanks 😄

This reverts commit 1f4c6e0.

This reverts commit 675b82e.

javialra97 · 2024-10-29T10:47:35Z

Hi,

I made the pertinent changes. Hope everything is fine

Thanks!

t-young31 · 2024-10-29T21:03:07Z

Awesome – thanks @javialra97 😄

It looks like there's a conflict on your branch, but looks like an easy one to resolve

javialra97 · 2024-10-30T08:45:24Z

Hi @t-young31,

I hope that is solved now 🙂

t-young31 · 2024-11-04T21:49:48Z

Looks like some Guassian frequency calculation tests are failing due to allowing the Standard orientation coordinates to be extracted. I've not had time to dig into it but I'm guessing it's a limitation of the frequency calculation from the Hessian, making it unstable when the principal axis lies along one of the cartesian axes (I guess that's 'standard orientation'?). A quick fix might be to do

            if "Input orientation" in line or ("Standard orientation" in line and len(coords) == 0):

but we should fix this properly so that frequency calculation is okay irrespective of the orientation

javialra97 · 2024-11-05T11:30:42Z

Hi @t-young31,

In the Standard orientation, Gaussian reorders the molecule so the principal axis will be the z-axis and the principal plane of symmetry is located on the yz-plane. In the first test with the H2 molecule, the h2.hessian.atoms will look like as Atoms(n_atoms=2, [Atom(H, 0.0000, 0.0000, 0.3804), Atom(H, 0.0000, 0.0000, -0.3804)]) and not in the x-axis as is defined in the input.

As you said, the problem is in the hessian, the output of _tr_vecs() returns different vectors and consequently the _proj_mass_weighted has completely different values. Honestly, I can't help you any further at this point. I would suggest to rise a warning. Do the thermal corrections depend on that, because the extracted frequencies are fine?

t-young31 · 2024-11-05T11:46:30Z

Honestly, I can't help you any further at this point

Sorry you hit this 😞. Have created #362 to track outside of this PR

Do the thermal corrections depend on that, because the extracted frequencies are fine?

I'm not sure the frequencies are fine.. https://github.com/duartegroup/autodE/actions/runs/11589659415/job/32265741467#step:6:205

codecov · 2024-11-05T11:55:22Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.43%. Comparing base (8284a7c) to head (52aa45a).
Report is 1 commits behind head on v1.4.5.

Additional details and impacted files

@@           Coverage Diff           @@
##           v1.4.5     #359   +/-   ##
=======================================
  Coverage   97.43%   97.43%           
=======================================
  Files         204      204           
  Lines       23758    23759    +1     
=======================================
+ Hits        23148    23150    +2     
+ Misses        610      609    -1

Flag	Coverage Δ
unittests	`97.43% <100.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

javialra97 · 2024-11-05T12:32:38Z

I'm not sure the frequencies are fine.. https://github.com/duartegroup/autodE/actions/runs/11589659415/job/32265741467#step:6:205

Yes, is true, I only checked the one for H2 but the acetylene frequencies are wrong 😔

Maybe instead of taking the geometries from the standard orientation, we can take them directly from the input instructions, after charge and multiplicity, what do you think?

The error now is also because of taking a different geometry?

t-young31 · 2024-11-05T13:37:15Z

Maybe instead of taking the geometries from the standard orientation, we can take them directly from the input instructions, after charge and multiplicity, what do you think?

We could, but then it'd still potentially be broken for Opt+Freq calculations that don't print the input orientation coordinates. I'll find a way to fix it properly, or at least try!

The error now is also because of taking a different geometry?

I don't think so. @shoubhikraj have you got any ideas? it's Windows+TRM so your world. Python 3.8 looks like it's just end of life, so shall we just drop support?
@javialra97 very happy to merge irrespective of that failing test, if you're happy 😄

javialra97 · 2024-11-05T14:07:17Z

We could, but then it'd still potentially be broken for Opt+Freq calculations that don't print the input orientation coordinates. I'll find a way to fix it properly, or at least try!

It seems that Gaussian is a little messy with the outputs. But I was checking my outputs and the one that you have in test and for the Opt+Freq of the TS it prints it

@javialra97 very happy to merge irrespective of that failing test, if you're happy 😄

Great! Glad to help a little bit.

shoubhikraj · 2024-11-05T22:15:26Z

@t-young31 Honestly I am not sure where the error is coming from... The hessian and gradient are loaded from stored matrices, and I have previously checked that it works. It does succeed with python 3.13, so maybe some issue with numpy or scipy versions. If it comes up later, then I might check in detail. I am fine with removing python 3.8 support, maybe in the next version - it was just made eol.

javialra97 added 4 commits October 17, 2024 18:39

tight criteria

675b82e

csv output

1f4c6e0

single point refinement bool

a68ae84

bug duartegroup#358

5dbd19b

t-young31 changed the base branch from master to v1.4.5 October 28, 2024 19:24

javialra97 and others added 5 commits October 29, 2024 09:37

Revert "csv output"

1821fd3

This reverts commit 1f4c6e0.

Revert "tight criteria"

e402660

This reverts commit 675b82e.

template

a192480

update changelog and readme

314ead0

parentheses

3be4931

Merge branch 'v1.4.5' into dataset_rxn_profiles

73ec55f

provisional patch

52aa45a

t-young31 mentioned this pull request Nov 5, 2024

Frequency calculations depend on molecule orientation #362

Open

t-young31 approved these changes Nov 5, 2024

View reviewed changes

t-young31 merged commit 140c60b into duartegroup:v1.4.5 Nov 5, 2024
13 of 14 checks passed

javialra97 deleted the dataset_rxn_profiles branch November 13, 2024 11:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minor improvements for generation of dataset of reaction profiles. #359

Minor improvements for generation of dataset of reaction profiles. #359

javialra97 commented Oct 18, 2024 •

edited

Loading

t-young31 commented Oct 18, 2024

javialra97 commented Oct 19, 2024

t-young31 commented Oct 28, 2024

javialra97 commented Oct 29, 2024

t-young31 commented Oct 29, 2024

javialra97 commented Oct 30, 2024

t-young31 commented Nov 4, 2024

javialra97 commented Nov 5, 2024 •

edited

Loading

t-young31 commented Nov 5, 2024

codecov bot commented Nov 5, 2024 •

edited

Loading

javialra97 commented Nov 5, 2024 •

edited by t-young31

Loading

t-young31 commented Nov 5, 2024

javialra97 commented Nov 5, 2024 •

edited

Loading

shoubhikraj commented Nov 5, 2024

Minor improvements for generation of dataset of reaction profiles. #359

Minor improvements for generation of dataset of reaction profiles. #359

Conversation

javialra97 commented Oct 18, 2024 • edited Loading

Checklist

t-young31 commented Oct 18, 2024

javialra97 commented Oct 19, 2024

t-young31 commented Oct 28, 2024

javialra97 commented Oct 29, 2024

t-young31 commented Oct 29, 2024

javialra97 commented Oct 30, 2024

t-young31 commented Nov 4, 2024

javialra97 commented Nov 5, 2024 • edited Loading

t-young31 commented Nov 5, 2024

codecov bot commented Nov 5, 2024 • edited Loading

Codecov Report

javialra97 commented Nov 5, 2024 • edited by t-young31 Loading

t-young31 commented Nov 5, 2024

javialra97 commented Nov 5, 2024 • edited Loading

shoubhikraj commented Nov 5, 2024

javialra97 commented Oct 18, 2024 •

edited

Loading

javialra97 commented Nov 5, 2024 •

edited

Loading

codecov bot commented Nov 5, 2024 •

edited

Loading

javialra97 commented Nov 5, 2024 •

edited by t-young31

Loading

javialra97 commented Nov 5, 2024 •

edited

Loading