Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor improvements for generation of dataset of reaction profiles. #359

Merged
merged 11 commits into from
Nov 5, 2024

Conversation

javialra97
Copy link

@javialra97 javialra97 commented Oct 18, 2024

  • Reading the output.csv as a data frame (with pandas) can be problematic because of the first line, removing the line you will have a proper data frame with well-structured columns. Also, that information is printed in the methods.txt file.
  • For specific purposes, it is not necessary a final single point, add it as a boolean (set to True by default). Sort of related to pull request add set_basis_set method #330.
  • The convergence criterion for the optimization with Gaussian should be set to Tight.
  • Implemented bug Extract coordinates failed #358.

Checklist

  • The changes include an associated explanation of how/why
  • Test pass
  • Documentation has been updated
  • Changelog has been updated

@t-young31
Copy link
Member

Thanks for the PR @javialra97

Reading the output.csv as a data frame (with pandas) can be problematic because of the first line, removing the line you will have a proper data frame with well-structured columns. Also, that information is printed in the methods.txt file.

This would be a breaking change, and pandas supports skipping rows natively

import pandas as pd
pd.read_csv("<filename>", skiprows=1)

so I'm not sure about that.

For specific purposes, it is not necessary a final single point, add it as a boolean

Interesting. I wouldn't be that adverse to this functionality, but can't think of a scenario where this would be useful. What did you have in mind?

The convergence criterion for the optimization with Gaussian should be set to Tight.

On this I don't agree I'm afraid. I've not come across a situation where tight optimisations are worthwhile. Maybe if you're wanting super accurate frequencies and you've got a load of compute to burn then maybe, in my view.

Implemented bug #358.

Thanks! 😄

@javialra97
Copy link
Author

Hi @t-young31

Interesting. I wouldn't be that adverse to this functionality, but can't think of a scenario where this would be useful. What did you have in mind?

I think the simplest case is when you have the resources to compute the full profile at a TZ/QZ basis set, in those cases a final single point is unnecessary. I have used it for benchmarking purposes, I want the geometries (with all the autodE checks) along the full profile to compute single points with N different functionals.

On this I don't agree I'm afraid. I've not come across a situation where tight optimisations are worthwhile. Maybe if you're wanting super accurate frequencies and you've got a load of compute to burn then maybe, in my view.

Well, I would prefer it to use the tight criterion, and I couldn't configure it with ade.Config, I will give it another try.

Thanks! 😄

Thanks to you for autodE!

@t-young31
Copy link
Member

Sorry for the slow reply – work took over last week

I think the simplest case is when you have the resources to compute the full profile at a TZ/QZ basis set, in those cases a final single point is unnecessary.

Cool – sounds good 👍🏼

Well, I would prefer it to use the tight criterion, and I couldn't configure it with ade.Config, I will give it another try.

Hopefully the second try was successful! The following I think should work

import autode as ade

ade.Config.G09.keywords.opt.remove('Opt')
ade.Config.G09.keywords.opt.append('Opt=Tight')
# likewise for ade.Config.G09.keywords.opt_ts

To get this merged would you mind

Thanks 😄

@t-young31 t-young31 changed the base branch from master to v1.4.5 October 28, 2024 19:24
@javialra97
Copy link
Author

Hi,

I made the pertinent changes. Hope everything is fine

Thanks!

@t-young31
Copy link
Member

Awesome – thanks @javialra97 😄

It looks like there's a conflict on your branch, but looks like an easy one to resolve

@javialra97
Copy link
Author

Hi @t-young31,

I hope that is solved now 🙂

@t-young31
Copy link
Member

Looks like some Guassian frequency calculation tests are failing due to allowing the Standard orientation coordinates to be extracted. I've not had time to dig into it but I'm guessing it's a limitation of the frequency calculation from the Hessian, making it unstable when the principal axis lies along one of the cartesian axes (I guess that's 'standard orientation'?). A quick fix might be to do

            if "Input orientation" in line or ("Standard orientation" in line and len(coords) == 0):

but we should fix this properly so that frequency calculation is okay irrespective of the orientation

@javialra97
Copy link
Author

javialra97 commented Nov 5, 2024

Hi @t-young31,

In the Standard orientation, Gaussian reorders the molecule so the principal axis will be the z-axis and the principal plane of symmetry is located on the yz-plane. In the first test with the H2 molecule, the h2.hessian.atoms will look like as Atoms(n_atoms=2, [Atom(H, 0.0000, 0.0000, 0.3804), Atom(H, 0.0000, 0.0000, -0.3804)]) and not in the x-axis as is defined in the input.

As you said, the problem is in the hessian, the output of _tr_vecs() returns different vectors and consequently the _proj_mass_weighted has completely different values. Honestly, I can't help you any further at this point. I would suggest to rise a warning. Do the thermal corrections depend on that, because the extracted frequencies are fine?

@t-young31
Copy link
Member

Honestly, I can't help you any further at this point

Sorry you hit this 😞. Have created #362 to track outside of this PR

Do the thermal corrections depend on that, because the extracted frequencies are fine?

I'm not sure the frequencies are fine.. https://github.com/duartegroup/autodE/actions/runs/11589659415/job/32265741467#step:6:205

Copy link

codecov bot commented Nov 5, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.43%. Comparing base (8284a7c) to head (52aa45a).
Report is 1 commits behind head on v1.4.5.

Additional details and impacted files
@@           Coverage Diff           @@
##           v1.4.5     #359   +/-   ##
=======================================
  Coverage   97.43%   97.43%           
=======================================
  Files         204      204           
  Lines       23758    23759    +1     
=======================================
+ Hits        23148    23150    +2     
+ Misses        610      609    -1     
Flag Coverage Δ
unittests 97.43% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@javialra97
Copy link
Author

javialra97 commented Nov 5, 2024

I'm not sure the frequencies are fine.. https://github.com/duartegroup/autodE/actions/runs/11589659415/job/32265741467#step:6:205

Yes, is true, I only checked the one for H2 but the acetylene frequencies are wrong 😔

Maybe instead of taking the geometries from the standard orientation, we can take them directly from the input instructions, after charge and multiplicity, what do you think?

The error now is also because of taking a different geometry?

@t-young31
Copy link
Member

Maybe instead of taking the geometries from the standard orientation, we can take them directly from the input instructions, after charge and multiplicity, what do you think?

We could, but then it'd still potentially be broken for Opt+Freq calculations that don't print the input orientation coordinates. I'll find a way to fix it properly, or at least try!

The error now is also because of taking a different geometry?

I don't think so. @shoubhikraj have you got any ideas? it's Windows+TRM so your world. Python 3.8 looks like it's just end of life, so shall we just drop support?
@javialra97 very happy to merge irrespective of that failing test, if you're happy 😄

@javialra97
Copy link
Author

javialra97 commented Nov 5, 2024

We could, but then it'd still potentially be broken for Opt+Freq calculations that don't print the input orientation coordinates. I'll find a way to fix it properly, or at least try!

It seems that Gaussian is a little messy with the outputs. But I was checking my outputs and the one that you have in test and for the Opt+Freq of the TS it prints it

@javialra97 very happy to merge irrespective of that failing test, if you're happy 😄

Great! Glad to help a little bit.

@t-young31 t-young31 merged commit 140c60b into duartegroup:v1.4.5 Nov 5, 2024
13 of 14 checks passed
@shoubhikraj
Copy link
Collaborator

@t-young31 Honestly I am not sure where the error is coming from... The hessian and gradient are loaded from stored matrices, and I have previously checked that it works. It does succeed with python 3.13, so maybe some issue with numpy or scipy versions. If it comes up later, then I might check in detail. I am fine with removing python 3.8 support, maybe in the next version - it was just made eol.

@javialra97 javialra97 deleted the dataset_rxn_profiles branch November 13, 2024 11:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants