Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Molecular Basis #175

Closed
wants to merge 14 commits into from
Closed

Molecular Basis #175

wants to merge 14 commits into from

Conversation

Ali-Tehrani
Copy link
Contributor

@Ali-Tehrani Ali-Tehrani commented Aug 5, 2020

This pull request is regarding issue (#146).

Added

  • Conversion to decontracted molecular basis (c97cad3)
  • Ability to generally convert a vector between two basis sets, given that one knows the overlap matrices (6f3f76f)
  • Method in MolecularBasis to convert all primitive shells in a contracted shell to a specific kind (either Cartesian or Pure) (345f190).
    • Ran into the problem of iodata assuming all coefficients in a primitive shell are the same. However, conversion to a different basis varys the coefficients across a primitive shell. This is fixed by averaging them.
  • General function to convert a kind of a single primitive shell to Cartesian or pure (345f190) .
  • Update the code document with definitions, and equations (d956cfa, 8eb2ea7).

Copy link
Member

@tovrstra tovrstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Ali-Tehrani Thanks for this pull request and I'd like to apologize in advance for the extensive comments below. It may be a bit overwhelming, but this is common for larger PRs. I hope you don't mind and we're really happy that you made this effort, regardless of all the comments.

Most comments below are small and should be easy to fix.

My main concern is that convert_vector_basis is impractical and too general for IOData. (Previously, I thought this approach would be fine, but I'm no longer convinced it is.) The transformations we essentially need are not projections and can be constructed in a much faster and (most importantly) numerically stable fashion, without requiring any matrix (pseudo) inverse. (We only need decontraction and conversion from pure to Cartesian, which do not cause any loss of information.)

(I did not review the unit tests yet.)

def ncon(self) -> int: # noqa: D401
"""Number of contractions. This is usually 1; e.g., it would be 2 for an SP shell."""
def ncon(self) -> int: # noqa: D401
"""Number of contractions with distinct angular momentum numbers.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be updated to correctly reflect all possible scenarios. Some basis sets also use generalized contractions in which the same angular momentum is occurring several times, e.g. ANO basis sets or the ones used in CP2K.

iodata/basis.py Outdated
def nprim(self) -> int: # noqa: D401
"""Number of primitives, also known as the contraction length."""
def nprim(self) -> int: # noqa: D401
"""Number of primitive contracted shells, also known as the contraction length."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The formulation is confusing because primitives are not contracted and this seems to contradict it. I guess you wanted to clarify this. Would the following work?

Suggested change
"""Number of primitive contracted shells, also known as the contraction length."""
"""Number of primitives in the contraction, also known as the contraction length."""

or

Suggested change
"""Number of primitive contracted shells, also known as the contraction length."""
"""Number of primitives in the contracted shells, also known as the contraction length."""

I slightly prefer the former for its brevety.

iodata/basis.py Outdated
@@ -181,6 +227,29 @@ class MolecularBasis(NamedTuple):
primitive_normalization
The normalization convention of primitives, which can be 'L2' (orbitals) or 'L1'
(densities) normalized.
Methods
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: indentation and blank line

Suggested change
Methods
Methods

iodata/basis.py Outdated
Comment on lines 147 to 148
- Primitive contracted shell with exponent :math:`\alpha` is the set of all contractions
in a contracted shell that have the exponent :math:`\alpha`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Ali-Tehrani @PaulWAyers I'm a bit confused by the terminology. A primitive is not contracted, so I'm not sure what the term "primitive contracted shell" is referring to. It appears in several places. Could you clarify this? I tried to suggest some corrections in other comments to fix this.

iodata/basis.py Outdated
Comment on lines 268 to 269
Segmented Molecular basis is a Molecular basis where all contracted shell in it share
a specific total angular momentum number.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit confusing. Maybe it is a matter of terminology and conventions? Normally in a segmented basis one just does not attempt to describe how a series of exponents is shared by multiple contractions. Every contraction is listed separately, also when they have the same angular momentum and/or exponents.

iodata/basis.py Outdated
)
)
# pylint: disable=no-member
return self._replace(shells=shells)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also here, it would be useful to return a transformation matrix, which can be used to update MO coefficients, similar to the comment above. (This matrix would be the place where I'd expect the transformation matrices be used.)

@@ -159,3 +161,64 @@ def gob_cart_normalization(alpha: np.ndarray, n: np.ndarray) -> np.ndarray:
vfac2 = np.vectorize(factorialk)
return np.sqrt((4 * alpha)**sum(n) * (2 * alpha / np.pi)**1.5
/ np.prod(vfac2(2 * n - 1, 2)))


def convert_vector_basis(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that this is a general way of transforming MO coefficients from one basis to another, but it is overkill (and not always numerically stable) for the type of basis set modifications in this PR. Moreover, it assumes that the type of overlap matrix needed (between basis1 and basis2) can be computed, which is not the case (yet) with IOData alone.

coeffs
The array containing the coefficients of the normalized primitives in each contraction;
shape = (nprim, ncon).
These coefficients assume that the primitives are L2 (orbitals) or L1
(densities) normalized, but contractions are not necessarily normalized.
(This depends on the code which generated the contractions.)

Notes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a lot overlap between this section and the documentation in https://github.com/theochem/iodata/blob/master/doc/basis.rst
Could you refer to this page instead of adding this Notes section? It is impractical to maintain the same documentation in different places. A link can be created as follows:

Basis set conventions and terminology are documented in :ref:`basis_conventions`.

If something needs to be updated in basis.rst, feel free to improve it.

def ncon(self) -> int: # noqa: D401
"""Number of contractions with distinct angular momentum numbers.

This is usually 1; e.g., it would be 2 for an SP shell.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be helpful to refer to the term "segmented" in case this is 1.

convert_kind
Return molecular basis object whose kinds are the same type.

Notes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above. I'd suggest to refer to the basis set documentation page instead of adding notes here. (It might be useful to add a glossary to that document.)

@tovrstra
Copy link
Member

Something else we need to keep in mind is that this PR overlaps heavily with an earlier one, which hasn't been merged yet: #157 . I'd like to merge them in chronological order to keep it manageable.

@PaulWAyers
Copy link
Member

@tovrstra I'm sort of interested in maintaining both Cartesian-to-pure and pure-to-Cartesian, as well as contracted-to-primitive and primitive-to-contracted. The reason has mostly to do with the ability to take a *.wfn or *.wfx that is actually, in fact, from a contracted spherical calculations and reexpress it in the more robust form. I wasn't sure how to do this without the projection, which is admittedly a more general trick that could be used for converting between different basis sets (e.g., 3-21G and cc-pVQZ) also.

I'm not sure that the pseudoinverse would be ill-conditioned. Do you have an example? I feel like in cases where things are explicit everything should be perfect (to the numerical precision of the underlying overlaps, which admittedly is not as high as the explicit coefficient matrix, but is presumably still more digits than are normally reported in basis-set specifications. Certainly this method is slower but I felt like the important feature here was that this existed, not that it was blazing fast. I couldn't think of many cases where one would do a basis-set-conversion and then the subsequent post-processing would be even faster, but admittedly now we do have an O(n^3) (in the pseudoinverse), although if this were ever too slow we could use sparse-matrix-utilities in scipy.sparse to speed it up.

@tovrstra
Copy link
Member

@PaulWAyers Regarding the wfn and wfx, is the idea to be able to reconstruct the orbitals in more amenable basis?

@PaulWAyers
Copy link
Member

Exactly. It would be nice if we could read in the *.wfn and *.wfx and, to the extent possible, reconstruct the true computational basis. That would speed subsequent calculations appreciably.

I agree that this is a lot of code to put in IOData. I'm not sure where else to put it, though; it doesn't really belong in any module, as the main utility of basis-set-changing is when post-processing one calculation as a pre-processing step for a subsequent calculation. It might be that it belongs in HORTON3 proper, though I'm not sure how the workflow would be: probably IOData to gbasis to IOData. Given that the feature of gbasis that is being used is already (redundantly, for the sake of modularity) in IOData, putting this basis-set-conversion functionality in IOData seems semi-sensible (but, I agree, not perfectly so).

Added the ability to convert the coefficients of a vector inside
one basis to another basis of possible different size.
The method "convert_kind" inside class MolecularBasis is added
that converts all kind of a shell to either Cartesian or Pure.
Note that the coefficients are averaged, since iodata assumes
coefficients are the same throughtout a Primitive Shell.

The function "convert_primitive_kind" allows the ability
to convert one primitive shell to another.
Added
- COntracted Shell instead of Shell.
- Primitive here means Primitive Shell.
- Notes section detailing the definition and formulas.
Added
- Emphasis on contracted shell.
- Method section detailing the methods
- Notes including the definitions of different types of molecular
basis.
Based on Toon's suggestion.
Toon's suggestion to refer this to the doc instead.
@tovrstra
Copy link
Member

tovrstra commented Apr 3, 2021

I'm closing this PR because several aspects need more discussion before the implementation and because too many different issues and features are being addressed in one PR, making it hard to fix all problems before merging. I've opened several new issues related to this PR, making references to the code here when relevant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants