-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Molecular Basis #175
Molecular Basis #175
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Ali-Tehrani Thanks for this pull request and I'd like to apologize in advance for the extensive comments below. It may be a bit overwhelming, but this is common for larger PRs. I hope you don't mind and we're really happy that you made this effort, regardless of all the comments.
Most comments below are small and should be easy to fix.
My main concern is that convert_vector_basis
is impractical and too general for IOData. (Previously, I thought this approach would be fine, but I'm no longer convinced it is.) The transformations we essentially need are not projections and can be constructed in a much faster and (most importantly) numerically stable fashion, without requiring any matrix (pseudo) inverse. (We only need decontraction and conversion from pure to Cartesian, which do not cause any loss of information.)
(I did not review the unit tests yet.)
def ncon(self) -> int: # noqa: D401 | ||
"""Number of contractions. This is usually 1; e.g., it would be 2 for an SP shell.""" | ||
def ncon(self) -> int: # noqa: D401 | ||
"""Number of contractions with distinct angular momentum numbers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be updated to correctly reflect all possible scenarios. Some basis sets also use generalized contractions in which the same angular momentum is occurring several times, e.g. ANO basis sets or the ones used in CP2K.
iodata/basis.py
Outdated
def nprim(self) -> int: # noqa: D401 | ||
"""Number of primitives, also known as the contraction length.""" | ||
def nprim(self) -> int: # noqa: D401 | ||
"""Number of primitive contracted shells, also known as the contraction length.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The formulation is confusing because primitives are not contracted and this seems to contradict it. I guess you wanted to clarify this. Would the following work?
"""Number of primitive contracted shells, also known as the contraction length.""" | |
"""Number of primitives in the contraction, also known as the contraction length.""" |
or
"""Number of primitive contracted shells, also known as the contraction length.""" | |
"""Number of primitives in the contracted shells, also known as the contraction length.""" |
I slightly prefer the former for its brevety.
iodata/basis.py
Outdated
@@ -181,6 +227,29 @@ class MolecularBasis(NamedTuple): | |||
primitive_normalization | |||
The normalization convention of primitives, which can be 'L2' (orbitals) or 'L1' | |||
(densities) normalized. | |||
Methods |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: indentation and blank line
Methods | |
Methods |
iodata/basis.py
Outdated
- Primitive contracted shell with exponent :math:`\alpha` is the set of all contractions | ||
in a contracted shell that have the exponent :math:`\alpha`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Ali-Tehrani @PaulWAyers I'm a bit confused by the terminology. A primitive is not contracted, so I'm not sure what the term "primitive contracted shell" is referring to. It appears in several places. Could you clarify this? I tried to suggest some corrections in other comments to fix this.
iodata/basis.py
Outdated
Segmented Molecular basis is a Molecular basis where all contracted shell in it share | ||
a specific total angular momentum number. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit confusing. Maybe it is a matter of terminology and conventions? Normally in a segmented basis one just does not attempt to describe how a series of exponents is shared by multiple contractions. Every contraction is listed separately, also when they have the same angular momentum and/or exponents.
iodata/basis.py
Outdated
) | ||
) | ||
# pylint: disable=no-member | ||
return self._replace(shells=shells) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also here, it would be useful to return a transformation matrix, which can be used to update MO coefficients, similar to the comment above. (This matrix would be the place where I'd expect the transformation matrices be used.)
@@ -159,3 +161,64 @@ def gob_cart_normalization(alpha: np.ndarray, n: np.ndarray) -> np.ndarray: | |||
vfac2 = np.vectorize(factorialk) | |||
return np.sqrt((4 * alpha)**sum(n) * (2 * alpha / np.pi)**1.5 | |||
/ np.prod(vfac2(2 * n - 1, 2))) | |||
|
|||
|
|||
def convert_vector_basis( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that this is a general way of transforming MO coefficients from one basis to another, but it is overkill (and not always numerically stable) for the type of basis set modifications in this PR. Moreover, it assumes that the type of overlap matrix needed (between basis1 and basis2) can be computed, which is not the case (yet) with IOData alone.
coeffs | ||
The array containing the coefficients of the normalized primitives in each contraction; | ||
shape = (nprim, ncon). | ||
These coefficients assume that the primitives are L2 (orbitals) or L1 | ||
(densities) normalized, but contractions are not necessarily normalized. | ||
(This depends on the code which generated the contractions.) | ||
|
||
Notes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a lot overlap between this section and the documentation in https://github.com/theochem/iodata/blob/master/doc/basis.rst
Could you refer to this page instead of adding this Notes section? It is impractical to maintain the same documentation in different places. A link can be created as follows:
Basis set conventions and terminology are documented in :ref:`basis_conventions`.
If something needs to be updated in basis.rst
, feel free to improve it.
def ncon(self) -> int: # noqa: D401 | ||
"""Number of contractions with distinct angular momentum numbers. | ||
|
||
This is usually 1; e.g., it would be 2 for an SP shell. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be helpful to refer to the term "segmented" in case this is 1.
convert_kind | ||
Return molecular basis object whose kinds are the same type. | ||
|
||
Notes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above. I'd suggest to refer to the basis set documentation page instead of adding notes here. (It might be useful to add a glossary to that document.)
Something else we need to keep in mind is that this PR overlaps heavily with an earlier one, which hasn't been merged yet: #157 . I'd like to merge them in chronological order to keep it manageable. |
@tovrstra I'm sort of interested in maintaining both Cartesian-to-pure and pure-to-Cartesian, as well as contracted-to-primitive and primitive-to-contracted. The reason has mostly to do with the ability to take a *.wfn or *.wfx that is actually, in fact, from a contracted spherical calculations and reexpress it in the more robust form. I wasn't sure how to do this without the projection, which is admittedly a more general trick that could be used for converting between different basis sets (e.g., 3-21G and cc-pVQZ) also. I'm not sure that the pseudoinverse would be ill-conditioned. Do you have an example? I feel like in cases where things are explicit everything should be perfect (to the numerical precision of the underlying overlaps, which admittedly is not as high as the explicit coefficient matrix, but is presumably still more digits than are normally reported in basis-set specifications. Certainly this method is slower but I felt like the important feature here was that this existed, not that it was blazing fast. I couldn't think of many cases where one would do a basis-set-conversion and then the subsequent post-processing would be even faster, but admittedly now we do have an O(n^3) (in the pseudoinverse), although if this were ever too slow we could use sparse-matrix-utilities in scipy.sparse to speed it up. |
@PaulWAyers Regarding the wfn and wfx, is the idea to be able to reconstruct the orbitals in more amenable basis? |
Exactly. It would be nice if we could read in the *.wfn and *.wfx and, to the extent possible, reconstruct the true computational basis. That would speed subsequent calculations appreciably. I agree that this is a lot of code to put in |
Added the ability to convert the coefficients of a vector inside one basis to another basis of possible different size.
The method "convert_kind" inside class MolecularBasis is added that converts all kind of a shell to either Cartesian or Pure. Note that the coefficients are averaged, since iodata assumes coefficients are the same throughtout a Primitive Shell. The function "convert_primitive_kind" allows the ability to convert one primitive shell to another.
Added - COntracted Shell instead of Shell. - Primitive here means Primitive Shell. - Notes section detailing the definition and formulas.
Added - Emphasis on contracted shell. - Method section detailing the methods - Notes including the definitions of different types of molecular basis.
Based on Toon's suggestion.
Toon's suggestion to refer this to the doc instead.
I'm closing this PR because several aspects need more discussion before the implementation and because too many different issues and features are being addressed in one PR, making it hard to fix all problems before merging. I've opened several new issues related to this PR, making references to the code here when relevant. |
This pull request is regarding issue (#146).
Added
MolecularBasis
to convert all primitive shells in a contracted shell to a specific kind (either Cartesian or Pure) (345f190).