Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revisit ChemicalSpecies #115

Open
cortner opened this issue Sep 3, 2024 · 4 comments
Open

Revisit ChemicalSpecies #115

cortner opened this issue Sep 3, 2024 · 4 comments

Comments

@cortner
Copy link
Member

cortner commented Sep 3, 2024

(cf discussion in #114)

The ChemicalSpecies implementation has some problems, partly due to my lack of understanding of chemistry. Basically, the issue is that we now require the number of neutrons to be specified. But the correct interpretation when this is not done is that one uses the average number of neutrons (taken over all isotopes). I believe there are two relatively straightforward solutions to this issue:

(1) Allow n_neutrons = -1 as this default with unspecified n_neutrons

(2) Introduce a new type e.g. Element that only specifies the atomic number and nothing else.

I would be grateful for others' thoughts.

CC @rkurchin @mfherbst @jgreener64 @tjjarvinen

@tjjarvinen
Copy link
Collaborator

tjjarvinen commented Sep 4, 2024

I propose a small extension to the current design. It would be a non-breaking one.

If following can be made happen, then everything that is needed for chemistry would be implemented.

cs1 = ChemicalSpecies(6)  # Carbon atom
cs2 = ChemicalSpecies(:C)  # equal to above
cs3 = ChemicalSpecies(6; n_nucleons=13)
cs4 = ChemicalSpecies(6; n_nucleons=12)
cs5 = ChemicalSpecies(6; atom_name=:Cah)
cs6 = ChemicalSpecies(6; n_nucleons=13, atom_name=:Cah)
cs7 = ChemicalSpecies(6; n_nucleons=12, atom_name=:Cah)

ChemicalSpecies(:Cah) # error

cs1 == cs2 # true
cs1 == cs3 # true
cs1 == cs4 # true
cs1 == cs5 # false
cs3 == cs4 # false
cs3 == cs5 # true
cs3 == cs6 # true
cs4 == cs5 # true
cs4 == cs6 # true
cs4 == cs7 # false
cs5 == cs6 # true
cs5 == cs7 # true
cs6 == cs7 # false

atomic_symbol( cs ) # return :C for all above cases 

atomic_name( cs1 ) # return :C
atomic_name( cs2 ) # return :C
atomic_name( cs3 ) # return :C
atomic_name( cs4 ) # return :C
atomic_name( cs5 ) # return :Cah
atomic_name( cs6 ) # return :Cah
atomic_name( cs7 ) # return :Cah

Reasoning

PDB file format defines that there is atom name (columns 13-16) and element (columns 77-78) identified for each atom. Element is based on proton number, but atom name can be whatever, with a restriction that it is max 4 ASCII characters.

It is a common practice e.g., to name different carbon atoms based on their environment or equivalence under symmetry. You could e.g., have carbon atoms Cah and Cbh that are named based on different symmetry, or e.g. Csp2 and Csp3 carbons based on hybridization.

To get to this style of an usage, we would only need to use info as an atom name. info is UInt32, so it can be reinterpret as SVector{4, UInt8} that represents 4 ASCII characters.

Implementation

Implement atomic_name function that defaults to atomic_symbol. But for ChemicalElement it would give info, when it differs from zero(UInt32)

atomic_name(at) = atomic_symbol(at)

function atomic_name(cs::ChemicalSpecies)
   if cs.info == 0
       return atomic_symbol(cs)
   else
       # filter first empty space characters
       as_characters = Char.( reinterpret(SVector{4, UInt8}, cs.info) )
       tmp = String( filter( x -> ! isspace(x), as_characters ) )
       return Symbol( tmp )
   end
end

Comparison logic can be implemented with

function isequal(cs1::ChemicalSpecies, cs2::ChemicalSpecies)
    if cs1.atomic_number != cs2.atom_number
        return false
    elseif (cs1.nneut >= 0 && cs2.nneut >= 0) && cs1.nneut != cs2.nneut
        return false
    elseif (cs1.info != 0 && cs2 != 0) && cs1.info != cs2.info
        return false
    else
        return true
    end
end

Comments

This implementation should work for almost all of the cases. For those few cases that it does not, it might be a good idea to implement a generic Element structure that has fields atomic_number::Uint16 and atomic_symbol::Symbol with isequal implemented as

function isequal(sp1::AbstractSpecies, sp2::AbstractSpecies)
      if atomic_number(sp1) == atomic_number(sp2) && atomic_name(sp1) == atomic_name(sp2)
          return true
      else
          return false
      end
end

and naturally both Element and ChemicalSpecies would be subtypes of AbstractSpecies.

This kind of an implementation is also needed to make ExtXYZ to AB v0.4 compatible. Currently ExtXYZ has its own AbstractSystem implementation. But it has several issues on its own and with this we could just use FlexibleSystem directly with ExtXYZ and make the implementation a lot more straight forward.

@jgreener64
Copy link
Collaborator

I'm less familiar how this is done in quantum simulations but in molecular mechanics each particle typically has an atom name, which maps to an element, and a total mass, which is the average over all isotopes. As long as there are functions to return these I don't have strong opinions on the implementation under the hood.

We should consider whether we need to deal with neutrons at all though. There could be a mass kwarg to ChemicalSpecies that defaults to the average isotopic mass, then ChemicalSpecies would have to have all properties in common to be equal.

@cortner
Copy link
Member Author

cortner commented Sep 4, 2024

Teemu's suggestions sound good. Some variation of those should be ok. In general, I am nervous about introducing a second species type. But I'm equally nervous about the complexity of ChemicalSpecies. I suggest to not rush this.

@rkurchin
Copy link
Collaborator

rkurchin commented Sep 5, 2024

In DFT at least, we don't generally care about neutrons at all, so I have less of a direct horse in this race. That being said, I'm fine with either of the suggestions made above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants