Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StructureData kind names are not preserved when getting pymatgen structure #1028

Closed
sphuber opened this issue Dec 22, 2017 · 6 comments
Closed

Comments

@sphuber
Copy link
Contributor

sphuber commented Dec 22, 2017

Courtesy of @ChRicca
If a StructureData specifies sites with kinds that have suffixes to the elemental type, for example Ni1 and Ni2, when getting the ASE or pymatgen structure, the kind names are not preserved. If possible and supported by these other libraries, the original kind names should be kept, such that no information is lossed when toggling between data structures.

@giovannipizzi
Copy link
Member

This is not accurate.
When possible, kind names are preserved by AiiDA.

E.g. when running this

StructureData = DataFactory('structure')
struct = StructureData(cell=[[4,0,0],[0,4,0],[0,0,4]])
struct.append_atom(position=(0,0,0), symbols='Ni', name='Ni1')
struct.append_atom(position=(2,2,2), symbols='Ni', name='Ni2')
asestruct = struct.get_ase()
struct2 = StructureData(ase=asestruct)

the output of struct.sites is

[<Site: kind name 'Ni1' @ 0.0,0.0,0.0>, <Site: kind name 'Ni2' @ 2.0,2.0,2.0>]

and the output of struct2.sites is

 [<Site: kind name 'Ni1' @ 0.0,0.0,0.0>, <Site: kind name 'Ni2' @ 2.0,2.0,2.0>]

so the kind names are preserved, when assigned in this way (and the masses are the same, and the kind name is the element name followed by an integer, and probably a few more requirements...).
The way this is achieved is by using the Atom.tags integer of ASE to assign. The reason this cannot work with any random choice of kind names is due to the way ASE stores information about kinds (at least for what I understand), see also below.

I think the only proper way to do it is, when exporting, to use the new functions that me and @sphuber implemented here https://github.com/aiidateam/aiida_core/blob/develop/aiida/tools/data/structure/__init__.py
spglib_tuple_to_structure and structure_to_spglib_tuple. (or wrappers around this)
In practice, the function assigns a different integer to each kind (giving it the Z number when possible, otherwise a very high number) and keeps track in additional variables of the mapping integer<->kind. I think the additional variables are needed and need to be used by people wanting to use external libraries and then go back to AiiDA.

The issue is that it is not obvious how to properly deal with this for each external library. E.g. AFAIK in ASE the only additional thing one can use is an integer (the 'tag'). AiiDA tries to use it so that at least in simple cases one can do a round trip and obtain back the same kinds, but unfortunately this cannot work in all cases because of the different way things are stored (in AiiDA, there are kinds with a label and, e.g., a mass; in ASE there are atoms, each of them has a tag and an mass; and there is no easy way that always works to map one on the other).

If there is a specific example that we can try, where the conversion does not work as expected, and also the code that shows what instead it would be the expected behaviour (or how to obtain it with a specific external library), please post it here for discussion and to then include it in AiiDA.

This is also very related to the discussion to #853

@giovannipizzi
Copy link
Member

I would suggest that either we get a specific example that does not work from Chiara, or we close the issue for the time being.

@ChRicca
Copy link

ChRicca commented Jan 10, 2018

The problem with ASE arises only if you do not use numbers but letters to distinguish the sites (Nia instead of Ni1).
Maybe not many people use letters, but sometimes it is necessary for me since I have several kinds and QE allows you to write only 3 characters when you specify the ATOMIC_SPECIES: so I either use letters (Mna instead of Mn10) or I have to use only one letter to indicate the element and two digits for the number (M10 instead of Mn10). Either using Mna or M10 in the conversion form AiiDA-ASE-AiiDA I lose the kinds that I had defined, as you can see in these examples:

StructureData = DataFactory('structure')
struct = StructureData(cell=[[4,0,0],[0,4,0],[0,0,4]])
struct.append_atom(position=(0,0,0), symbols='Ni', name='Nia')
struct.append_atom(position=(2,2,2), symbols='Ni', name='Nib')
asestruct = struct.get_ase()
struct2 = StructureData(ase=asestruct)
print struct2.sites

The output is:
[<Site: kind name 'Ni1' @ 0.0,0.0,0.0>, <Site: kind name 'Ni2' @ 2.0,2.0,2.0>]

StructureData = DataFactory('structure')
struct = StructureData(cell=[[4,0,0],[0,4,0],[0,0,4]])
struct.append_atom(position=(0,0,0), symbols='Ni', name='A1')
struct.append_atom(position=(2,2,2), symbols='Ni', name='A2')
asestruct = struct.get_ase()
struct2 = StructureData(ase=asestruct)
print struct2.sites

The output is:
[<Site: kind name 'Ni1' @ 0.0,0.0,0.0>, <Site: kind name 'Ni2' @ 2.0,2.0,2.0>]

The problem is always there with pymatgen:

StructureData = DataFactory('structure')
struct = StructureData(cell=[[4,0,0],[0,4,0],[0,0,4]])
struct.append_atom(position=(0,0,0), symbols='Ni', name='Ni1')
struct.append_atom(position=(2,2,2), symbols='Ni', name='Ni2')
asestruct = struct.get_pymatgen()
struct2 = StructureData(pymatgen=asestruct)
print struct2.sites

Output:
<Site: kind name 'Ni' @ 0.0,0.0,0.0>, <Site: kind name 'Ni' @ 2.0,2.0,2.0>]

@giovannipizzi
Copy link
Member

Indeed, the issue is there (and should be fixed) for pymatgen.
For ASE, while I understand what you want to do and why, I don't know how the name info could be saved in ASE. If you have more experience than me and have an idea, please let me know.

@giovannipizzi giovannipizzi changed the title StructureData kind names are not preserved when getting ASE or pymatgen structure StructureData kind names are not preserved when getting pymatgen structure Jan 15, 2018
@nmounet
Copy link

nmounet commented Mar 14, 2018

For pymatgen, I have a fix (PR to come soon).
Also, more generally, we seem to miss a number of "roundtrip" tests (via ASE & pymatgen, in particular).

@nmounet
Copy link

nmounet commented Mar 15, 2018

Issue is fixed by #1285

@nmounet nmounet closed this as completed Mar 15, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants