Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Defect generation updates #133

Merged
merged 23 commits into from
Aug 13, 2023

Conversation

kavanase
Copy link
Contributor

Hi @jmmshn!
These are some updates to the defect generation code:

  • Adds an optional equivalent_sites parameter to Defect objects. This is automatically computed by the defect generators anyway and is useful info (particularly for interstitials), so these have been updated to return Defects with this parameter set accordingly.
  • Optional target_frac_coords parameter to get_supercell_structure which allows the user to specify a 'target position' for the defect site in the generated supercell, so that it generates the defect supercell with the defect site at the closest possible/equivalent position to this site in the supercell.
  • Added the return_site optional parameter to get_supercell_structure, to also return the site of the defect in the generated supercell (useful info to have for a lot of our workflows!)
  • Use the oxi_state of the substituting specie in the host structure, if present, when guessing the oxi state of substitutional defects – i.e. antisite defects in this case.
  • Automatically bump max_cell_range to 2 in TopographyAnalyzer if num_atoms < 5, for accurate/complete Voronoi tessellation (small unit cells anyway so negligible cost!). – I noticed that for some small unit cells it was either missing out on Voronoi sites and/or getting the multiplicities wrong because of max_cell_range being too low for these cases. Bumping it up to 2 fixes it, and should have negligible additional cost because these are only for small/high-symmetry unit cells anyway.

…rough to `InterstitialGenerator` superclass (bugfix)
…_coords` parameter to `get_supercell_structure` to allow choice of generated defect site in supercell
…num_atoms < 5`, for accurate/complete Voronoi tessellation (small unit cells anyway so negligible cost!)
…nt (when guessing oxi state of substitution defect) – i.e. antisite defects!
# Conflicts:
#	pymatgen/analysis/defects/generators.py
@codecov
Copy link

codecov bot commented Jul 14, 2023

Codecov Report

Patch coverage: 80.00% and project coverage change: -0.16% ⚠️

Comparison is base (f1c9f72) 91.29% compared to head (2ee22ee) 91.13%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #133      +/-   ##
==========================================
- Coverage   91.29%   91.13%   -0.16%     
==========================================
  Files          11       11              
  Lines        1952     1985      +33     
  Branches      364      377      +13     
==========================================
+ Hits         1782     1809      +27     
- Misses         91       94       +3     
- Partials       79       82       +3     
Files Changed Coverage Δ
pymatgen/analysis/defects/utils.py 91.70% <0.00%> (-0.47%) ⬇️
pymatgen/analysis/defects/core.py 89.02% <79.48%> (+0.20%) ⬆️
pymatgen/analysis/defects/generators.py 96.13% <89.47%> (-1.00%) ⬇️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

…percell_structure` to make compatible with supertype `Defect`
@jmmshn
Copy link
Collaborator

jmmshn commented Jul 24, 2023

@kavanase Thanks for the PR!
Sorry for the late reponse.
Two quick comments

  • For target_frac_coords I think it'll be more general to just do the shift to the user provided coordinate and not the nearest site. The rounding to the nearest site can happen in a utility function outside.
  • for the oxidataion state for substitional defects I noticed something smiliar and put a fix slightly different fix in. I think that's causing the conflict on the repo right now. I think your fix is more thorough so we can go with that for now.

If you update the code I'll also dig through my misbehaving antisite case and add that to the tests.
Thanks again!

@kavanase
Copy link
Contributor Author

Hi @jmmshn, no worries! 😃
My turn to apologise for the delay, couple deadlines this week!

For target_frac_coords I think it'll be more general to just do the shift to the user provided coordinate and not the nearest site. The rounding to the nearest site can happen in a utility function outside.

For this, I don't know if I'm totally understanding what you mean. If shifting to exactly the user-provided fractional coordinates (in the case where this doesn't already exactly match one of the equivalent defect sites in the supercell), then this would mean either shifting the supercell origin or moving the defect position right?
For the former, this means we'd have a mismatch between different generated defect supercells (despite the same input sc_mat, min_atoms, max_atoms and min_length) which would cause some big issues with comparing defect/bulk supercells for charge corrections and identifying defect sites. For the latter (moving the defect), this wouldn't be possible for vacancies and also would likely lead to unexpected behaviour with antisites/interstitials (essentially changing what defect is being generated). This is why I set it up to return the closest equivalent defect site to the input frac coords. Maybe I'm missing something here though!

About the substitution defect oxidation state issue, cool!

@jmmshn
Copy link
Collaborator

jmmshn commented Jul 31, 2023

I think I'm just not super clear about the exact downstream problem this is addressing and just assumed it was mostly for visualization.

For this, I don't know if I'm totally understanding what you mean. If shifting to exactly the user-provided fractional coordinates (in the case where this doesn't already exactly match one of the equivalent defect sites in the supercell), then this would mean either shifting the supercell origin or moving the defect position right?

So I am assuming that shifting this was mostly targeted at visualizing the defect since the physical system is ultimately the same. As such, I assumed that all calculations will be done with the original cell and that a final arbitrary shift of all the sites in the defect cell can be performed at the very end so it won't affect any results.
I"m a bit scared of using equivalent_sites since symmetry analysis can be brittle and it's not technically even implemented for all the classes and it would be a bit weirder still for complexes.

If we want to have an arbitrary shift of the sites for visualization then it seems to me that we can just use the provided site for each defect in the UC and shift all the sites in the SC by the difference between the target and the SC position of the site. That method will also transfer well to complexes since we can just use a center_of_mass or something to make sure the complex position is uniquely defined then shift the whole SC.

@kavanase
Copy link
Contributor Author

kavanase commented Aug 5, 2023

Sorry busy week! 😅

Yes it's mostly for visualisation purposes in the long run, but I think it makes sense here at the generation stage, because then the defect can be initialised near the target site before the relaxation calculations.
Then when calculations have completed the user can directly open and visualise the relaxed defect structure as is, instead of the alternative having to load it into python, apply some post-calculation shifting functions, output to a file and then open for visualisation. Another advantage is that the calculation structure and visualised structure are the same, so avoids confusion when manually inspecting calculations/structures, comparing coordinates etc.
So if happening at the generation stage, then the whole supercell shouldn't be modified but just the chosen defect position.

About equivalent_sites, I get the concerns about using it. For symmetry analysis being brittle as you say, I'd still think it'd be (more) useful to have this info though, as it's additionally providing which sites the spglib analysis has grouped together as equivalent, and instead of generating the defect at a random one of these sites and not giving info on the alternative sites, the user can optionally see what other sites were considered equivalent by the symmetry analysis for this defect. So giving more info to go off in case symmetry analysis isn't working as desired.
Also, it uses the same spglib code/info (sym_struct.equivalent_sites) as the multiplicity attributes, so just optionally giving more info on what the other equivalent sites are which that attribute refers to.
I agree for complexes implementing this would be difficult, so I've made it throw a NotImplementedError if target_frac_coords is attempted with them, in the same way multiplicity is currently treated with them.

You might disagree with some of this though!

@jmmshn
Copy link
Collaborator

jmmshn commented Aug 7, 2023

Yes it's mostly for visualisation purposes in the long run, but I think it makes sense here at the generation stage, because then the defect can be initialised near the target site before the relaxation calculations.
Then when calculations have completed the user can directly open and visualise the relaxed defect structure as is, instead of the alternative having to load it into python, apply some post-calculation shifting functions, output to a file and then open for visualisation. Another advantage is that the calculation structure and visualised structure are the same, so avoids confusion when manually inspecting calculations/structures, comparing coordinates etc.
So if happening at the generation stage, then the whole supercell shouldn't be modified but just the chosen defect position.

OK agreed, this all makes sense now.

@jmmshn
Copy link
Collaborator

jmmshn commented Aug 9, 2023

Can we add two tests?

  1. For the translation, I'd like to see it work for a substitution and insertion defect.
  2. For the oxidation I'd like to see it work for the following test case:
from pymatgen.ext.matproj import MPRester
from pymatgen.analysis.defects.generators import AntiSiteGenerator
with MPRester() as mpr:
    structure = mpr.get_structure_by_material_id("mp-804")
gen = AntiSiteGenerator()
n_ga, g_na = list(gen.get_defects(structure))

n_ga.get_charge_states() 

currently gives [-1,0,1] since both N and Ga have 3.0 valence.

@kavanase
Copy link
Contributor Author

Hi Jimmy, sure!
Added both of these now 😃

@jmmshn jmmshn merged commit e509671 into materialsproject:main Aug 13, 2023
6 of 8 checks passed
@jmmshn
Copy link
Collaborator

jmmshn commented Aug 13, 2023

Hi @kavanase
Awesome, Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants