-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added optical spectra from refractiveindex.info as composite features #1
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work!
I did some comments with suggestions. We can look through it together if you want when you come back.
Also, I did not say here but we may need to modify some things so that the amount of data is not too large, when we request to merge to the main matminer. We will see what they say about that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the work!
I added some comments and questions.
Some general remarks:
- if the stored files are too large it could be an option to compress them? Also switching to some other format other than csv may be an option, but I am not sure if this will introduce unwanted dependencies in matminer
- in case you did not do already, before opening the PR to the trunk it would be good to set up the pre-commit, so you will get the linting erros checked beforehand.
…d the optical database, better use of files for restart
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates. I have added some more comments.
Summary
The idea is to add the optical features based on data from https://www.refractiveindex.info
I first extracted the (relevant*) data, then created 2 sets of 3 tables (refractive index, extinction coefficient, reflectivity).
The 2 sets correspond to 1) the pure elements, as averaged over available data from different sources (if any),
and 2), the pseudo-inverse values for each element (corresponding to the least-square fit), based on ~200 data points.
The second solution (based on the pseudo-inverse) leads to better results when compared to exact ones from the database
and is chosen as the default.
*Relevant in the sense that I included only solid-state non-organic materials, and only took the data if the spectrum from 380 to 780 nm was covered.
I also include the transport properties in the same way (pure elements or based on their pseudo-inverses). These come from the database of transport properties computed by Ricci et al (2017) that can be retrieved from the Materials Project.
TODO