-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Examples Folder and MSE Example #158
Conversation
Changed bandstructure to bandstructure_col and dos to dos_col
automatminer/automl/adaptors.py
Outdated
@@ -118,9 +118,9 @@ def fit(self, df, target, **fit_kwargs): | |||
self._features = df.drop(columns=target).columns.tolist() | |||
self._ml_data = {"X": X, "y": y} | |||
self.fitted_target = target | |||
self.logger.info("TPOT fitting started.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are there these logger changes happening here? In this PR we shouldn't do this
automatminer/examples/mse_example.py
Outdated
@@ -0,0 +1,45 @@ | |||
import unittest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Also is cool that it's in a test, but I'm afraid it will just confuse people who come to use it (ie, "woah, it is weird they put this test here, I wonder where the example is"). I don't think we will be frequently running this test anyway (especially because you are using the default config, not the debug config which is much faster).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of being in a test, could you write out the file in a simple script (or even better, notebook), with perhaps a few more comments? Not everyone is particularly familiar with pandas/automatminer/matminer/machine learning, and there are a few areas that won't make sense to someone unfamiliar with this stack.
For example, when we are renaming the formula column, just add a comment saying "The preset automatminer uses pre-defined column names 'composition' and 'structure' to find the composition and structure columns. You can change these by editing your config"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ADA110 thanks for taking care of that issue, Is this ready (or about ready to merge)? |
@ardunn If you think everything is good with the iPython file I added, then yes |
The example uses the elastic_tensor_2015 dataset and a default config to create a MatPipe. This MatPipe is used to benchmark the target property K_VRH.
The unit tests confirm that the output of the benchmark is not empty. They also ensure that, based on this specific example, the mean squared error is between 0 and 500.
For debugging purposes, you can use the debug config instead. In addition, make the range of the mean squared error be 0 - 1000 rather than 0 - 500.