Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Examples Folder and MSE Example #158

Merged
merged 21 commits into from
Jan 17, 2019

Conversation

ADA110
Copy link
Contributor

@ADA110 ADA110 commented Jan 10, 2019

The example uses the elastic_tensor_2015 dataset and a default config to create a MatPipe. This MatPipe is used to benchmark the target property K_VRH.

The unit tests confirm that the output of the benchmark is not empty. They also ensure that, based on this specific example, the mean squared error is between 0 and 500.

For debugging purposes, you can use the debug config instead. In addition, make the range of the mean squared error be 0 - 1000 rather than 0 - 500.

@@ -118,9 +118,9 @@ def fit(self, df, target, **fit_kwargs):
self._features = df.drop(columns=target).columns.tolist()
self._ml_data = {"X": X, "y": y}
self.fitted_target = target
self.logger.info("TPOT fitting started.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are there these logger changes happening here? In this PR we shouldn't do this

@@ -0,0 +1,45 @@
import unittest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Also is cool that it's in a test, but I'm afraid it will just confuse people who come to use it (ie, "woah, it is weird they put this test here, I wonder where the example is"). I don't think we will be frequently running this test anyway (especially because you are using the default config, not the debug config which is much faster).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of being in a test, could you write out the file in a simple script (or even better, notebook), with perhaps a few more comments? Not everyone is particularly familiar with pandas/automatminer/matminer/machine learning, and there are a few areas that won't make sense to someone unfamiliar with this stack.

For example, when we are renaming the formula column, just add a comment saying "The preset automatminer uses pre-defined column names 'composition' and 'structure' to find the composition and structure columns. You can change these by editing your config"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ardunn
Copy link
Contributor

ardunn commented Jan 17, 2019

@ADA110 thanks for taking care of that issue, Is this ready (or about ready to merge)?

@ADA110
Copy link
Contributor Author

ADA110 commented Jan 17, 2019

@ardunn If you think everything is good with the iPython file I added, then yes

@ardunn ardunn merged commit 8db0695 into hackingmaterials:master Jan 17, 2019
@ardunn ardunn mentioned this pull request Jan 17, 2019
@ADA110 ADA110 deleted the example_mse branch January 19, 2019 01:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants