Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metaselection needs error handling, or screening beforehand #139

Closed
ardunn opened this issue Dec 5, 2018 · 0 comments
Closed

metaselection needs error handling, or screening beforehand #139

ardunn opened this issue Dec 5, 2018 · 0 comments
Assignees

Comments

@ardunn
Copy link
Contributor

ardunn commented Dec 5, 2018

Sometimes there can be a bad composition object passed to metaselector, and it will fail. For example, try running metaselector on the mp_all dataset. It will give the error:

"Traceback (most recent call last):\n  File \"/global/scratch/ardunn/python/lib/python3.7/site-packages/fireworks/core/rocket.py\", line 262, in run\n    m_action = t.run_task(my_spec)\n  File \"/global/scratch/ardunn/codes/hmprivate/hmprivate/automatminer/benchmarking/tasks.py\", line 72, in run_task\n    predicted_test_df = pipe.benchmark(df, target, test_spec=0.2)\n  File \"/global/scratch/ardunn/codes/automatminer/automatminer/utils/package_tools.py\", line 76, in wrapper\n    result = func(*args, **kwargs)\n  File \"/global/scratch/ardunn/codes/automatminer/automatminer/pipeline.py\", line 229, in benchmark\n    df = self.autofeaturizer.fit_transform(df, target)\n  File \"/global/scratch/ardunn/codes/automatminer/automatminer/utils/package_tools.py\", line 76, in wrapper\n    result = func(*args, **kwargs)\n  File \"/global/scratch/ardunn/codes/automatminer/automatminer/featurization/core.py\", line 321, in fit_transform\n    return self.fit(df, target).transform(df, target, tidy_column=False)\n  File \"/global/scratch/ardunn/codes/automatminer/automatminer/utils/package_tools.py\", line 76, in wrapper\n    result = func(*args, **kwargs)\n  File \"/global/scratch/ardunn/codes/automatminer/automatminer/featurization/core.py\", line 259, in fit\n    self._customize_featurizers(df)\n  File \"/global/scratch/ardunn/codes/automatminer/automatminer/featurization/core.py\", line 364, in _customize_featurizers\n    auto_exclude = self.metaselector.auto_excludes(df)\n  File \"/global/scratch/ardunn/codes/automatminer/automatminer/featurization/metaselection/core.py\", line 172, in auto_excludes\n    self.dataset_mfs = dataset_metafeatures(df, **mfs_kwargs)\n  File \"/global/scratch/ardunn/codes/automatminer/automatminer/featurization/metaselection/core.py\", line 45, in dataset_metafeatures\n    if mfs_func is not None else {})\n  File \"/global/scratch/ardunn/codes/automatminer/automatminer/featurization/metaselection/core.py\", line 63, in _composition_metafeatures\n    mfs[mf] = mf_class.calc(df[composition_col])\n  File \"/global/scratch/ardunn/codes/automatminer/automatminer/featurization/metaselection/metafeatures.py\", line 171, in calc\n    stats = composition_stats(X)\n  File \"/global/scratch/ardunn/codes/automatminer/automatminer/featurization/metaselection/metafeatures.py\", line 137, in composition_stats\n    return _composition_stats(tuple(X.values))\n  File \"/global/scratch/ardunn/codes/automatminer/automatminer/featurization/metaselection/metafeatures.py\", line 154, in _composition_stats\n    stats = composition_statistics(X)\n  File \"/global/scratch/ardunn/codes/automatminer/automatminer/featurization/metaselection/utils.py\", line 23, in composition_statistics\n    stats[idx] = _composition_summary(composition)\n  File \"/global/scratch/ardunn/codes/automatminer/automatminer/featurization/metaselection/utils.py\", line 81, in _composition_summary\n    c = Composition(composition)\n  File \"/global/scratch/ardunn/codes/pymatgen/pymatgen/core/composition.py\", line 134, in __init__\n    elmap = dict(*args, **kwargs)\nTypeError: 'float' object is not iterable\n"

Which is basically just saying it tried to do Composition(1.24) or something (i.e., the original df had a bad value). So we need to figure out a good way to do error handling here to make sure metaselection is robust

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants