Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Group By: KeyError with newly added Variables #5802

Closed
mw25 opened this issue Jan 24, 2022 · 1 comment · Fixed by #5823
Closed

Group By: KeyError with newly added Variables #5802

mw25 opened this issue Jan 24, 2022 · 1 comment · Fixed by #5823
Assignees
Labels
bug A bug confirmed by the core team snack This will take an hour or two

Comments

@mw25
Copy link
Contributor

mw25 commented Jan 24, 2022

What's wrong?
If first a subset of a table is given to group by (here: only sepal length with Select Columns), and then the remaining variables are added, the aggregations column in the aggregate table is empty. When clicking on the row, a KeyError exception occurs.

image

----------------------------- KeyError Exception ------------------------------
Traceback (most recent call last):
  File "D:\dev\Python\orange3\orange3\Orange\widgets\data\owgroupby.py", line 385, in __rows_selected
    active_aggregations = [self.aggregations[attr] for attr in selected_attrs]
  File "D:\dev\Python\orange3\orange3\Orange\widgets\data\owgroupby.py", line 385, in <listcomp>
    active_aggregations = [self.aggregations[attr] for attr in selected_attrs]
KeyError: ContinuousVariable(name='sepal width', number_of_decimals=1)
-------------------------------------------------------------------------------

How can we reproduce the problem?

  • see Screenshot
  • insert this test in TestOWGropBy in Orange/widgets/data/tests/test_owgroupby.py
    def test_reproduce_bug(self):
        partial_domain = Domain([ContinuousVariable("a")])
        partial_table = Table.from_table(partial_domain, self.data)

        self.send_signal(self.widget.Inputs.data, partial_table)
        assert len(self.widget.aggregations) == 1
        self.assert_aggregations_equal(["Mean"])

        self.send_signal(self.widget.Inputs.data, self.data)
        assert len(self.widget.aggregations) == 1  # should be 5
        self.assert_aggregations_equal(['Mean', '', '', '', ''])  # should be defaults, not ''

        # select a new variable - should not raise
        self.select_table_rows(self.widget.agg_table_view, [1])

Analysis and related problem

A ContextSetting is created with the partial table.
With the full table, this context is recognized and the default values of the new variables in self.aggregations (owgroupby) are completely overwritten with the ContextSetting of the partial table.
In self.agg_table_view are all variables, but in self.aggregations only one. KeyError.

If a ContextSetting is a dict, it completely replaces the current setting:

https://github.com/biolab/orange-widget-base/blob/51d4f35e2fe807d617aac0547532d220d8942679/orangewidget/settings.py#L201

Wouldn't it be better if:

  • Keys existing in the current setting and in the ContextSetting are updated by the ContextSetting.
  • Keys existing in the current settings that are not in the ContextSetting should be kept.

What's your environment?

  • Operating system: Windows 10
  • Orange version: 3.32.0.dev
  • How you installed Orange: clone github
@mw25 mw25 added the bug report Bug is reported by user, not yet confirmed by the core team label Jan 24, 2022
@janezd janezd self-assigned this Jan 28, 2022
@janezd janezd added bug A bug confirmed by the core team and removed bug report Bug is reported by user, not yet confirmed by the core team labels Feb 3, 2022
@janezd
Copy link
Contributor

janezd commented Feb 3, 2022

Thanks for this excellent report. How I wish all reports were so detailed, and also included some debugging and even a test! :)

You are right, of course.

A note to whomever who'd work on this: this shouldn't be too difficult. Copy the idea from widgets that already do it properly, for instance, Discretize and Edit Domain. The latter is better, because it stores changes immediately, not through storeSpecificSettings.

@janezd janezd added the snack This will take an hour or two label Feb 3, 2022
@janezd janezd removed their assignment Feb 3, 2022
@VesnaT VesnaT self-assigned this Feb 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A bug confirmed by the core team snack This will take an hour or two
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants