Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize compatibility checks in merge.unique_variable #3311

Closed
dcherian opened this issue Sep 16, 2019 · 0 comments · Fixed by #3453
Closed

optimize compatibility checks in merge.unique_variable #3311

dcherian opened this issue Sep 16, 2019 · 0 comments · Fixed by #3453

Comments

@dcherian
Copy link
Contributor

Currently merge.unique_variable does this

    if equals is None:
        out = out.compute()
        for var in variables[1:]:
            equals = getattr(out, compat)(var)
            if not equals:
                break

out (=variables[0]) is always computed though it may not be necessary.

One solution would be to loop through once checking attrs, shapes and _data. If these checks were satisfied, then we execute the above code

dcherian added a commit to dcherian/xarray that referenced this issue Oct 27, 2019
dcherian added a commit to dcherian/xarray that referenced this issue Oct 28, 2019
Dask arrays with the same graph have the same name. We can use this to quickly
compare dask-backed variables without computing.

Fixes pydata#3068 and pydata#3311
dcherian added a commit to dcherian/xarray that referenced this issue Oct 30, 2019
commit 08f7f74
Merge: 53c0f4e 278d2e6
Author: dcherian <deepak@cherian.net>
Date:   Tue Oct 29 09:36:58 2019 -0600

    Merge remote-tracking branch 'upstream/master' into fix/dask-computes

    * upstream/master:
      upgrade black verison to 19.10b0 (pydata#3456)
      Remove outdated code related to compatibility with netcdftime (pydata#3450)

commit 53c0f4e
Author: dcherian <deepak@cherian.net>
Date:   Tue Oct 29 09:25:27 2019 -0600

    Add identity check to lazy_array_equiv

commit 5e742e4
Author: dcherian <deepak@cherian.net>
Date:   Tue Oct 29 09:22:15 2019 -0600

    update whats new

commit ee0d422
Merge: e99148e 74ca69a
Author: dcherian <deepak@cherian.net>
Date:   Tue Oct 29 09:18:38 2019 -0600

    Merge remote-tracking branch 'upstream/master' into fix/dask-computes

    * upstream/master:
      Remove deprecated behavior from dataset.drop docstring (pydata#3451)
      jupyterlab dark theme (pydata#3443)
      Drop groups associated with nans in group variable (pydata#3406)
      Allow ellipsis (...) in transpose (pydata#3421)
      Another groupby.reduce bugfix. (pydata#3403)
      add icomoon license (pydata#3448)

commit e99148e
Author: dcherian <deepak@cherian.net>
Date:   Tue Oct 29 09:17:58 2019 -0600

    add concat test

commit 4a66e7c
Author: dcherian <deepak@cherian.net>
Date:   Mon Oct 28 10:19:32 2019 -0600

    review suggestions.

commit 8739ddd
Author: dcherian <deepak@cherian.net>
Date:   Mon Oct 28 08:32:15 2019 -0600

    better docstring

commit e84cc97
Author: dcherian <deepak@cherian.net>
Date:   Sun Oct 27 20:22:13 2019 -0600

    Optimize dask array equality checks.

    Dask arrays with the same graph have the same name. We can use this to quickly
    compare dask-backed variables without computing.

    Fixes pydata#3068 and pydata#3311
dcherian added a commit to dcherian/xarray that referenced this issue Nov 2, 2019
commit 0711eb0
Author: dcherian <deepak@cherian.net>
Date:   Thu Oct 31 21:18:58 2019 -0600

    bugfix.

commit 4ee2963
Author: Deepak Cherian <dcherian@users.noreply.github.com>
Date:   Thu Oct 31 11:27:05 2019 -0600

    pep8

commit 6e4c11f
Merge: 08f7f74 53c5199
Author: Deepak Cherian <dcherian@users.noreply.github.com>
Date:   Thu Oct 31 11:25:12 2019 -0600

    Merge branch 'master' into fix/dask-computes

commit 08f7f74
Merge: 53c0f4e 278d2e6
Author: dcherian <deepak@cherian.net>
Date:   Tue Oct 29 09:36:58 2019 -0600

    Merge remote-tracking branch 'upstream/master' into fix/dask-computes

    * upstream/master:
      upgrade black verison to 19.10b0 (pydata#3456)
      Remove outdated code related to compatibility with netcdftime (pydata#3450)

commit 53c0f4e
Author: dcherian <deepak@cherian.net>
Date:   Tue Oct 29 09:25:27 2019 -0600

    Add identity check to lazy_array_equiv

commit 5e742e4
Author: dcherian <deepak@cherian.net>
Date:   Tue Oct 29 09:22:15 2019 -0600

    update whats new

commit ee0d422
Merge: e99148e 74ca69a
Author: dcherian <deepak@cherian.net>
Date:   Tue Oct 29 09:18:38 2019 -0600

    Merge remote-tracking branch 'upstream/master' into fix/dask-computes

    * upstream/master:
      Remove deprecated behavior from dataset.drop docstring (pydata#3451)
      jupyterlab dark theme (pydata#3443)
      Drop groups associated with nans in group variable (pydata#3406)
      Allow ellipsis (...) in transpose (pydata#3421)
      Another groupby.reduce bugfix. (pydata#3403)
      add icomoon license (pydata#3448)

commit e99148e
Author: dcherian <deepak@cherian.net>
Date:   Tue Oct 29 09:17:58 2019 -0600

    add concat test

commit 4a66e7c
Author: dcherian <deepak@cherian.net>
Date:   Mon Oct 28 10:19:32 2019 -0600

    review suggestions.

commit 8739ddd
Author: dcherian <deepak@cherian.net>
Date:   Mon Oct 28 08:32:15 2019 -0600

    better docstring

commit e84cc97
Author: dcherian <deepak@cherian.net>
Date:   Sun Oct 27 20:22:13 2019 -0600

    Optimize dask array equality checks.

    Dask arrays with the same graph have the same name. We can use this to quickly
    compare dask-backed variables without computing.

    Fixes pydata#3068 and pydata#3311
dcherian added a commit that referenced this issue Nov 5, 2019
* Optimize dask array equality checks.

Dask arrays with the same graph have the same name. We can use this to quickly
compare dask-backed variables without computing.

Fixes #3068 and #3311

* better docstring

* review suggestions.

* add concat test

* update whats new

* Add identity check to lazy_array_equiv

* pep8

* bugfix.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant