Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sparse option to reindex and unstack #3542
sparse option to reindex and unstack #3542
Changes from 6 commits
4a6237a
e7b470d
1df4a3c
6a66831
3a369a1
6fe30e2
179cc1f
5d8ab27
13ad683
ac41ef8
92ce6cd
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a little surprised this is necessary. Does
sparse
not support__array_function__
fornp.where
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, yes. sparse looks not working with
np.result_type
andastype(copy=False)
.I'll add a TODO here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you have the latest version of sparse installed?
when I test this on my machine, it works:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. You are right.
I was running with sparse 0.7.0. With 0.8.0, it is running.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, this is a private method.
Probably we can expose it to the public and add the same method to DataArray and Dataset as well in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hopefully sparse will raise an error if you try to convert a dask array into a sparse array! If not, we should do that ourselves.
Long term, the best solution would be to convert a dask array from dense chunks to sparse chunks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of not hard-coding supported sparse formats, but I wonder if we could be a little more careful here if AttributeError is raised. We should probably catch and re-raise Attribute error with a more informative message if this fails.
Otherwise, I expect we might see bug reports from confused users, e.g., when
sparse_format='csr'
raises a confusing message.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also private, as is
_as_sparse
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should make these public in
DataArray
andDataset
. See discussion here: #3245 (comment)Can be left for a future PR though :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @dcherian.
I would like to expose them to public, but what is the best name of these functions?
#3245
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently,
assert_equal
cannot be used as we need to explicitly densify the array for the comparison.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we generally use
assert_array_equal
for numpy arrays (but I can't immediately recall the difference...)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these actually end up doing the exact same checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
values
property does not work for the sparse-backed Variable, resulting in the failure ofassert_array_equal
.I'll add TODO comment for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we test whether
actual.variable
is actually sparse?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍