Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement backend='auto' and make it the default #81

Merged
merged 2 commits into from
Feb 20, 2019

Conversation

jcmgray
Copy link
Collaborator

@jcmgray jcmgray commented Feb 16, 2019

Description

This implements backend='auto' for contractions and makes it the default. This works currently by inferring the module from the class of the first array supplied. A slight detail is that sometimes the module of an array does not supply tensordot etc. (maybe it has subclassed just numpy.ndarray for example) and in this case 'numpy' is still used.

Hopefully this kind of thing will eventually be made redundant by __array_function__ (#46) but for the moment it makes it a bit more convenient to work with other array libraries.

Status

  • Ready to go

@codecov-io
Copy link

codecov-io commented Feb 16, 2019

Codecov Report

Merging #81 into master will decrease coverage by 0.34%.
The diff coverage is 80%.

Copy link
Owner

@dgasmith dgasmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of this, but worried that this could potentially break code. We seem to be in quite a few projects at the moment and could change behavior.

@fritzo Thoughts here?


# some arrays will be defined in modules that don't implement tensordot
# etc. so instead default to numpy
if not backends.has_tensordot(backend):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if someone has subclassed a NumPy array or similar or uses an unknown backend? We should fall back to NumPy here as well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think the behavior you describe is what happens here. For example, in my library quimb I have a subclass of ndarray so the initial inferred backend is quimb. However this check for quimb.tensordot (which is cached) means 'numpy' is used still since unlike dask, tensorflow etc. quimb does not provide tensordot.

@jcmgray
Copy link
Collaborator Author

jcmgray commented Feb 17, 2019

I like the idea of this, but worried that this could potentially break code. We seem to be in quite a few projects at the moment and could change behavior.

Yeah I definitely appreciate this might be a significant change and would be good to get anybody's feedback! I think for almost all use cases there should be no change in behaviour however.

The only really 'breaking' edge case I can think of, is if someone is using non-numpy arrays from a library that does provide tensordot but also is also for some reason relying by default on numpy.tensordot being the backend function. In this case, the libraries tensordot will now be used instead.

@fritzo
Copy link
Contributor

fritzo commented Feb 17, 2019

IIUC this would not break any Pyro code since we always specify backend, and actually we usually specify a custom backend like pyro.ops.einsum.torch_log.

@jcmgray
Copy link
Collaborator Author

jcmgray commented Feb 17, 2019

IIUC this would not break any Pyro code since we always specify backend, and actually we usually specify a custom backend like pyro.ops.einsum.torch_log.

Yes no change for explicit backend - which I imagine is all uses in other projects since this has until now been essentially required.


The overall logic is really that if you supply a mylib array, and mylib implements tensor functions, you almost definitely (by default) want to use that backend and so supplying backend='mylib' feels redundant. Of course, if you really wanted to use numpy tensor functions you can explicitly set backend='numpy' but since this is the much more niche case it feels fine to be more verbose.

@dgasmith
Copy link
Owner

I think this represents a fundamental shift where we are no longer focused on a NumPy backend, but are agnostic. I would consider this a major change (3.x) as we change our philosophy statement (something that could be argued that we have done anyways). This would make me a bit more comfortable as we are potentially breaking code, thoughts?

@jcmgray
Copy link
Collaborator Author

jcmgray commented Feb 18, 2019

Yes that seems like a reasonable approach to me!

Copy link
Owner

@dgasmith dgasmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dgasmith dgasmith merged commit ff86c2c into dgasmith:master Feb 20, 2019
@dgasmith dgasmith mentioned this pull request Feb 20, 2019
5 tasks
@dgasmith dgasmith added this to the v3.0 milestone Jul 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants