Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pint compatibility issues #38

Open
dopplershift opened this issue May 12, 2015 · 15 comments
Open

Pint compatibility issues #38

dopplershift opened this issue May 12, 2015 · 15 comments
Assignees
Labels
Area: Infrastructure Pertains to project infrastructure (e.g. CI, linting)

Comments

@dopplershift
Copy link
Member

Somehow we need to make the pint.Quantity instances be numpy subclasses. This would:

  • Fix the problems with various numpy functions
  • Fix matplotlib's unit support

In the best case, this gets contributed upstream to pint.

@dopplershift dopplershift added the Area: Infrastructure Pertains to project infrastructure (e.g. CI, linting) label May 12, 2015
@shoyer
Copy link

shoyer commented Jun 16, 2015

I don't think this is good idea. Subclassing numpy.ndarray is quite error prone and not really recommended anymore (pandas, for example, recently got ride of the last of its ndarray subclasses). It also rules out attaching units to other interesting array types (like those from dask). NumPy provides plenty of hooks for array-like objects to override, so I don't think there's any functionality that actually needs a subclassing.

@dopplershift
Copy link
Member Author

I thought that there were plenty of hooks too. Then I tried to use a pint Quantity with matplotlib's unit support--because matplotlib uses asarray() everywhere, the units get dropped quite readily.

The balance of things might be in favor of non-subclass, but I really want to use matplotlib's unit support. Thoughts?

On Jun 15, 2015, at 8:27 PM, Stephan Hoyer notifications@github.com wrote:

I don't think this is good idea. Subclassing numpy.ndarray is quite error prone and not really recommended anymore (pandas, for example, recently got ride of the last of its ndarray subclasses). It also rules out attaching units to other interesting array types (like those from dask). NumPy provides plenty of hooks for array-like objects to override, so I don't think there's any functionality that actually needs a subclassing.


Reply to this email directly or view it on GitHub.

@shoyer
Copy link

shoyer commented Jun 16, 2015

If matplotlib uses np.asarray in places where you need units, you're already screwed -- asarray only passes through base numpy.ndarray objects. asanyarray will pass through subclases, but that's rarely used.

I know that matplotlib has a units API that in principle allows for custom formatting, but I haven't actually used it myself.

@dopplershift
Copy link
Member Author

Right--I'm aware of all these issues (I contributed patches to numpy to replace some asarray calls with asanyarray to make things work better with the quantities package I was using at the time.)

It seems matplotlib's unit support works great with things that aren't arrays (or at least don't easily degrade to arrays), but broke with these things. Though, you're right, if asarray is the problem, a subclass doesn't help. I guess maybe patching matplotlib....

Ryan

On Jun 15, 2015, at 8:39 PM, Stephan Hoyer notifications@github.com wrote:

If matplotlib uses np.asarray in places where you need units, you're already screwed -- asarray only passes through base numpy.ndarray objects. asanyarray will pass through subclases, but that's rarely used.

I know that matplotlib has a units API that in principle allows for custom formatting, but I haven't actually used it myself.


Reply to this email directly or view it on GitHub.

@shoyer
Copy link

shoyer commented Jun 16, 2015

I guess maybe patching matplotlib....

This seems like the more likely solution, unfortunately.

@dopplershift
Copy link
Member Author

Isn't the first time, won't be the last.

Thanks for the useful discussion making clear that sub-classing doesn't solve the problem.

On Jun 15, 2015, at 8:54 PM, Stephan Hoyer notifications@github.com wrote:

I guess maybe patching matplotlib....

This seems like the more likely solution, unfortunately.


Reply to this email directly or view it on GitHub.

@dopplershift
Copy link
Member Author

Just capturing here:
Subclassing ndarray is really frowned upon now by the numpy guys, given what is involved with the current code. Also, dask is really cool, so we want to enable wrapping other things. The right answer is to make matplotlib behave better.

@dopplershift dopplershift changed the title Units need to be numpy subclass Pint compatibility issues Jul 10, 2015
@rsignell-usgs
Copy link
Contributor

Would cf_units instead of pint help here?
cf_units probably didn't exist when metpy development started...

@dopplershift
Copy link
Member Author

It existed but does not meet my requirements. I want array-like objects that I can do unit-correct math with. All I see with cf-units is the ability to take units and resolve them properly--meaning you still have to treat units out of band. What I want (and what pint, quantities, etc.) give is the ability to do just do math with arrays and have the units work themselves out automatically.

@ocefpaf
Copy link

ocefpaf commented Oct 1, 2015

@dopplershift I guess that is my fault for not addressing SciTools/cf-units#3.

Here is a quick example on how to use cf_units with numpy.arrays.

http://nbviewer.ipython.org/gist/ocefpaf/b5dc173da0d445ee344e

Note that all these libraries have their strengths and weakness. IMHO astropy has the best units support, but I am not sure if they provide a standalone module for units. cf_units, on the other hand, has all the niche idiosyncrasies features we need and is a standalone module. I end up using both.

@dopplershift
Copy link
Member Author

The lack of standalone is exactly why I'm not using astropy. Other than the cf-compatibility, the API/behavior of cf-units is just not something I'm wild about (object arrays, really?). I understand why cf-units gets brought up, but I don't see anyone saying why pint is a problem. It hits exactly the behavior I'm looking for and don't see anyone bringing up problems.

@ocefpaf
Copy link

ocefpaf commented Oct 1, 2015

The lack of standalone is exactly why I'm not using astropy

Yeah. That is a bummer.

Other than the cf-compatibility, the API/behavior of cf-units is just not something I'm wild about (object arrays, really?).

Some see that as an advantage. But your mileage may vary.

but I don't see anyone saying why pint is a problem.

I did had issues with pow and mod in the past. Maybe they fixed that, but I am not sure because I left it for astropy's and cf_units.

@shoyer
Copy link

shoyer commented Oct 1, 2015

Object arrays are pretty much a deal breaker in terms of performance. This is why we need to be able to write a units dtype...

@ocefpaf
Copy link

ocefpaf commented Oct 1, 2015

I knew someone was going to bring the dtype 😉

I fully agree! Until then... I am OK with object arrays and the performance hit. I rarely need to propagate units when computing things that need high performance.

@dopplershift
Copy link
Member Author

Yeah, dtype is really what we need. I think we just have different opinions on what trade-offs to make.

@jrleeman jrleeman added this to the Spring 2017 milestone Feb 14, 2017
@jrleeman jrleeman removed this from the Spring 2017 milestone Feb 24, 2017
@dopplershift dopplershift added this to the Spring 2018 milestone Nov 15, 2017
@jrleeman jrleeman removed this from the 0.8 milestone Apr 11, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Infrastructure Pertains to project infrastructure (e.g. CI, linting)
Projects
None yet
Development

No branches or pull requests

5 participants