Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document reason for strongly typed Axis? #564

Closed
JP-Ellis opened this issue Nov 28, 2018 · 6 comments · Fixed by #956
Closed

Document reason for strongly typed Axis? #564

JP-Ellis opened this issue Nov 28, 2018 · 6 comments · Fixed by #956

Comments

@JP-Ellis
Copy link
Contributor

While using ndarray, I was wondering why certain methods use the Axis type instead of just usize, especially given that Axis is just a simple wrapper around usize. I tried looking into the documentation, and could not find any explanation; and searching through past issues/pull requests, I found just #91, #96 and #97 (though none of them given a clear reason why this is the case).

Would it be worth documenting this choice?

And is there a reason why From<usize>/Into<Axis> isn't implemented for Axis/usize?

@bluss
Copy link
Member

bluss commented Nov 28, 2018

Doesn't axis doc have a short rationale?

@JP-Ellis
Copy link
Contributor Author

All the documentation states is:

All array axis arguments use this type to make the code easier to write correctly and easier to understand.

Looking through the functions which use the Axis type, I don't really see why it is all that necessary.

  • A number of functions which have the name axis in them, making the Axis type overly verbose in my opinion.
  • Some function, although don't have the name axis in them, seem 'obvious enough' to me to not need the Axis type:
  • The only functions which I can see perhaps needing the Axis type are those with multiple arguments, though having said that none of them have overly complicated arguments so I don't really see the need for the Axis type:

So ultimately, I still don't see why there's a need for Axis over just using usize, other than for the very last group of functions listed above.

@bluss
Copy link
Member

bluss commented Nov 28, 2018

Some good defenders of the newtype are IMO

Argument Conversion Traits and Axis

From and Into are IMO best understood as argument conversion traits. Their primary purpose is converting something directly after having been received as a function parameter.

The biggest drawback with implementing these traits is when it leads to code that looks like this:

let first_rows = data.slice_axis(0.into(), Slice::from(0..3));
  1. 0.into() is completely anonymous and the type is implicit.
  2. It's a hazard for breaking changes, since it makes API changes in slice_axis that are nominally type compatible vulnerable to breaking changes that just come from failure to infer the correct type to use, since 0.into() can mean many things. For example if slice_axis becomes generic in the first parameter.

There are two ways to avoid this 0.into() situation:

a. Don't implement From/Into.
b. Implement From/Into for Axis and make sure that every place that takes axis parameters, already is generic and accepts <T> where T: Into<Axis>.
c. Just use usize and avoid conversion problems.

I think, we spend our complexity budget elsewhere and don't have time for frivolous generic parameters, ndarray is complicated enough as it is. User reports of course differ in both directions (both "It would be cool if it was easier to use" and "It would be cool if this another thing was also encoded type level information").

I also stand by that the code is easier to understand if you see that the parameter is written as Axis(0).

sum_axis

This method was originally called sum, and with .sum(Axis(0)) as the original design was, we can see that it was a bit better than it is now. We decided at some point to let .sum() be the whole thing and .sum_axis(axis) be for a specific axis, and we ended up with this redundancy, I can totally see that.

As an alternate solution, any other ideas what we can do about the .sum() vs .sum_axis(axis) convention?

@jturner314
Copy link
Member

Fwiw, I'd like to replace .sum_axis(axis) with .sum_axes(axes) in the future (and do the same for most/all of the other _axis methods), so we'll need to think about how to handle the axes parameter.

I do think it makes sense to keep a separate .sum() method because the return type is different (A instead of Array<A, _>).

An alternative to having separate .sum() and .sum_axes(axes) methods is to make the return type be an associated type of the axes parameter, treating () as a special case, but I don't like that idea.

@JP-Ellis
Copy link
Contributor Author

Thanks for the replies!

I still think that Axis as a type of a bit too verbose, but I also understand the desire for being a little more verbose for the sake of clarity.

I also really like the way .sum_axes() approaches this.

@max-sixty
Copy link
Contributor

FWIW in xarray we follow a similar approach to Numpy:

  • Don't have separate methods for *_axis
  • Specify axes / dimensions on all reductions, either as a list, single item, or None. Where no axes / dimensions are given, all are reduced. Given there aren't optional args in rust, I think the corollary would be an Axis enum which could be any of those three options
  • Unlike numpy, have all operations return an array, even those that reduce to a scalar. That reduces the uncertainty about what type a method returns. IIUC in rust you could allow a coercion to the element type with Into

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants