Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Groups from api discovery returned in non-deterministic order #886

Closed
clux opened this issue Apr 18, 2022 · 7 comments · Fixed by #887
Closed

Groups from api discovery returned in non-deterministic order #886

clux opened this issue Apr 18, 2022 · 7 comments · Fixed by #887
Assignees
Labels
bug Something isn't working client http issues with the client

Comments

@clux
Copy link
Member

clux commented Apr 18, 2022

Current and expected behavior

when iterating over groups:

let discovery = Discovery::new(client).run().await?;
for group in discovery.groups() {

we get entries returned in different orders depending on what the internal HashMap feels like.

This is problematic when used in kubectl like inference that tries to "pick the first matching kind".

Possible solution

Switch HashMap for BTreeMap.
Won't fix the inherent problems in not having something more solid to sort on, but it at least provides a consistent interface.

@clux clux added bug Something isn't working client http issues with the client labels Apr 18, 2022
clux added a commit that referenced this issue Apr 18, 2022
Fixes #886

Signed-off-by: clux <sszynrae@gmail.com>
@clux clux self-assigned this Apr 18, 2022
@nightkr
Copy link
Member

nightkr commented Apr 18, 2022

I'm inclined to think that this seems more like a client implementation issue...

@clux
Copy link
Member Author

clux commented Apr 18, 2022

It's true. I was hoping it was OK anyway, because it doesn't really cost us anything here to use BTreeMap on something so small.

(I'm having a fun time trying to replicate the weird disambiguation logic that kubectl has, and have found that the plural of NodeMetrics is "nodes" and the plural of PodMetrics is "pods" so am kind of forced to implement some sort of real api group preference system if the groups are non-deterministic)

@nightkr
Copy link
Member

nightkr commented Apr 18, 2022

I'm not super against the BTreeMap itself, as much as going and committing to any one sorting as the one blessed way. Especially when we don't really have any deeper principle going on than "this makes it slightly easier to mimic kubectl".

And especially when it seems like the kubectl use case would be better served by a linear scan than bothering to sort the whole list.

@clux
Copy link
Member Author

clux commented Apr 18, 2022

we don't really have any deeper principle

There is another reason to do something like this; it's really hard to build a preference system of api groups because all you have are names. If you knew that builtin types had higher precedence than custom ones then that might be enough for many use cases. btreemap just so happened to push the core group early, but there might be better ways.

I did a little more digging to see if there was more solid information.

found some mostly useless information

  • kubectl api-resources gets its data from the /apis endpoint (via -v=9), but this is sorted in a way that did not make any sense to me
  • kubectl api-resources presents the data in a way different from the data input (sorted by the group's apiVersion quasi-alphabetically (core is special-cased to be first))
  • kubectl api-resources uses client-side sorting by either kind or name (interestingly, they don't support setting the other deterministic (default) sort (by apiversion).
  • all supported sort orders differ from what the /apis endpoint returns
  • the discovery client does not mention sorting/order/alphabetical at all, and there's only a single naked sort in its impl

so went for experimentation instead and found:

kubectl get will pick the first matching kind in the earliest alphabetical group name => it is consistent with default sort of k api-resources

This is helpful in that if we sort our apigroup vector by the group name, we have aligned..
Given that the underlying structure is a HashMap and the only linear interface is given in Discovery::groups, are you OK with sorting that before returning?

@nightkr
Copy link
Member

nightkr commented Apr 21, 2022

Currently Discovery::groups() is a cheap direct hashmap Iterator, returning it sorted with our current implementation would requires us to both buffer that into a Vec and sort that, both of which are relatively expensive operations.

Personally I'd be happier with either (in descending order of preference):

  1. Exposing a cmp_group_preference function that can be passed slice::sort_by
  2. Adding a new Discovery::groups_by_preference method that does the sorting
  3. Changing Discovery::groups to be sorted, adding a new Discovery::unsorted_groups method

Personally I prefer option 1 because it:

  • Makes the slow option (sorting) possible but keeps some slight friction
  • Is also usable for linear scans with Iterator::min_by, which is what uses like kubectl get "should" be using
  • Doesn't require us to decide on the final data structure that the data will land in (as opposed to either exposing the Vec directly, or "burning" it and turning it into a vec::IntoIter immediately)

clux added a commit that referenced this issue Apr 24, 2022
Fixes #886

Signed-off-by: clux <sszynrae@gmail.com>
@clux
Copy link
Member Author

clux commented Apr 24, 2022

I have tried both 1 and 2 in the PR as it stands, and will highlight a few things in there.

I like the idea of not being tied to a data structure, but the Iterator::min_by idea is hard to actually use. By default, min_by gives you the CORE group every time unless you filter by a group name (which we don't have - if you did you wouldn't be using this interface you'd be using oneshot::group).

@nightkr
Copy link
Member

nightkr commented Apr 24, 2022

I'll have to look closer at the kubectl PR again to comment on that.. Hoping to get to it tomorrow.

@clux clux closed this as completed in #887 May 23, 2022
clux added a commit that referenced this issue May 23, 2022
* Make Discovery::groups have deterministic order

Fixes #886

Signed-off-by: clux <sszynrae@gmail.com>

* sort Discovery::groups in kubectl order

Signed-off-by: clux <sszynrae@gmail.com>

* revert back to hashmap (no need for btree anymore)

Signed-off-by: clux <sszynrae@gmail.com>

* use peek rather than creating group values twice

Signed-off-by: clux <sszynrae@gmail.com>

* document

Signed-off-by: clux <sszynrae@gmail.com>

* update with suggestions

Signed-off-by: clux <sszynrae@gmail.com>

* update test

Signed-off-by: clux <sszynrae@gmail.com>

* remove ordering fn since it's simple

Signed-off-by: clux <sszynrae@gmail.com>

* can avoid making name public since we have a getter

Signed-off-by: clux <sszynrae@gmail.com>

* ..but then need to update the test

Signed-off-by: clux <sszynrae@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working client http issues with the client
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants