-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add collection version to plugin pages and collection plugin index #200
Add collection version to plugin pages and collection plugin index #200
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generated output LGTM from the docs perspective!
afc8b7d
to
0942fc7
Compare
I like the output. I have to think about whether the structure of the coe makes sense. Since it spans so much of the code (getting the data, saving the data, outputting the data), I have to pull a lot of information into the forefront of my brain to be able to think about it :-) |
b210876
to
7fbf44c
Compare
7fbf44c
to
0307b28
Compare
antsibull/docs_parsing/__init__.py
Outdated
return 'AnsibleCollectionInfo({0}, {1})'.format(repr(self.path), repr(self.version)) | ||
|
||
|
||
class AnsibleCollectionDocs: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, after looking and thinking and looking and thinking some more, I don't like this particular data structure. It's aggregating information that's really about different things. The plugins variable is information about individual plugins organized by collection of which each is a part while the collection_info data structure is information about each collection.
It feels like this should be returned as a tuple containing both pieces of information and then they should be passed around separately.
Alternately, the information could be integrated together:
AnsibleCollectionDocs = {
'community.general':
'path': str,
'version': str,
'plugins': { $plugin_type: { $plugin_name: [...] } }
}
However, since those come from distinct sources, it still feels like that integration should be done as a second step. The first step, in the parsing code, acquires the information. The second step (somewhere in the calling code) transforms the raw data into a data structure that's easily consumed by the later steps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is essentially a NamedTuple
. I've converted it into a 'proper' NamedTuple
in 0996b65. I don't think it is easier to read though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah.... what I mean is that the two things are really items which are unassociated with each other, so the API should look like:
def get_disparate_information(url):
return len(url), contents_stored_at(url)
length, contents = get_disparate_information('https://google.com/')
do_something_else(length)
yet_another_else(contents)
Packing them together, at least as arbitrary concatenation, should be temporary (so that we can return two values from where the parsing happened) but the API shouldn't encourage people to store and access the values as a single entity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is 892998d ok?
antsibull/cli/doc_commands/stable.py
Outdated
@@ -221,7 +222,7 @@ def get_plugin_contents(plugin_info: t.Mapping[str, t.Mapping[str, t.Any]], | |||
|
|||
|
|||
def get_collection_contents(plugin_content: t.Mapping[str, t.Mapping[str, t.Mapping[str, str]]], | |||
) -> t.DefaultDict[str, t.Dict[str, t.Mapping[str, str]]]: | |||
) -> CollectionInfoT: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason not to do this would be that the types for return values should be concrete: We know we're returning a DefaultDict of Dicts because we're creating those inside of this function. So anything making use of our return values would know they can depend on those behaviours being present. Typing it as generic Mapping of Mapping means that code using our return value can't depend on those behaviours.
Input types, otoh, should specify the least common denominator that they need. So in terms of taking input, a Mapping is appropriate unless the function wanted to do something that depended on the DefaultDict's behaviour.
I kind of feel like we should keep return types and input types separate for that reason (when they're different). Does it make sense to define a return spec alongside the input spec so that it's easier to write this sort of thing and see that they're compatible types? (like RetCollectionInfoT
as a counterpart to CollectionInfoT
?) Or does that defeat the purpose of keeping the input type as lowest common denominator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually things like "this implementation uses a defaultdict
" is an implementation detail and callers should not depend on that, because it makes it really hard to change the implementation without having to go through all callers to make sure they don't depend on it. (If a generic Mapping is documented, they can of course still use it like a defaultdict
, but then they're clearly faulty.)
Using CollectionInfoT
here says to API users that "this collection returns what can be used in other APIs as CollectionInfoT
", as opposed to "we return this concrete mapping type, which might be similar to one of the input types of other functions, but you have to figure that out by yourself". Also, defining RetCollectionInfoT
different from CollectionInfoT
makes it impossible to ever provide any function returning that which does not use a similar implementation (based on defaultdict
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed in 54ed7ce.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting..... So I don't think of static typing info as part of the API but I can definitely see your point that if it is, then we should be using it to define capabilities to expect rather than the specific type. I guess there's two separate use cases (static typing to catch errors in the current implementation vs static typing to define the expected API) and in this specific case they're a little bit at odds.
There are cases where defaultdict and dict behaviour would be different even though you are only using functions that are common to both, so I think it can be beneficial to know concretely what we're returning. To be fair, I don't think static typing will prevent it (at this stage in type checkers' development, at least), though so type checking might not be the place to specify this and enforce it.... An example would be the following code:
def test() -> defauldict:
return defaultdict(int)
d = test()
try:
print(d['x'])
except KeyError:
print("d['x'] is unknown")
+1 for the intent and the output. |
Merged, thanks! |
Contains #197. Will rebase once that's merged.Fixes #149, fixes #150.