-
Notifications
You must be signed in to change notification settings - Fork 276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Destroy unnecessary MetaFile classes in formats.py #836
Conversation
8c0338a
to
1b994fd
Compare
Kills classes MetaFile, RootFile, TargetsFile, MirrorsFile, SnapshotFile, and TimestampFile. They each had an unused from_ method and a used make_ method. They were all additional, unnecessary representations of the same metadata, and it is very important that metadata formats be defined once in the reference implementation, in the schemas that are already used more broadly, in foramts.py. Replaces the classes, their methods, and some associated variables with a single short function called build_dict_conforming_to_schema that takes keyword arguments and builds a dictionary, then checks to make sure that the result conforms to the given schema. This commit shifts repository_lib from use of the old classes to the new function. In later commits, we should use this function more broadly, since it can be of use in all schema construction. There are several TODOs added to the code, mostly for post-#660 tasks. Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
Add a tuf-level variable in tuf/__init__.py indicating the version of the TUF specification that the code in this repository is intended and expected to conform to. This will be used when writing metadata. Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
It was excluded from the Timestamp schema definition in error. In the process of making metadata writing use the Timestamp schema strictly, this bug was discovered. Metadata previously written included specification version, but the schema check did not. Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
It pertains to now-deleted metadata classes. Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
Testing will now use (and test) build_dict_conforming_to_schema. Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
1b994fd
to
e7c8229
Compare
Rebased after #844 |
See sister PR and review comments here in Uptane's TUF fork. |
tuf.formats.make_role_metadata concerned itself with exclusivity checks for paths and path_hash_prefixes, but no code actually used it for relevant data. It's yet another custom metadata writer replaced by build_dict_conforming_to_schema. Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
Make a copy of the provided fields so that the caller's provided values do not change when the returned values are changed. Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
in tuf.formats.build_dict_conforming_to_schema Populate _type with the expected value for the given schema, and populate spec_version with tuf.SPECIFICATION_VERSION. Do this only when the values are not provided, and support overriding them. Also adds testing for the above and takes advantage of the above in repository_lib's _generate metadata functions. Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
bf84ad5
to
7ecf522
Compare
since make_role_metadata is being replaced by build_dict_conforming_to_schema Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
I think I've addressed your comments on the sister (upTUF) PR, @lukpueh. I've also expanded this PR just a little bit. Should be a quick re-review, I hope. :) |
Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
tuf/formats.py
Outdated
@@ -24,8 +24,7 @@ | |||
module should be read and understood before tackling this module. | |||
|
|||
'formats.py' can be broken down into three sections. (1) Schemas and object |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/three/two/
tuf/formats.py
Outdated
tuf.formats.TARGETS_SCHEMA.check_match(result) | ||
else: | ||
# If the field has not been provided and is required, check to see if | ||
# the field is one of the one of the fields we automatically fill. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/one of the one of the/one of the/
tuf/formats.py
Outdated
# | ||
for schema_element in schema._required: #pylint: disable=protected-access | ||
key = schema_element[0] | ||
element_type = schema_element[1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why check above if the schema has a check_match
method and here just assume that it has a _required
property, which itself and whose elements are subscriptable?
I recommend to get rid of above hasattr
-check and instead use the commented out isinstance
-check but with a schema.Object
which should be guaranteed to have a _required
-list
and, as a child of schema.Schema
, also a check_match
-method.
Also, consider my comment in awwad#23, about how this function only makes sense for Object
-type schemas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw. you can unpack the _required
-tuples in the loop definition, which imho looks nicer:
for key, element_type in schema._required:
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
Re. the additional typing requirements: I forgot to edit the type check after also requiring
_required
. You're right: I should switch to a type check now. -
I'll add a comment noting that this is intended for Object-type schemas (which are for the most part the ones for which it's of interest).
-
On the 2-tuples: quite right, of course -- dunno why I didn't unpack that way. Fixing.
elif (key == 'spec_version' and | ||
element_type == SPECIFICATION_VERSION_SCHEMA): | ||
# If not provided, use the specification version in tuf/__init__.py | ||
dictionary[key] = tuf.SPECIFICATION_VERSION |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know, I suggested this, but now that I look at it I think it's pretty specific, at least for the function name build_dict_conforming_to_schema
. Do you think there is a better name? If not, disregard this comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function is still intended to be a general purpose (Object-type) schema builder, so I would rather remove this convenience functionality than change that scope. There may be a better name, but I don't want to narrow its purpose to Role metadata construction, if that's what you mean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine as it is.
tuf.formats.TARGETS_SCHEMA, | ||
version=version, | ||
expires=expiration_date, | ||
targets=filedict) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I couldn't sell the kwargs
alternative to you? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whatcha mean? The extras bit (e.g. delegations) in the calls below this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function uses kwargs and returns deepcopy(kwargs) (after some modifications).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gonna merge and use this in some other work, now that the rest has been addressed, but if I missed something, please LMK. I'll keep the extras-style optional arguments construction in mind if there are other places calls like this start to need to happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes, I was referring to passing something like an extra_kwargs
that may or may not contain the delegations
(sorry for being unclear).
@@ -356,6 +348,7 @@ | |||
TIMESTAMP_SCHEMA = SCHEMA.Object( | |||
object_name = 'TIMESTAMP_SCHEMA', | |||
_type = SCHEMA.String('timestamp'), | |||
spec_version = SPECIFICATION_VERSION_SCHEMA, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this require new tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This in itself, no. The spec version was simply missing in the schema for Timestamp. It was already being written in metadata despite the fact that it was missing in the schema, because this schema was not really used to build the metadata before. Now the schema is used to build metadata.
I want to add more validation that uses these schemas in updater, and to add some additional testing for formats.py related to the schemas more generally, but that's not really related to this change, and it'll have to wait.
Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
Edits prompted by Lukas's review. Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
- Expand on the arguments/kwargs - Note that the function is particular to schema.Object objects. Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
May the Python gods forgive me for having three different names with the same names and different casing on one line.... Perhaps I should address the import module as.... Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
Thanks for the review, @lukpueh! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing my comments. I think you want to do an isinstance
-check on SCHEMA.Object
in build_dict_conforming_to_schema
. Otherwise it looks good.
tuf/formats.py
Outdated
filedict = targets_metadata.get('targets') | ||
delegations = targets_metadata.get('delegations') | ||
# Check the schema argument type (must provide check_match and _required). | ||
if not isinstance(schema, SCHEMA.Schema): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/Schema/Object/
tuf/formats.py
Outdated
# Check the schema argument type (must provide check_match and _required). | ||
if not isinstance(schema, SCHEMA.Schema): | ||
raise ValueError( | ||
'The first argument must be a schema.Schema object, but is not. ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/Schema/Object/
elif (key == 'spec_version' and | ||
element_type == SPECIFICATION_VERSION_SCHEMA): | ||
# If not provided, use the specification version in tuf/__init__.py | ||
dictionary[key] = tuf.SPECIFICATION_VERSION |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine as it is.
tuf.formats.TARGETS_SCHEMA, | ||
version=version, | ||
expires=expiration_date, | ||
targets=filedict) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes, I was referring to passing something like an extra_kwargs
that may or may not contain the delegations
(sorry for being unclear).
Raise an error if it's not a schema.Object instance (not just if it's not a schema.Schema instance). Also adds a test for this. Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
This PR is related to two efforts:
The primary effect is to replace six classes and their associated methods and some related external functions with one, simpler, general purpose function in formats.py,
build_dict_conforming_to_schema
. These functions are involved in the production of metadata.This PR also:
Details
Kills classes MetaFile, RootFile, TargetsFile, MirrorsFile, SnapshotFile, and TimestampFile. They each had an unused from_ method and a used make_ method. They were all additional, unnecessary representations of the same metadata, and it is very important that metadata formats be defined once in the reference implementation, in the schemas that are already used more broadly, in formats.py.
Replaces the classes, their methods, some related external functions, and some associated variables with a single short function called build_dict_conforming_to_schema that takes keyword arguments and builds a dictionary, then checks to make sure that the result conforms to the given schema.
This commit shifts repository_lib from use of the old classes to the new function.
In later commits, we should use this function more broadly, since it can be of use in all schema construction.
There are several TODOs added to the code, mostly for post-#660 tasks.