-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add simple TUF role metadata model #1112
Conversation
3bac769
to
bf805b0
Compare
tests/test_api.py
Outdated
|
||
# Copyright 2020, New York University and the TUF contributors | ||
# SPDX-License-Identifier: MIT OR Apache-2.0 | ||
""" Unit tests for api/metdata.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
""" Unit tests for api/metdata.py | |
""" Unit tests for api/metadata.py |
be19a11
to
ae23d32
Compare
Add metadata module with container classes for TUF role metadata, including methods to read/serialize/write from and to JSON, perform TUF-compliant metadata updates, and create and verify signatures. The 'Metadata' class provides a container for inner TUF metadata objects (Root, Timestamp, Snapshot, Targets) (i.e. OOP composition) The 'Signed' class provides a base class to aggregate common attributes (i.e. version, expires, spec_version) of the inner metadata classes. (i.e. OOP inheritance). The name of the class also aligns with the 'signed' field of the outer metadata container. Based on prior observations in TUF's sister project in-toto, this architecture seems to well represent the metadata model as it is defined in the specification (see in-toto/in-toto#98 and in-toto/in-toto#142 for related discussions). This commits also adds tests. **TODO: See doc header TODO list** **Additional design considerations** (also in regards to prior sketches of this module) - Aims at simplicity, brevity and recognizability of the wireline metadata format. - All attributes that correspond to fields in TUF JSON metadata are public. There doesn't seem to be a good reason to protect them with leading underscores and use setters/getters instead, it just adds more code, and impedes recognizability of the wireline metadata format. - Although, it might be convenient to have short-cuts on the Metadata class that point to methods and attributes that are common to all subclasses of the contained Signed class (e.g. Metadata.version instead of Metadata.signed.version, etc.), this also conflicts with goal of recognizability of the wireline metadata. Thus we won't add such short-cuts for now. See: theupdateframework#1060 (comment) - Signing keys and a 'consistent_snapshot' boolean are not on the targets metadata class. They are a better fit for management code. See: theupdateframework#1060 (comment), and theupdateframework#660. - Does not use sslib schema checks (see TODO notes about validation in doc header) - Does not use existing tuf utils, such as make_metadata_fileinfo, build_dict_conforming_to_schema, if it is easy and more explicit to just re-implement the desired behavior on the metadata classes. - All datetime's are treated as UTC. Since timezone info is not captured in the wireline metadata format it should not be captured in the internal representation either. - Does not use 3rd-party dateutil package, in order to minimize dependency footprint, which is especially important for update clients which often have to vendor their dependencies. However, compatibility between the more advanced dateutil.relativedelta (e.g handles leap years automatically) and timedelta is tested. - Uses PEP8 indentation (4 space) and Google-style doc string instead of sslab-style. See secure-systems-lab/code-style-guidelines#20 - Does not support Python =< 3.5 Co-authored-by: Trishank Karthik Kuppusamy <trishank.kuppusamy@datadoghq.com> Co-authored-by: Joshua Lock <jlock@vmware.com> Co-authored-by: Teodora Sechkova <tsechkova@vmware.com> Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
e530af3
to
6732705
Compare
I will be leaving some thoughts here while looking at the way the client side (Updater) may use the api/metadata.
|
Great idea! What about the following interface? # Classmethod on all classes
def from_dict(json_dict):
"""
Initialize object from Python dictionary, and call from_dict for any
inner complex attribute that is represented by a class.
In the case of Metadata.from_dict(), it must dispatch to the corresponding
from_dict of the base class of Signed, identified by
json_dict['signed']['_type'].
NOTE: This is basically a mix of the constructor and load_from_json and
the inverse of `as_dict` in the current version of metadata.py. """
# Classmethods on Metadata (and optionally on Signed)
def from_json(json_object):
"""Deserialize a json string and pass to from_dict(). """
def from_json_file(filename):
"""Read file from storage and pass to from_json(). """ |
a71b03e
to
520bc20
Compare
Hey @sechkova, I added an @woodruffw, @joshuagl, @trishankatdatadog, I'd say this is ready for review aka. throwing rocks at it. It still misses some features but what's there is tested, documented and should serve our purposes for PEP 458. I think we can work off the todo list from |
tuf/api/metadata.py
Outdated
version: The metadata version number. | ||
spec_version: The TUF specification version number (semver) the | ||
metadata format adheres to. | ||
expires: The metadata expiration date in 'YYYY-MM-DDTHH:MM:SSZ' format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now a datetime
object.
tuf/api/metadata.py
Outdated
and assign attributes later on. | ||
- Fail on bad json metadata in write_to_json method, but with option to | ||
disable check as there might be a justified reason to write WIP | ||
metadata to json. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw. what do others think about the ValidationMixin methods we use in in-toto? (also see Layout._validate_*
methods for example usage)
I think it's a commendable way of doing this (credits to @SantiagoTorres)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That ValidationMixin
looks nice! I'd be happy to see it used here in tuf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job, @lukpueh! 🎉
I don't have strong opinions about this PR: it looks largely like the Metadata API I originally proposed, except for changes like removing KeyRing and Threshold so on (we can argue about that later). As long as @woodruffw and @joshuagl also approve, I'm 👍🏽
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice streamlining of the metadata API PR. Thanks @lukpueh! I have a few minor questions and comments inline. I'm also curious to understand why version
and spec_version
have different types? version: int
and spec_version: str
, is there any reason spec_version
can't be an int?
tuf/api/metadata.py
Outdated
|
||
FIXME: Is this behavior expected? An alternative approach would be | ||
to raise an exception if no signature is found for the keyid, | ||
and/or if more than one sigantures are found for the keyid. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and/or if more than one sigantures are found for the keyid. | |
and/or if more than one signatures are found for the keyid. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer the alternative approach of raising an exception if no signature is found, or more than one signature is found, for the keyid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I do too. Any advice on the exception type? At some point we should discuss exception taxonomy guidelines. But we shouldn't block on that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't seem to have an exception in tuf.exceptions
or securesystemslib.exceptions
that would fit, we could either throw a generic exception or define a new one. Either way, it will probably need some cleanup once guidelines are defined.
cls.temporary_directory = tempfile.mkdtemp(dir=os.getcwd()) | ||
|
||
test_repo_data = os.path.join( | ||
os.path.dirname(os.path.realpath(__file__)), 'repository_data') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
8 space indentation on a continuation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. Good question. I always thought PEP 8 required double-indentation for hanging indent, but it is only a suggestion, and also only to distinguish it from a subsequent nested block.
IMO it's a good idea to always differentiate line continuation indentation from functional indentation, even if there is no subsequent nested block. But I'm fine either way.
FYI, the google style guide does not say anything about double indent but has some examples that could be interpreted that way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO it's a good idea to always differentiate line continuation indentation from functional indentation, even if there is no subsequent nested block. But I'm fine either way.
That seems like a reasonable coding style for us to adopt. We should document it and teach the linter about it 😃
tuf/api/metadata.py
Outdated
and assign attributes later on. | ||
- Fail on bad json metadata in write_to_json method, but with option to | ||
disable check as there might be a justified reason to write WIP | ||
metadata to json. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That ValidationMixin
looks nice! I'd be happy to see it used here in tuf
tuf/api/metadata.py
Outdated
# Convert 'expires' TUF metadata string to a datetime object, which is | ||
# what the constructor expects and what we store. The inverse operation | ||
# is implemented in 'to_dict'. | ||
signed_dict['expires'] = iso8601.parse_date( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need an iso8601 dependency if we're only targeting Python 3.5+, can we just use datetime.fromisoformat()
here?
tests/test_api.py
Outdated
@@ -3,7 +3,12 @@ | |||
# Copyright 2020, New York University and the TUF contributors | |||
# SPDX-License-Identifier: MIT OR Apache-2.0 | |||
""" Unit tests for api/metdata.py | |||
|
|||
Skipped on Python < 3.6. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I presume the goal would be to revert this patch once all Python 2.7 supporting code has been removed?
Yes, |
e06e278
to
0a9fc93
Compare
Thanks for the reviews, @joshuagl and @trishankatdatadog!. I addressed the minor comments in 0a9fc93, which I'm happy to squash with previous commits before merging. The more serious change of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @lukpueh, this looks solid. I'm very happy to see this long-discussed model implemented.
One minor typo in a comment, but otherwise this looks good to rebase and merge.
@@ -19,6 +17,7 @@ | |||
from datetime import datetime, timedelta | |||
from dateutil.relativedelta import relativedelta | |||
|
|||
# TODO: Remove case handling when fully dropping support for versions >= 3.6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# TODO: Remove case handling when fully dropping support for versions >= 3.6 | |
# TODO: Remove case handling when fully dropping support for versions < 3.6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤦 thanks for catching this!
tuf/api/metadata.py
Outdated
|
||
import iso8601 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
signed_dict['expires'] = datetime.strptime( | ||
signed_dict['expires'], | ||
"%Y-%m-%dT%H:%M:%SZ").replace(tzinfo=None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thanks for adding this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have three uses of iso8601.parse_date
in the current codebase, so this might be a good candidate for a helper function in future.
The new metadata module uses constructs that are only available on Python >= 3.6 (typing, f-format strings, etc.). Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Add generic read from json class method that returns a Metadata object with a signed field that contains the appropriate Signed subclass, based on the signed._type field of the read metadata. Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Add simple as_json Metadata method and use it instead of repository lib's internal _get_written_metadata function in write_to_json. This commit further adds code documentation and the possibility to write compact json by excluding whitespace to write_to_json, and also removes a call to the sign method from write_to_json. The commit also adds tests. Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Add simple methods to create or verify signatures of the canonical_signed property of a Metadata object. See corresponding docstrings for behavior and design considerations. The commit also adds tests and updates the test setup to load some test keys into memory. Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Remove metadata factory on Signed class, for the sake of API simplicity/non-ambiguity, i.e. it's enough to have one way of loading any Metadata, that is: Metadata.read_from_json Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Consistenly rename de/serialization interface methods, using a 'from_' and 'to_' prefix. read_from_json -> from_json_file write_to_json -> to_json_file as_json -> to_json as_dict -> to_dict signed_bytes -> to_canonical_bytes The latter is also changed from a property to a method for consistency with the other serialization methods. Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
This commit better separates the Metadata class model from the Metadata wireline format, by tailoring the constructors towards class-based parameters and adding an additional factory classmethod that creates Metadata objects based on the wireline json/dictionary metadata representation. (pythonic way of constructor overloading). This 'from_dict' factory method recurses into the 'from_dict' methods of each contained complex field/attribute that is also represented by a class. Currently 'signed' is the only such attribute. This commit further: - Changes optional constructor keyword arguments to mandatory positional arguments: Reduces code and simplifies usage by restricting it. For now, users are unlikely to call constructor directly anyway, but the 'from_dict' factory (or its 'from_json_file' wrapper) instead. - Removes Signed.__expiration (datetime) vs. Signed.expires (datestring) dichotomy: Keeping only one representation of the same attribute in memory makes the interface simpler and less ambiguous. We choose the datetime object, because it is more convenient to modify. Transformation from and to the string format required by the tuf wireline format is performed in the corresponding metadata de/serialization methods, i.e. ('to_dict' and 'from_dict'). Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Add convenience wrapper that takes a json string and passes it to from_dict to create a Metadata object. Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Change Metadata.verify(key) behavior to raise an exception if none or multiple signatures for the passed key are found on the Metadata object. Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
See: Add root metadata class to new TUF metadata model theupdateframework#1137 Add classes for complex metadata fields theupdateframework#1139 Add input validation to simple metadata api theupdateframework#1140 Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Use builtin datetime instead of external iso6801 for simple datetime string parsing. Also see theupdateframework#1065 Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
0a9fc93
to
f106435
Compare
Thanks again for the reviews! I just cleaned up the git history, fixed a minor inconsistency and removed the todo list after created GitHub issues for each item. |
This PR copies over the new
api/metadata.py
module (+ tests) added in #1060, which has shifted focus to implementing a Hashicorp Vault interface.Co-authorship by @trishankatdatadog, @joshuagl, @sechkova and me plus note-worthy commit message details are preserved in a single squashed commit.
Missing TODOs items for this PR include at least.
See doc header of
api/metadata.py
for more details