Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move DSSE Implementation to securesystemslib #487

Merged
merged 22 commits into from
Mar 3, 2023

Conversation

PradyumnaKrishna
Copy link
Contributor

Fixes: #370

Description of the changes being introduced by the pull request:

This pull requests aims to add DSSE Envelope to securesystemslib. Moves DSSE implementation from in-toto/securesystemslib to here.

DSSE Envelope currently have three APIs

  • pae() -> to generate pre auth encoding.
  • sign() -> sign DSSE pre auth encoding.
  • verify() -> verify DSSE pre auth encoding with n Keys.
    sign and verify methods of DSSE currently utilizes Key and Singer provided by securesystemslib.

Serialization module commits are remaining, will be added to this PR soon.

Please verify and check that the pull request fulfils the following requirements:

  • The code follows the Code Style Guidelines
  • Tests have been added for the bug fix or new feature
  • Docs have been added for the bug fix or new feature

DSSE Envelope is a specification for signing arbitrary data.

Envelope contains from_dict and to_dict methods to convert the object to
json/dict and vice versa.

Signed-off-by: Pradyumna Krishna <git@onpy.in>
helper functions related to base64 is now moved to util.py

Signed-off-by: Pradyumna Krishna <git@onpy.in>
Run linters like black, isort, pylint and fix all errors.
Update some doc strings.
Change attribute "payload_type" to snake case.

Signed-off-by: Pradyumna Krishna <git@onpy.in>
Pre-Auth-Encoding method returns a byte sequence required to sign the
payload and create DSSE signatures.

Signed-off-by: Pradyumna Krishna <git@onpy.in>
DSSE envelope tests are added which included test cases like
* test for from and to dict methods of envelope
* test to check envelope equality
* and test to check Pre-Auth-Encoding of envelope

Signed-off-by: Pradyumna Krishna <git@onpy.in>
Minor changes includes
* Improve test case of from_dict.
* Convert PreAuthEncoding to property.
* Use of TypeError instead of FormatError.

Signed-off-by: Pradyumna Krishna <git@onpy.in>
Signed-off-by: Pradyumna Krishna <git@onpy.in>
It has been decided to use a GPGSigner such that it generates
Signature objects, not GPGSignature objects. Hence, case handling
has been removed and signature matches are added.

Signed-off-by: Pradyumna Krishna <git@onpy.in>
* Added missing types in Envelope constructor.
* Remove property tag from pre-auth-encoding.
* Modify doc string of DSSE Envelope.

Signed-off-by: Pradyumna Krishna <git@onpy.in>
sign and verify methods are DSSE protocols used to create and verify
DSSE signatures. sign uses a Signer to sign the payload and create
DSSE a signature and verify method uses a Key List to verify all
DSSE signatures with the payload present in the envelope.

Veirfy function updated with latest Key specification. Now
try-catch the key.verify_signature instead of checking condition.

Signed-off-by: Pradyumna Krishna <git@onpy.in>
Tests to ensure integrity of creation and verification of DSSE signatures.

Signed-off-by: Pradyumna Krishna <git@onpy.in>
Copy link
Member

@lukpueh lukpueh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is super cool, @PradyumnaKrishna! I only gave it a high-level review just now and diffed it to downstream master, which I had already reviewed at some point. You did a very nice job fixing conflicts with all the upstream signer developments!

Please add serialization parts and we can invite other maintainers to take a look.

I think it is feasible to review all in one PR after all. The advantage of a single PR is that we can more easily test in-toto (in-toto/in-toto#503) and python-tuf (I'd like to try using this there as well) against that PR, and update if needed.

securesystemslib/metadata.py Outdated Show resolved Hide resolved
payload = b64dec(data["payload"])
payload_type = data["payloadType"]

formats.SIGNATURES_SCHEMA.check_match(data["signatures"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not use schema/formats in any new code. The from_dict parser hierarchy should be format validation enough.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping

securesystemslib/metadata.py Outdated Show resolved Hide resolved
securesystemslib/metadata.py Outdated Show resolved Hide resolved

@classmethod
def from_dict(cls, data: dict) -> "Envelope":
"""Creates a Signature object from its JSON/dict representation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy-pasta?


@classmethod
def from_dict(cls, data: dict) -> "Envelope":
"""Creates a Signature object from its JSON/dict representation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Creates a Signature object from its JSON/dict representation.
"""Creates a DSSE Envelope from its JSON/dict representation.


Note:
Mandating keyid in signatures and matching them with keyid of Key
in order to consider them for verification, is not a DSSE spec
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
in order to consider them for verification, is not a DSSE spec
in order to consider them for verification, is not DSSE spec

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping

Metadata classes and other similar classes requires a method to
serialize and deserialize their objects into json, bytes, and
other forms. Hence, to provide a common interface to serialize
and deserialize is a better approach.

GenericSerializer and GenericDeserializer are introduced having
method serialize and deserialize respectively. These methods will
serialize class instance to raw bytes and vice versa.

SerializationError and DeserializationError will be raised on error
occurrences.

Signed-off-by: Pradyumna Krishna <git@onpy.in>
Serialization and Desrialization to and from utf-8 encoded json
bytes.

These JSON method requires an object or class type with to_dict
or from_dict methods respectively. Which will result in reuse of
this serialization technique between multiple different types of
Metadata or other serializable entities.

Signed-off-by: Pradyumna Krishna <git@onpy.in>
Envelope is now made a serializable entity to support operations
like: from_bytes, to_bytes, from_file and to_file.

Payload deserialization method is added to convert payload into a
required form, currently from json to a required class_type.
Maybe its not best idea to reuse BaseDeserializer here because
payload may not always result as an instance of a class.

Signed-off-by: Pradyumna Krishna <git@onpy.in>
Tests are added for serialization considering all possible cases.

Signed-off-by: Pradyumna Krishna <git@onpy.in>
JSONSerializable class provide abstract method `from_dict` and
`to_dict`, required by JSONDeserializer and JSONSerializer to
serialize object to json bytes and vice versa.
Inheriting the JSONSerializable class, these two method need to
be defined by the subclass in order to use JSON serialization.

Serializable is renamed to SerializationMixin, to remove confusion
about naming and keep it serparate from JSONSerializable.

Option for default serializer and deserializer are added to
SerializatioMixin. The method default_serializer will be defined
to provide a default serializer for serialization, if a serializer
is not provided (similar for default_deserializer).

By using this method, Metadata or other container can define a
default option in case of multiple serializer (or deserialzer)
other than JSON one, which makes it simpler. Otherwise, have to
redefine all methods of Mixin for each Serialization method.

Signed-off-by: Pradyumna Krishna <git@onpy.in>
TODO is added to use typing.Protocol after dropping python 3.7

Method return type SerializationMixin doesn't gives type hints
for Envelope, Metablock and will not provide hints to other
wrappers too. Any type is much more flexible in this case.

Update doc-string in deserialize_payload

Documentation for class_type argument corrected in Envelope's
deserialize_payload method.

Signed-off-by: Pradyumna Krishna <git@onpy.in>
BaseDeserializer requires a class type to deserialize the raw_data
into a class. This creates a problem with Paylaod Deserialization,
where we have to deserialize the same payload multiple time.

BaseDeserializer, JSONDeserializer returns json and that get
converted into the class object after that. Payload Deserialization
will be app specific and will be implemented at project level.

Documentation fixes in serialization.py

Removing references to `Serializable` from doc string of
SerializationMixin and mention of default serializer/deserializer
from docs.

Changing method name of default_serializer to _default_serializer
i.e. protected element of class. Also, avoid getting generated in
sphinx docs.

Add reference for classes mentioned in doc-string

Updated doc-string to provide hints, where look for the mentioned
class. For example ``JSONSerializer`` is not renamed to
``securesystemslib.serialization.JSONSerializer``.

Signed-off-by: Pradyumna Krishna <git@onpy.in>
Tests was failing due to changes in serialization and metadata
module. Some methods of DSSE Envelope was renamed which was
failing the test cases.

Signed-off-by: Pradyumna Krishna <git@onpy.in>
@lukpueh
Copy link
Member

lukpueh commented Dec 19, 2022

Really cool, @PradyumnaKrishna! I think we can work with this. I'm trying to integrate it with python-tuf right now. I want to see if the idea works there as well, before we move forward with this.

Do you feel like updating in-toto/in-toto#503 in the meanwhile?

@PradyumnaKrishna
Copy link
Contributor Author

Do you feel like updating in-toto/in-toto#503 in the meanwhile?

Just two minor issues, first GPGSigner and GPGKey support is not present in this PR. Second, the naming of sign and verify methods.

@lukpueh
Copy link
Member

lukpueh commented Dec 22, 2022

FYI: I drafted a PR similar to in-toto/in-toto#503 in theupdateframework/python-tuf#2246, which adds DSSE to python-tuf.

It works pretty well as is, I'm just a little bit worried about the added complexity. But I don't think we can do anything about it here in securesystemslib. Let's see what python-tuf maintainers say.

@lukpueh
Copy link
Member

lukpueh commented Dec 22, 2022

Just two minor issues, first GPGSigner and GPGKey support is not present in this PR....

Maybe we can land in-toto/in-toto#503 without DSSE/GPG support?

[...] Second, the naming of sign and verify methods.

Maybe we don't even have to change the names here in secureystemslib. We can also just alias the functions with the names we need in the in-toto envelope subclass. E.g. in python-tuf I created a sign alias, which uses the same name but has an additional argument. And for verify I use a tuf-specific verify_delegate.

@adityasaky
Copy link
Member

Maybe we can land in-toto/in-toto#503 without DSSE/GPG support?

I'm okay with merging without GPG support when using DSSE and picking that up right after. I think there's loads in here and in 503 that can tie into some other efforts that don't require GPG right away.

@PradyumnaKrishna
Copy link
Contributor Author

Maybe we can land in-toto/in-toto#503 without DSSE/GPG support?

I am wondering how I will do that. Currently, we have to pass a Signer (GPGSigner) to Metablock or Envelope to create a signature. I will have to put more effort in removing GPGSigner and GPGKey from that PR. Maybe you have a better idea?



# TODO: Use typing.Protocol post python 3.7
# See https://github.com/in-toto/securesystemslib/issues/10.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: I just closed this issue. Let's just stick to abc for the time being and remove the comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping

@lukpueh
Copy link
Member

lukpueh commented Jan 10, 2023

I am wondering how I will do that. Currently, we have to pass a Signer (GPGSigner) to Metablock or Envelope to create a signature. I will have to put more effort in removing GPGSigner and GPGKey from that PR. Maybe you have a better idea?

Hm. You're right, this will subvert some of our elegant AnyMetadata abstraction. I need to think some more about it.

@lukpueh
Copy link
Member

lukpueh commented Jan 17, 2023

I gave this some thought and came to the following conclusion:

Let's implement something like backwards-compatible _LegacyGPGKey and _LegacyGPGSigner in in-toto and use it for traditional metadata (as described in #471) AND, for the time being, just not support gpg signature in DSSE metadata in in-toto.

This allows us to move forward with in-toto/in-toto#503, without:

  • mixing old and new gpg signing API (i.e. securesystemslib.gpg.functions vs. securesystemslib.signer)
  • adding old signatures to DSSE, which does not have backwards-compatibility constraints yet, OR mixing old and new (spec-compliant) gpg signature formats (see #450 and #448)
  • blocking on upstream gpg work (i.e. #488)

Copy link
Collaborator

@jku jku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, it's taken a while since it's a hard concept to really properly understand (after being so familiar with the TUF metadata style)... but I've now left some specific comments in code and there are a few high level discussion items below as well.

Is securesystemslib the right place?

For the record I think it probably is... but the question may be worth discussing: securesystemslib is already a bit of mishmash collection of code and adding yet another fairly large system in there does not make the "idea of securesystemslib" any clearer. Contributors are easier to find when the code base is understandable and has a purpose...

Is making payload an object attribute correct?

Payload is obviously available at deserialize time so it's easy to just store it in an attribute... but I'm imagining some code that modifies the payload in multiple steps and I wonder if it's good that the code can't really use the Envelope to store the payload during those steps -- or if it does, it then ends up serializing and deserializing the data constantly.

The alternative I'm imagining is the payload potentially being stored in a deserialized format and only serialized when the serialization is needed.... However, the current design would likely work ok with the python-tuf repository module so I'm leaning towards yes, it's fine to enforce that payload is stored as serialized. User is expected to keep track of the deserialized payload object until they are ready to put it back in the envelope

Does threshold-verify really belong in this layer?

I did leave some rather nitpicky review comments in the verify() method already (because this code is so critical)... but as a higher-level question: If envelope as a concept does not even know what a threshold is, why does it contain a threshold-verify method? I think it maybe should not: Thresholds are something both python-tuf and intoto care about but that does not automatically mean Envelope should handle thresholds.

I suppose the alternative could be a simple verify_signature(self, key: Key)

Naming

The unfortunate naming clashes with python-tuf are not making it easier to understand the concept. Maybe they can't be avoided but we could try? Like starting with module names, how about not having a securesystemslib.metadata module? dsse maybe?

securesystemslib/metadata.py Outdated Show resolved Hide resolved
):
self.payload = payload
self.payload_type = payload_type
self.signatures = signatures
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the decision to internally store signatures as a dict in Metadata was a good idea. Is there a reason it's a list here (other than simpler de/serialize)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think we kept it a list for simplicity, and because that's what the DSSE spec describes. I'm fine with using a dict internally, if there's a clear usage advantage. Though I don't know if the same requirements apply as for TUF Metadata, since I expect Envelope objects to usually have a much shorter lifetime (also see theupdateframework/python-tuf#2246 (review)).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any opinions about this, @PradyumnaKrishna?

@@ -457,3 +459,40 @@ def digests_are_equal(digest1: str, digest2: str) -> bool:
are_equal = False

return are_equal


def b64enc(data: bytes) -> str:
Copy link
Collaborator

@jku jku Jan 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With regards to expanding existing modules like utils... Should we bite the bullet and make a _utils.py or _internal/utils.py so that we can:

  • add new internal helpers there without expanding the assumed API surface
  • start moving older code from utils.py to the internal version if we think there are methods that were never supposed to be public?

This is not that specific to this PR so feel free to disregard.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, especially since we already have base64 utility functions in the public (?) formats module. Let's start a _utils.py or _internal/utils.py just with the b64enc and b64dec from this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping

securesystemslib/metadata.py Outdated Show resolved Hide resolved
securesystemslib/metadata.py Outdated Show resolved Hide resolved

Raises:
ValueError: If "threshold" is not valid.
SignatureVerificationError: If the enclosed signatures do not pass
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SignatureVerificationError or VerificationError?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping

compliant (Issue #416).

Returns:
accepted_keys: A dict of unique public keys.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't object to it being a dict if that's useful for other purposes but looking at the code it looks more like a set

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping

tests/test_metadata.py Outdated Show resolved Hide resolved
@lukpueh
Copy link
Member

lukpueh commented Jan 31, 2023

Many thanks for the review, @jku! Your insights are very much appreciated. I left some replies inline and will comment on your high-level observations here below:

Is securesystemslib the right place?

For the record I think it probably is... but the question may be worth discussing: securesystemslib is already a bit of mishmash collection of code and adding yet another fairly large system in there does not make the "idea of securesystemslib" any clearer. Contributors are easier to find when the code base is understandable and has a purpose...

The "idea of securesystemslib" is actually fairly clear: provide code that is useful for tuf and in-toto python implementations. As such, it seems the right place for Envelope (and even Metadata -> #272). IMO, the bigger problem of securesystemslib is the lack of internal module organization and clear API. With securesystemslib.signer we are going in the right direction. Maybe we can provide a separate subpackage for metadata related API as well?

Is making payload an object attribute correct?

Payload is obviously available at deserialize time so it's easy to just store it in an attribute... but I'm imagining some code that modifies the payload in multiple steps and I wonder if it's good that the code can't really use the Envelope to store the payload during those steps -- or if it does, it then ends up serializing and deserializing the data constantly.

The alternative I'm imagining is the payload potentially being stored in a deserialized format and only serialized when the serialization is needed.... However, the current design would likely work ok with the python-tuf repository module so I'm leaning towards yes, it's fine to enforce that payload is stored as serialized. User is expected to keep track of the deserialized payload object until they are ready to put it back in the envelope

We also went back and forth about this, especially when designing a common abstraction for Envelope and Metadata. But the main difference, which is also the no 1. argument for DSSE, is that serialization there happens in two stages, to allow signature verification before deserializing the payload. This also results in a different usage pattern, where Envelope is really just a metadata wrapper to be discarded after verification, and created right before signature creation respectively.

Does threshold-verify really belong in this layer?

I did leave some rather nitpicky review comments in the verify() method already (because this code is so critical)... but as a higher-level question: If envelope as a concept does not even know what a threshold is, why does it contain a threshold-verify method? I think it maybe should not: Thresholds are something both python-tuf and intoto care about but that does not automatically mean Envelope should handle thresholds.

I suppose the alternative could be a simple verify_signature(self, key: Key)

As mentioned inline, we just implemented the DSSE spec here. I think this is in scope, and could be used as base layer for a verify_delegate method (see tuf.api.metadata.Envelop.verify_delegate of theupdateframework/python-tuf#2246).

Naming

The unfortunate naming clashes with python-tuf are not making it easier to understand the concept. Maybe they can't be avoided but we could try? Like starting with module names, how about not having a securesystemslib.metadata module? dsse maybe?

Agreed. Do you think it still makes sense to have a securesystemslib.metadata subpackage? ... for dsse, serialization, #272, and maybe even Signature, Key, SSlibKey, ...?

@jku
Copy link
Collaborator

jku commented Jan 31, 2023

Does threshold-verify really belong in this layer?

...
As mentioned inline, we just implemented the DSSE spec here.

oh, I completely missed that. If this was a spec review I would strongly argue this has no place in the envelope format spec. Since this is not spec review, I guess it has to be part of the implementation.

Although I will point out that verify() does not implement the last steps of the specified verify process anyway (probably because they also make no sense in the envelope):

  • Reject if PAYLOAD_TYPE is not a supported type.
  • Parse SERIALIZED_BODY according to PAYLOAD_TYPE. Reject if the parsing fails.

so arguably the threshold verification could be left to higher level as well 😬

@jku
Copy link
Collaborator

jku commented Jan 31, 2023

This also results in a different usage pattern, where Envelope is really just a metadata wrapper to be discarded after verification, and created right before signature creation respectively.

Oh this would be useful in documentation: it makes sense but isn't obvious at least not from TUF metadata background.

Also, this will make custom fields in the Envelope quite painful... but I think that might be a fair price to pay. I don't see much use for (long lived) custom fields anyway

Comment on lines 161 to 162
Returns:
accepted_keys: A dict of unique public keys.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I resolved an earlier comment related to this... but I've decided I'm not happy about it: we should not return "accepted keys" that is not actually a complete list but just some of the valid keys.

This return value is also not part of the protocol spec.

To be clear: I don't object to the method returning a set of keys, but I do object to it returning a vaguely defined set of keys: returning None would be better than that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's return None then for the time being so that we provide a spec compliant implementation. Also, AFAICS, we don't currently need the return value in in-toto.

@jku
Copy link
Collaborator

jku commented Feb 1, 2023

filed secure-systems-lab/dsse#55 for spec discussion

lukpueh added a commit to lukpueh/tuf that referenced this pull request Feb 3, 2023
securesystemslib now provides a generic de/serialization module,
modelled after `tuf.api.serialization`:
secure-systems-lab/securesystemslib#487

This module provides abstract `BaseSerializer` and
`BaseDeserializer` classes, which define serialize and deserialize
methods respectively. They are similar to TUF's
`MetadataSerializer` and `MetadataDeserializer`, but without
restricting argument or return value type to `Metadata`.

The module further provides `JSONSerializer` and `JSONDeserializer`
classes, which implement serialize and deserialize methods
respectively. They are similar to TUF's existing `JSONSerializer`
and `JSONDeserializer`, but without `Metadata`-specific code. That
is, `JSONSerializer` takes any object that implements `to_dict`
(i.e. `JSONSerializable`), and `JSONDeserializer` just deserializes
json bytes to dict, and leaves object specific deserialization
(e.g. `Metadata.from_dict`) to a subclass implementation.

This patch uses securesystemslib's base de/serializer classes in
metadata method signatures (type-aliased for backwards
compatibility) and factors out the json-specific de/serialization
to securesystemslib.

**IMPORTANT NOTE:** This is a POC to demonstrate how the
`securesystemslib.serialization` can be used in general.  It may
not actually be that useful for python-tuf, given that python-tuf
has a working serialization subpackage and below caveat.  It can,
however, be re-used in a couple of other classes, which don't have
a proper de/serialization implementation yet.

Generic DSSE
- secureystemslib.Envelope.from_file
- secureystemslib.Envelope.get_payload
DSSE for in-toto payloads
- in_toto.Envelope.from_file
- in_toto.Envelope.get_payload
in-toto Metadata pendant
- in_toto.Metablock.from_file
in-toto Metablock and DSSE abstraction
- in_toto.AnyMetadata.from_file
- in_toto.AnyMetadata.get_payload

**CAVEAT:**
Using securesystemslib's `Base[Des|S]erializer` instead of
`Metadata[Des|S]erializer` (and `SignedSerializer`) weakens the
interface, because it is more generic about the argument/return
value type. Stricter typing and also raising the right TUF
de/serialization errors's needs to be implemented by a tuf-specific
de/serializer, see e.g. `tuf.api.serialization.json`.

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
lukpueh added a commit to lukpueh/tuf that referenced this pull request Feb 3, 2023
Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
lukpueh added a commit to lukpueh/tuf that referenced this pull request Feb 3, 2023
Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
@jku
Copy link
Collaborator

jku commented Feb 6, 2023

I had a look at theupdateframework/python-tuf#2292 and that does make it clear what the purpose of the serialization mixins and abstractions are. I'm not sure if it's worth all of the engineering complexity to share the serialization code between Metadata and Envelope but it looks correct to me and I don't have a problem with it.

Copy link
Member

@lukpueh lukpueh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we can finally get this merged. @jku, green-lighted the serialization-related code and what's left for DSSE is commented inline.

Regarding naming, I take back my suggestion to create a metadata subpackage. It's unlikely that we will create a common traditional metadata class for tuf and in-toto (#272), which such a package would have been a good fit for. Because tuf and in-toto already have working metadata implementations, which they actually want to move away from in favor of DSSE.

So I suggest the following renames for this PR:

  • util.py -> _internal/utils.py
  • metadata.py -> dsse.py
  • test_metadata.py -> test_dsse.py

@PradyumnaKrishna, do you care for finalizing the PR?

):
self.payload = payload
self.payload_type = payload_type
self.signatures = signatures
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any opinions about this, @PradyumnaKrishna?

payload = b64dec(data["payload"])
payload_type = data["payloadType"]

formats.SIGNATURES_SCHEMA.check_match(data["signatures"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping

securesystemslib/metadata.py Outdated Show resolved Hide resolved
securesystemslib/metadata.py Outdated Show resolved Hide resolved

Raises:
ValueError: If "threshold" is not valid.
SignatureVerificationError: If the enclosed signatures do not pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping

compliant (Issue #416).

Returns:
accepted_keys: A dict of unique public keys.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping

Comment on lines 161 to 162
Returns:
accepted_keys: A dict of unique public keys.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's return None then for the time being so that we provide a spec compliant implementation. Also, AFAICS, we don't currently need the return value in in-toto.



# TODO: Use typing.Protocol post python 3.7
# See https://github.com/in-toto/securesystemslib/issues/10.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping



# TODO: Use typing.Protocol post python 3.7
# See https://github.com/in-toto/securesystemslib/issues/10.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping

@@ -457,3 +459,40 @@ def digests_are_equal(digest1: str, digest2: str) -> bool:
are_equal = False

return are_equal


def b64enc(data: bytes) -> str:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping

@PradyumnaKrishna
Copy link
Contributor Author

@PradyumnaKrishna, do you care for finalizing the PR?

I will work on in-toto/in-toto#503 first, and then I will update this PR according to the in-toto integration.

@PradyumnaKrishna
Copy link
Contributor Author

PradyumnaKrishna commented Feb 13, 2023

@lukpueh, are we going to use create_sig and verify_sigs instead of sign and verify for signing and verification method names of DSSE?

@lukpueh
Copy link
Member

lukpueh commented Feb 15, 2023

@lukpueh, are we going to use create_sig and verify_sigs instead of sign and verify for signing and verification method names of DSSE?

I suggest we keep using sign and verify here. We need to subclass Envelope in in-toto anyway. We can just create Metablock-compatible aliases for sign and verify there. WDYT?

@lukpueh
Copy link
Member

lukpueh commented Feb 16, 2023

I'm not sure if it's worth all of the engineering complexity to share the serialization code between Metadata and Envelope but it looks correct to me and I don't have a problem with it.

I fear @jku has a point here. Reasons to add the serialization infrastructure (base- and json-de/serializers plus common methods to read and write from and to files), was to re-use it for in-toto, python-tuf and a generic securesystemslib dsse envelope.

BUT given that we are:

  • unlikely to use it in python-tuf (details in tuf#2292), and
  • don't have a use case for a standalone securesystemslib Envelope, and
  • are generally trying to reduce the engineering footprint in this library,

... I suggest we don't include the de/serialization code with this PR after all, but figure out de/serialization for in-toto independently (in-toto/in-toto#503).

This means Envelope loses its _default_deserializer, _default_serializer and get_payload methods. But it can keep from_dict and to_dict. This is consistent with other metadata-related classes in securesystemslib.signer, and may be helpful for any custom de/serialization solution.

* Moves DSSE Envelope from metadata.py to dsse.py.
* Moves base64 functions from util.py to _internal/utils.py
* Minor documentation fixes

Signed-off-by: Pradyumna Krishna <git@onpy.in>
Serialization api is removed from securesystemslib with this
commit because of its complexity and don't having a use case
for a standalone securesystemslib Envelope.

This commit removes `_default_deserializer`, `_default_serializer`
and `get_payload` methods of Envelope with securesystemslib but
keeps `from_dict` and `to_dict` methods to be consistent with
other metadata related classes, e.g. signer.

Signed-off-by: Pradyumna Krishna <git@onpy.in>
compliant (Issue #416).

Returns:
accepted_keys: A dict of unique public keys.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit: Sphinx renders accepted_keys as return type, which is incorrect. Besides, the name is an implementation detail of this method, so no need to mention it in the docstring.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, @jku made a good point about the return value being too vaguely defined, and not part of the dsse spec ( see #487 (comment))

I suggest we either return None -- I checked in-toto, we don't use it there -- or we add a tiny bit more detail in the documentation to explains that these are only the first threshold of keys that verified signatures.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just add the docs here and move on. What do you think about...

Suggested change
accepted_keys: A dict of unique public keys.
A dict of the threshold of unique public keys that verified a signature.

?

Copy link
Member

@lukpueh lukpueh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's wrap this up, it looks like this works well enough for our in-toto use case (in-toto/in-toto#503).

Before we merge and release, would you mind accepting my inline comments?

return signature

def verify(self, keys: List[Key], threshold: int) -> Dict[str, Key]:
"""Verify the payload with the provided Keys.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at how we use this function in in-toto -- verify(keys=[key], threshold=1), and building in-toto specific threshold check verification around it -- @jku's argument in secure-systems-lab/dsse#55 seems more convincing now.

That said, we can work with what we have and I'd really like to move forward. Let's just add comment that makes it very clear that this API isn't final yet. E.g...

Suggested change
"""Verify the payload with the provided Keys.
"""Verify the payload with the provided Keys.
NOTE: This API is experimental and might change (see secure-systems-lab/dsse#55)

compliant (Issue #416).

Returns:
accepted_keys: A dict of unique public keys.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compliant (Issue #416).

Returns:
accepted_keys: A dict of unique public keys.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just add the docs here and move on. What do you think about...

Suggested change
accepted_keys: A dict of unique public keys.
A dict of the threshold of unique public keys that verified a signature.

?

@PradyumnaKrishna PradyumnaKrishna marked this pull request as ready for review March 3, 2023 05:22
Updates documentation of verify method in Envelope, add a little
more detail over returns section.
Fix linting issue.

Signed-off-by: Pradyumna Krishna <git@onpy.in>
@lukpueh
Copy link
Member

lukpueh commented Mar 3, 2023

Thanks for the update, @PradyumnaKrishna! CI failure is unrelated (see #520). Merging. 🎉

@lukpueh lukpueh merged commit 3a08f0f into secure-systems-lab:main Mar 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for DSSE
4 participants