String representation of Sigstore identities #7

znewman01 · 2023-07-12T11:15:41Z

There are a number of places where users must ask "does this signature come from X?" where X is an "identity." This is actually non-trivial to get right: you can't just ask for user@example.com because what if I made my username user@example.com for some random OIDC provider that Fulcio happens to trust (like justtrustme.dev)? See sigstore/cosign#1947

So we've settled on a UX in Cosign that's kind of a pain: you have to have a magic combination of flags (--certificate-oidc-issuer, --certificate-identity) and this gets even worse when you start considering e.g. workload identities via GitHub Actions (sigstore/cosign#2691). There's a number of issues related to this UX:

A number of folks have remarked something like "wouldn't it be nice if I could just pass in a string to represent the identity and Cosign figured the rest out?" While each project (policy-controller, Cosign, sigstore-python, any other CLI or otherwise user-facing implementations) could figure this out on their own, it seems useful to have a consistent notion of identity across the Sigstore ecosystem, and sig-clients seems like a good place to coordinate.

(Note that this issue is mostly about some kind of user-facing string, not a string for embedding into the Fulcio certificate in place of the multiple extensions that get used.)

Some general considerations:

Do we even want to encourage typing identities into the CLI?
- In the long term, you should always be getting your identities via some meta-policy like TUF.
- But in the short-term, there is a need to provide identities yourself.
- Maybe we could have users provide this input in the form of a file, which may be machine-readable but not necessarily human-friendly. This avoids risks like typosquatting.
  - This is somewhat user-hostile but maybe the security tradeoffs are worth it.
Backwards compatibility: you don't want to make old identities, which used to embed hard-coded strings, suddently more liberal than they were.
- Probably easy to avoid: just use a different, mutually exclusive flag.

Some options that have been discussed:

Distinguished Names in X.500

RFC 1779 provides a notion of a string representation of distinguished names. This uses ASCII strings (though it is capable of representing arbitrary ASN.1 BER-encoded data via an "escaped" notation): Foo=lol, Bar=baz.

Pros:

It works.
No need to invent something new.

Cons:

No support for more advanced matching (wildcards, regular expressions).
A little gross once special characters get involved.
Hard for clients to look at such a string and figure out if you're asking for something sensible or something totally insecure (e.g., omitting the issuer).

Invent something ourselves

Pros:

Meets our needs: easy to define a set of common patterns that don't have footguns (e.g. user("user@example.com", "accounts.google.com") wouldn't let you omit the issuer)
We can make it quite ergonomic.

Cons:

Nonstandard.
We may get something wrong.
Hard to enable flexibility (e.g., using regular expressions or wildcards). This could also be considered a pro 🙂
Requires development work for every new "archetype" of identity (e.g., BYO PKI).

Logic programming / expression language

Squinting, you might realize that the "identities" we're talking about aren't so much fixed identities as a predicate over the certificate. That is, sometimes I want to match all of some number of X.509 extensions; sometimes I just want a few. Maybe I want to express things like "signed by Alice or Bob."

There are existing languages for expressing predicates. These include full-blown programming languages (a terrible idea in this case!) and more-restricted languages, like logic programming languages (Rego, CUE) and expression/filtering languages (jq, CEL). Cosign already supports Rego and CUE for matching predicates over attestations. Could we have users provide expressions for identity matching?

Pros:

As flexible as we need.
Could be used internally for validation. Prevents bad verification and false positives due to client implementation bugs.
We could ship a Sigstore "standard library" for these languages to embed common patterns.
Matches the way we already handle attestations.

Cons:

Requires shipping an engine for these language with each client.
Complexity: way more effort to implement than something hardcoded.
- Though arguably this cuts through an existing Gordian knot of verification somewhat, and overall simplifies/unifies things.
Yet another concept for users to learn. Maybe while learning they'll make security-critical mistakes.

Do nothing

It’s somewhat tough to express these in CLI flags, but maybe we just have the wrong flags? You could still do CLI flags to express common patterns. Maybe you need something like mutually exclusive groups with more specific requirements. Or, people are getting by with the current flags (though they are frequently complaining, as the issues mentioned above illustrate).

Pros:

Easy.
Familiar. Works.

Cons:

No flexibility.
Validating that a user's query is sensible is quite hard to check, and needs to be repeated across ecosystems.
This is really easy to shoot yourself in the foot with. What’s AND-ed, what’s OR-ed?

The text was updated successfully, but these errors were encountered:

kommendorkapten · 2023-07-17T12:17:50Z

My gut feeling would be that Logic programming / expression language and Do nothing are viable options.

The Do nothing option could be made easier for a CLI program as mentioned in other tickets by utilizing different subcommands.

The provided options could then be encoded into verification options https://github.com/sigstore/protobuf-specs/blob/59b0801bae9e856b9f7b85ddc8873e24a663ccfb/protos/sigstore_verification.proto#L29
For this to work we would need to formally define any formats around AND-ed or OR-ed. With the subcommands we may be able to make this simpler to use?

I would think about require Do nothing for all clients, and enforce them via the conformance tests. Then make Logic / expression optional, and consider them for sigstore-go so cosign and policy-controller can utilize them. Of course they would share the engine (it's implemented in sigstore-go) and so any syntax or file format where the policy can be expressed.

If we get a shared implementation in the future (via FFI) we can of course bring more complex verifications to all clients that are using that (it may not be viable for all clients as discussed before).

znewman01 added the enhancement New feature or request label Jul 12, 2023

di mentioned this issue Jul 13, 2023

Incorrect information in https://www.python.org/download/sigstore/ sigstore/sigstore-python#600

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

String representation of Sigstore identities #7

String representation of Sigstore identities #7

znewman01 commented Jul 12, 2023

kommendorkapten commented Jul 17, 2023

String representation of Sigstore identities #7

String representation of Sigstore identities #7

Comments

znewman01 commented Jul 12, 2023

Distinguished Names in X.500

Invent something ourselves

Logic programming / expression language

Do nothing

kommendorkapten commented Jul 17, 2023