Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADR: NanoTDF KAS resource locator path and key identifier #900

Open
pflynn-virtru opened this issue May 31, 2024 · 14 comments
Open

ADR: NanoTDF KAS resource locator path and key identifier #900

pflynn-virtru opened this issue May 31, 2024 · 14 comments
Assignees
Labels
adr Architecture Decision Records pertaining to OpenTDF

Comments

@pflynn-virtru
Copy link
Member

pflynn-virtru commented May 31, 2024

NanoTDF KAS resource locator path and key identifier

Context

Problem

  1. KAS resource locator usage varies
  2. No identifier for the KAS key in a NanoTDF

The NanoTDF specification requires enhancements to support key identifier and multiple ways to access KAS.

This section contains a Resource Locator type that allows describing access to a resource. In the case of the KAS, the Resource Locator defines how to access a KAS.

The Resource Locator is a way for the nanotdf to represent references to external resources in as succinct a format as possible.

See https://github.com/opentdf/spec/tree/main/schema/nanotdf#3312-kas

Example body with protocol values:

https://secure.virtru.com/api/kas
http://kas.example.com:9000
\\securehost\kas

How to access a KAS

  1. Parse KAS resource locator field from NanoTDF header
  2. Create a URL using protocol enum and body. Body usually contains domain with a partial path, if needed, to the KAS service
  3. Append /v2/rewrap or /kas/v2/rewrap or /rewrap to URL from step 2 (varies by SDK)
  4. Perform a RESTful or gRPC call using the TDF protocol (varies by SDK)

Which KAS key

As we introduce multiple KAS keys and perform key rotations, we need a key identifier kid used in creating the NanoTDF so a rewrap operation can use the same key.

Policy Key Access

This section allows for an ephemeral key other than the Payload key to encrypt the policy.

See https://github.com/opentdf/spec/tree/main/schema/nanotdf#342323-optional-policy-key-access

Goal

  1. Clarify specification in regards to "How to access a KAS":
  • Define how to build the URL to perform a rewrap request including kid
  • Define what parts of the URL are defined where
  • Define where to find what RPC methods are available
  • Define a strategy on how to find KAS if relocated in the future
  1. Update specification in regards to "Which KAS key":
  • Choose kid or public key
  • Clarify keys (policy and payload key?)
  • Define where to locate kid on decryption
  • Define how to determine kid on encryption
  • Define what format it takes
  • Define how to use it when accessing KAS

Related

Included here because a possible version change to NanoTDF specification could influence this decision.

Decision

⚖️ Add or Use Key Identifier section

See #1199

0x03	Embedded Policy (Encrypted w/Policy Key Access)
0x04	Embedded Policy (Encrypted w/KAS Key Access)

See https://github.com/opentdf/spec/tree/main/schema/nanotdf#342323-optional-policy-key-access

Rationale: Only KAS needs to know the public key or kid. It has no impact on how to access KAS.

Changes:

  • Add new section to Header - Key identifier 32B - 133B
  • Recommended name "Payload Key Access" section
  • 🟨 Similar to Policy Key Access
  • 🟥 NanoTDF version update

⚖️ Add Key Identifier to Policy

See #1197
Rationale: KAS presents multiple keys available to a client. The client determines which key based on the use case policy.

Changes:

  • Add KAS public key field
  • Or Add kid field with compute and format specification
  • Optional Policy Key Access section is required

Example attribute

  • urn:opentdf:kas:ec:secp256r1:43:51:43:a1:b5:fc:8b:b7:0a:3a:a9:b1:0f:66:73:a8
  • urn:ietf:params:oauth:jwk-thumbprint:sha-256:NzbLsXh8uDCcd-6MNwXF4W_7noWXFZAfHkxZsRGC9Xs
  • 🟩 Policy section is relatively large
  • 🟩 Policy section is processed by KAS
  • 🟩 Policy section has binding
  • 🟨 NanoTDF specification clarification

⚖️ Add Key Identifier to KAS Resource Locator

See Specification opentdf/spec#40
See Implementation #1222

Rationale: KAS has multiple keys available to a client. The client determines which key based on the use case policy.

Changes:

  • Add recommend way to add a kid to a policy with default implementation in KAS and SDK
  • 🟨 Resource Locator is 257B
  • 🟨 NanoTDF specification clarification
  • 🟥 KAS must parse a freeform URL fragment created by SDK
  • 🟥 No binding, attack vector for public key thumbprint

⚖️ Resource Locator format: URN

Rationale: Introducing a URN type that includes both domain and identifier enhances the ability to uniquely and efficiently reference resources.

Changes:

  • A new URN type will be added to the resource locator format. The URN will include the domain and a unique identifier, following the pattern: urn:opentdf:kas:<domain>:<identifier>.
  • example virtru.com:01edksqtx9cfzzt1y9sm57h3yq
  • <identifier> in ULID 16 bytes
  • Protocol enum for urn
Value Protocol
0x00 http
0x01 https
0x03 urn:opentdf:kas
0x02 unreserved
0xff Shared Resource Directory
  • 🟩 Concise and compact
  • 🟨 SDK computes and formats
  • 🟨 NanoTDF and ZTDF specification change

⚖️ Resource Locator format: URL query parameter

See #1190

virtru.com/api/kas?kid=435143a1b5fc8bb70a3aa9b10f6673a8

  • 🟩 Chain-of-trust through DNSSEC and Certificates
  • 🟨 Not a succinct format
  • 🟨 With or without :
  • 🟨 URL encode or not
  • 🟨 SDK computes and formats
  • 🟨 NanoTDF specification clarification
  • 🟨 URL resolution is responsibility of Ops, not Security
  • 🟨 Port is difficult to maintain over time

Decision

TBD

References

@pflynn-virtru pflynn-virtru added the adr Architecture Decision Records pertaining to OpenTDF label May 31, 2024
@dmihalcik-virtru
Copy link
Member

We should also support key splitting explicitly somehow

@patmantru
Copy link
Contributor

Are we going to change the "L1L" magic number?

@pflynn-virtru pflynn-virtru changed the title WIP: ADR: nanotdf v2 WIP: ADR: nanotdf key identifier Jun 3, 2024
@pflynn-virtru
Copy link
Member Author

@patmantru The "L1L" contains the version which will change if the specification is revised. A revision would include a binary format change, or perhaps even a reinterpretation of an existing field.

see https://github.com/opentdf/spec/tree/main/schema/nanotdf#3311-magic-number--version

@pflynn-virtru
Copy link
Member Author

@dmihalcik-virtru key splitting will not be covered by this ADR nor any ADR in foreseeable future

@pflynn-virtru pflynn-virtru changed the title WIP: ADR: nanotdf key identifier WIP: ADR: NanoTDF KAS resource locator - key identifier and path Jun 3, 2024
@biscoe916
Copy link
Member

biscoe916 commented Jun 4, 2024

@patmantru As a fun aside, L1L also results in a BASE64 encoding that starts with "TDF..."

@pflynn-virtru pflynn-virtru changed the title WIP: ADR: NanoTDF KAS resource locator - key identifier and path WIP: ADR: NanoTDF KAS resource locator path and key identifier Jun 4, 2024
@pflynn-virtru pflynn-virtru changed the title WIP: ADR: NanoTDF KAS resource locator path and key identifier ADR: NanoTDF KAS resource locator path and key identifier Jun 4, 2024
github-merge-queue bot pushed a commit that referenced this issue Jun 12, 2024
- define new `cryptoProvider` config structs to support rotation with
different keys
- this means the algorithm must be present
- I'm using some heuristics to maintain backward compatibility of
standard crypto configs, but hsm configs must be updated
- Adds `kid` in response to kas_public_key
- Adds `kid` in key access objects produced by SDK
- Still no support in nanotdf, as we have not agreed on how/where to
store the `kid` value in the header. See
#900

New Config:

```yaml
server:
  cryptoProvider:
    type: standard
    standard:
      keys:
        - kid: r1
          alg: rsa:2048
          private: kas-private.pem
          cert: kas-cert.pem
        - kid: r0
          alg: rsa:2048
          private: kas-private-old.pem
          cert: kas-cert-old.pem
        - kid: e1
          alg: ec:secp256r1
          private: kas-ec-private.pem
          cert: kas-ec-cert.pem
services:
  kas:
    enabled: true
    keyring:
      - alg: rsa:2048
        kid: r1
      - alg: rsa:2048
        kid: r0
        legacy: true
      - alg: ec:secp256r1
        kid: e1
```

some notes:

- `kid` values should be unique, preferably for the lifetime of the kas
host domain name
- `kid` values should short strings (I'd suggest maxing out at 44
characters)
- `private` and `cert` indicate the location of private key and a
certificate, if available
- For `hsm` keys, these should be label values
- For `standard` keys, these should be paths to PEM files relative to
the current working directory
- I've deprecated the `eccertid` for a new `keyring` parameter which
describes how KAS will interpret the key. So we have two sections:
`server.cryptoProvider` describes what keys are available, while
`service.kas.keyring` describes how KAS uses those keys.
- We don't have a 'rotate' script yet. To do this manually, update the
init-temp-keys script to use a new name/label and rerun and add the new
keys to the list, updating the `certid` fields to point to the new
values


To come:

1. nanoTDF support
@strantalis
Copy link
Member

strantalis commented Jun 12, 2024

@pflynn-virtru This might be another option. Have you considered leveraging the URI or URN pattern to fetch KAS information from the platform itself? This way, an administrator could control the host, port, and path that the KAS maps to.

For example, if we keep the KAS hostname, port, and path in a NanoTDF or ZTDF, an administrator could never really tear down those records without risking breaking existing TDFs. This approach also allows the platform to dictate which KASs are trusted.

Example: urn:kas:kas-1.virtru.com:kid:123

We should also consider using this format for ZTDF to address similar issues there.

@pflynn-virtru
Copy link
Member Author

Two options exist:
⚖️ Resource Locator format: URN
⚖️ Resource Locator format: URL query parameter

I have added issues with URL case.
I have added ZTDF change needed too.

@jrschumacher
Copy link
Member

Would one of these approaches be better suited for situations where OpenTDF is deployed in a multi-KAS environment or handle situations where maybe IT wants to migrate obscure endpoints from kas[1-100].example.com to more descriptive endpoints like mkt[1-5].kas.example.com, eng[1-5].kas.example.com, exec[1-5].kas.example.com, etc?

@jrschumacher
Copy link
Member

One bit of experience I have is with Virtru's Secure Reader product. We offered the ability to tie policy with custom domains (CNAMEs) as the product aged we learned that customers wanted the ability to change those either due to a company rebrand, acquisition, or even an IT policy change that required domains to comply with a universal org policy.

By binding emails directly with the company CNAME it meant that the company would have to hold that domain indefinitely or risk breaking access to old emails. I would encourage learning from this experience and make sure we reduce this risk.

For instance, managing one domain indefinitely is much less burdensome than N domains (or subdomains) per deployed KAS.

@strantalis
Copy link
Member

Would one of these approaches be better suited for situations where OpenTDF is deployed in a multi-KAS environment or handle situations where maybe IT wants to migrate obscure endpoints from kas[1-100].example.com to more descriptive endpoints like mkt[1-5].kas.example.com, eng[1-5].kas.example.com, exec[1-5].kas.example.com, etc?

In this case I think going with a uri or urn approach would be better suited with the core platform holding the necessary information to connect to kas.

Imagine if you needed to change the kas endpoint 5 times. I think a user would have to maintain a new cname record every time it's changed.

@strantalis
Copy link
Member

@jrschumacher

One bit of experience I have is with Virtru's Secure Reader product. We offered the ability to tie policy with custom domains (CNAMEs) as the product aged we learned that customers wanted the ability to change those either due to a company rebrand, acquisition, or even an IT policy change that required domains to comply with a universal org policy.

By binding emails directly with the company CNAME it meant that the company would have to hold that domain indefinitely or risk breaking access to old emails. I would encourage learning from this experience and make sure we reduce this risk.

For instance, managing one domain indefinitely is much less burdensome than N domains (or subdomains) per deployed KAS.

This is my main concern. We’re shifting complexity to the infrastructure. For instance, if a port number other than 443 is used and then changed, does that mean those TDFs become inaccessible unless the infrastructure is constantly maintained? This could lead to significant challenges in ensuring continuous access to TDFs that were previously generated.

The current solution feels brittle the more we dig into it.

@pflynn-virtru
Copy link
Member Author

After the Architecture meeting the following has been decided:

  • implement an experimental feature for ⚖️ Resource Locator format: URN

Rough implementation impact:

  • update .well-known to have needed info for the SDK to perform a KAS URN to rewrap endpoint resolution
  • update Java and JS SDK to use .well-known - might be done already
  • update Nano Resource Locater in the Java and JS SDK to support the URN bit
  • mark this feature as experimental that can be turned on

@strantalis
Copy link
Member

Before I forget want to note this down. KAS is really an interface to the policy keys. There should be no reason I can't load the keys into another kas and decrypt my data as long as my entitlements match the resource attributes. We don't tie a key to a kas in anyway.

pflynn-virtru added a commit that referenced this issue Jun 14, 2024
Temporarily disable the decrypt command in tdf-roundtrips test due to a bug reported in issue #900. This change also modifies the paths of the r2 key pair to match the kas-private.pem and kas-cert.pem paths, eliminating the previously separate r2 keys.
@damorris25
Copy link
Member

@pflynn-virtru this is a really well written ADR and the back and forth discussion in an open forum like this between you, @strantalis and @jrschumacher is amazing.

I will pile on and just say I agree with the comments from both Ryan and Sean. We need to be more flexible to infrastructure changes, particularly with something as important as accessing keys.

pflynn-virtru added a commit to opentdf/java-sdk that referenced this issue Aug 7, 2024
This is prep for the incoming changes related to ADR below.

Updated the constructor to mask the first byte with 0xF, ensuring only
the first four bits are used for indexing the protocol. This prevents
potential out-of-bounds errors when retrieving values from the
NanoTDFType.Protocol enum.

Issue: opentdf/platform#1203
Specification: opentdf/spec#40
ADR: opentdf/platform#900
pflynn-virtru added a commit to opentdf/java-sdk that referenced this issue Aug 15, 2024
NanoTDF will now have the KAS KID set in the KAS ResourceLocator

Resolves #100
Specification: opentdf/spec#40
ADR: opentdf/platform#900
github-merge-queue bot pushed a commit that referenced this issue Aug 23, 2024
…#1222)

- Adds `identifier` field to Resource Locator
- Updates Protocol Enum with `identifier `  size

Closes #1203 
Issue: #1203 
Specification: opentdf/spec#40
ADR: #900

---------

Co-authored-by: sujankota <sreddy@virtru.com>
Co-authored-by: David Mihalcik <dmihalcik@virtru.com>
Co-authored-by: Tyler Biscoe <biscoe@virtru.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
adr Architecture Decision Records pertaining to OpenTDF
Projects
None yet
Development

No branches or pull requests

7 participants