Decide how to handle errors in trusted key value V2 responses #1284

MattMenke2 · 2024-09-24T15:17:09Z

The new trusted signals KVv2 protocol is a good opportunity for us to reconsider how we handle errors in the response we received over-the-wire.

There are a number of classes of errors, which we may or may not want to handle differently. Note that while I'm trying to group errors in sets that I think should probably be handled similarly, other groupings are possible.

Global errors (network errors, encryption errors, can't parse outer CBOR, entire structure of response wrong, so can't find any compression groups), framing header errors. In these cases, we have absolutely no usable data at all.
Errors with specific compression groups in the outer CBOR (two compression groups with a duplicate ID, groups without an ID, group that's not even a map, group missing other necessary fields ("content" is the only one currently), "content" not a binary string.
Error that renders an entire compression group's "content" unusable (decompression fails, can't parse as CBOR, can't find partitions array).
Error in partitions array (partition not a map, partition without an ID, partitions with duplicate ID, partition without "keyGroupOutputs" array).
Error within a partition's keyGroupOutputs CBOR (Entry not an object, tags/keyValue entry missing, keyValues entry missing a "value" field, "value" field not a string, keyValues entry not a map, tags value not an array, tags array has multiple values).
"values" entry in "keyGroupOutputs" not JSON.
Errors within the "values" JSON (e.g., priorityVector not a dictionary, value within priorityVector wrong type, "updateIfOlderThanMs" not a number.
Missing compression groups / partitions (I think receiving ones that weren't requested can just be ignored).

For 1)-5), the bad data had to come from the TEE code that wraps the bidder/seller response generation code, as opposed to the bidder/seller code itself, so we can probably just do whatever failure behavior is convenient (fail all fetches, fail only affected fetches, etc), and just treat the same as a network failure.

On the other hand, 6) and 7) deal with data provided by the buyer/seller script, rather than the shared top-level TEE code (not sure if there's better terminology here). Note that each value corresponds to a single key or data about a single interest group. Discussions around what to do here are the reason for filing this issue. If we hard-fail on bad data, folks are more likely to notice bugs and fix their code, but that also means potentially breaking auctions in the case of any bad data received (B&A aborts bidding/scoring on KV fetch failure, chrome does not, currently - which is the right choice is yet another issue, which seems like a good idea to discuss, albeit on a different issue). Back to the issue at hand: Hard failing the entire response or ignoring errors in the JSON and keeping whatever we can get out of it each have their own advantages.

The current behavior in Chrome's not-yet-landed code is for 6) is to treat it as a fatal error for the entire partition (or the entire response), and to ignore 7), extracting whatever data we can successfully read from the JSON string. That seems inconsistent. 7)'s permissive behavior is inherited from shared code for KVv1. If we hard fail, there's also the question of what we hard fail: Just the requests that try to use the bad data, all requests, all requests using the partition, all requests using the compression group. Only failing the requests using the bad value opens up options for lazily parsing values, but also means any cached parsed JSON would need to record if that particular chunk of JSON failed to parse with an error.

Failures of type 8) could be grouped with 1)-5). It's probably best to just fail individually any requests with missing data. Probably more an issue to discuss with the server team and how their code works (and whether the choice matters).

MattMenke2 changed the title ~~Spec out trusted key value V2 error handling~~ Decide on how to handle errors in trusted key value V2 responses Sep 24, 2024

MattMenke2 changed the title ~~Decide on how to handle errors in trusted key value V2 responses~~ Decide how to handle errors in trusted key value V2 responses Sep 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decide how to handle errors in trusted key value V2 responses #1284

Decide how to handle errors in trusted key value V2 responses #1284

MattMenke2 commented Sep 24, 2024 •

edited

Loading

Decide how to handle errors in trusted key value V2 responses #1284

Decide how to handle errors in trusted key value V2 responses #1284

Comments

MattMenke2 commented Sep 24, 2024 • edited Loading

MattMenke2 commented Sep 24, 2024 •

edited

Loading