Redact potentially sensitive information from error objects #76

JoshMock · 2023-11-30T22:02:40Z

Exposes new Transport options to improve the JS client's security position on potentially sensitive data.

Right now, many of the Error classes defined in this library include a DiagnosticResult object full of request and connection metadata that is useful for debugging. This can include HTTP headers and URLs. Common sources of sensitive data are removed while serializing to JSON or logging to the console, so common use cases are already covered. However, since that data is still stored on these objects in memory, sensitive data may not be scrubbed when custom serialization methods are used.

To solve for these cases, Error objects now fully redact and/or remove potentially sensitive data at instantiation time rather than at serialization time, so they do not exist in memory at all and cannot be exposed by any means.

These redaction methods are all exposed as options on the Transport class:

redactConnection - Completely removes the Connection object from the DiagnostictResult (false by default)
redactDiagnostics - Recursively traverses the DiagnosticResult object, redacting any values whose keys match—case-insensitive—a pre-set list of sensitive key names: authorization, password, apiKey, and x-elastic-app-auth (true by default)
additionalRedactionKeys - A list of additional keys that can be redacted when redactDiagnostics is true; note that it is not possible to remove the pre-set keys listed above, only add new ones (defaults to [])

See elastic/elasticsearch-js#2026.

src/errors.ts

JoshMock · 2023-11-30T22:18:40Z

src/security.ts

+ */
+export function redactObject (obj: Record<string, any>, additionalKeys: string[] = []): Record<string, any> {
+  const toRedact = [...secretKeys, ...additionalKeys].map(key => key.toLowerCase())
+  return traverse(obj)


traverse loops recursively through every key/value in any deeply-nested object, and can easily account for circular references (see line 47). I wanted both of these things so that this function could be generic enough to be used elsewhere later.

It was proposed that we could use fast-redact to implement this redaction feature. However, it uses wildcards to match a key string at multiple levels. I benchmarked it against my solution and, when using it to match keys at 5+ levels of nesting—DiagnosticResult includes HTTP headers nested 5 levels deep—it was substantially slower than using traverse, and did not account for circular references.

This validation would have been ideal, but it would break this lib's contract with elasticsearch-js. Since we want to ensure the transport's major version will also work with the same major version of Elasticsearch, bumping this lib to 9.0.0 would have created confusion.

ezimuel

I think we should provide some documentation on this feature and how a user can customize the list of keyword to be filtered. Moreover, do we provide a way to switch off this features? I'm assuming we'll remove the sensitive information by default but if someone wants to have it we should provide (e.g. for debugging).

ezimuel · 2023-12-05T06:55:38Z

src/security.ts

+
+import traverse from 'traverse'
+
+const secretKeys = [


Can a user customize this list?

delvedor

I'm always a bit on the fence about adding new options, users may overlook the documentation and adds more code to maintain in general. If the goal here is only to hide the sensitive options, how about using a getter?
For example:

class Error {
  constructor (opts) {
    this.foo = opts.foo
    Object.defineProperty(this, 'bar', {
      get: () => opts.bar
    })
  }
}

const error = new Error({ foo: 1, bar: 2 })
console.log(error)
console.log(error.bar)
throw error

This will log the following:

Error { foo: 1 }
2

/Users/delvedor/Development/elastic/sdh-stats/error.js:13
throw error
^
Error { foo: 1 }

Node.js v18.12.1

In the Transport's case we could do:

export class ConnectionError extends ElasticsearchClientError {
  meta?: DiagnosticResult
  constructor (message: string, meta?: DiagnosticResult) {
    super(message)
    Error.captureStackTrace(this, ConnectionError)
    this.name = 'ConnectionError'
    this.message = message ?? 'Connection Error'
    Object.defineProperty(this, 'meta', {
      get: () => meta
    })
  }
}

The same in other error classes. It may not be conventional, but it does the trick without introducing new options.

JoshMock · 2023-12-05T16:32:27Z

I'm always a bit on the fence about adding new options, users may overlook the documentation and adds more code to maintain in general.

I agree that adding new options increases maintenance cost and should be avoided when possible, but it didn't seem reasonable in this case.

The goal of the options is to allow multiple redaction "strengths," and to be able to disable it when the old behavior is needed for e.g. local debugging. When discussing with @rudolf, @pgayvallet and @ezimuel, adding these options—with secure but non-breaking-change defaults—seemed like the most flexible approach: current users get better protection by default, users who want more protection can add it, and it can be disabled when necessary.

@delvedor Did you have an idea of how to achieve the same goals with getters? It wasn't clear by your example.

JoshMock · 2023-12-05T16:38:55Z

I think we should provide some documentation on this feature and how a user can customize the list of keyword to be filtered. Moreover, do we provide a way to switch off this features? I'm assuming we'll remove the sensitive information by default but if someone wants to have it we should provide (e.g. for debugging).

@ezimuel The three new options in the PR description are how all of these things can be customized. The optional list of redaction keys (additionalRedactionKeys) will only append new keys, so the defaults (authorization etc.) cannot be accidentally disabled by adding extra keys. To disable the default keys one would need to set redactDiagnostics to false. I'm willing to revisit that approach if it seems too strict.

As for docs, when I upgrade elasticsearch-js to use this new transport version, I'll add documentation there, as well as some notes in the changelog.

rudolf · 2023-12-06T15:20:52Z

Common sources of sensitive data are removed while serializing to JSON or logging to the console, so common use cases are already covered. However, since that data is still stored on these objects in memory, sensitive data may not be scrubbed when custom serialization methods are used.

Just to clarify, these existing protections only work on connection not on other sensitive parts of ElasticsearchClientError contained under meta: DiagnosticResult. So while JSON.stringify(connection) is safe, JSON.stringify(Error) is not (same for console.log).

I think both layers are usefulh bb

JoshMock · 2023-12-06T15:30:06Z

@rudolf I assume you closed on accident 😆

JoshMock · 2023-12-07T22:31:21Z

I chatted with @delvedor earlier today and we talked through his suggestion in detail. Rather than adding a redactDiagnostics option that has to be passed all the way down from the client, we can reduce the number of new options to 2 and just redact potentially sensitive data all the time.

Rather than replacing values with [redacted], the values are still there, just hidden from all serialization and iteration methods by wrapping it in a getter function set on the object using Object.defineProperty. By this method, the only way redacted values can be accessed is by directly hitting the property (e.g. console.log(myObject.authorization)). Someone who is doing that explicitly should be aware that this data is sensitive and proceed accordingly. All serialization methods run on myObject will hide the value:

util.inspect
JSON.stringify
accessing all values via Object.values, Object.keys and Object.entries
looping over all keys on an object with a for..in loop: for (const key in redactedObject) { ... }
throwing an object as an Error

See commit b486c79 for this change.

redactConnection and additionalRedactionKeys options are still exposed as described above.

Planning to merge this tomorrow, Dec 8.

rudolf · 2023-12-08T12:38:04Z

Rather than replacing values with [redacted], the values are still there, just hidden from all serialization and iteration methods by wrapping it in a getter function set on the object using Object.defineProperty. By this method, the only way redacted values can be accessed is by directly hitting the property (e.g. console.log(myObject.authorization)). Someone who is doing that explicitly should be aware that this data is sensitive and proceed accordingly. All serialization methods run on myObject will hide the value

I'm not sure how I feel about this...

I 100% agree that JSON.stringify(ErrorWithSecrets) should never serialise those secrets. And here it makes a lot of sense to me to define getters on the connection, and all the *.headers keys so that those headers never escape. Although it's a slightly different implementation we have this protection in place on connection objects already. If this is easier to adopt in a non-breaking way this could be great layer to have for other users.

BUT *rant incoming 😉 * I feel like Elasticsearch-js has no business throwing around secrets all over the place. "I asked you to call Elasticsearch for me and tell me what went wrong, the authorization headers you used to make the request are irrelevant!" If I e.g. compare mongodb's nodejs client errors just contain the code, message, labels and a connection id string https://github.com/mongodb/node-mongodb-native/blob/main/src/error.ts

I know there's some backwards compatibility challenges, but if I could be completely selfish I just want an error that contains the bare minimum for me to understand what went wrong. Anything more I would rather use the diagnostics event emitter to debug. Things to include would be (probably missing something):

{
  attempts: number;
  aborted: boolean;
  body: TResponse;
  statusCode: number;
  warnings?: string[]; // not sure what this is?
  type: string;
  reason: string;
}

But definitely I don't want the request/response headers or the connection they're not useful and they're dangerous so why add it to the error object in the first place. Removing these completely makes me feel much better than trying to use a scalpel to find just the things we know are secrets.

JoshMock · 2023-12-08T17:06:30Z

Thanks for the feedback, @rudolf. I agree full removal is ideal. A diagnostic looking more like your proposal is definitely a consideration we can take for the next major version. But since the DiagnosticResult has been attached to Error objects for at least a few minor releases now, it would be a breaking change.

I'm leaning toward reverting the last commit, and changing the redaction options to something like:

interface RedactionOptions {
	type: 'basic' | 'advanced' | 'off'
    additionalKeys?: string[]
}

off would leave everything alone, as is the case now
basic would be the default, and would do what redactDiagnostics: true does (replace values with [redacted]).
advanced would do what redactDiagnostics: true, redactConnection: true does, but also strips (almost) all optional values off of the diagnostic, to get as close as possible to what you proposed:

{
  body?: TResponse
  statusCode?: number
  warnings: string[] | null
  meta: {
    context: TContext
    name: string | symbol
    request: {
      params: {
        method: string
        path: string
      },
      options: {}
      id: any
    }
    connection: null
    attempts: number
    aborted: boolean
  }
}

By nesting the three options into their own interface, the top-level option would just be redaction: RedactionOptions, and effectively behave like one or two options instead of 3. Having type be a union of strings also allows us to add more choices later.

JoshMock · 2023-12-08T17:09:22Z

Another set of options for type could be:

off
hide: replace values with hidden getters
replace: replace values with [redacted]
remove: what advanced does above: strip out most/all optional values from the diagnostic

JoshMock · 2023-12-08T19:02:54Z

I settled for off, replace and remove, with replace as default. I decided to skip hide, which would use the Object.defineProperty method. I don't think it's necessary, but if we see a need for it later, we can add it because the structure of the redaction options are more future-proof now.

Thanks everyone for the feedback. This solution feels a lot more refined. 🙏

Otherwise it breaks compatibility with the client

JoshMock · 2023-12-08T19:27:58Z

I tested this branch against all minor release branches of elasticsearch-js back to 8.6. Glad I did. Quick fix in 5b1324d, and now all tests on those branches are passing.

Merging soon.

JoshMock · 2023-12-08T19:40:42Z

Not going to publish to npm until Monday, just to be safe.

Includes support for [new redaction features](elastic/elastic-transport-js#76).

JoshMock added 2 commits November 28, 2023 14:48

Redact sensitive values on error types that include request metadata

c843b45

Expose error redaction settings as user options

bac5ee9

JoshMock requested review from ezimuel, pquentin and picandocodigo November 30, 2023 22:02

JoshMock commented Nov 30, 2023

View reviewed changes

src/errors.ts Outdated Show resolved Hide resolved

JoshMock commented Nov 30, 2023

View reviewed changes

structuredClone polyfill

e2ecfbc

JoshMock force-pushed the redaction branch from b2dcd09 to e2ecfbc Compare November 30, 2023 22:31

JoshMock added 2 commits December 1, 2023 09:45

Bump version to 8.4.0

44eca7b

ezimuel reviewed Dec 5, 2023

View reviewed changes

src/security.ts

import traverse from 'traverse'

const secretKeys = [

Copy link

Contributor

ezimuel Dec 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can a user customize this list?

delvedor reviewed Dec 5, 2023

View reviewed changes

rudolf closed this Dec 6, 2023

JoshMock reopened this Dec 6, 2023

Use Object.defineProperty to redact values

b486c79

Future-proof redaction options

7731b18

ErrorOptions has to be optional everywhere

5b1324d

Otherwise it breaks compatibility with the client

JoshMock merged commit 5ae8589 into main Dec 8, 2023
13 checks passed

JoshMock deleted the redaction branch December 8, 2023 19:39

This was referenced Dec 12, 2023

Bump transport to 8.4.0 elastic/elasticsearch-js#2095

Merged

Upgrade transport to 8.4.0 elastic/elasticsearch-serverless-js#40

Merged

JoshMock added a commit to elastic/elasticsearch-serverless-js that referenced this pull request Dec 14, 2023

feat(transport): Upgrade transport to 8.4.0 (#40)

9858727

Includes support for [new redaction features](elastic/elastic-transport-js#76).

legrego mentioned this pull request Jan 30, 2024

Upgrade elasticsearch-js client to >= 8.11.0-canary.1 elastic/kibana#175919

Closed

pgayvallet mentioned this pull request Feb 1, 2024

Redaction mechanism transform arrays properties into objects elastic/elasticsearch-js#2128

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Redact potentially sensitive information from error objects #76

Redact potentially sensitive information from error objects #76

JoshMock commented Nov 30, 2023

JoshMock Nov 30, 2023

ezimuel left a comment

ezimuel Dec 5, 2023

delvedor left a comment

JoshMock commented Dec 5, 2023

JoshMock commented Dec 5, 2023

rudolf commented Dec 6, 2023

JoshMock commented Dec 6, 2023

JoshMock commented Dec 7, 2023

rudolf commented Dec 8, 2023

JoshMock commented Dec 8, 2023

JoshMock commented Dec 8, 2023

JoshMock commented Dec 8, 2023

JoshMock commented Dec 8, 2023 •

edited

Loading

JoshMock commented Dec 8, 2023

Redact potentially sensitive information from error objects #76

Redact potentially sensitive information from error objects #76

Conversation

JoshMock commented Nov 30, 2023

JoshMock Nov 30, 2023

Choose a reason for hiding this comment

ezimuel left a comment

Choose a reason for hiding this comment

ezimuel Dec 5, 2023

Choose a reason for hiding this comment

delvedor left a comment

Choose a reason for hiding this comment

JoshMock commented Dec 5, 2023

JoshMock commented Dec 5, 2023

rudolf commented Dec 6, 2023

JoshMock commented Dec 6, 2023

JoshMock commented Dec 7, 2023

rudolf commented Dec 8, 2023

JoshMock commented Dec 8, 2023

JoshMock commented Dec 8, 2023

JoshMock commented Dec 8, 2023

JoshMock commented Dec 8, 2023 • edited Loading

JoshMock commented Dec 8, 2023

JoshMock commented Dec 8, 2023 •

edited

Loading