Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --entropy-sensitivity option for controlling entropy checks #272

Merged
merged 12 commits into from
Nov 15, 2021

Conversation

rbailey-godaddy
Copy link
Contributor

To help us get this pull request reviewed and merged quickly, please be sure to include the following items:

  • Tests (if applicable)
  • Documentation (if applicable)
  • Changelog entry
  • A full explanation here in the PR description of the work done

PR Type

What kind of change does this PR introduce?

  • Bugfix
  • Feature
  • Code style update (formatting, local variables)
  • Refactoring (no functional changes, no api changes)
  • Build related changes
  • CI related changes
  • Documentation content changes
  • Tests
  • Other

Backward Compatibility

Is this change backward compatible with the most recently released version? Does it introduce changes which might change the user experience in any way? Does it alter the API in any way?

  • Yes (backward compatible)
  • No (breaking changes)

Issue Linking

fixes #265

What's new?

This PR introduces a new --entropy-sensitivity option that replaces --b64-entropy-score and --hex-entropy-score. The older options are marked "deprecated" and generate warnings when used.

The default sensitivity is exactly the same as it was before, but it is presented differently and applied consistently to both hexadecimal and base64 strings. "Sensitivity" is expressed on a scale of 0 (totally random) to 100 (totally nonrandom) -- the default is 25 -- and intuitively higher sensitivity means more borderline-random strings will be reported as suspicious.

For purposes of minimizing this PR's footprint, the default is applied both in cli.py (real world) and in scanner.py (because existing unit tests do not specify most options).

@rbailey-godaddy rbailey-godaddy requested a review from a team as a code owner November 9, 2021 16:46
@rbailey-godaddy rbailey-godaddy changed the base branch from main to v3.x November 9, 2021 16:54
Copy link
Contributor

@sushantmimani sushantmimani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make more sense to add deprecation warnings to v2.x for this and any other option(s) that we want to remove in v3.0 & then just remove them in v3.0?

@rbailey-godaddy rbailey-godaddy changed the title Add --sensitivity option for controlling entropy checks Add --entropy-sensitivity option for controlling entropy checks Nov 10, 2021
@rbailey-godaddy
Copy link
Contributor Author

Does it make more sense to add deprecation warnings to v2.x for this and any other option(s) that we want to remove in v3.0 & then just remove them in v3.0?

I dunno. This feels a lot like obeying the letter of the law ("we warned you in 2.11 (?) that this would disappear in 3.0") while violating the spirit by jamming it in at chronologically the last second and effectively saying "you don't HAVE to get all the new features in 3.0, so you can take as long as you want to convert, and suffer in the meantime...".

The extra logic/code for the backward compatibility is pretty minor and cleanly separated, so I don't view it as a problem to leave it for now (marked as deprecated) and then tear it out in 3.future or even 4.0. Especially if it means we can avoid yet another 2.x update right on the heels of all the ones we already published.

Copy link
Contributor

@sushantmimani sushantmimani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🦅

Copy link
Contributor

@tarkatronic tarkatronic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the update to the command line options, please remember to update the --help output in the README.md!

CHANGELOG.md Outdated Show resolved Hide resolved
tartufo/scanner.py Outdated Show resolved Hide resolved
Fixes #265

Provide more comprehensible alternative for tuning entropy checking.
This is applied consistently across all target character sets, and stated
in a way that is slightly easier to understand ("higher means more
likely to flag a given string").

The older `--b64-entropy-score` and `--hex-entropy-score` options are
marked as deprecated but retained for backwards compatibility (and they
override `--sensitivity` if used together with it).
Use `--entropy-sensitivity` instead of `--sensitivity`
...in response to review comments
...and use them instead of private members
Do everything from scratch instead of storing explicitly. This is a PITA
because you can't combine `@property` and `@lru_cache()` and skipping
the caching would be a killer.
* Consolidate common code for entropy limit back into a single method,
  and rework properties related to it so they are cleaner.
* Invert sensitivity scale; adjust math and doc to match. It's still
  weird but aligns more closely to the underlying entropy metric.
Copy link
Contributor

@tarkatronic tarkatronic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@tarkatronic tarkatronic merged commit a0eba89 into v3.x Nov 15, 2021
@tarkatronic tarkatronic deleted the sensitivity branch November 15, 2021 15:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Change --b64-entropy-score and --hex-entropy-score to --entropy-sensitivity
4 participants