Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sensitive validator branch #1364

Merged
merged 4 commits into from
Nov 15, 2022

Conversation

rchache
Copy link
Contributor

@rchache rchache commented Aug 18, 2022

Issue #, if available:

Description of changes:
Based on various existing services for sensitive data detection and current AWS sensitive trait usage, generated a default list of words and phrases that likely indicate the data stored inside is sensitive. It is configurable the way the non inclusive terms validator is.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@rchache rchache requested a review from a team as a code owner August 18, 2022 18:20
@mtdowling
Copy link
Member

I created a PR here that is used by ReservedWords to support word boundary based term matching. I made an abstraction for this so that it could be used by this linter as well. Let me know what you think. #1461

mtdowling added a commit that referenced this pull request Oct 24, 2022
This commit introduces a new syntax for matching words with the
ReservedWords linter and is intended to be used with the upcoming
sensitive words linter defined in #1364.

In addition to supporting wildcard searches ("*" prefix, suffix,
and contains), we now support matching based on word boundaries.

This commit introduces the "terms" keyword for word boundary
searches and adds dedicated abstractions for word boundary and
wildcard matching.

For example, "access key id" will match "AccessKeyId",
"access_key_id", "accessKeyID", "access_key_id100", "AccesKeyIDValue".
It will also match when all the words are concatenated together:
"accesskeyid". However, it will not match "accesskey_id" because it
only has two word boundaries ("accesskey" and "id").
mtdowling added a commit that referenced this pull request Oct 24, 2022
This commit introduces a new syntax for matching words with the
ReservedWords linter and is intended to be used with the upcoming
sensitive words linter defined in #1364.

In addition to supporting wildcard searches ("*" prefix, suffix,
and contains), we now support matching based on word boundaries.

This commit introduces the "terms" keyword for word boundary
searches and adds dedicated abstractions for word boundary and
wildcard matching.

For example, "access key id" will match "AccessKeyId",
"access_key_id", "accessKeyID", "access_key_id100", "AccesKeyIDValue".
It will also match when all the words are concatenated together:
"accesskeyid". However, it will not match "accesskey_id" because it
only has two word boundaries ("accesskey" and "id").
@rchache rchache force-pushed the SensitiveValidatorBranch branch from d3e810b to ae30062 Compare October 27, 2022 01:13
@rchache rchache force-pushed the SensitiveValidatorBranch branch from ae30062 to d3d2bb5 Compare October 27, 2022 15:23
@rchache rchache requested a review from mtdowling November 10, 2022 23:06
@mtdowling mtdowling merged commit 6b7e154 into smithy-lang:main Nov 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants