Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to lint based on word boundaries #1461

Merged
merged 1 commit into from
Oct 24, 2022
Merged

Conversation

mtdowling
Copy link
Member

@mtdowling mtdowling commented Oct 22, 2022

Add ability to lint based on word boundaries

This commit introduces a new syntax for matching words with the
ReservedWords linter and is intended to be used with the upcoming
sensitive words linter defined in #1364.

In addition to supporting wildcard searches ("*" prefix, suffix,
and contains), we now support matching based on word boundaries.

This commit introduces the "terms" keyword for word boundary
searches and adds dedicated abstractions for word boundary and
wildcard matching.

For example, "access key id" will match "AccessKeyId",
"access_key_id", "accessKeyID", "access_key_id100", "AccesKeyIDValue".
It will also match when all the words are concatenated together:
"accesskeyid". However, it will not match "accesskey_id" because it
only has two word boundaries ("accesskey" and "id").

@mtdowling mtdowling requested a review from a team as a code owner October 22, 2022 00:20
@mtdowling mtdowling force-pushed the reserved-words-matching branch 12 times, most recently from 11e52f3 to cacc888 Compare October 23, 2022 16:52
This commit introduces a new syntax for matching words with the
ReservedWords linter and is intended to be used with the upcoming
sensitive words linter defined in #1364.

In addition to supporting wildcard searches ("*" prefix, suffix,
and contains), we now support matching based on word boundaries.

This commit introduces the "terms" keyword for word boundary
searches and adds dedicated abstractions for word boundary and
wildcard matching.

For example, "access key id" will match "AccessKeyId",
"access_key_id", "accessKeyID", "access_key_id100", "AccesKeyIDValue".
It will also match when all the words are concatenated together:
"accesskeyid". However, it will not match "accesskey_id" because it
only has two word boundaries ("accesskey" and "id").
@mtdowling mtdowling force-pushed the reserved-words-matching branch from cacc888 to b25767a Compare October 24, 2022 19:41
@mtdowling mtdowling requested a review from sugmanue October 24, 2022 19:42
@mtdowling mtdowling merged commit c079c9b into main Oct 24, 2022
return result.toString();
}

private static void addLowerCaseStringToBuilder(StringBuilder result, String str, int start, int count) {
Copy link
Contributor

@sugmanue sugmanue Oct 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nit, I think that we can simplify the callers and a bit the function if we take the endIndex instead of the count.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was mimicking what java.lang.String's constructor does here:

public String(char value[], int offset, int count) {

It's internal to the class, so probably not worth changing IMO

@mtdowling mtdowling deleted the reserved-words-matching branch March 21, 2023 16:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants