Add ability to lint based on word boundaries #1461

mtdowling · 2022-10-22T00:20:23Z

Add ability to lint based on word boundaries

This commit introduces a new syntax for matching words with the
ReservedWords linter and is intended to be used with the upcoming
sensitive words linter defined in #1364.

In addition to supporting wildcard searches ("*" prefix, suffix,
and contains), we now support matching based on word boundaries.

This commit introduces the "terms" keyword for word boundary
searches and adds dedicated abstractions for word boundary and
wildcard matching.

For example, "access key id" will match "AccessKeyId",
"access_key_id", "accessKeyID", "access_key_id100", "AccesKeyIDValue".
It will also match when all the words are concatenated together:
"accesskeyid". However, it will not match "accesskey_id" because it
only has two word boundaries ("accesskey" and "id").

smithy-linters/src/main/java/software/amazon/smithy/linters/WordBoundaryMatcher.java

This commit introduces a new syntax for matching words with the ReservedWords linter and is intended to be used with the upcoming sensitive words linter defined in #1364. In addition to supporting wildcard searches ("*" prefix, suffix, and contains), we now support matching based on word boundaries. This commit introduces the "terms" keyword for word boundary searches and adds dedicated abstractions for word boundary and wildcard matching. For example, "access key id" will match "AccessKeyId", "access_key_id", "accessKeyID", "access_key_id100", "AccesKeyIDValue". It will also match when all the words are concatenated together: "accesskeyid". However, it will not match "accesskey_id" because it only has two word boundaries ("accesskey" and "id").

sugmanue · 2022-10-24T20:52:27Z

smithy-linters/src/main/java/software/amazon/smithy/linters/WordBoundaryMatcher.java

+        return result.toString();
+    }
+
+    private static void addLowerCaseStringToBuilder(StringBuilder result, String str, int start, int count) {


Small nit, I think that we can simplify the callers and a bit the function if we take the endIndex instead of the count.

I was mimicking what java.lang.String's constructor does here:

public String(char value[], int offset, int count) {

It's internal to the class, so probably not worth changing IMO

mtdowling requested a review from a team as a code owner October 22, 2022 00:20

mtdowling mentioned this pull request Oct 22, 2022

Sensitive validator branch #1364

Merged

mtdowling force-pushed the reserved-words-matching branch 12 times, most recently from 11e52f3 to cacc888 Compare October 23, 2022 16:52

JordonPhillips approved these changes Oct 24, 2022

View reviewed changes

sugmanue reviewed Oct 24, 2022

View reviewed changes

smithy-linters/src/main/java/software/amazon/smithy/linters/WordBoundaryMatcher.java Show resolved Hide resolved

mtdowling force-pushed the reserved-words-matching branch from cacc888 to b25767a Compare October 24, 2022 19:41

mtdowling requested a review from sugmanue October 24, 2022 19:42

mtdowling merged commit c079c9b into main Oct 24, 2022

sugmanue reviewed Oct 24, 2022

View reviewed changes

mtdowling deleted the reserved-words-matching branch March 21, 2023 16:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ability to lint based on word boundaries #1461

Add ability to lint based on word boundaries #1461

mtdowling commented Oct 22, 2022 •

edited

Loading

sugmanue Oct 24, 2022 •

edited

Loading

mtdowling Oct 24, 2022

Add ability to lint based on word boundaries #1461

Add ability to lint based on word boundaries #1461

Conversation

mtdowling commented Oct 22, 2022 • edited Loading

sugmanue Oct 24, 2022 • edited Loading

Choose a reason for hiding this comment

mtdowling Oct 24, 2022

Choose a reason for hiding this comment

mtdowling commented Oct 22, 2022 •

edited

Loading

sugmanue Oct 24, 2022 •

edited

Loading