Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add secrets search and bump DotUtils.MsBuild.BinlogRedactor.SensitiveDataDetector version #829

Conversation

YuliiaKovalova
Copy link
Contributor

@YuliiaKovalova YuliiaKovalova commented Oct 30, 2024

Fixes

#823

Context

This change adds a support of a new search key word "$secret" that allows to detect all the suspicious tree entries, based on the capabilities of this library MSBuild.BinlogRedactor.

MSBuild.BinlogRedactor is already used for secrets reduction functionality, but it this context it's possible to check the presence of the secrets on fly
{4CD40D35-CD93-4195-B83B-30873E35C1D2}

$secret not(SensitiveDataKind) statement is supported here:
{E743D083-2878-4655-AACF-711D9A8E6610}

It's possible to use the new keyword on 2 tabs:
"Search Log"
{12CCFBFD-AC63-4A06-B1F7-86948D7B721A}

"Find in Files"
{ED9BC1B4-EBC7-4979-B8E2-097B9A01AB53}

Changes made

A new SecretsSearch class was implemented based on ISearchExtension interface.
To make this functionality available on "Find in Files" tab, the method was extended

private object FindInFiles(string searchText, int maxResults, CancellationToken cancellationToken)
. There was an idea to use factory for handing this case, but we agreed with to postpone this and get back on demand.
msbuild_logWithFalseSecrets.zip

@YuliiaKovalova YuliiaKovalova changed the title Add secrets search Add secrets search and bump DotUtils.MsBuild.BinlogRedactor.SensitiveDataDetector version Oct 31, 2024
@YuliiaKovalova YuliiaKovalova marked this pull request as ready for review October 31, 2024 12:48
Copy link
Collaborator

@JanKrivanek JanKrivanek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love this!

src/StructuredLogViewer/Controls/BuildControl.xaml.cs Outdated Show resolved Hide resolved
src/StructuredLogger.Utils/SecretsSearch.cs Show resolved Hide resolved
@JanKrivanek
Copy link
Collaborator

Can this be added to list of sample searches?

private static string[] searchExamples = new[]
{
"Copying file from ",
"Resolved file path is ",
"There was a conflict",
"Encountered conflict between",
"Building target completely ",
"is newer than output ",
"Property reassignment: $(",
"out-of-date",
"$task $time",
"$message CompilerServer failed",
"will be compiled because",
};

image

var haystack = file.Value;
var resultsInFile = haystack.Find(searchText);
if (resultsInFile.Count > 0)
if (!string.IsNullOrEmpty(searchText) && searchText.StartsWith("$secret"))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this condition will run for every file, and it doesn't change. How about we extract a bool above the foreach and just check it here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved

@KirillOsenkov
Copy link
Owner

Very nice! I left a couple small comments but otherwise it's good to go!

@KirillOsenkov
Copy link
Owner

Oh, I forgot. When there are no secrets found, we should return something, otherwise you type $secret and nothing happens. For a regular search we return
image

@@ -1023,24 +1022,19 @@ private object FindInFiles(string searchText, int maxResults, CancellationToken
{
var results = new List<(string, IEnumerable<(int, string)>)>();

NodeQueryMatcher notQueryMatcher = new NodeQueryMatcher(searchText);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you mean nodeQueryMatcher?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed it in main

@@ -1023,24 +1022,19 @@ private object FindInFiles(string searchText, int maxResults, CancellationToken
{
var results = new List<(string, IEnumerable<(int, string)>)>();

NodeQueryMatcher notQueryMatcher = new NodeQueryMatcher(searchText);
bool isSecretsSearch = !string.IsNullOrEmpty(searchText) && searchText.StartsWith("$secret");
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can also check for nodeQueryMatcher.TypeKeyword == "secret"

Copy link
Owner

@KirillOsenkov KirillOsenkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, thanks!

@KirillOsenkov KirillOsenkov merged commit 49aaad1 into KirillOsenkov:main Nov 1, 2024
1 check passed
@JanKrivanek
Copy link
Collaborator

Works like a charm!
Plus the new patterns are now available in the binlogtool redact as well:

image

@JanKrivanek
Copy link
Collaborator

@YuliiaKovalova - do you want to promete the functionality on the viewer help? Probably here?: https://github.com/KirillOsenkov/MSBuildLog/blob/6f87e14be0eb1a7cfe7c40e912b96b9ce0cf0301/index.html#L185-L191

@KirillOsenkov
Copy link
Owner

One todo would be to investigate adding parallelism to secret detection because right now it's really slow on real-life binlogs, even moderately sized. Takes over two minutes on a 5 MB binlog I use often. I think a judicious use of Task.Run() will really make things better here. You can maybe spawn one task per string, and one task per file, then just foreach the tasks and do task.Wait() on each of them. No need for async as only one thread will be blocked (the one where the computation happens)

@KirillOsenkov
Copy link
Owner

I filed #832

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants