Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

export results per family scan #304

Merged
merged 6 commits into from
May 24, 2023
Merged

export results per family scan #304

merged 6 commits into from
May 24, 2023

Conversation

fishkerez
Copy link
Contributor

Export scan results when family finish scanning, and not after all family scan is completed like today.
This will make the scan progress more informative and granular. Also mark in progress for each family scan

Type of Change

[* ] New Feature

@fishkerez fishkerez self-assigned this May 9, 2023
@fishkerez fishkerez requested a review from a team as a code owner May 9, 2023 14:44
@ghost
Copy link

ghost commented May 9, 2023

Architecturally this is quite different to how I expected us to implement this. The channels and go routine stuff that were added as wrappers on in the job manager where a temporary solution for abort until we can fix up the code in kubeclarity to support cancelable contexts better, I don't think its a good idea to use that for the result processing.

When discussing this previously I expected the presenter to be passed down to or be a part of the family logic. Personally I think it should be part of the family logic because each family can then understand how to present its own report format, and it would be configurable with different options per family e.g.

sbom:
  scanners: trivy, syft
  scanner_configs:
     trivy: ...
  output:
    type: file
    output_format: spdx_json
    path: ...

or

sbom:
  scanners: trivy, syft
  scanner_configs:
     trivy: ...
  output:
    type: vmclarity
    assetScanId: <uuid>

However this is a more significant change, and refactor requiring each family to be responsible for a set of "export"/"presenter" plugins.

A shortcut which achieves close to the same thing is to pass the presenter and state interfaces down to the families as an input, when we create the family e.g. it can be an input: "vulnerabilities.New(ctx, config.Vulnerabilties, presenter, statemanager)". Then in the Run function for the specific family, after its run all scan jobs and merged the results, it can call the correct function on the presenter with the report, and any errors before returning to the family manager flow.

This works well with the progress tracking stuff too, as "statemanager.MarkInProgress" for a specific family can be called as the first function call of the family Run function making it much more obvious whats happening from my perspective.

It also means we can get rid of a lot of the type casting in the presenter because the results will come straight from the specific family to the specific family export function without going through the FamilyResults object.

@fishkerez
Copy link
Contributor Author

@sambetts-cisco I do not recall a discussion regarding this. The current implementation was left as is, so the CLI is still responsible for the presenting/exporting as before. I only made it more granular.
TBH I don't think that the families should be responsible for presenting/exporting as this will require them to be aware of the backend, which is not their responsibility. The families responsibility is to scan and report back the results to whom ever called them.
In any case this is a big architectural change and I don't think it should happen as part of this PR, which meant not to refactor, but to fix current behaviour which was not satisfying

@ghost
Copy link

ghost commented May 10, 2023

I don't like the addition of more channels and monitoring in this PR, I would much prefer the families to accept an interface which can be called when starting a new family and exporting results, even if its just a two functions "MarkInProgress" and "ExportFamilyResult" like you've created. I don't think more go routines and channels is the right solution here. Like I said the go routine in the family manager was a temporary solution as it was.

func (m *Manager) Run(ctx context.Context, presenter FamilyPresenter) RunErrors {
        familyErrors := make(RunErrors)
        familyResults := results.New()

        for _, family := range m.families {
                presenter.MarkInProgress(family)
        
                result := make(chan familyResult)
                go func() {
                        ret, err := family.Run(familyResults)
                        result <- familyResult{
                                ret,
                                err,
                        }
                }()

                select {
                case <-ctx.Done():
                        go func() {
                                <-result
                                close(result)
                        }()
                        familyErrors[family.GetType()] = fmt.Errorf("failed to run family %v: aborted", family.GetType())
                case r := <-result:
                        log.Debugf("received result from family %q: %v", family, r)
                        if r.err != nil {
                                familyErrors[family.GetType()] = fmt.Errorf("failed to run family %v: %w", family.GetType(), r.err)
                        } else {
                                familyResults.SetResults(r.result)
                        }
                        close(result)
                }
                
                presenter.ExportFamilyResult(r.result, r.err)
        }

        return familyErrors
}

@akpsgit
Copy link
Contributor

akpsgit commented May 18, 2023

I don't like the addition of more channels and monitoring in this PR, I would much prefer the families to accept an interface which can be called when starting a new family and exporting results, even if its just a two functions "MarkInProgress" and "ExportFamilyResult" like you've created. I don't think more go routines and channels is the right solution here. Like I said the go routine in the family manager was a temporary solution as it was.

func (m *Manager) Run(ctx context.Context, presenter FamilyPresenter) RunErrors {
        familyErrors := make(RunErrors)
        familyResults := results.New()

        for _, family := range m.families {
                presenter.MarkInProgress(family)
        
                result := make(chan familyResult)
                go func() {
                        ret, err := family.Run(familyResults)
                        result <- familyResult{
                                ret,
                                err,
                        }
                }()

                select {
                case <-ctx.Done():
                        go func() {
                                <-result
                                close(result)
                        }()
                        familyErrors[family.GetType()] = fmt.Errorf("failed to run family %v: aborted", family.GetType())
                case r := <-result:
                        log.Debugf("received result from family %q: %v", family, r)
                        if r.err != nil {
                                familyErrors[family.GetType()] = fmt.Errorf("failed to run family %v: %w", family.GetType(), r.err)
                        } else {
                                familyResults.SetResults(r.result)
                        }
                        close(result)
                }
                
                presenter.ExportFamilyResult(r.result, r.err)
        }

        return familyErrors
}

Decided to move the exporting logic inside the family logic

shared/pkg/families/manager.go Outdated Show resolved Hide resolved
cli/cmd/root.go Outdated Show resolved Hide resolved
shared/pkg/families/manager.go Outdated Show resolved Hide resolved
cli/pkg/state/vmclarity.go Outdated Show resolved Hide resolved
cli/pkg/state/local.go Outdated Show resolved Hide resolved
@ghost
Copy link

ghost commented May 19, 2023

Couple of small comments, but overall the code looks great much cleaner

@fishkerez fishkerez added this pull request to the merge queue May 24, 2023
Merged via the queue into main with commit 8fb0422 May 24, 2023
@fishkerez fishkerez deleted the granular-results-export branch May 24, 2023 07:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants