fix(SpdxExpression): Fix the DNF algorithm #6721

MarcelBochtler · 2023-03-20T12:38:00Z

Whenever a SpdxExpression contains an AND operator, the DNF needs to be calculated again.
For instance the correct DNF for a AND (b OR c OR d) is (a AND b) OR (a AND c) OR (a AND d). Before, the calculated DNF for this expression was (a AND (b OR c)) OR (a AND d) which is not a valid DNF.

MarcelBochtler · 2023-03-20T12:39:20Z

I'm using this scanner result to check the performance impact of this change:
https://github.com/fviernau/orthw-demo/blob/main/scan-results/scan-result.json.xz

Results on my machine

Current main: 13 seconds
With fix: 1 min 14 seconds.

Therefore, the changes are currently not feasible :/

utils/spdx/src/main/kotlin/SpdxExpression.kt

codecov · 2023-03-20T12:53:04Z

Codecov Report

Patch and project coverage have no change.

Comparison is base (8a48a6d) 64.82% compared to head (58e0e6b) 64.82%.

Additional details and impacted files

@@            Coverage Diff            @@
##               main    #6721   +/-   ##
=========================================
  Coverage     64.82%   64.82%           
  Complexity     1938     1938           
=========================================
  Files           320      320           
  Lines         16024    16024           
  Branches       2281     2281           
=========================================
  Hits          10388    10388           
  Misses         4657     4657           
  Partials        979      979

Flag	Coverage Δ
funTest-non-docker	`49.53% <ø> (ø)`
test	`31.06% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

MarcelBochtler · 2023-03-21T08:18:03Z

Extracting the distributeAndOverOr function and only calling this recursively results in ~45 seconds evaluation time on my machine.
I doubt that this problem can be solved in a non-exponential time complexity (in the worst case).

mnonnenmacher · 2023-03-21T13:39:59Z

Extracting the distributeAndOverOr function and only calling this recursively results in ~45 seconds evaluation time on my machine. I doubt that this problem can be solved in a non-exponential time complexity (in the worst case).

I guess you are right because according to Wikipedia the problem is np-complete: https://en.wikipedia.org/wiki/Quine%E2%80%93McCluskey_algorithm#Complexity (Cline-McCluskey calculates the minimal DNF which is not strictly required for our use case, but calculating a non-minimal DNF is probably not much less complex)

I see basically two options how we can proceed:

Check if implementing another algorithm like Quine-McCluskey provides significantly better performance.
Check why the algorithm fix affects performance so much: Do we calculate the DNF too often? Could we maybe cache the result? Do we create too many objects during the calculation? Is there one specific very long expression that takes so long or does the problem also exist for many simple expressions?

Whenever a `SpdxExpression` contains an `AND` operator, the DNF needs to be calculated again. For instance, the correct DNF for `a AND (b OR c OR d)` is `(a AND b) OR (a AND c) OR (a AND d)`. Before, the calculated DNF for this expression was `(a AND (b OR c)) OR (a AND d)`, which is not a valid DNF. Signed-off-by: Marcel Bochtler <marcel.bochtler@bosch.com>

sschuberth · 2024-01-11T22:48:54Z

I wonder whether disjunctiveNormalForm() is still needed after all... in #8106 it seems we could simply remove it.

sschuberth · 2024-01-17T09:45:50Z

I'm using this scanner result to check the performance impact of this change:
https://github.com/fviernau/orthw-demo/blob/main/scan-results/scan-result.json.xz

@MarcelBochtler what operation exactly did you perform on the result for benchmarking?

MarcelBochtler · 2024-03-25T08:09:48Z

I'm using this scanner result to check the performance impact of this change:
https://github.com/fviernau/orthw-demo/blob/main/scan-results/scan-result.json.xz

@MarcelBochtler what operation exactly did you perform on the result for benchmarking?

I simply ran the evaluator using the linked scan-result.json:

ort evaluate -i scan-result.json -o ort/ --license-classifications-file license-classifications.yml --rules-file evaluator.rules.kts

I cannot share the license-classifications and the evaluator rules.
In the evaluator.rules.kts we do use expression.validChoices() on every expression that offers a choice. Other than that, I don't see anything that could be related to this issue.

I wonder whether disjunctiveNormalForm() is still needed after all... in #8106 it seems we could simplify remove it.

Yes looks like it. Should I close this PR?

sschuberth · 2024-03-25T20:04:06Z

Should I close this PR?

For the sake of cleaning things up a little, I'd say yes. I believe in the context of the current code base there now nothing anymore in this PR that we could take / reuse.

github-advanced-security bot found potential problems Mar 20, 2023

View reviewed changes

utils/spdx/src/main/kotlin/SpdxExpression.kt Fixed Show fixed Hide fixed

utils/spdx/src/main/kotlin/SpdxExpression.kt Fixed Show fixed Hide fixed

MarcelBochtler force-pushed the dnf-fix branch 3 times, most recently from a5834b5 to 677c4d0 Compare March 20, 2023 12:42

MarcelBochtler force-pushed the dnf-fix branch 2 times, most recently from 70d48cf to 7685fc2 Compare March 21, 2023 08:05

MarcelBochtler force-pushed the dnf-fix branch from 7685fc2 to 74b0759 Compare March 21, 2023 15:40

MarcelBochtler force-pushed the dnf-fix branch from 74b0759 to 58e0e6b Compare March 28, 2023 14:08

MarcelBochtler mentioned this pull request Apr 14, 2023

evaluator passes wrong detected license to rules.kts when license choice was made #6812

Closed

fviernau mentioned this pull request Jun 22, 2023

Does repository_license_choices has an effect on CycloneDXReporter sbom output? #7182

Closed

MarcelBochtler closed this Mar 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(SpdxExpression): Fix the DNF algorithm #6721

fix(SpdxExpression): Fix the DNF algorithm #6721

MarcelBochtler commented Mar 20, 2023

MarcelBochtler commented Mar 20, 2023 •

edited

Loading

codecov bot commented Mar 20, 2023 •

edited

Loading

MarcelBochtler commented Mar 21, 2023 •

edited

Loading

mnonnenmacher commented Mar 21, 2023 •

edited

Loading

sschuberth commented Jan 11, 2024 •

edited

Loading

sschuberth commented Jan 17, 2024

MarcelBochtler commented Mar 25, 2024

sschuberth commented Mar 25, 2024

fix(SpdxExpression): Fix the DNF algorithm #6721

fix(SpdxExpression): Fix the DNF algorithm #6721

Conversation

MarcelBochtler commented Mar 20, 2023

MarcelBochtler commented Mar 20, 2023 • edited Loading

Results on my machine

codecov bot commented Mar 20, 2023 • edited Loading

Codecov Report

MarcelBochtler commented Mar 21, 2023 • edited Loading

mnonnenmacher commented Mar 21, 2023 • edited Loading

sschuberth commented Jan 11, 2024 • edited Loading

sschuberth commented Jan 17, 2024

MarcelBochtler commented Mar 25, 2024

sschuberth commented Mar 25, 2024

MarcelBochtler commented Mar 20, 2023 •

edited

Loading

codecov bot commented Mar 20, 2023 •

edited

Loading

MarcelBochtler commented Mar 21, 2023 •

edited

Loading

mnonnenmacher commented Mar 21, 2023 •

edited

Loading

sschuberth commented Jan 11, 2024 •

edited

Loading