-
Notifications
You must be signed in to change notification settings - Fork 314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evaluator: Allow making arbitrary file checks on the project's source tree #5679
Conversation
2eeae8c
to
6525ee2
Compare
Codecov Report
@@ Coverage Diff @@
## main #5679 +/- ##
=========================================
Coverage 65.54% 65.54%
Complexity 2212 2212
=========================================
Files 271 271
Lines 16600 16600
Branches 3473 3473
=========================================
Hits 10881 10881
Misses 4575 4575
Partials 1144 1144
Flags with carried forward coverage won't be shown. Click here to find out more. Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You use different prefixes for the example commits: "example.rules", "examples.rules", "example.rules.kts", please align.
3d921be
to
09d7f45
Compare
import org.ossreviewtoolkit.model.config.DownloaderConfiguration | ||
import org.ossreviewtoolkit.utils.ort.createOrtTempDir | ||
|
||
class SourceTree private constructor( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need this new class? Can't we simply use the existing WorkingTree
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I added this class will be more obvious when you look at what get's added in the following commits.
That's all IMO evaluator specific logic. Basically helper functions to use for implementing RuleMatchers and / or policy rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw the upcoming changes, but I'm still not convinced. Looks like the helper functions would as well operate on a WorkingTree
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw the upcoming changes, but I'm still not convinced.
let me just try on more time to convince you:
I believe even if these helper functions could all be implemented in WorkingTree encapsulation does make sense, because
- It is impossible that API change in the analyzer (working tree) breaks a policy rule implementation. So, the rules API is
independent from working tree. - The API can be designed to reflect exactly the requirements from the rules. This is exactly what is needed to arrive at easy to read rules. In particular from my experience it's hard to foresee the exact API needs when implementing new policy rules use cases, and my gut feeling is that this encapsulation will make more and more sense the more functions are being added.
- The functions can be implement based on the working tree, but it is not required. The encapsulation allows
changing the implementation. For example a file existence check could be changed to not work on the cloned source but
on the ScanResult, if that contained a full list of all files. - The helper functions are basically factored out logic from the rule matchers added to OrtResult. Exposing that logic is
important to expose because the logic inside rule matchers is not re-usable. For example the rule matcherhasFile()
contains logic to find the actual files. If you want to re-use the logic to find the files, the logic cannot be put intohasFile()
matcher but needs to be exposed somewhere. That somewhere is theSourceTree
class I've added. I don't see why that somewhere should beWorkingTree
.
Is any of the above points somewhat more convincing?
BTW.: I'm planning to add matchers for the commit history, and therefore create (not expose) a working tree instance inside the sourcetree class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyhow, as I plan to make a couple of further PRs on top, can we just keep it for now and make that decision when that work is done (guess we know more by then)? If it's really not needed then I'll refactor it away.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we just keep it for now and make that decision when that work is done
I'd actually prefer to get that sorted out now. IMO we had to much of "I need this urgently now so let's merge it"-style of changes recently, and ORT's code base is starting to suffer from suboptimal code design decisions (with only a single use-case in mind).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd actually prefer to get that sorted out now.
Ok, then let's sort it out asap.
IMO we had to much of "I need this urgently now so let's merge it"-style of changes recently, and ORT's code base is starting to suffer from suboptimal code design decisions (with only a single use-case in mind).
Maybe I wasn't clear enough. I said that I would do the refactoring in a later change if we by then consider it reasonable. So, the code base wouldn't suffer from it in the long run. Did you get that?
I prefer a rather iterative approach as I believe the proposed refactoring is a bit too early and can be done with gained knowledge a couple of days later in a following iteration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, the code base wouldn't suffer from it in the long run. Did you get that?
I did get that. But such promises were made in the past, and then that planned refactoring never happened, or only after a very long time.
I prefer a rather iterative approach
Iterative is fine, but iterative should also mean that it's already going into the right direction, and not introducing stuff that gets removed later on (where foreseeable).
What I'd like to discuss first with you and at least also @mnonnenmacher is whether providing access to the source / working tree should really become a "first class citizen" of the evaluator as currently sketched. When I first read about #5621, I was hoping that would be mostly implemented by helper functions implemented in EPAM's rules.kts
itself, and not so much in ORT upstream.
09d7f45
to
7c95528
Compare
Move `getRepositoryPath()` to `OrtResultExtensions` to enable re-use in an upcoming change. Signed-off-by: Frank Viernau <frank_viernau@epam.com>
… tree Allow access to the project's source tree in order to enable doing arbitrary checks on files like the ones in repolinter [1]. Adding arbitrary file checks do make the rules API more powerful, as it allows highly customizable checks which can automate parts of the checks typically done prior to open sourcing a project. Not using a third-party tool makes sense as it is simpler to use, because it sticks to a single way (rules.kts) for writing the policy rules. Note that the implementation intentionally is limited to the project's source tree, e.g. it does not work for dependency source, because the doing such checks throught the dependency tree does not have obvious need and is not feasible anyway in terms of exection time. [1] https://github.com/todogroup/repolinter Signed-off-by: Frank Viernau <frank_viernau@epam.com>
Signed-off-by: Frank Viernau <frank_viernau@epam.com>
Signed-off-by: Frank Viernau <frank_viernau@epam.com>
Signed-off-by: Frank Viernau <frank_viernau@epam.com>
Signed-off-by: Frank Viernau <frank_viernau@epam.com>
Some policy rules do not only require the result of hasFile(), but also the actual matching files if any. The same applies to hasDirectory(). So, expose the logic for finding the files and directories. Signed-off-by: Frank Viernau <frank_viernau@epam.com>
Signed-off-by: Frank Viernau <frank_viernau@epam.com>
Signed-off-by: Frank Viernau <frank_viernau@epam.com>
Signed-off-by: Frank Viernau <frank_viernau@epam.com>
Signed-off-by: Frank Viernau <frank_viernau@epam.com>
Signed-off-by: Frank Viernau <frank_viernau@epam.com>
7c95528
to
47de9f5
Compare
I've created a new PR which is more minimal: #5754. |
The code of this PR has been refactored and is superseeded by: |
See individual commits.
Note: The project's source tree will be cloned in the evaluator only if the rules needed it / if the file checks are actually used.
This implements parts of #5621.