Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[report-converter] Sparse parser #3160

Merged
merged 3 commits into from
Mar 2, 2021

Conversation

jay24rajput
Copy link
Contributor

Sparse report converter tool for parsing Sparse output of kernel sources

@bulwahn
Copy link

bulwahn commented Jan 24, 2021

sparse creates multi-line error messages, but this report-converter here does not parse those correctly. Please rework that.

@jay24rajput
Copy link
Contributor Author

@bulwahn @csordasmarton So should we include that multi-line messages with the message of the original file? Because those lines to have a path,line and column number?

@bulwahn
Copy link

bulwahn commented Jan 24, 2021

Well, sparse has a very special syntax for its output... it would be best to just repair the sparse output to be more like clang-analyzer or clang-tidy, but for the time being, you need understand which lines belong to previous warning line and when a new warning line starts. You can have a look at Argert's implementation (he got that right).

@jay24rajput
Copy link
Contributor Author

jay24rajput commented Jan 31, 2021

@csordasmarton @bulwahn I think I have made the required changes to recognize the multi-line comments Kindly have a look and let me know if any modifications are required.

Copy link
Contributor

@csordasmarton csordasmarton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some tiny little comments. The biggest one is to update the test case with a multiline warning. Otherwise LGTM!

@csordasmarton csordasmarton added this to the release 6.16.0 milestone Feb 2, 2021
@csordasmarton csordasmarton added documentation 📖 Changes to documentation. test ☑️ Adding or refactoring tests tools 🛠️ Meta-tag for all the additional tools supplied with CodeChecker: plist2html, tu_collector, etc. labels Feb 2, 2021
@jay24rajput
Copy link
Contributor Author

@csordasmarton Apologies for changing the approved code of the Parser. I noticed a bug while re-writing the test cases and hence decided to fix it. Kindly have a look at the modified code

@bulwahn
Copy link

bulwahn commented Feb 8, 2021

It is "its converter", not "it's converter" in the commit message header of the third commit.

@bulwahn
Copy link

bulwahn commented Feb 8, 2021

Review is done; I will start testing...

@jay24rajput
Copy link
Contributor Author

@csordasmarton Kindly have a look 😄

@jay24rajput
Copy link
Contributor Author

jay24rajput commented Feb 19, 2021

@csordasmarton removing this and the above-mentioned regex was actually creating a separate sample.h.plist file and not appending it to sample.c. Nevertheless, I have modified the code to not include the files/sample.c: note: in included file: line in the output.

Kindly have a look 😄

r'(?P<message>[\S \t]+)\s*'
)

def parse_message(self, it, line):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to convert what the sample.out with the report-converter tool and then store the results to a running server. Now there is only one report with a lot of bug steps:
image
I don't think this is what you want.
In my previous comment (#3160 (comment)) my recommendation was the following:

  • Every line which looks like this will be a new report (Message): ./files/sample.h:3:40: warning: incorrect type in argument 2 (different address spaces) (the warning keyword identifies that we have a new report).
  • Every line which looks like this will be an event / bug step item of the previous Message object: ./files/sample.h:3:40: expected struct task_struct *p1. (So it doesn't contain the warning keyword like in the previous case).
  • Every line can be skipped which looks like this files/sample.c: note: in included file:.

So in case of the following output:

files/sample.c:4:1: warning: symbol 'machine_id' was not declared. Should it be static?
files/sample.h:3:40: warning: incorrect type in argument 2 (different address spaces)
files/sample.c: note: in included file:
files/sample.h:3:40: warning: incorrect type in argument 1 (different address spaces)
files/sample.h:3:40:    expected struct task_struct *p1
files/sample.h:3:40:    got struct spinlock [noderef] __rcu *

I want to see 3 reports:

  1. files/sample.c:4:1: warning: symbol 'machine_id' was not declared. Should it be static?
  2. files/sample.h:3:40: warning: incorrect type in argument 2 (different address spaces)
  3. files/sample.h:3:40: warning: incorrect type in argument 1 (different address spaces)
    files/sample.h:3:40: expected struct task_struct *p1
    files/sample.h:3:40: got struct spinlock [noderef] __rcu *

In the third case the report will have multiple bug steps.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to convert what the sample.out with the report-converter tool and then store the results to a running server. Now there is only one report with a lot of bug steps:
image
I don't think this is what you want.

Well Actually I thought that is what we wanted, a single report for all the files and hence changed the code to include it in the single report. 😢

In my previous comment (#3160 (comment)) my recommendation was the following:

  • Every line which looks like this will be a new report (Message): ./files/sample.h:3:40: warning: incorrect type in argument 2 (different address spaces) (the warning keyword identifies that we have a new report).

This logic won't always stand true as some lines also start with error. Do you recommend modifying the regex as well now?

  • Every line which looks like this will be an event / bug step item of the previous Message object: ./files/sample.h:3:40: expected struct task_struct *p1. (So it doesn't contain the warning keyword like in the previous case).
  • Every line can be skipped which looks like this files/sample.c: note: in included file:.

So in case of the following output:

files/sample.c:4:1: warning: symbol 'machine_id' was not declared. Should it be static?
files/sample.h:3:40: warning: incorrect type in argument 2 (different address spaces)
files/sample.c: note: in included file:
files/sample.h:3:40: warning: incorrect type in argument 1 (different address spaces)
files/sample.h:3:40:    expected struct task_struct *p1
files/sample.h:3:40:    got struct spinlock [noderef] __rcu *

I want to see 3 reports:

  1. files/sample.c:4:1: warning: symbol 'machine_id' was not declared. Should it be static?
  2. files/sample.h:3:40: warning: incorrect type in argument 2 (different address spaces)
  3. files/sample.h:3:40: warning: incorrect type in argument 1 (different address spaces)
    files/sample.h:3:40: expected struct task_struct *p1
    files/sample.h:3:40: got struct spinlock [noderef] __rcu *

In the third case the report will have multiple bug steps.

Alright so in the tests, we will also have two sample.c.expected.plist and sample.h.expected.plist, this is after the assumption that second line in the sample.out file is removed. Correct me if I am wrong

@jay24rajput
Copy link
Contributor Author

@csordasmarton I think the current code solves the issues discussed above by us. Kindly have a look and let me know what you think

Copy link
Contributor

@csordasmarton csordasmarton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have just 2 tiny comments, otherwise its LGTM!

Comment on lines +3 to +5
./files/sample.h:3:40: warning: incorrect type in argument 1 (different address spaces)
./files/sample.h:3:40: expected struct task_struct *p1
./files/sample.h:3:40: got struct spinlock [noderef] __rcu *
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my previous review comments I asked what is the reason that one line starts with ./ and the others are not (#3160 (comment)).
Now with the regexes we will identify every line which stars with ./ as notes and the others as reports. Are you sure that every note line will start with ./?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes as far as I have checked, all the note lines start with ./

- Sparse output is parsed and stored into the format:
   file:line:column:message
- Parser is called from report converter CLI with type 'sparse'
- Sparse Parser is tested with sample.c, sample.h
  and sample.out files
- Sparse documentation in report converter explains the usage
  of sparse tool on kernel sources
- Sparse is accessible from the main readme file and supported
  code analyzer file
- Sparse updated in usage
@jay24rajput
Copy link
Contributor Author

@csordasmarton I have made the changes suggested by you! Kindly have a look😄

Copy link
Contributor

@csordasmarton csordasmarton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you for your hard work 😊

@csordasmarton csordasmarton merged commit ce64dfb into Ericsson:master Mar 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation 📖 Changes to documentation. test ☑️ Adding or refactoring tests tools 🛠️ Meta-tag for all the additional tools supplied with CodeChecker: plist2html, tu_collector, etc.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants