Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[report-converter] Sparse parser #3160

Merged
merged 3 commits into from
Mar 2, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -171,6 +171,7 @@ The following tools are supported:
| | [Coccinelle](/tools/report-converter/README.md#coccinelle) |
| | [Smatch](/tools/report-converter/README.md#smatch) |
| | [Kernel-Doc](/tools/report-converter/README.md#kernel-doc) |
| | [Sparse](/tools/report-converter/README.md#sparse) |
| **Java** | [SpotBugs](/tools/report-converter/README.md#spotbugs) |
| | [Facebook Infer](/tools/report-converter/README.md#fbinfer) |
| **Python** | [Pylint](/tools/report-converter/README.md#pylint) |
Expand Down
1 change: 1 addition & 0 deletions docs/supported_code_analyzers.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ CodeChecker result directory which can be stored to a CodeChecker server.
| | [Coccinelle](/tools/report-converter/README.md#coccinelle) | ✓ |
| | [Smatch](/tools/report-converter/README.md#smatch) | ✓ |
| | [Kernel-Doc](/tools/report-converter/README.md#kernel-doc) | ✓ |
| | [Sparse](/tools/report-converter/README.md#sparse) | ✓ |
| **Java** | [FindBugs](http://findbugs.sourceforge.net/) | ✗ |
| | [SpotBugs](/tools/report-converter/README.md#spotbugs) | ✓ |
| | [Facebook Infer](/tools/report-converter/README.md#fbinfer) | ✓ |
Expand Down
31 changes: 29 additions & 2 deletions tools/report-converter/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ a CodeChecker server.
* [Smatch](#smatch)
* [Kernel-Doc](#kernel-doc)
* [Sphinx](#sphinx)
* [Sparse](#sparse)
* [License](#license)

## Install guide
Expand Down Expand Up @@ -57,8 +58,8 @@ optional arguments:
-t TYPE, --type TYPE Specify the format of the code analyzer output.
Currently supported output types are: asan, clang-
tidy, coccinelle, cppcheck, eslint, fbinfer, golint,
kernel-doc, msan, pyflakes, pylint, smatch, spotbugs,
sphinx, tsan, tslint, ubsan.
kernel-doc, msan, pyflakes, pylint, smatch, sparse,
spotbugs, sphinx, tsan, tslint, ubsan.
--meta [META [META ...]]
Metadata information which will be stored alongside
the run when the created report directory will be
Expand Down Expand Up @@ -95,6 +96,7 @@ Supported analyzers:
pyflakes - Pyflakes, https://github.com/PyCQA/pyflakes
pylint - Pylint, https://www.pylint.org
smatch - smatch, https://repo.or.cz/w/smatch.git
sparse - sparse, https://git.kernel.org/pub/scm/devel/sparse/sparse.git
spotbugs - spotbugs, https://spotbugs.github.io
sphinx - sphinx, https://github.com/sphinx-doc/sphinx
tsan - ThreadSanitizer, https://clang.llvm.org/docs/ThreadSanitizer.html
Expand Down Expand Up @@ -501,6 +503,31 @@ report-converter -t sphinx -o ./codechecker_sphinx_reports ./sphinx.out
CodeChecker store ./codechecker_sphinx_reports -n sphinx
```

## [Sparse](https://git.kernel.org/pub/scm/devel/sparse/sparse.git)
[Sparse](https://git.kernel.org/pub/scm/devel/sparse/sparse.git) is a semantic checker
for C programs; it can be used to find a number of potential problems with kernel code.

The recommended way of running Sparse is to redirect the output to a file and
give this file to the report converter tool.

The following example shows you how to run Sparse on kernel sources
and store the results found by Sparse to the CodeChecker database.

```sh
# Change Directory to your project
cd path/to/linux/kernel/repository

# Run Sparse
make C=1 2>&1 | tee sparse.out

# Use 'report-converter' to create a CodeChecker report directory from the
# analyzer result of Sparse
report-converter -t sparse -o ./codechecker_sparse_reports ./sparse.out

# Store the Sparse reports with CodeChecker.
CodeChecker store ./codechecker_sparse_reports -n sparse
```

## License

The project is licensed under Apache License v2.0 with LLVM Exceptions.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,8 @@
KernelDocAnalyzerResult # noqa
from codechecker_report_converter.sphinx.analyzer_result import \
SphinxAnalyzerResult # noqa
from codechecker_report_converter.sparse.analyzer_result import \
SparseAnalyzerResult # noqa


LOG = logging.getLogger('ReportConverter')
Expand Down Expand Up @@ -98,7 +100,8 @@ class RawDescriptionDefaultHelpFormatter(
CoccinelleAnalyzerResult.TOOL_NAME: CoccinelleAnalyzerResult,
SmatchAnalyzerResult.TOOL_NAME: SmatchAnalyzerResult,
KernelDocAnalyzerResult.TOOL_NAME: KernelDocAnalyzerResult,
SphinxAnalyzerResult.TOOL_NAME: SphinxAnalyzerResult
SphinxAnalyzerResult.TOOL_NAME: SphinxAnalyzerResult,
SparseAnalyzerResult.TOOL_NAME: SparseAnalyzerResult
}

supported_metadata_keys = ["analyzer_command", "analyzer_version"]
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# -------------------------------------------------------------------------
#
# Part of the CodeChecker project, under the Apache License v2.0 with
# LLVM Exceptions. See LICENSE for license information.
# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
#
# -------------------------------------------------------------------------
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# -------------------------------------------------------------------------
#
# Part of the CodeChecker project, under the Apache License v2.0 with
# LLVM Exceptions. See LICENSE for license information.
# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
#
# -------------------------------------------------------------------------

from codechecker_report_converter.analyzer_result import AnalyzerResult

from .output_parser import SparseParser
from ..plist_converter import PlistConverter


class SparseAnalyzerResult(AnalyzerResult):
""" Transform analyzer result of Sparse. """

TOOL_NAME = 'sparse'
NAME = 'Sparse'
URL = 'https://git.kernel.org/pub/scm/devel/sparse/sparse.git'

def parse(self, analyzer_result):
""" Creates plist files from the given analyzer result to the given
output directory.
"""
parser = SparseParser(analyzer_result)

content = self._get_analyzer_result_file_content(analyzer_result)
if not content:
return

messages = parser.parse_messages(content)

plist_converter = PlistConverter(self.TOOL_NAME)
plist_converter.add_messages(messages)
return plist_converter.get_plist_results()
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
# -------------------------------------------------------------------------
#
# Part of the CodeChecker project, under the Apache License v2.0 with
# LLVM Exceptions. See LICENSE for license information.
# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
#
# -------------------------------------------------------------------------

import logging
import os
import re

from ..output_parser import BaseParser, Message, Event
LOG = logging.getLogger('ReportConverter')


class SparseParser(BaseParser):
"""
Parser for Sparse Output
"""

def __init__(self, analyzer_result):
super(SparseParser, self).__init__()

self.analyzer_result = analyzer_result

self.message_line_re = re.compile(
# File path followed by a ':'.
r'^(?P<path>[\S ]+?):'
# Line number followed by a ':'.
r'(?P<line>\d+?):'
# Column number followed by a ':'.
r'(?P<column>\d+?):'
# Message.
r'(?P<message>[\S \t]+)\s*')

self.note_line_re = re.compile(
# File path followed by a ':'.
r'^(?P<path>\.[\S ]+?):'
# Line number followed by a ':'.
r'(?P<line>\d+?):'
# Column number followed by a ':'.
r'(?P<column>\d+?):'
# Message.
r'(?P<message>[\S \t]+)\s*'
)

def parse_message(self, it, line):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to convert what the sample.out with the report-converter tool and then store the results to a running server. Now there is only one report with a lot of bug steps:
image
I don't think this is what you want.
In my previous comment (#3160 (comment)) my recommendation was the following:

  • Every line which looks like this will be a new report (Message): ./files/sample.h:3:40: warning: incorrect type in argument 2 (different address spaces) (the warning keyword identifies that we have a new report).
  • Every line which looks like this will be an event / bug step item of the previous Message object: ./files/sample.h:3:40: expected struct task_struct *p1. (So it doesn't contain the warning keyword like in the previous case).
  • Every line can be skipped which looks like this files/sample.c: note: in included file:.

So in case of the following output:

files/sample.c:4:1: warning: symbol 'machine_id' was not declared. Should it be static?
files/sample.h:3:40: warning: incorrect type in argument 2 (different address spaces)
files/sample.c: note: in included file:
files/sample.h:3:40: warning: incorrect type in argument 1 (different address spaces)
files/sample.h:3:40:    expected struct task_struct *p1
files/sample.h:3:40:    got struct spinlock [noderef] __rcu *

I want to see 3 reports:

  1. files/sample.c:4:1: warning: symbol 'machine_id' was not declared. Should it be static?
  2. files/sample.h:3:40: warning: incorrect type in argument 2 (different address spaces)
  3. files/sample.h:3:40: warning: incorrect type in argument 1 (different address spaces)
    files/sample.h:3:40: expected struct task_struct *p1
    files/sample.h:3:40: got struct spinlock [noderef] __rcu *

In the third case the report will have multiple bug steps.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to convert what the sample.out with the report-converter tool and then store the results to a running server. Now there is only one report with a lot of bug steps:
image
I don't think this is what you want.

Well Actually I thought that is what we wanted, a single report for all the files and hence changed the code to include it in the single report. 😢

In my previous comment (#3160 (comment)) my recommendation was the following:

  • Every line which looks like this will be a new report (Message): ./files/sample.h:3:40: warning: incorrect type in argument 2 (different address spaces) (the warning keyword identifies that we have a new report).

This logic won't always stand true as some lines also start with error. Do you recommend modifying the regex as well now?

  • Every line which looks like this will be an event / bug step item of the previous Message object: ./files/sample.h:3:40: expected struct task_struct *p1. (So it doesn't contain the warning keyword like in the previous case).
  • Every line can be skipped which looks like this files/sample.c: note: in included file:.

So in case of the following output:

files/sample.c:4:1: warning: symbol 'machine_id' was not declared. Should it be static?
files/sample.h:3:40: warning: incorrect type in argument 2 (different address spaces)
files/sample.c: note: in included file:
files/sample.h:3:40: warning: incorrect type in argument 1 (different address spaces)
files/sample.h:3:40:    expected struct task_struct *p1
files/sample.h:3:40:    got struct spinlock [noderef] __rcu *

I want to see 3 reports:

  1. files/sample.c:4:1: warning: symbol 'machine_id' was not declared. Should it be static?
  2. files/sample.h:3:40: warning: incorrect type in argument 2 (different address spaces)
  3. files/sample.h:3:40: warning: incorrect type in argument 1 (different address spaces)
    files/sample.h:3:40: expected struct task_struct *p1
    files/sample.h:3:40: got struct spinlock [noderef] __rcu *

In the third case the report will have multiple bug steps.

Alright so in the tests, we will also have two sample.c.expected.plist and sample.h.expected.plist, this is after the assumption that second line in the sample.out file is removed. Correct me if I am wrong

"""
Actual Parsing function for the given line
It is expected that each line contains a seperate report
"""
match = self.message_line_re.match(line)

if (match is None):
return None, next(it)

checker_name = None

file_path = os.path.normpath(
os.path.join(os.path.dirname(self.analyzer_result),
match.group('path')))
message = Message(
file_path,
int(match.group('line')),
int(match.group('column')),
match.group('message').strip(),
checker_name)

try:
line = next(it)
note_match = self.note_line_re.match(line)
while note_match:
file_path = os.path.normpath(
os.path.join(os.path.dirname(self.analyzer_result),
note_match.group('path')))
message.events.append(Event(file_path,
int(note_match.group('line')),
int(note_match
.group('column')),
note_match.group('message')
.strip()))
line = next(it)
note_match = self.note_line_re.match(line)
return message, line

except StopIteration:
return message, ''
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
all:
# Due to difference in format on kernel and command below, and to
# test all the cases of output, the output is different in sample.out file
sparse -Wsparse-all files/sample.c > ./sample.out 2>&1 || exit 0;
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
#include <stdio.h>
#include "sample.h"

unsigned int machine_id;

void main()
{
printf("Inside main function");
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
static inline struct task_struct *get_task_struct(struct task_struct *t1, struct task_struct *t2)
{
refcount_inc(t1, t2);
return t2;
}

void refcount_inc(struct task_struct *t1, struct task_struct *t2)
{
return;
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>diagnostics</key>
<array>
<dict>
<key>category</key>
<string>unknown</string>
<key>check_name</key>
<string>sparse</string>
<key>description</key>
<string>warning: symbol 'machine_id' was not declared. Should it be static?</string>
<key>issue_hash_content_of_line_in_context</key>
<string>474283465f76d26a290e0ce2e5dd3502</string>
<key>location</key>
<dict>
<key>col</key>
<integer>1</integer>
<key>file</key>
<integer>0</integer>
<key>line</key>
<integer>4</integer>
</dict>
<key>path</key>
<array>
<dict>
<key>depth</key>
<integer>0</integer>
<key>kind</key>
<string>event</string>
<key>location</key>
<dict>
<key>col</key>
<integer>1</integer>
<key>file</key>
<integer>0</integer>
<key>line</key>
<integer>4</integer>
</dict>
<key>message</key>
<string>warning: symbol 'machine_id' was not declared. Should it be static?</string>
</dict>
</array>
<key>type</key>
<string>sparse</string>
</dict>
</array>
<key>files</key>
<array>
<string>files/sample.c</string>
</array>
<key>metadata</key>
<dict>
<key>analyzer</key>
<dict>
<key>name</key>
<string>sparse</string>
</dict>
<key>generated_by</key>
<dict>
<key>name</key>
<string>report-converter</string>
<key>version</key>
<string>x.y.z</string>
</dict>
</dict>
</dict>
</plist>
Loading