Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTML format bug report for local compare #1278

Merged
merged 1 commit into from
Jan 8, 2018

Conversation

dkrupp
Copy link
Member

@dkrupp dkrupp commented Jan 5, 2018

"CodeChecker cmd diff" command can generate HTML output
even if the new report set is a local report directory.

Fixes #1277

@whisperity whisperity added bugfix 🔨 CLI 💻 Related to the command-line interface, such as the cmd, store, etc. commands labels Jan 5, 2018
@whisperity
Copy link
Contributor

(@dkrupp Your commit environment uses a machine-local e-mail address and thus the commit isn't assigned to you properly.)

@@ -356,6 +357,50 @@ def get_report_data(client, reports, file_cache):
return {'files': file_sources,
'reports': report_data}

def reports2report_data(reports):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function looks very familiar yet this patch only introduces it as block code. Can't we import the library where this is originally in (optionally with a proper handling of import errors?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a totally new code. The function get_report_data() a little bit above is similarish, but that is converting from another report format.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reports_to_report_data function name would be better.

# Not all report in this list may refer to the same files
# thus we need to create a single file list with
# all files from all reports.
for f in report.files:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use enumerate here, so you will not need a separate findex variable.

for index, f in enumerate(report.files):
...

The same applies for the other index variable below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the old fashion way. ;)

@@ -356,6 +357,50 @@ def get_report_data(client, reports, file_cache):
return {'files': file_sources,
'reports': report_data}

def reports2report_data(reports):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reports_to_report_data function name would be better.


events = []
pathElements = report.bug_path
index = 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let this variable be equal with 1 (index = 1). This way step field of an event below can be just simply equal with this index.

'step': index})

@@ -365,17 +410,26 @@ def report_to_html(client, reports, output_dir):

file_report_map = defaultdict(list)
for report in reports:
file_report_map[report.fileId].append(report)
filePath = ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not use camel cased variable names in python. Use underscore to separate the part of the variable name:

file_path

filename = os.path.basename(checked_file)
h = int(hashlib.md5(file_path).hexdigest(), 16) % (10 ** 8)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we convert the hash to a number and limit the values to 10^8? This way the probability of hash collision is a much larger risk. Would not be better to use the full hash?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hash collision is minimal, as a file name is always the prefix. So this number is only to distinguish between files with the same name and different paths.
The full hash is 16 bytes long and in hex it is a 32 long character chain.
I think a 8 long decimal chain is enough to distinguish between files.
So an output HTML will look like:
alma.c_95479304.html

instead of
alma.c_ebbc3c26a34b609dc46f5c3378f96e08.html

See

h.digest_size
16L
h.hexdigest()
'ebbc3c26a34b609dc46f5c3378f96e08'
int(h.hexdigest(), 16) % (10 ** 8)
95479304L

Copy link
Member Author

@dkrupp dkrupp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed the review comments

filename = os.path.basename(checked_file)
h = int(hashlib.md5(file_path).hexdigest(), 16) % (10 ** 8)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hash collision is minimal, as a file name is always the prefix. So this number is only to distinguish between files with the same name and different paths.
The full hash is 16 bytes long and in hex it is a 32 long character chain.
I think a 8 long decimal chain is enough to distinguish between files.
So an output HTML will look like:
alma.c_95479304.html

instead of
alma.c_ebbc3c26a34b609dc46f5c3378f96e08.html

See

h.digest_size
16L
h.hexdigest()
'ebbc3c26a34b609dc46f5c3378f96e08'
int(h.hexdigest(), 16) % (10 ** 8)
95479304L

# Not all report in this list may refer to the same files
# thus we need to create a single file list with
# all files from all reports.
for f in report.files:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the old fashion way. ;)

if isinstance(report, Report):
file_path = report.main['location']['file_name']
else:
file_path = report.checked_file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ReportData instance has no attribute checked_file. Use checkedFile attribute:

file_path = report.checkedFile

Copy link
Contributor

@whisperity whisperity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit author information is still bogus.

@@ -441,6 +496,7 @@ def print_reports(client, reports, output_format):
get_line_from_file(report.main['location']['file_name'],
bug_line)
else:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know the type that comes from the server? Maybe it should be explicitly put as an elif, instead of an else.

"CodeChecker cmd diff" command can generate HTML output
even if the new report set is a local report directory.
@csordasmarton csordasmarton merged commit 136fabc into Ericsson:master Jan 8, 2018
@dkrupp dkrupp deleted the diff_fix branch February 17, 2018 17:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix 🔨 CLI 💻 Related to the command-line interface, such as the cmd, store, etc. commands
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants