-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: duplicate scancode files being reported #1045
Conversation
tern/extensions/scancode/executor.py
Outdated
if f.merge(checkfile): | ||
# file already exists and has now been updated | ||
updated = True | ||
break | ||
# file didn't previously exist in layer so add it now | ||
layer_obj.files.append(checkfile) | ||
if not updated: | ||
# file didn't previously exist in layer so add it now | ||
layer_obj.files.append(checkfile) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you check my logic here? I think this can be achieved this way:
for f in layer_obj.files <-- cycle through existing files
merged = f.merge(checkfile) <-- try to update the file
if not merged: <-- the file didn't get updated, means this is a unique file
layer_obj.files.append(checkfile)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we still need the break
and to update the file only when the for loop has been broken or completed. Otherwise, with your proposal, say we have three files in the layer [A, B, C] and two files from scancode that needs updating: [A, C]. We go through this for loop and compare A to A, merged = True so file gets updated but then we move to the next f in layer files and compare B from the layer files to the same A from scancode... merged = False and so we add the A file, even though it already exists and has been updated... and we are facing the same problem we have now. I think we could do:
merged = False <-- variable needs to exist outside of for loop so we can check it once the loop completes.
for f in layer_obj.files:
merged = f.merge(checkfile)
if merged:
break
if not merged: <-- if we are here it's because we broke out of the loop (in which case the file has been updated and merged=True or because we never found a match and merged=False and file needs to be updated.
layer_obj.files.append(checkfile)
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup! This makes sense.
When running scancode with an empty cache, Tern was reporting the same file twice due to a logic bug in the add_file_data() function. This fix adds a flag to ensure that a file is only added to the layer object file list if it does not already exist there. Resolves tern-tools#1000 Signed-off-by: Rose Judge <rjudge@vmware.com>
31fa6bd
to
b5b64e2
Compare
When running scancode with an empty cache, Tern was reporting the same
file twice due to a logic bug in the add_file_data() function. This fix
adds a flag to ensure that a file is only added to the layer object
file list if it does not already exist there.
Resolves #1000
Signed-off-by: Rose Judge rjudge@vmware.com