Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

git diff adding a new empty file is not parsed correctly #19

Closed
wbolster opened this issue Dec 10, 2015 · 4 comments · Fixed by #86
Closed

git diff adding a new empty file is not parsed correctly #19

wbolster opened this issue Dec 10, 2015 · 4 comments · Fixed by #86

Comments

@wbolster
Copy link

a git diff adding a new empty file is not parsed correctly. the patch set contains 0 items in that case, while it should contain one.

sample:

diff --git a/empty b/empty
new file mode 100644
index 0000000..e69de29
@wbolster wbolster changed the title git diff adding a new empty file is not parsed git diff adding a new empty file is not parsed correctly Dec 10, 2015
@matiasb
Copy link
Owner

matiasb commented Dec 25, 2015

I see, you're right, will take a look.
Thanks for your report!

@archen
Copy link

archen commented Sep 13, 2016

This is a problem for any changes to binary files. Capturing these changes in the metadata and being able to iterate over them as part of the PatchSet would be great.

@wbolster
Copy link
Author

fwiw a while ago i made a start on a diff/patch parser. i just published that w-i-p code here:

https://github.com/wbolster/diffparse

it tries to get metadata from svn and git diffs

@wbolster
Copy link
Author

sample output from that parser (see __main__ module):

$ python -m diffparse some-git-commit.diff 
patch set: PatchSet(preamble=['commit c91d3dbeff576c3f0ea8aab5848d1b95af290544 (HEAD -> master, origin/master)', 'Author: Wouter Bolsterlee <wouter@bolsterl.ee>', 'Date:   2015-12-28 21:55:47 +0100', '', '    Ignore leading/trailing whitespace', ''], patched_files=[PatchedFile(source_file='diffparse/__main__.py', target_file='diffparse/__main__.py', source_timestamp=None, target_timestamp=None, git_header=GitHeader(source_file='diffparse/__main__.py', target_file='diffparse/__main__.py', old_mode=None, new_mode=None, deleted_file_mode=None, new_file_mode=None, copy_from=None, copy_to=None, rename_from=None, rename_to=None, similarity_index=None, dissimilarity_index=None, index_from_hash='f211439', index_to_hash='8209424', index_mode='100644'), svn_header=None), PatchedFile(source_file='diffparse/parser.py', target_file='diffparse/parser.py', source_timestamp=None, target_timestamp=None, git_header=GitHeader(source_file='diffparse/parser.py', target_file='diffparse/parser.py', old_mode=None, new_mode=None, deleted_file_mode=None, new_file_mode=None, copy_from=None, copy_to=None, rename_from=None, rename_to=None, similarity_index=None, dissimilarity_index=None, index_from_hash='007ebf6', index_to_hash='e899075', index_mode='100644'), svn_header=None)])
patched file: PatchedFile(source_file='diffparse/__main__.py', target_file='diffparse/__main__.py', source_timestamp=None, target_timestamp=None, git_header=GitHeader(source_file='diffparse/__main__.py', target_file='diffparse/__main__.py', old_mode=None, new_mode=None, deleted_file_mode=None, new_file_mode=None, copy_from=None, copy_to=None, rename_from=None, rename_to=None, similarity_index=None, dissimilarity_index=None, index_from_hash='f211439', index_to_hash='8209424', index_mode='100644'), svn_header=None)
hunk: Hunk(source_start=5, source_length=7, target_start=5, target_length=7, section='from . import parse_patch_sets')
line: Line(type=' ', value='', source_line=5, target_line=5)
line: Line(type=' ', value='def main():', source_line=6, target_line=6)
line: Line(type=' ', value='    with open(sys.argv[1]) as fp:', source_line=7, target_line=7)
line: Line(type='-', value='        for patch_set in parse_patch_sets(fp):', source_line=8, target_line=None)
line: Line(type='+', value='        for patch_set in parse_patch_sets(fp, allow_preamble=True):', source_line=None, target_line=8)
line: Line(type=' ', value="            print('patch set:', repr(patch_set))", source_line=9, target_line=9)
line: Line(type=' ', value='            for patched_file in patch_set.patched_files:', source_line=10, target_line=10)
line: Line(type=' ', value="                print('patched file:', repr(patched_file))", source_line=11, target_line=11)
patched file: PatchedFile(source_file='diffparse/parser.py', target_file='diffparse/parser.py', source_timestamp=None, target_timestamp=None, git_header=GitHeader(source_file='diffparse/parser.py', target_file='diffparse/parser.py', old_mode=None, new_mode=None, deleted_file_mode=None, new_file_mode=None, copy_from=None, copy_to=None, rename_from=None, rename_to=None, similarity_index=None, dissimilarity_index=None, index_from_hash='007ebf6', index_to_hash='e899075', index_mode='100644'), svn_header=None)
hunk: Hunk(source_start=408, source_length=6, target_start=408, target_length=14, section='class SubversionHeader(object):')
line: Line(type=' ', value='', source_line=408, target_line=408)
line: Line(type=' ', value='def parse_patch_sets(fp, allow_preamble=False):', source_line=409, target_line=409)
line: Line(type=' ', value='    it = PeekableFile(fp)', source_line=410, target_line=410)
line: Line(type='-', value='    while it.peek() is not None:', source_line=411, target_line=None)
line: Line(type='-', value='        patch_set = PatchSet._from_peekable(it, allow_preamble=allow_preamble)', source_line=412, target_line=None)
line: Line(type='-', value='        yield patch_set', source_line=413, target_line=None)
line: Line(type='+', value='    while True:', source_line=None, target_line=411)
line: Line(type='+', value='        next_line = it.peek()', source_line=None, target_line=412)
line: Line(type='+', value='        if next_line is None:', source_line=None, target_line=413)
line: Line(type='+', value='            break', source_line=None, target_line=414)
line: Line(type='+', value='        elif not next_line.strip():', source_line=None, target_line=415)
line: Line(type='+', value='            # Silently ignore leading and trailing white-space only lines.', source_line=None, target_line=416)
line: Line(type='+', value='            next(it)', source_line=None, target_line=417)
line: Line(type='+', value='        else:', source_line=None, target_line=418)
line: Line(type='+', value='            patch_set = PatchSet._from_peekable(', source_line=None, target_line=419)
line: Line(type='+', value='                it, allow_preamble=allow_preamble)', source_line=None, target_line=420)
line: Line(type='+', value='            yield patch_set', source_line=None, target_line=421)

Felixoid added a commit to Felixoid/python-unidiff that referenced this issue Dec 23, 2021
matiasb added a commit that referenced this issue Jan 18, 2022
Fix issue #19: emtpy new git file
webknjaz added a commit to sanitizers/chronographer-github-app that referenced this issue Apr 28, 2023
This is needed to fix a bug with it not noticing new empty files being
added. The fix first appears in v0.7.1[[1]] [[2]]

It was reported in the pip project[[3]].

[1]: matiasb/python-unidiff#19
[2]: matiasb/python-unidiff#86
[3]: pypa/pip#11969
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants