Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with parsing file in-progress #26

Open
heatherspacek opened this issue Dec 15, 2020 · 6 comments · May be fixed by #46
Open

Issues with parsing file in-progress #26

heatherspacek opened this issue Dec 15, 2020 · 6 comments · May be fixed by #46

Comments

@heatherspacek
Copy link

Hi,

I'm trying apply parse() to a tail -f stream from a slippi file in the process of being written (i.e. a game in progress). My relevant python snippet reads as follows (Windows 10, Python 3.9, unix tail via cygwin):

p = subprocess.Popen(["cmd", "/c", "tail -c+1 -f %s" % fullpath], stdout=subprocess.PIPE)

parse(p.stdout, handlers)

It appears that the portion of "parse()" that measures how long the Raw element is, is misestimating because the file has just begun being written? I've attached a screenshot of the error thrown - the parser is looking for the metadata element (which, i believe, hasnt been written yet) and is looking very close to the beginning of the file (see the correspondence between the "bytes found" and the highlighted bytes in a hex editor).

Screenshot 2020-12-15 013054
Screenshot 2020-12-15 013127

Please also let me know if I've set it up wrong, or if there's a convenient workaround here!
Thanks in advance.

@hohav
Copy link
Owner

hohav commented Dec 15, 2020

Thanks for the report. Are you on latest master? And does this happen with all replays?

Parse error (3 @0x-1fca) is very weird. There should be a meaningful name in place of the 3, and 0x-1fca is a negative file position (it's formatted weirdly since the code assumes a positive value).

But if the parser is expecting to see the metadata block, that means it thinks it has reached the end of the raw block. That can happen either due to (1) reading the amount of bytes specified in the raw UBJSON element length (the four bytes after $U#l), or else (2) reading a Game End event.

For an in-progress replay, Slippi initially sets the raw length to zero (since its size is not known in advance). That disables py-slippi's check for condition (1), so I'd normally suspect a premature Game End event. But there's enough weirdness going on here that I can't say that with any confidence.

It's hard for me to test without a Windows box, but I'll investigate more when I have more time.

@yael333
Copy link

yael333 commented Jan 15, 2021

I'm not involved by any means in any Slippi development but I heard that they made the writing to a replay file asynchronous meaning real time data might come in a different order. They are now developing the broadcast mode which enables reading real time data more cleanly but didn't look into that yet.

Maybe you should look into that :3

@dawsonbooth
Copy link
Collaborator

dawsonbooth commented Jan 19, 2021

I thought it would help if I added that I get a similar error when parsing a file. Here's the traceback:

>>> from slippi import Game
>>> Game("D:\\OneDrive\\Gaming\\Melee\\Slippi\\Error\\Game_20200615T235839.slp")
Traceback (most recent call last):
  File "C:\Users\micro\AppData\Local\pypoetry\Cache\virtualenvs\treefrog-jCddXGjI-py3.7\lib\site-packages\slippi\parse.py", line 212, in _parse_try
    _parse(input, handlers)
  File "C:\Users\micro\AppData\Local\pypoetry\Cache\virtualenvs\treefrog-jCddXGjI-py3.7\lib\site-packages\slippi\parse.py", line 193, in _parse    
    expect_bytes(b'U\x08metadata', stream)
  File "C:\Users\micro\AppData\Local\pypoetry\Cache\virtualenvs\treefrog-jCddXGjI-py3.7\lib\site-packages\slippi\util.py", line 55, in expect_bytes
    raise Exception(f'expected {expected_bytes}, but got: {read_bytes}')
Exception: expected b'U\x08metadata', but got: b'6\x02\x00\x01\x002\x01\x86L\xc3'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\micro\AppData\Local\pypoetry\Cache\virtualenvs\treefrog-jCddXGjI-py3.7\lib\site-packages\slippi\game.py", line 36, in __init__
    ParseEvent.METADATA_RAW: lambda x: setattr(self, 'metadata_raw', x)})
  File "C:\Users\micro\AppData\Local\pypoetry\Cache\virtualenvs\treefrog-jCddXGjI-py3.7\lib\site-packages\slippi\parse.py", line 241, in parse
    _parse_open(pathlib.Path(input), handlers)
  File "C:\Users\micro\AppData\Local\pypoetry\Cache\virtualenvs\treefrog-jCddXGjI-py3.7\lib\site-packages\slippi\parse.py", line 231, in _parse_open
    _parse_try(f, handlers)
  File "C:\Users\micro\AppData\Local\pypoetry\Cache\virtualenvs\treefrog-jCddXGjI-py3.7\lib\site-packages\slippi\parse.py", line 226, in _parse_try
    raise e
slippi.parse.ParseError: Parse error (D:\OneDrive\Gaming\Melee\Slippi\Error\Game_20200615T235839.slp @0x27): [Errno None] None: 'D:\\OneDrive\\Gaming\\Melee\\Slippi\\Error\\Game_20200615T235839.slp'

It's worth noting that this is a pretty old game (June 2020), but other games around the same time work. In case it helps, here is a .zip with an example of either case.

@Samoe97
Copy link

Samoe97 commented Dec 23, 2022

Hi there, just writing to say I'm experiencing the same problem with in-progress replay files. Exact same exception and error code with Exception: expected b'U\x08metadata', but got: b'6\x02\x00\x01\x002\x01\x86L\xc3'

I'm not experienced enough to solve this myself, but could it be that the slp format has changed substantially enough to break the parser's logic? It works well on completed replay files, I don't have any issues there, but I haven't successfully been able to use the Event Parser to read any in-progress replays.

This is the code that gets me these results:

directory = '/tests/Game_20221223T125106.slp'

from slippi.parse import parse
from slippi.parse import ParseEvent
handlers = {ParseEvent.START: print}
with open(directory, 'rb') as f :
    parse(f.raw, handlers)

I've tested it now with multiple in-progress files and used every ParseEvent type and it hasn't worked.

@Kered13
Copy link

Kered13 commented Jan 24, 2023

I happened to noticed this bug in the code while referencing PySlippi for my own C++ implementation. It's actually quite simple. For in-progress replays, the length field of raw is set to 0. In the _parse_events() loop, total_size == 0 is used to determine if a replay is in-progress. However _parse() passes length - bytesRead to _parse_events(), where bytesRead is the size of the event payloads block, always positive. Therefore for an in-progress replay _parse_events() is receiving a negative value for the total_size parameter.

The error message is a result of _parse_events() returning immediately because it thinks there are no more events to read, therefore _parse() tries to start reading the metadata section, but reads the game event instead.

The simplest fix is to change parse.py:124 to use total_size < 0 instead of total_size == 0. Another possible fix is to pass in-progress status explicitly as a boolean parameter.

It would probably be good to have a replay file for unit tests with 0 raw length to test the behavior of streaming replays as well. I have attached such a replay file. It is identical to game.slp, except that the raw length field has been modified to 0. I believe it will fail parsing on PySlippi, though I have not tested it (I tested it on my C++ implementation).

streaming_game.zip

@NickCondron NickCondron linked a pull request Jan 25, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants