-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
replay a GitHub event locally at a bot instance #178
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks already quite nice. The PR does more than what #94 asked for. #94 just suggested to have a means to replay an event locally (where a bot instance is running). A couple of suggestions:
- remove the ability to send the
replay
command from GitHub (it will be impractical to use or worse lead to unintended consequences) - think of improving the efficiency, as of now
os.walk()
will traverse the whole directory of the bot instance and access 100k or even millions of entries- one could/should make use of the directory structure starting at
events_log
- one could also think of improving PyGHee such that it creates symlinks, eg,
event_id
-> directory that contains the header & body json- thus there need to be no lookup/traversal of the file system
- one could/should make use of the directory structure starting at
- there could be a new small script, say
replay_event.py
that imports thereplay_event
method- this probably requires to move the core of the method
replay_event
out of the EESSIBotEventHandler class to another module, saytools/replay_events.py
- thus we could have a tool that replays an event and if we later decide to use that from the bot (event handler) we can just reuse it
- this probably requires to move the core of the method
- a follow-up PR could improve the replay_event tool to limit the effects of replaying an event, for example, only replay it for certain architectures or job ids
- this might use code developed for support for sending commands to bot instances via PR comments #172
eessi_bot_event_handler.py
Outdated
if comment_txt.lower()[:6] == "replay": | ||
self.replay_event(comment_txt[7:]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to remove the ability to trigger the replay from GitHub. #94 only intended to replay an event locally, that is, where a bot instance is running. Someone who only has access to GitHub may also have difficulties to know/obtain the ID of an event.
eessi_bot_event_handler.py
Outdated
""" | ||
event = namedtuple('Request', ['headers', 'json', 'data']) | ||
|
||
for (dir, _subdirs, files) in os.walk("."): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This traverses the whole directory tree under .
. Benefit would be that even if the directory structure changes, it could still find the event. Disadvantage could be that it does not find the event because they are not stored under .
(it's just a relative reference to the file system) and it checks many files/directories.
Could this be made more efficient and explicitly use the directory that stores event information?
.
could be replaced with what app.cfg
defines for the directory that contains jobs OR the directory PyGHee uses to log events. For example, for the current NESSI bot instances on Fram, eX3 and AWS, there are 15159, 15161 and 15129 entries under events_log
, while under .
there are 49037, 53289, 284738 entries, respectively.
eessi_bot_event_handler.py
Outdated
if any([event_id in file for file in files]): | ||
with open(f"{dir}/{event_id}_headers.json", 'r') as jf: | ||
headers = json.load(jf) | ||
event.headers = CaseInsensitiveDict(headers) | ||
with open(f"{dir}/{event_id}_body.json", 'r') as jf: | ||
body = json.load(jf) | ||
event.json = body | ||
|
||
event_info = get_event_info(event) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks very good. Clean and concise. No immediate requests for changes. Maybe need to be updated if the os.walk()
is replaced with something more efficient.
Closing this one. We may revisit it at a later time. |
Fixes #94
PR for being able to replay events stored in the file system.
Triggered by commenting "replay " on the pr (Your GitHub app must subscribe to the event "Issue comment")
There were already some support for replaying events (see line 200 in
eessi_bot_event_handler.py
), but this is outdated and would not work even if the specified file contained both the header and body (PyGHee logs events as separate files for body and header). Should I change the code here to utilise my newly implemented function?