-
-
Notifications
You must be signed in to change notification settings - Fork 704
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance improvements for the fsevents
module
#680
Conversation
Hello @CCP-Aporia, Thanks for the patch! Do you have some numbers before/after to show improvements? |
The pathological case which triggered this work includes recursively watching a directory with around 350.000 files in over 30.000 folders. Profiling revealed that the amount of system calls used to create a Benchmarking script: # coding=utf-8
import datetime
import logging
import os
import threading
import time
from watchdog.events import LoggingEventHandler
# from watchdog.observers.fsevents2 import FSEventsObserver2 as Observer
from watchdog.observers.fsevents import FSEventsObserver as Observer
# path to a folder which contains ~350.000 files in ~30.000 subfolders
PATH = '/tmp/testfolder'
def mod_func(path, stop_signal):
mod_file = os.path.join(path, 'filemod.test')
while not stop_signal.is_set():
with open(mod_file, 'w+') as fp:
now = datetime.datetime.utcnow()
fp.write("%s\n" % (now,))
logging.info('Wrote to %s at %s', mod_file, now)
time.sleep(5)
if __name__ == '__main__':
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(message)s', datefmt='%Y-%m-%d %H:%M:%S')
stop_signal = threading.Event()
modifier = threading.Thread(target=mod_func, args=(PATH, stop_signal))
observer = Observer()
handler = LoggingEventHandler()
observer.schedule(handler, PATH, recursive=True)
observer.start()
modifier.start()
try:
while True:
time.sleep(0.001)
except KeyboardInterrupt:
stop_signal.set()
observer.stop()
observer.join()
modifier.join() Running the above script with 2020-08-14 13:55:18 - Wrote to /tmp/testfolder/filemod.test at 2020-08-14 13:55:18.404196
2020-08-14 13:55:23 - Wrote to /tmp/testfolder/filemod.test at 2020-08-14 13:55:23.405711
2020-08-14 13:55:28 - Wrote to /tmp/testfolder/filemod.test at 2020-08-14 13:55:28.870788
2020-08-14 13:55:31 - Modified file: /tmp/testfolder/filemod.test
2020-08-14 13:55:33 - Wrote to /tmp/testfolder/filemod.test at 2020-08-14 13:55:33.876508
2020-08-14 13:55:38 - Wrote to /tmp/testfolder/filemod.test at 2020-08-14 13:55:38.880568
2020-08-14 13:55:43 - Modified file: /tmp/testfolder/filemod.test
2020-08-14 13:55:43 - Wrote to /tmp/testfolder/filemod.test at 2020-08-14 13:55:43.884855
2020-08-14 13:55:48 - Wrote to /tmp/testfolder/filemod.test at 2020-08-14 13:55:48.892458
2020-08-14 13:55:54 - Wrote to /tmp/testfolder/filemod.test at 2020-08-14 13:55:54.080669
2020-08-14 13:55:56 - Modified file: /tmp/testfolder/filemod.test
2020-08-14 13:55:59 - Wrote to /tmp/testfolder/filemod.test at 2020-08-14 13:55:59.093225
2020-08-14 13:56:04 - Wrote to /tmp/testfolder/filemod.test at 2020-08-14 13:56:04.101078
Snapshot saved to /Users/thomas/Library/Caches/JetBrains/PyCharm2020.2/snapshots/watchdog-benchmark-macos7.pstat With 2020-08-14 14:34:00 - Wrote to /tmp/testfolder/filemod.test at 2020-08-14 14:34:00.572773
2020-08-14 14:34:00 - Modified file: /tmp/testfolder/filemod.test
2020-08-14 14:34:05 - Wrote to /tmp/testfolder/filemod.test at 2020-08-14 14:34:05.574264
2020-08-14 14:34:05 - Modified file: /tmp/testfolder/filemod.test
2020-08-14 14:34:10 - Wrote to /tmp/testfolder/filemod.test at 2020-08-14 14:34:10.580673
2020-08-14 14:34:10 - Modified file: /tmp/testfolder/filemod.test
2020-08-14 14:34:15 - Wrote to /tmp/testfolder/filemod.test at 2020-08-14 14:34:15.588850
2020-08-14 14:34:15 - Modified file: /tmp/testfolder/filemod.test
Snapshot saved to /Users/thomas/Library/Caches/JetBrains/PyCharm2020.2/snapshots/watchdog-benchmark-macos8.pstat With 2020-08-14 14:38:49 - Wrote to /tmp/testfolder/filemod.test at 2020-08-14 14:38:49.821265
2020-08-14 14:38:49 - Modified file: /tmp/testfolder/filemod.test
2020-08-14 14:38:54 - Wrote to /tmp/testfolder/filemod.test at 2020-08-14 14:38:54.831356
2020-08-14 14:38:54 - Modified file: /tmp/testfolder/filemod.test
2020-08-14 14:38:59 - Wrote to /tmp/testfolder/filemod.test at 2020-08-14 14:38:59.839356
2020-08-14 14:38:59 - Modified file: /tmp/testfolder/filemod.test
2020-08-14 14:39:04 - Wrote to /tmp/testfolder/filemod.test at 2020-08-14 14:39:04.846402
2020-08-14 14:39:04 - Modified file: /tmp/testfolder/filemod.test
Snapshot saved to /Users/thomas/Library/Caches/JetBrains/PyCharm2020.2/snapshots/watchdog-benchmark-macos9.pstat |
@CCP-Aporia thank you for addressing this issue. We are currently also seeing these performance issues when using watchdog on OS X watching dirs of web projects (large node_modules dirs) with around 70.000 files. Investigations indicates that the DirectorySnapshot class is the bottleneck. @BoboTiG are there plans to include this PR in the next release? |
@CCP-Aporia do you mind adding a line in the changelog + your name in the "beloved contributors" (release 0.10.4 in the changelog)? I will then merge @ceelian :) |
Done. Feel free to let me know if there's anything else you'd like to see addressed. :-) |
Hmm, that's interesting. Are the Windows tests for Python 3.7 / 3.8 flaky? 🤔 I wouldn't know why they'd break for a changelog update in a branch that only affects macOS. |
It seems yes. Not a blocker here ;) |
Hm I am curious: what about TODOs you left? |
Those were copy/paste from https://github.com/gorakhargosh/watchdog/blob/master/src/watchdog/observers/fsevents2.py#L210 because I needed the same logic. I've a post-it note on my desk to follow up with some more code improvements, and that's one of the things I wanted to check out when I get around to spend some more time on this. :-) |
Great, go ahead I would love to merge more improvements ;) |
@CCP-Aporia Unfortunately for me it doesn't work reliable. It works blazingly fast (almost no feeling of delay compared to seconds of delay with the normal implementation with around 50.000 files in node_modules) but after 3 or 4 modified events it completely stops emitting events, doesn't matter which file system action I do. Can't see any errors. My System: macOS Catalina (10.15.7 (19H2)) |
Hi @ceelian, That's interesting, we haven't experienced the kind of problem you mention with our 350.000 files use case. Can you maybe create an issue with some more details, maybe even code that exposes the problem? For example, does the benchmark code I posted in #680 (comment) cause the problem as well? |
I would like to release a new version of the module, could you give more details @ceelian? I there is no regression from that PR, I will consider a new bug and a new issue should be opened then. |
If no one else experiences problems it is probably related to my mac environment. I wouldn't hold back the new release just because of my experience because I havn't tested it on other macs beside my main development macbook, so it could be something related to my system environment which then has nothing todo with this PR. I additionally went through your PR line by line and I also couldn't find any sign for a logic error or mistake which could cause that issue. I will further investigate but I am currently limited a bit in time for testing this out on multiple machines. So again thanks for the PR, it is a huge performance improvement on macOS. |
v0.10.4 of the "watchdog" package had an extensive update to the "fsevent" observer, which appears to have broken functionality the Plastron InboxWatcher relies on (gorakhargosh/watchdog#680) The problem still exists in the latest current watchdog version (v1.0.2), so pinning the "watchdog" package to v0.10.3 is the simplest fix for now. The "tests/stomp/test_inbox_watcher.py" passed when using v0.10.3, and fails in v0.10.4 and later, so it can be used to gauge whether "watchdog" is performing as expected by Plastron. https://issues.umd.edu/browse/LIBFCREPO-944
v0.10.4 of the "watchdog" package had an extensive update to the "fsevent" observer, which appears to have broken functionality the Plastron InboxWatcher relies on (gorakhargosh/watchdog#680) The problem still exists in the latest current watchdog version (v1.0.2), so pinning the "watchdog" package to v0.10.3 is the simplest fix for now. The "tests/stomp/test_inbox_watcher.py" passed when using v0.10.3, and fails in v0.10.4 and later, so it can be used to gauge whether "watchdog" is performing as expected by Plastron. https://issues.umd.edu/browse/LIBFCREPO-944
v0.10.4 of the "watchdog" package had an extensive update to the "fsevent" observer, which appears to have broken functionality the Plastron InboxWatcher relies on (gorakhargosh/watchdog#680) The problem still exists in the latest current watchdog version (v1.0.2), so pinning the "watchdog" package to v0.10.3 is the simplest fix for now. The "tests/stomp/test_inbox_watcher.py" passed when using v0.10.3, and fails in v0.10.4 and later, so it can be used to gauge whether "watchdog" is performing as expected by Plastron. https://issues.umd.edu/browse/LIBFCREPO-944
v0.10.4 of the "watchdog" package had an extensive update to the "fsevent" observer, which appears to have broken functionality the Plastron InboxWatcher relies on (gorakhargosh/watchdog#680) The problem still exists in the latest current watchdog version (v1.0.2), so pinning the "watchdog" package to v0.10.3 is the simplest fix for now. In gorakhargosh/watchdog#729 it is noted that the watchdog MacOS tests are failing, and that there are known issues with fsevent. The "tests/stomp/test_inbox_watcher.py" passed when using v0.10.3, and fails in v0.10.4 and later, so it can be used to gauge whether "watchdog" is performing as expected by Plastron. https://issues.umd.edu/browse/LIBFCREPO-944
v0.10.4 of the "watchdog" package had an extensive update to the "fsevent" observer, which appears to have broken functionality the Plastron InboxWatcher relies on (gorakhargosh/watchdog#680). In gorakhargosh/watchdog#729 it is noted hat the watchdog MacOS tests are failing, and that there are known issues with fsevent. The problem still exists in the latest current watchdog version (v1.0.2), so pinning the "watchdog" package to v0.10.3 is the simplest fix for now. The "tests/stomp/test_inbox_watcher.py" passed when using v0.10.3, and fails in v0.10.4 and later, so it can be used to gauge whether "watchdog" is performing as expected by Plastron. https://issues.umd.edu/browse/LIBFCREPO-944
This PR brings the
fsevents
module on par withfsevents2
when it comes to performance on macOS. Effectively it's not much more than exposing theNativeEvent
structure fromfsevents2
via the_watchdog_fsevents
C extension so that theDirectorySnapshot
utility is no longer.There is more room for improvements in the overall structure of the
fsevents
module, but I wanted to put this up for discussion here first as issue #623 makes it sound like this module would become obsolete.