Fix tests on macOS #729

samschott · 2020-12-10T21:20:44Z

This PR fixes the tests on macOS. Closes #546. Notably:

Wait for longer to flush old events from stream before recording.
Skip testing for nested MovedEvents: those are currently not generated by fsevents and there is an in-line TODO.
Other small tweaks.

Most of the tests will pass after all the other PRs for macOS have been merged. One exception to this is the test_fast_subdirectory_creation_deletion test. This is problematic because the native events, while having the correct total count, all have both the is_created and is_removed flags set. I haven't figured out yet if this is a limitation of FSEvents or an error in watchdog.

watches are always recursive

CCP-Aporia · 2020-12-10T21:29:29Z

Thanks a lot!

samschott · 2020-12-10T22:38:09Z

I am happy to help.watchdog is a great package and AFAIK the only Python package for cross-platform file event notifications.

That being said, all of those checks and fixes really should have been part of the review process for #680. Writing a complete overhaul of the macOS backend without detailed manual or automated testing was just asking for trouble...

CCP-Aporia · 2020-12-11T11:33:09Z

Yeah, apologies for the mess that ensued afterwards. I really should have made sure that the automated tests work first to spot many of the issues that cropped up. Manual testing is way too unreliable when it needs to be performed multiple times against different co-existing versions on the same machine.

But that aside, regarding this point:

This is problematic because the native events, while having the correct total count, all have both the is_created and is_removed flags set. I haven't figured out yet if this is a limitation of FSEvents or an error in watchdog.

FSEvents does "temporal coalescing" of events when possible, but there also appears to be a problem with watchdog's C extension. The timeframe within which "temporal coalescing" appears is controlled by the latency parameter that we're passing into FSEventStreamCreate, and it should be rarely happening since we specify 1 millisecond for it.
But there are also problems with the way that the C extension currently works that I'm looking into at the moment. The most famous one shows by the test requiring a time.sleep statement to flush out events we didn't want to be notified about. 🙂 That behaviour happens because with the current C extension code FSEvents will start observing from the time the watch is created, not from the time that we start the observer.

BoboTiG · 2020-12-11T11:42:25Z

The good thing is: you are now both working on improving the macOS support reliably, so thank you both very much ❤️

samschott · 2020-12-11T12:05:26Z

No worries, your quick reaction time makes working on fixes very easy :)

I am still confused about the temporal coalescing of events. The test_fast_subdirectory_creation_deletion test runs 30 creations and deletions without any pause in between, and indeed 60 events are reported by FSEvents. This suggests that no coalescing takes place while the fact that both is_created and is_deleted flags are set suggests that events were combined.

Is setting the latency to 0 feasible? Or a really bad idea?

Edit: Correct that, there are only 30 reported events, even when setting the latency to 0. We could queue both created and deleted events in the emitter for such combined native events but the trouble is determining the correct order. A call to os.path.exists may be flaky because the item at the path may have changed since the event was recorded.

samschott · 2020-12-11T12:08:03Z

@CCP-Aporia, yes, that flush has been confusing me. Any idea why those historic events are reported? I have not encountered this with manual testing, only from the test suite.

CCP-Aporia · 2020-12-11T13:21:20Z

Presumably they are reported in the test suite because the time between the filesystem operations and adding the watch is incredibly short - fseventsd is a separate process, so a bit of a race condition should be expected here. First experiments with a time.sleep(0.1) before creating the emitter in the tests look promising in avoiding the old events.

CCP-Aporia · 2020-12-12T15:54:29Z

So, I've done some more experimenting and noticed that the tests kept failing even with the time.sleep(0.1), but in general because of events getting coalesced, and not because of old events still being in the event queue. For example, test_emitter.test_delete kept failing for me because a single NativeEvent with is_removed and is_created being True was received by the C extension. That then got translated to a FileCreatedEvent because of our translation logic in fsevents.py. So I did some digging, and after lots of googling I arrived at this old ArsTechnica article which says:

The fseventsd log files are written in a compressed binary format. Since only per-directory changes are kept, multiple changes to the same directory occurring within 30 seconds of each other are coalesced into a single event in the log file. The upshot is that, even when running a disk-thrashing server-type workload for 24 hours straight, the fseventsd log files will only grow by a megabyte or two per day. Normal usage will produce a small fraction of that.

Now, that 30 second interval doesn't appear to be entirely true anymore. And fseventsd has seen more improvements since macOS 10.5, most prominently the per-file notifications in 10.7. But the more I read about fsevents, the more I'm inclined to think that my work in #680 was misguided, and we'll still need to do the occasional system call to decide what really happened. For example, given above NativeEvent with is_created and is_removed being True then we'd have to do something like an os.path.exists(event.src_path) to figure out the order of those two coalesced events.

edit: This also means that the fsevents2.py approach that I based #680 on is not going to work as intended, and will need to do these system calls, too.

samschott · 2020-12-12T22:55:09Z

Oh wow, that article is a nice find!

Yes, it does looks like we will need an os.path.exists call to determine the order of those coalesced events. This increasingly points towards a chain of properly sorted if-clauses instead of if-else clauses. For instance, if a modified and a created / deleted event are combined, we should also emit the modified event.

We will also need to generate a tree of events for moved parent folders, similar to what is done on the inotify backend with generate_sub_moved_events.

I am not a huge fan of all of this but it looks like we have no choice. It's certainly better than going back to snapshot diffs. Still, it should probably be done in separate PRs. And maybe after this PR is merged. It's always good to have the tests as a reference point and then work towards making them pass. What do you think?

Finally, it looks like we will still need a long waiting time to flush old events since we really don't want to deal with those leaking into the tests in unpredictable ways.

CCP-Aporia · 2020-12-13T18:56:46Z

Indeed, there are a few things that we need to implement in order to provide a correct FSEvents based observer.

But also, since I assume that we want to have the same behaviour across all operating systems: What do we consider as expected behaviour? The emitter tests contain a bunch of special casing for different operating systems, which makes it difficult to provide consistent behaviour across different platforms. Of course, there are inherent differences between the platforms, but we might want to provide a base line somehow - otherwise the event translation is a bit of a mood point as users of this library would still need to write platform-specific code for a lot of cases.

samschott · 2020-12-13T19:18:47Z

I agree, uniform behaviour across all platforms would be ideal. If this is impossible, a common and documented baseline is the next best thing.

But before we achieve any of that, the more immediate goal to me is a usable emitter on macOS (not crashing, not emitting only half of the events...) that provides at least the granularity of events as the last working release. For this, my aim was to take the tests as they currently are ("expected behaviour"), assuming that any user-land code is already dealing with existing differences between platforms, and ensure that they pass reliably.

IMO, the larger goal of unifying the emitted events across platforms is a breaking change and therefore something for a 2.0 release.

CCP-Aporia · 2020-12-15T17:31:53Z

One thing that crossed my mind regarding the time.sleep() that we're doing on macOS: A less flaky approach would be to simply wait for the events caused by the directory and file operations that we perform for setting up and tearing down a test. Well, the tear down ones aren't quite that important since we shouldn't be re-using the same path for two different tests. But waiting for the setup notifications is certainly useful to avoid race conditions where we receive some old notifications within a single test case.

Additionally increasing the timeout for the event_queue.get() calls to something like 35 seconds might be a good idea, in case that fseventsd really does take 30 seconds to send the notification. :-)

BoboTiG · 2020-12-18T11:41:33Z

Sorry guys, I did the 1.0.2 release. I was trying to add wheels to the current release and it seems the master branch was used ... So let's rebase your work on master.

v0.10.4 of the "watchdog" package had an extensive update to the "fsevent" observer, which appears to have broken functionality the Plastron InboxWatcher relies on (gorakhargosh/watchdog#680) The problem still exists in the latest current watchdog version (v1.0.2), so pinning the "watchdog" package to v0.10.3 is the simplest fix for now. In gorakhargosh/watchdog#729 it is noted that the watchdog MacOS tests are failing, and that there are known issues with fsevent. The "tests/stomp/test_inbox_watcher.py" passed when using v0.10.3, and fails in v0.10.4 and later, so it can be used to gauge whether "watchdog" is performing as expected by Plastron. https://issues.umd.edu/browse/LIBFCREPO-944

v0.10.4 of the "watchdog" package had an extensive update to the "fsevent" observer, which appears to have broken functionality the Plastron InboxWatcher relies on (gorakhargosh/watchdog#680). In gorakhargosh/watchdog#729 it is noted hat the watchdog MacOS tests are failing, and that there are known issues with fsevent. The problem still exists in the latest current watchdog version (v1.0.2), so pinning the "watchdog" package to v0.10.3 is the simplest fix for now. The "tests/stomp/test_inbox_watcher.py" passed when using v0.10.3, and fails in v0.10.4 and later, so it can be used to gauge whether "watchdog" is performing as expected by Plastron. https://issues.umd.edu/browse/LIBFCREPO-944

BoboTiG · 2021-01-19T11:54:31Z

Closed by #734.

Sam Schott added 6 commits December 10, 2020 21:14

fix pruning old events on macos

2576fcd

remove unneeded platform check in test_create_wrong_encoding

898c2d2

increase flush time

f8224f2

fix expected event on test_delete_self

9cd6d4f

skip recursive-off test on macOS

b70643e

watches are always recursive

don't expect nested moves on macOS

8d989c0

BoboTiG added bug fsevents labels Dec 11, 2020

removed unused pdb import

e652da0

CCP-Aporia mentioned this pull request Dec 13, 2020

Support coalesced fsevents #734

Merged

dsteelma-umd mentioned this pull request Jan 14, 2021

LIBFCREPO-944. Pinned "watchdog" package to v0.10.3 umd-lib/plastron#149

Merged

BoboTiG closed this Jan 19, 2021

samschott deleted the macos-test-fixes branch January 20, 2021 16:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix tests on macOS #729

Fix tests on macOS #729

samschott commented Dec 10, 2020 •

edited

Loading

CCP-Aporia commented Dec 10, 2020

samschott commented Dec 10, 2020

CCP-Aporia commented Dec 11, 2020 •

edited

Loading

BoboTiG commented Dec 11, 2020 •

edited

Loading

samschott commented Dec 11, 2020 •

edited

Loading

samschott commented Dec 11, 2020

CCP-Aporia commented Dec 11, 2020 •

edited

Loading

CCP-Aporia commented Dec 12, 2020 •

edited

Loading

samschott commented Dec 12, 2020

CCP-Aporia commented Dec 13, 2020

samschott commented Dec 13, 2020

CCP-Aporia commented Dec 15, 2020

BoboTiG commented Dec 18, 2020 •

edited

Loading

BoboTiG commented Jan 19, 2021

Fix tests on macOS #729

Fix tests on macOS #729

Conversation

samschott commented Dec 10, 2020 • edited Loading

CCP-Aporia commented Dec 10, 2020

samschott commented Dec 10, 2020

CCP-Aporia commented Dec 11, 2020 • edited Loading

BoboTiG commented Dec 11, 2020 • edited Loading

samschott commented Dec 11, 2020 • edited Loading

samschott commented Dec 11, 2020

CCP-Aporia commented Dec 11, 2020 • edited Loading

CCP-Aporia commented Dec 12, 2020 • edited Loading

samschott commented Dec 12, 2020

CCP-Aporia commented Dec 13, 2020

samschott commented Dec 13, 2020

CCP-Aporia commented Dec 15, 2020

BoboTiG commented Dec 18, 2020 • edited Loading

BoboTiG commented Jan 19, 2021

samschott commented Dec 10, 2020 •

edited

Loading

CCP-Aporia commented Dec 11, 2020 •

edited

Loading

BoboTiG commented Dec 11, 2020 •

edited

Loading

samschott commented Dec 11, 2020 •

edited

Loading

CCP-Aporia commented Dec 11, 2020 •

edited

Loading

CCP-Aporia commented Dec 12, 2020 •

edited

Loading

BoboTiG commented Dec 18, 2020 •

edited

Loading