Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix tests on macOS #729

Closed
wants to merge 7 commits into from
Closed

Conversation

samschott
Copy link
Contributor

@samschott samschott commented Dec 10, 2020

This PR fixes the tests on macOS. Closes #546. Notably:

  • Wait for longer to flush old events from stream before recording.
  • Skip testing for nested MovedEvents: those are currently not generated by fsevents and there is an in-line TODO.
  • Other small tweaks.

Most of the tests will pass after all the other PRs for macOS have been merged. One exception to this is the test_fast_subdirectory_creation_deletion test. This is problematic because the native events, while having the correct total count, all have both the is_created and is_removed flags set. I haven't figured out yet if this is a limitation of FSEvents or an error in watchdog.

@CCP-Aporia
Copy link
Contributor

Thanks a lot!

@samschott
Copy link
Contributor Author

I am happy to help.watchdog is a great package and AFAIK the only Python package for cross-platform file event notifications.

That being said, all of those checks and fixes really should have been part of the review process for #680. Writing a complete overhaul of the macOS backend without detailed manual or automated testing was just asking for trouble...

@CCP-Aporia
Copy link
Contributor

CCP-Aporia commented Dec 11, 2020

Yeah, apologies for the mess that ensued afterwards. I really should have made sure that the automated tests work first to spot many of the issues that cropped up. Manual testing is way too unreliable when it needs to be performed multiple times against different co-existing versions on the same machine.

But that aside, regarding this point:

This is problematic because the native events, while having the correct total count, all have both the is_created and is_removed flags set. I haven't figured out yet if this is a limitation of FSEvents or an error in watchdog.

FSEvents does "temporal coalescing" of events when possible, but there also appears to be a problem with watchdog's C extension. The timeframe within which "temporal coalescing" appears is controlled by the latency parameter that we're passing into FSEventStreamCreate, and it should be rarely happening since we specify 1 millisecond for it.
But there are also problems with the way that the C extension currently works that I'm looking into at the moment. The most famous one shows by the test requiring a time.sleep statement to flush out events we didn't want to be notified about. 🙂 That behaviour happens because with the current C extension code FSEvents will start observing from the time the watch is created, not from the time that we start the observer.

@BoboTiG
Copy link
Collaborator

BoboTiG commented Dec 11, 2020

The good thing is: you are now both working on improving the macOS support reliably, so thank you both very much ❤️

@samschott
Copy link
Contributor Author

samschott commented Dec 11, 2020

No worries, your quick reaction time makes working on fixes very easy :)

I am still confused about the temporal coalescing of events. The test_fast_subdirectory_creation_deletion test runs 30 creations and deletions without any pause in between, and indeed 60 events are reported by FSEvents. This suggests that no coalescing takes place while the fact that both is_created and is_deleted flags are set suggests that events were combined.

Is setting the latency to 0 feasible? Or a really bad idea?

Edit: Correct that, there are only 30 reported events, even when setting the latency to 0. We could queue both created and deleted events in the emitter for such combined native events but the trouble is determining the correct order. A call to os.path.exists may be flaky because the item at the path may have changed since the event was recorded.

@samschott
Copy link
Contributor Author

@CCP-Aporia, yes, that flush has been confusing me. Any idea why those historic events are reported? I have not encountered this with manual testing, only from the test suite.

@CCP-Aporia
Copy link
Contributor

CCP-Aporia commented Dec 11, 2020

Presumably they are reported in the test suite because the time between the filesystem operations and adding the watch is incredibly short - fseventsd is a separate process, so a bit of a race condition should be expected here. First experiments with a time.sleep(0.1) before creating the emitter in the tests look promising in avoiding the old events.

@CCP-Aporia
Copy link
Contributor

CCP-Aporia commented Dec 12, 2020

So, I've done some more experimenting and noticed that the tests kept failing even with the time.sleep(0.1), but in general because of events getting coalesced, and not because of old events still being in the event queue. For example, test_emitter.test_delete kept failing for me because a single NativeEvent with is_removed and is_created being True was received by the C extension. That then got translated to a FileCreatedEvent because of our translation logic in fsevents.py. So I did some digging, and after lots of googling I arrived at this old ArsTechnica article which says:

The fseventsd log files are written in a compressed binary format. Since only per-directory changes are kept, multiple changes to the same directory occurring within 30 seconds of each other are coalesced into a single event in the log file. The upshot is that, even when running a disk-thrashing server-type workload for 24 hours straight, the fseventsd log files will only grow by a megabyte or two per day. Normal usage will produce a small fraction of that.

Now, that 30 second interval doesn't appear to be entirely true anymore. And fseventsd has seen more improvements since macOS 10.5, most prominently the per-file notifications in 10.7. But the more I read about fsevents, the more I'm inclined to think that my work in #680 was misguided, and we'll still need to do the occasional system call to decide what really happened. For example, given above NativeEvent with is_created and is_removed being True then we'd have to do something like an os.path.exists(event.src_path) to figure out the order of those two coalesced events.

edit: This also means that the fsevents2.py approach that I based #680 on is not going to work as intended, and will need to do these system calls, too.

@samschott
Copy link
Contributor Author

Oh wow, that article is a nice find!

Yes, it does looks like we will need an os.path.exists call to determine the order of those coalesced events. This increasingly points towards a chain of properly sorted if-clauses instead of if-else clauses. For instance, if a modified and a created / deleted event are combined, we should also emit the modified event.

We will also need to generate a tree of events for moved parent folders, similar to what is done on the inotify backend with generate_sub_moved_events.

I am not a huge fan of all of this but it looks like we have no choice. It's certainly better than going back to snapshot diffs. Still, it should probably be done in separate PRs. And maybe after this PR is merged. It's always good to have the tests as a reference point and then work towards making them pass. What do you think?

Finally, it looks like we will still need a long waiting time to flush old events since we really don't want to deal with those leaking into the tests in unpredictable ways.

@CCP-Aporia
Copy link
Contributor

Indeed, there are a few things that we need to implement in order to provide a correct FSEvents based observer.

But also, since I assume that we want to have the same behaviour across all operating systems: What do we consider as expected behaviour? The emitter tests contain a bunch of special casing for different operating systems, which makes it difficult to provide consistent behaviour across different platforms. Of course, there are inherent differences between the platforms, but we might want to provide a base line somehow - otherwise the event translation is a bit of a mood point as users of this library would still need to write platform-specific code for a lot of cases.

@samschott
Copy link
Contributor Author

I agree, uniform behaviour across all platforms would be ideal. If this is impossible, a common and documented baseline is the next best thing.

But before we achieve any of that, the more immediate goal to me is a usable emitter on macOS (not crashing, not emitting only half of the events...) that provides at least the granularity of events as the last working release. For this, my aim was to take the tests as they currently are ("expected behaviour"), assuming that any user-land code is already dealing with existing differences between platforms, and ensure that they pass reliably.

IMO, the larger goal of unifying the emitted events across platforms is a breaking change and therefore something for a 2.0 release.

@CCP-Aporia
Copy link
Contributor

One thing that crossed my mind regarding the time.sleep() that we're doing on macOS: A less flaky approach would be to simply wait for the events caused by the directory and file operations that we perform for setting up and tearing down a test. Well, the tear down ones aren't quite that important since we shouldn't be re-using the same path for two different tests. But waiting for the setup notifications is certainly useful to avoid race conditions where we receive some old notifications within a single test case.

Additionally increasing the timeout for the event_queue.get() calls to something like 35 seconds might be a good idea, in case that fseventsd really does take 30 seconds to send the notification. :-)

@BoboTiG
Copy link
Collaborator

BoboTiG commented Dec 18, 2020

Sorry guys, I did the 1.0.2 release. I was trying to add wheels to the current release and it seems the master branch was used ... So let's rebase your work on master.

dsteelma-umd added a commit to dsteelma-umd/plastron that referenced this pull request Jan 14, 2021
v0.10.4 of the "watchdog" package had an extensive update to the
"fsevent" observer, which appears to have broken functionality the
Plastron InboxWatcher relies on
(gorakhargosh/watchdog#680)

The problem still exists in the latest current watchdog version
(v1.0.2), so pinning the "watchdog" package to v0.10.3 is the simplest
fix for now.

In gorakhargosh/watchdog#729 it is noted
that the watchdog MacOS tests are failing, and that there are
known issues with fsevent.

The "tests/stomp/test_inbox_watcher.py" passed when using v0.10.3, and
fails in v0.10.4 and later, so it can be used to gauge whether
"watchdog" is performing as expected by Plastron.

https://issues.umd.edu/browse/LIBFCREPO-944
dsteelma-umd added a commit to dsteelma-umd/plastron that referenced this pull request Jan 14, 2021
v0.10.4 of the "watchdog" package had an extensive update to the
"fsevent" observer, which appears to have broken functionality the
Plastron InboxWatcher relies on
(gorakhargosh/watchdog#680).

In gorakhargosh/watchdog#729 it is noted
hat the watchdog MacOS tests are failing, and that there are
known issues with fsevent.

The problem still exists in the latest current watchdog version
(v1.0.2), so pinning the "watchdog" package to v0.10.3 is the simplest
fix for now.

The "tests/stomp/test_inbox_watcher.py" passed when using v0.10.3, and
fails in v0.10.4 and later, so it can be used to gauge whether
"watchdog" is performing as expected by Plastron.

https://issues.umd.edu/browse/LIBFCREPO-944
@BoboTiG
Copy link
Collaborator

BoboTiG commented Jan 19, 2021

Closed by #734.

@BoboTiG BoboTiG closed this Jan 19, 2021
@samschott samschott deleted the macos-test-fixes branch January 20, 2021 16:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix test_emitter.py on macOS
3 participants