Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support coalesced fsevents #734

Merged
merged 17 commits into from
Jan 19, 2021

Conversation

CCP-Aporia
Copy link
Contributor

@CCP-Aporia CCP-Aporia commented Dec 13, 2020

Starting this as a draft PR to get some early feedback.

The general idea is to expose an is_coalesced property on NativeEvent instances. This allows the NativeEvent -> FileSystemEvent conversion to perform additional checks on NativeEvent.path if necessary. With these additional checks the NativeEvent -> FileSystemEvent mechanism then has a chance to execute "a chain of properly sorted if-clauses instead of if-else clauses".

Do note that in case of coalesced events it is impossible to know how many times NativeEvent.path was created, removed or renamed. It is only possible to know that such an event occurred at least once within an fseventsd update interval.

@samschott - please let me know if you think the list of "coalesced event" bit masks is sufficient or not. As far as I can imagine right now then the tricky bits with coalesced events are around creation, removal, and rename of an item in an observed path.

So that we can effectively decide if we need to perform additional
system calls to figure out what really happened.
@samschott
Copy link
Contributor

samschott commented Dec 14, 2020

@CCP-Aporia, the approach looks good to me. I probably would have implemented this on the Python side. But maybe that's just because I am more comfortable with Python...

One more type of coalescing which I have noticed is with kFSEventStreamEventFlagItemModified, kFSEventStreamEventFlagItemXattrMod, etc, flags. With the current order of checks in the emitter, this would mean that a native event with both kFSEventStreamEventFlagItemXattrMod and kFSEventStreamEventFlagItemCreated flags would be emitted as a FileModified event and the FileCreatedEvent would be suppressed. This was causing some test failures.

I tried to address this in #728 by prioritising created / deleted over modified but realise now that it's probably the wrong approach. Flagging those events as coalesced and emitting multiple events would be better.

@samschott
Copy link
Contributor

Do note that in case of coalesced events it is impossible to know how many times NativeEvent.path was created, removed or renamed. It is only possible to know that such an event occurred at least once within an fseventsd update interval.

This is unavoidable and already was an issue with the previous implementation (snapshot diffs triggered by FSEvents).

It's more pythonic, and the `_event_type` implementation wasn't quite
usable anyway.

NB: the representation is not truly copy/paste python code if there is
a double quote inside event.path, but that should be a rare case so we
don't add the expensive special case handling there.
Some Python debuggers create additional threads, so we shouldn't assume that there is only one.
It might be even better to check for the emitter class, as opposed to platform
Reduce the amount of 'silent breakage'
@CCP-Aporia
Copy link
Contributor Author

Just updated with a new snapshot of the current effort. There are some FIXME comments that I need to address, but more importantly it's now trying to expect the test setup filesystem events on macOS. And it also tries to rerun a test on macOS in order to work around a race condition that we have during test setup. I'm all up for ideas on how to resolve this one, here is how it goes:
First, we create path p which causes a filesystem event. Then, some time later, we create the emitter, telling it to watch for filesystem events for p from the FSEventStreamEventID at the time of creating the relevant FSStream.
Now, if the FSEventStreamEventID for the creation of p is no longer the latest FSEventStreamID then we will not receive the creation event.

@samschott
Copy link
Contributor

There is potentially a way around this race, but it's not exactly elegant. From within start_watching, after starting the emitter, we can create (and delete?) a file inside the watched path and wait to receive the corresponding event. Once that has arrived, we are certain to have cleared all old events and can start the test.

@samschott
Copy link
Contributor

If this works, it would do away with unnecessarily long wait times and flaky retry logic.

@CCP-Aporia
Copy link
Contributor Author

Ah yes, of course, a sentinel event is a great idea, and works perfectly. :-) It also means we don't have to expect any events for filesystem operations before we call start_watching(). Thanks!

@BoboTiG
Copy link
Collaborator

BoboTiG commented Dec 18, 2020

Sorry guys, I did the 1.0.2 release. I was trying to add wheels to the current release and it seems the master branch was used ... So let's rebase your work on master.

@samschott
Copy link
Contributor

Hi @CCP-Aporia, are you still working on this? If yes, I've just had a look at the failing tests and left a few comments inline.

tests/test_emitter.py Outdated Show resolved Hide resolved
tests/test_emitter.py Outdated Show resolved Hide resolved
@samschott
Copy link
Contributor

I've also noticed that for some reason the sentinel event does not register when just using p('') as the root dir. I'm not sure why though...

@samschott
Copy link
Contributor

samschott commented Jan 5, 2021

Regarding the handling of coalesced events, I can see your inline comment:

As such we need to track all events that occurred in order to make a correct decision about which events should be generated.

Are you proposing to manually cache some / all past events? Say we have created a new watch and a file is rapidly created and deleted -- there would be no way to determine the order of events because we don't have any history yet. Of course, we could start off with a directory snapshot, but those can use a lot of memory for large hierarchies.

What is your opinion on just checking if the path exists instead to order created and deleted events? Modified events cannot reasonable be ordered but we can always queue them after creations and before deletions.

What I am proposing is essentially a separate logic for coalesced events, that queues events in an order depending on the current state of the path. It's not ideal but a similar approach is taken for instances in the inotify backend to generate event trees for moved directories (generate_sub_moved_events). Those are generated according to the current state and not actual events.

@CCP-Aporia
Copy link
Contributor Author

CCP-Aporia commented Jan 6, 2021

I'm still working on this and my local version is a bit ahead of what's in here. But progress was somewhat slow because of the holidays.

@CCP-Aporia
Copy link
Contributor Author

Regarding the handling of coalesced events, I can see your inline comment:

As such we need to track all events that occurred in order to make a correct decision about which events should be generated.

Are you proposing to manually cache some / all past events? Say we have created a new watch and a file is rapidly created and deleted -- there would be no way to determine the order of events because we don't have any history yet. Of course, we could start off with a directory snapshot, but those can use a lot of memory for large hierarchies.

My current local implementation is keeping a dictionary that maps an event.path to an os.stat result. This may well consume a lot of memory for large hierarchies, I haven't profiled this yet. However, it shouldn't be worse than the old approach using directory snapshots.
Also, in retrospect, my wording there is bad, a less ambiguous wording is "[...] track all events that occurred so that we can make a correct decision [...]".

What is your opinion on just checking if the path exists instead to order created and deleted events? Modified events cannot reasonable be ordered but we can always queue them after creations and before deletions.

Indeed, an existence check is necessary to decide if a file was created or deleted.

What I am proposing is essentially a separate logic for coalesced events, that queues events in an order depending on the current state of the path. It's not ideal but a similar approach is taken for instances in the inotify backend to generate event trees for moved directories (generate_sub_moved_events). Those are generated according to the current state and not actual events.

After working on the coalesced events a bit more then these seem to be the norm for our test suite since it runs so fast so I haven't worked on two separate code paths. That said, the logic for coalesced events is somewhat complex, and if separate code paths make it a less complicated implementation then I'll transition to it for sure. 🙂

@samschott
Copy link
Contributor

samschott commented Jan 7, 2021

True, separating the logic for coalesced and "regular" events only makes sense if it makes the code easier to understand and maintain. As it stands, it is still possible to follow the code flow quite well (but maybe I'm just saying this because I have been involved in the discussions).

The worst-case scenario for the cache of os.stat results would be a user creating a large number of entries after starting a watch, in which case we would indeed approach the memory usage of the old directory snapshot implementation. The latter roughly uses 40 MB of memory for 15,000 files + folders. This is ok but not great.

From looking at the code, the stat cache is only actually used to suppress false modified events. When do those actually occur? And is suppressing them worth the tradeoff in memory and code complexity? Or will there be other uses for the cache?

Finally, something I noticed about the current logic: if a creation and deletion are coalesced into a single event, the current code will only emit the second event instead of emitting both in the "correct" order. Is this deliberate or accidental?

Of course, there is no way to disentangle a chain with more than two coalesced events of the same type (for example created -> deleted -> created). In this case, emitting a single created event may indeed be better than emitting deleted + created events. Do we actually get more than two events of the same type (e.g., creations) coalesced into a single event in the test suite?

@CCP-Aporia
Copy link
Contributor Author

CCP-Aporia commented Jan 7, 2021

These are great comments and questions.

The worst-case scenario for the cache of os.stat results would be a user creating a large number of entries after starting a watch, in which case we would indeed approach the memory usage of the old directory snapshot implementation. The latter roughly uses 40 MB of memory for 15,000 files + folders. This is ok but not great.

Agreed, and I need to profile this as we use watchdog with a 350.000 file folder.

From looking at the code, the stat cache is only actually used to suppress false modified events. When do those actually occur? And is suppressing them worth the tradeoff in memory and code complexity? Or will there be other uses for the cache?

The reason here is two-fold. One is to filter out erroneous FileModifiedEvent instances. However, it's also to avoid an additional FileCreatedEvent in, for example, test_emitter.py:test_modify. There might be a simpler approach, but I haven't found it yet.

Finally, something I noticed about the current logic: if a creation and deletion are coalesced into a single event, the current code will only emit the second event instead of emitting both in the "correct" order. Is this deliberate or accidental?

This was intentional. When I started going through the emitter tests then the delete test would fail without the existence check. However, I just tested this again, and it seems like this is no longer necessary. The latest version should emit an event for creation and deletion. But I think this scenario is missing an explicit test.

Of course, there is no way to disentangle a chain with more than two coalesced events of the same type (for example created -> deleted -> created). In this case, emitting a single created event may indeed be better than emitting deleted + created events. Do we actually get more than two events of the same type (e.g., creations) coalesced into a single event in the test suite?

I don't think the tests cover that scenario; it might be worthwhile adding such a test. Correction, there's test_emitter.py:test_fast_subdirectory_creation_deletion which I had mark with @pytest.skipif for macOS because it assumes that there's a count of specific events.

@CCP-Aporia CCP-Aporia marked this pull request as ready for review January 7, 2021 16:40
@CCP-Aporia
Copy link
Contributor Author

I think this is ready for review; the CI test failure for py310 is still the issue we've seen before and not related to the changes here.
Still need to profile the changes, and maybe even find a way to re-enable test_fast_subdirectory_creation_deletion on macOS.

@BoboTiG
Copy link
Collaborator

BoboTiG commented Jan 7, 2021

@samschott yes, I was not saying we do not care about such scenarios. As I understand it, the patch improves the situation, so it is a good thing :)
And if we can go better, then this is even better for everyone.

@samschott
Copy link
Contributor

Thank you @CCP-Aporia for the explanations.

However, it's also to avoid an additional FileCreatedEvent in, for example, test_emitter.py:test_modify. There might be a simpler approach, but I haven't found it yet.

Hm, I have just reproduced this myself and I'm a bit puzzled by the behaviour. Why is this additional FileCreatedEvent emitted here? It doesn't event seem to be coalesced, it looks like a plain and simple and wrong created event.

While the `filesystem_view` helps with filtering out additional
`FileCreatedEvent`+`DirModifiedEvent` pairs then it also introduces
a huge amount of edge cases for synthetic events caused by move and
rename operations. On top of that, in order to properly resolve
those edge cases we'd have to go back to a solution very similar to
the old directory snapshots, with all the performance penalties they
suffered from...

As such I think it's better to acknowledge the behaviour for coalesced
events instead, and thus remove the `filesystem_view` again.
@CCP-Aporia
Copy link
Contributor Author

@BoboTiG - is there anything more you'd like to have changed in this PR?

It's including @samschott's work from #729 as well (out of necessity to gain confidence in the changes), and the tests on Python 3.10 are failing due to the six issue.

@BoboTiG
Copy link
Collaborator

BoboTiG commented Jan 18, 2021

If @samschott is OK with that, I am OK to merge :)
Thank you for your hard work both 💪

Closes #546.
Closes #729.

@samschott
Copy link
Contributor

This looks good to me. Thank you @CCP-Aporia for finally having a fast and reliable macOS emitter! The improvements in performance and memory usage are both significant.

The only minor issue lies still with renaming files when changing only the casing. As described in the code comments, this currently emits two FileCreatedEvents, one with the old and one with the new name, instead of a single FileMovedEvent. However, if you both agree, we can leave that for a future PR. I am not even sure if this can be handled reliably...

@BoboTiG
Copy link
Collaborator

BoboTiG commented Jan 19, 2021

However, if you both agree, we can leave that for a future PR. I am not even sure if this can be handled reliably..

We can do that, yes. This is something I really would like to be fixed though.

Thank you both, that's awesome 💪

@BoboTiG BoboTiG merged commit fc1242e into gorakhargosh:master Jan 19, 2021
This was referenced Jan 19, 2021
@samschott
Copy link
Contributor

samschott commented Jan 19, 2021

@BoboTiG, I have just realised that you are also working on desktop client for file syncing (nuxeo desktop). And I can see that you also have watchdog pinned to version 0.10.3. So I expect that our use cases, requirements and tests will be very similar.

I'll see what I can do regarding the case changes. It appears that there might be pattern in the native event IDs again but it would be nice if we wouldn't need to rely an undocumented behaviour here.

@BoboTiG
Copy link
Collaborator

BoboTiG commented Jan 19, 2021

@BoboTiG, I have just realised that you are also working on desktop client for file syncing (nuxeo desktop). And I can see that you also have watchdog pinned to version 0.10.3. So I expect that our use cases, requirements and tests will be very similar.

Exactly :)
I will likely upgrade watchdog with the next release (that one should be the good candidate IMO). But the next release is not yet planned, let's improve fsevents in the meantime.

BoboTiG added a commit that referenced this pull request Feb 5, 2021
* Fix installation of Python 3.7.9 on Windows (#705)

* [windows] winapi.BUFFER_SIZE now defaults to 64000 (instead of 2048)

To handle cases with lot of changes, this seems the highest safest value we can use.

Note: it will fail with `ERROR_INVALID_PARAMETER` when it is greater than 64 KB and
      the application is monitoring a directory over the network.
      This is due to a packet size limitation with the underlying file sharing protocols.
      https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-readdirectorychangesw#remarks

Also introduced `winapi.PATH_BUFFER_SIZE` (defaults to `2048`)
to keep the old behavior with path-realted functions.

* [mac] Regression fixes for native fsevents (#717)

* [mac] Fix missing event_id attribute in fsevents (#722)

* [mac] Return byte paths if a byte path was given in fsevents (#726)

* Pin py.test to a version that supports 2.7

* Disable pypy2 tests on Windows

They hang and timeout after 6 hours for some unknown reason

* [mac] Add compatibility with old macOS versions (#733)

* Ensure version macros are available

* Uniformize event for deletion of watched dir (#727)

* [inotify] Add support for the IN_CLOSE_WRITE event (#747)

* [mac] Support coalesced fsevents (#734)

* Add `is_coalesced` property to `NativeEvent`

So that we can effectively decide if we need to perform additional
system calls to figure out what really happened.

* Replace `NativeEvent._event_type` with `repr()` support

It's more pythonic, and the `_event_type` implementation wasn't quite
usable anyway.

NB: the representation is not truly copy/paste python code if there is
a double quote inside event.path, but that should be a rare case so we
don't add the expensive special case handling there.

* Allow running tests with debugger attached

Some Python debuggers create additional threads, so we shouldn't assume that there is only one.

* Request notifications for watched root

* Expect events on macOS instead of using `time.sleep()`

It might be even better to check for the emitter class, as opposed to platform

* Add exception handling to FSEventsEmitter

Reduce the amount of 'silent breakage'

* Use sentinel event when setting up tests on macOS

So that we can avoid a race between test setup and fseventsd

* Improve handling of coalesced events

* Revert accidental platform check change

* Fix renaming_top_level_directory test on macOS

* Generate sub events for move operations

* Remove `filesystem_view` again

While the `filesystem_view` helps with filtering out additional
`FileCreatedEvent`+`DirModifiedEvent` pairs then it also introduces
a huge amount of edge cases for synthetic events caused by move and
rename operations. On top of that, in order to properly resolve
those edge cases we'd have to go back to a solution very similar to
the old directory snapshots, with all the performance penalties they
suffered from...

As such I think it's better to acknowledge the behaviour for coalesced
events instead, and thus remove the `filesystem_view` again.

* [mac] Improve handling of rename events (#750)

* drop support for macOS 10.12 and lower

Co-authored-by: SamSchott <ss2151@cam.ac.uk>
Co-authored-by: Boris Staletic <boris.staletic@gmail.com>
Co-authored-by: Mickaël Schoentgen <contact@tiger-222.fr>
Co-authored-by: Dustin Ingram <di@users.noreply.github.com>
Co-authored-by: ysard <ysard@users.noreply.github.com>
Co-authored-by: Lukas Šupienis <l.supienis@ncs.lt>
Co-authored-by: Lukas Šupienis <lukas.supienis@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants