Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve handling of rename events on macOS #750

Merged
merged 20 commits into from
Jan 21, 2021

Conversation

samschott
Copy link
Contributor

At the moment, the two native events which are emitted when an item is renamed are paired to a single MovedEvent by relying on an undocumented behaviour of the File System Events API: two such events will have event IDs which differ only by 1. This may stop working without notice and fails to cover two corner-cases:

  1. When an item is renamed by changing only the casing of the name, the native events appear to have IDs differing by 2.
  2. When a renamed event is coalesced with another event, its event ID may differ from the observed but undocumented behaviour.

This PR matches native event pairs by inode instead. This is achieved by actively recording not only the path but also the inode for all events in the extension module. Such extended information is provided when creating the FSEventsStream with the kFSEventStreamCreateFlagUseExtendedData and kFSEventStreamCreateFlagUseCFTypes flags. This requires dealing CFTypes instead of ctypes in the extension module.

Other changes in this PR:

  • Expose a method to flush the FSEventsStream.
  • Simplify the handling of coalesced events in the Python code. This seems to result in more consistent behaviour in case of coalescing.

@samschott
Copy link
Contributor Author

Since this inode information in FSEvents is available only on Mac OSX 10.13 and later, this PR drops support for previous versions. I hope this isn't a blocker since OSX 10.12 is no longer maintained.

@BoboTiG
Copy link
Collaborator

BoboTiG commented Jan 20, 2021

Thanks @samschott!

1. When an item is renamed by changing only the casing of the name, the native events appear to have IDs differing by 2.

Does that mean you will open a PR to handle that? :)

I am OK to drop macOS 10.12 support 👍

I'll be able to do a review tomorrow (French timezone), could you add a line in the chagelog about the 10.12 support drop? There may be other places where the minimum macOS version is described, but I could do it before the release then.

WDYT @CCP-Aporia?

@samschott
Copy link
Contributor Author

This is the PR that handles that. Case-changes should now be fine, see the added test test_emitter.py:test_case_change.

Yeah, I'll add the information to the change log.

Copy link
Collaborator

@BoboTiG BoboTiG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor review.

src/watchdog/observers/fsevents.py Outdated Show resolved Hide resolved
@BoboTiG
Copy link
Collaborator

BoboTiG commented Jan 20, 2021

Yes, I just saw the test... Sorry :)

@samschott
Copy link
Contributor Author

Also, this is my first time coding anything in C. I think I have eliminated all segfaults and possible issues. But please have a very careful look at the extension module!

Co-authored-by: Mickaël Schoentgen <contact@tiger-222.fr>
@BoboTiG
Copy link
Collaborator

BoboTiG commented Jan 20, 2021

FTR I plan to move tests to GitHub Actions soon, because Travis-CI is a pain.

Copy link
Contributor

@CCP-Aporia CCP-Aporia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all, thanks a lot! 👍 This looks great overall. :-)

For the requested changes, please see the inline comments. Basically, increasing the reference count on the created PyObject instances in those places cause a memory leak.

CFNumberGetValue(cf_number, kCFNumberSInt64Type, &c_int);

py_long = PyLong_FromLong(c_int);
Py_INCREF(py_long);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyLong_FromLong returns a new reference, so there's no need to call Py_INCREF here. :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks!

py_string = PyUnicode_FromString(c_string_ptr);
}

Py_INCREF(py_string);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyUnicode_FromString returns a new reference, so there's no need to call Py_INCREF here. :-)

@CCP-Aporia
Copy link
Contributor

Just saw that test_observers_api.py:test_event_dispatcher failed on macOS 3.9 on travis. And while it's not related to the changes in this PR, then I believe there's a missing event_dispatcher.join() call at the end of that test case, causing a race condition? 🤔

Sam Schott added 3 commits January 21, 2021 11:49
this is the case for kFSEventStreamEventFlagRootChanged
@samschott
Copy link
Contributor Author

Two more changes which I have made:

  1. Reverted the handling of coalesced events. It's off topic for this PR and not as as straight forward as I thought.
  2. Fix a TypeError in the extension module when NativeEvent is passed an inode which is None. This will be the case for events where the kFSEventStreamEventFlagRootChanged flag is set.

@CCP-Aporia
Copy link
Contributor

Two more changes which I have made:

  1. Reverted the handling of coalesced events. It's off topic for this PR and not as as straight forward as I thought.

The handling of those events sure does look odd, and I wouldn't be surprised if there's an easier way of expressing what we need to guard there. :-)

  1. Fix a TypeError in the extension module when NativeEvent is passed an inode which is None. This will be the case for events where the kFSEventStreamEventFlagRootChanged flag is set.

Oh, that's an interesting one, there's potential for crashes and memory leaks here. It's necessary to Py_INCREF(self->inode) if parsing was successful. And self->inode needs to be initialised to NULL before the argument parsing happens. Additionally NativeEventObject then needs a tp_dealloc function that does a Py_XDECREF for the inode object.

@samschott
Copy link
Contributor Author

samschott commented Jan 21, 2021

Maybe I should have read the Python/C API docs first... I've hopefully fixed all issues with the inode attribute now 🤞

src/watchdog_fsevents.c Outdated Show resolved Hide resolved
tests/test_emitter.py Outdated Show resolved Hide resolved
Co-authored-by: Mickaël Schoentgen <contact@tiger-222.fr>
Copy link
Contributor

@CCP-Aporia CCP-Aporia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The C extension looks good to me, so from my point of view this can get merged. 👍

@BoboTiG
Copy link
Collaborator

BoboTiG commented Jan 21, 2021

Great work @samschott! Thanks for the review @CCP-Aporia 💪

@BoboTiG BoboTiG merged commit d9a6e63 into gorakhargosh:master Jan 21, 2021
@samschott samschott deleted the improve-rename-handling-macos branch January 21, 2021 17:29
@samschott
Copy link
Contributor Author

Just for future reference, I have tested which events get emitted now when a user edits a file in TextEdit or a similar editor:

NativeEvent(path="Untitled.txt", inode=76814733, flags=10800, id=363703683): is_file, is_renamed
NativeEvent(path="Untitled.txt.sb-c668ce92-TL24mz", inode=76814733, flags=18800, id=363703692): is_file, is_renamed, is_xattr_mod

=> emits:
   FileMovedEvent(src_path="Untitled.txt", dest_path="Untitled.txt.sb-c668ce92-TL24mz")

NativeEvent(path="Untitled.txt", inode=76815055, flags=18c00, id=363703718): is_file, is_inode_meta_mod, is_renamed, is_xattr_mod

=> emits:
   FileCreatedEvent(src_path="Untitled.txt")
   FileModifiedEvent(src_path="Untitled.txt")

NativeEvent(path="Untitled.txt.sb-c668ce92-TL24mz", inode=76814733, flags=41ca00, id=363703735): is_cloned, is_coalesced, is_file, is_owner_change, is_removed, is_renamed, is_xattr_mod

=> emits:
   FileDeletedEvent(src_path="Untitled.txt.sb-c668ce92-TL24mz")

NativeEvent(path="Untitled.txt", inode=76815055, flags=418c00, id=363703799): is_cloned, is_file, is_inode_meta_mod, is_renamed, is_xattr_mod

=> emits:
   FileCreatedEvent(src_path="Untitled.txt")
   FileModifiedEvent(src_path="Untitled.txt")

This is more verbose than the old emitter but seems quite sane to me. Note the is_cloned flag when the file content is copied over from the temp file.

BoboTiG added a commit that referenced this pull request Feb 5, 2021
* Fix installation of Python 3.7.9 on Windows (#705)

* [windows] winapi.BUFFER_SIZE now defaults to 64000 (instead of 2048)

To handle cases with lot of changes, this seems the highest safest value we can use.

Note: it will fail with `ERROR_INVALID_PARAMETER` when it is greater than 64 KB and
      the application is monitoring a directory over the network.
      This is due to a packet size limitation with the underlying file sharing protocols.
      https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-readdirectorychangesw#remarks

Also introduced `winapi.PATH_BUFFER_SIZE` (defaults to `2048`)
to keep the old behavior with path-realted functions.

* [mac] Regression fixes for native fsevents (#717)

* [mac] Fix missing event_id attribute in fsevents (#722)

* [mac] Return byte paths if a byte path was given in fsevents (#726)

* Pin py.test to a version that supports 2.7

* Disable pypy2 tests on Windows

They hang and timeout after 6 hours for some unknown reason

* [mac] Add compatibility with old macOS versions (#733)

* Ensure version macros are available

* Uniformize event for deletion of watched dir (#727)

* [inotify] Add support for the IN_CLOSE_WRITE event (#747)

* [mac] Support coalesced fsevents (#734)

* Add `is_coalesced` property to `NativeEvent`

So that we can effectively decide if we need to perform additional
system calls to figure out what really happened.

* Replace `NativeEvent._event_type` with `repr()` support

It's more pythonic, and the `_event_type` implementation wasn't quite
usable anyway.

NB: the representation is not truly copy/paste python code if there is
a double quote inside event.path, but that should be a rare case so we
don't add the expensive special case handling there.

* Allow running tests with debugger attached

Some Python debuggers create additional threads, so we shouldn't assume that there is only one.

* Request notifications for watched root

* Expect events on macOS instead of using `time.sleep()`

It might be even better to check for the emitter class, as opposed to platform

* Add exception handling to FSEventsEmitter

Reduce the amount of 'silent breakage'

* Use sentinel event when setting up tests on macOS

So that we can avoid a race between test setup and fseventsd

* Improve handling of coalesced events

* Revert accidental platform check change

* Fix renaming_top_level_directory test on macOS

* Generate sub events for move operations

* Remove `filesystem_view` again

While the `filesystem_view` helps with filtering out additional
`FileCreatedEvent`+`DirModifiedEvent` pairs then it also introduces
a huge amount of edge cases for synthetic events caused by move and
rename operations. On top of that, in order to properly resolve
those edge cases we'd have to go back to a solution very similar to
the old directory snapshots, with all the performance penalties they
suffered from...

As such I think it's better to acknowledge the behaviour for coalesced
events instead, and thus remove the `filesystem_view` again.

* [mac] Improve handling of rename events (#750)

* drop support for macOS 10.12 and lower

Co-authored-by: SamSchott <ss2151@cam.ac.uk>
Co-authored-by: Boris Staletic <boris.staletic@gmail.com>
Co-authored-by: Mickaël Schoentgen <contact@tiger-222.fr>
Co-authored-by: Dustin Ingram <di@users.noreply.github.com>
Co-authored-by: ysard <ysard@users.noreply.github.com>
Co-authored-by: Lukas Šupienis <l.supienis@ncs.lt>
Co-authored-by: Lukas Šupienis <lukas.supienis@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants