Creates a new FileLoader class to separate the logic of watching files #121

hbcarlos · 2023-03-14T09:17:32Z

Creates a new FileLoader class to separate the logic of watching files from the WebSocketHandler.

Code changes:

Split classes in different files
Creates a new FileLoader class to centralize loading, saving, and watching files.
Multiple rooms can get an instance of this class and use it to access the content safely since we can create a single lock by file to ensure two rooms are not accessing the same file simultaneously.
Updates the DocumentRoom to use the FileLoader.

TODO:

Create an abstract class for FileLoader to extend from.
- Some extensions (like jupytercad) may want their own loader.
- Allow registering new file loaders.
Add documentation

github-actions · 2023-03-14T09:17:48Z

👈 Launch a Binder on branch hbcarlos/jupyter_collaboration/refactor

codecov-commenter · 2023-03-14T09:21:18Z

Codecov Report

Patch and project coverage have no change.

Comparison is base (eb8456d) 0.00% compared to head (ac92018) 0.00%.

Additional details and impacted files

@@          Coverage Diff           @@
##            main    #121    +/-   ##
======================================
  Coverage   0.00%   0.00%            
======================================
  Files          3       7     +4     
  Lines        297     418   +121     
======================================
- Misses       297     418   +121

Impacted Files	Coverage Δ
jupyter_collaboration/app.py	`0.00% <0.00%> (ø)`
jupyter_collaboration/handlers.py	`0.00% <0.00%> (ø)`
jupyter_collaboration/loaders.py	`0.00% <0.00%> (ø)`
jupyter_collaboration/rooms.py	`0.00% <0.00%> (ø)`
jupyter_collaboration/stores.py	`0.00% <0.00%> (ø)`
jupyter_collaboration/utils.py	`0.00% <0.00%> (ø)`

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

hbcarlos · 2023-03-14T09:54:05Z

It would be great to have a way to swap loaders so extensions can register new document types and the logic to load those documents. For example, JupyterCAD registers a new document type that uses the FreeCAD format. This format is a compressed folder with multiple XML files. Jupyter Server needs to learn how to open those documents, and the contents service needs a way to plug new loaders.

I have looked into the contents service of Jupyter Server, trying to find the best solution for this problem. Currently, the ContentsManager only allows registering pre/post save hooks. We could use these hooks to load the content of a specific file type, but we can not ensure that the hook we register will be executed the last one. In addition, there are only hooks for saving the file. There are no hooks for loading the file.

I'm not trying to solve the problem in this PR, but I want to redesign the architecture of jupyter_collaboration mindful of this problem to solve it in a follow-up PR.

At the moment, I see two options:
1 - Create a new file loader manager that allows registering loaders depending on the file types, and create the default FileLoader using Jupyter Server's content manager as we currently do.
2 - Create a new content manager based on the default content manager that allows registering loaders for each file type.

There is a caveat with the second option. If jupyter_collaboration is installed with another extension that also swaps the default content manager, it may not work.

hbcarlos · 2023-03-14T12:25:56Z

Hi @Zsailer. I'm pinging you since your thoughts here would be appreciated, and you might be interested in the comment I posted above.

I will try to join the jupyter server team meeting this week to discuss this topic. Jupyter Server's collaborators should be aware of and participate in the decisions taken here in the jupyter_collaboration extension since this involves some parts of the jupyter server.

Thanks in advance for your comments and your time!

Zsailer · 2023-03-16T03:20:25Z

Thanks for pinging me here, @hbcarlos. I'll give this a more thorough review over the next couple of days.

I don't know if I'll make it to tomorrow's Jupyter Server meeting—I have a family commitment that conflicts with part of the meeting. If I don't see you then, let's connect over gitter and maybe set up a time to meet separately?

hbcarlos · 2023-03-16T15:06:11Z

Thanks, @Zsailer!

If I don't see you then, let's connect over gitter and maybe set up a time to meet separately?

It's fine, we can continue the discussion here and discuss also it at next week's meeting.

jupyter_collaboration/handlers.py

hbcarlos · 2023-03-29T07:58:42Z

With the current implementation, the content is not synced between different rooms. It doesn't sync because there is only one property, _last_modified, for each file. If room A saves to disk, we update the property _last_modified. Next time we load the content, we will not notify every other room because, for the file, the content is up to date. See:
https://github.com/jupyterlab/jupyter_collaboration/blob/b1aedb92da354cf6c9df8979755145e8a3e952d7/jupyter_collaboration/loaders.py#L89

We could change this logic and keep a timestamp by room, allowing to sync between rooms. However, I prefer not to sync between rooms for now. If we sync the content between rooms, each time the content changes in one room, on every other room, we will create a Y update replacing the entire content. This could increase memory usage, so I would like to experiment before allowing rooms to sync.

davidbrochart · 2023-03-29T08:30:38Z

With the current implementation, the content is not synced between different rooms.

You mean with this PR? Before this PR, the content is synced between rooms.

davidbrochart · 2023-03-29T09:05:37Z

If we sync the content between rooms, each time the content changes in one room, on every other room, we will create a Y update replacing the entire content.

See jupyter-server/jupyter_ydoc#15.

jupyter_collaboration/loaders.py

hbcarlos · 2023-03-29T09:20:07Z

You mean with this PR?

Yes, I meant in this PR.

davidbrochart · 2023-03-29T09:24:28Z

So you mean that this PR breaks document synchronization between rooms?

jupyter_collaboration/loaders.py

jupyter_collaboration/rooms.py

fcollonval

Thanks @hbcarlos

Here are some suggestions. It will be good to add doc string to document the code and unit tests to at least test for the correctness of the instantiation and clean actions.

jupyter_collaboration/handlers.py

fcollonval · 2023-04-04T13:27:14Z

jupyter_collaboration/handlers.py


    def check_origin(self, origin):
        return True

+    @classmethod
+    def clean_up(cls):


This is another proof that the architecture is not yet great - but let put it aside for this PR.

I know I'm doing my best, but there are already a lot of changes in this PR.

Feel free to open an issue, and I'll work on it in another PR

jupyter_collaboration/loaders.py

jupyter_collaboration/rooms.py

fcollonval · 2023-04-04T13:45:19Z

jupyter_collaboration/stores.py

+    prefix_dir = "jupyter_ystore_"
+
+
+class SQLiteYStoreMetaclass(type(LoggingConfigurable), type(_SQLiteYStore)):  # type: ignore


Could you add a comment why we need this complex code based on metaclass rather than the classical direct inheritance?

I just moved it to a new file. I'm not sure why we need it. I have not looked into the stores yet.

@davidbrochart Could you open a PR adding documentation?

davidbrochart · 2023-04-04T13:54:41Z

So you mean that this PR breaks document synchronization between rooms?

I've not been following all the commits, has this been addressed?

hbcarlos · 2023-04-04T14:05:49Z

I've not been following all the commits, has this been addressed?

I'm not restoring it. This PR removes the synchronization between rooms in favor of simplicity and stability. I'll add it back once the extension is stable and we have a pleasant RTC experience.

davidbrochart · 2023-04-04T14:12:11Z

I think it is a big regression.
So what is the user experience when a notebook is opened as a notebook document and as a JSON document, for instance?
What happens to one document when the user makes a change to the other?

hbcarlos · 2023-04-04T15:51:54Z

Users are notified that they might lose changes if they open the same document with multiple views.

If you edit in one room, it won't sync with the other one. The last edited room is the one saved to disk.

davidbrochart · 2023-04-04T16:06:08Z

I think it needs to be discussed, as it is a significant regression and it could lead to a lot of user frustration.
What about changes that are made on the backend, like checking out a git branch that would change the open file? Is the change reflected on the frontend, or is that also not supported anymore?

hbcarlos · 2023-04-04T16:32:13Z

What about changes that are made on the backend, like checking out a git branch that would change the open file? Is the change reflected on the frontend, or is that also not supported anymore?

That is supported

davidbrochart · 2023-04-05T13:35:26Z

So there is something I don't understand. If the frontend reacts to file changes made in the backend, and you have one user with a notebook opened as a notebook document, and another user with the same notebook opened as a JSON document, what is the behavior?
Users won't be warned because they individually only have one view of the notebook, right?

davidbrochart · 2023-04-11T07:59:49Z

I just tried this PR and I'm seeing weird behaviors while synchronizing rooms, like a notebook being reverted to a previous state:

room_sync.mp4

hbcarlos · 2023-04-11T08:08:48Z

Yes, because I prioritize what is in the disk instead of automatically overwriting it whenever a change occurs.

I'm not in favor of the synchronization between rooms or the auto-save. It is unpredictable, and I'm sure it is related to some of the issues we are having.

davidbrochart · 2023-04-11T08:21:02Z

It is unclear to me what you do in this PR regarding room synchronization. Commit 1ee3848 says "Fixes sync between rooms", but it doesn't seem to work. Can you clarify what is fixed?

fcollonval

Thanks @hbcarlos

Let's move forward with this one.

davidbrochart · 2023-04-11T14:27:48Z

@fcollonval I was not done testing this PR, and there was still pending questions.

fcollonval · 2023-04-11T14:46:08Z

The synchronization has been fixed.

We can address following question in follow-up. But we should move forward to unlock the file loader.

davidbrochart · 2023-04-11T14:58:41Z

The synchronization has been fixed.

It's hard to tell, I think we should give more time for reviewers to get their questions answered before merging.

hbcarlos added the enhancement New feature or request label Mar 14, 2023

hbcarlos self-assigned this Mar 14, 2023

blink1073 mentioned this pull request Mar 16, 2023

Meeting Notes 2023 jupyter-server/team-compass#45

Closed

hbcarlos force-pushed the refactor branch from 2c9ded3 to eaa37f7 Compare March 26, 2023 10:33

hbcarlos marked this pull request as ready for review March 28, 2023 08:48

hbcarlos requested review from davidbrochart and fcollonval March 28, 2023 08:48

davidbrochart reviewed Mar 28, 2023

View reviewed changes

jupyter_collaboration/handlers.py Outdated Show resolved Hide resolved

davidbrochart reviewed Mar 28, 2023

View reviewed changes

jupyter_collaboration/handlers.py Show resolved Hide resolved

davidbrochart reviewed Mar 28, 2023

View reviewed changes

jupyter_collaboration/handlers.py Outdated Show resolved Hide resolved