Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace Portalocker #370

Merged
merged 16 commits into from
Sep 6, 2023
Merged

Replace Portalocker #370

merged 16 commits into from
Sep 6, 2023

Conversation

jburel
Copy link
Member

@jburel jburel commented Apr 21, 2023

Replace internal portalocker by the upstream version. This is the first step to be able to upgrade to Python 3.8+ (windows)

src/omero_ext/cloghandler.py Outdated Show resolved Hide resolved
src/omero/hdfstorageV2.py Show resolved Hide resolved
src/omero/hdfstorageV2.py Outdated Show resolved Hide resolved
@sbesson
Copy link
Member

sbesson commented Apr 27, 2023

0735974 is fully disabling Windows testing. Is it on purpose or is additional work planned to selectively exclude the backend OMERO.tables tests on Windows?

@sbesson
Copy link
Member

sbesson commented May 9, 2023

All CI tests are now successful and the OMERO.tables are skipped on Windows. I assume this will be assessed in a round to functional testing. Trying to identify a few scenarios which be reviewed in a deployment with this PR included:

  • concurrent access (read/write) to OMERO configuration via omero config
  • concurrent access (read/write) to OMERO tables
  • anything else?

Coming back briefly to the code changes and specifically a77749d, ome/openmicroscopy#3431 introduced a lot of logic specifically for handling OMERO.tables on Windows. Is it worth looking into reverting these changes as part of this clean-up to simplify our future maintenance? Or limit this to the minimal amount of changes and capture as an issue?

@jburel
Copy link
Member Author

jburel commented May 9, 2023

I would rather keep this PR as simple as possible i.e. remove dependency so we can use Python 3.8+
We can look at refactoring in a follow-up PR.

@joshmoore
Copy link
Member

👍 for the change in a post-Windows world but also for cleaning up the changes (in the future).

@jburel jburel mentioned this pull request Jun 13, 2023
@jburel
Copy link
Member Author

jburel commented Jun 13, 2023

issue created #373

Now that CI is back @pwalczysko @will-moore could you go over the scenarios highlighted above i.e.

  • concurrent access (read/write) to OMERO configuration via omero config
  • concurrent access (read/write) to OMERO tables

@will-moore
Copy link
Member

OK - I remember we had issues with async parallel reading of tables before - see ome/omero-web#25
and that was "fixed" client-side by avoiding async requests. Was portalocker supposed to prevent that scenario? Is the replacement expected to do better at that than portalocker?
Is that a good testing scenario for this PR?

@jburel
Copy link
Member Author

jburel commented Jun 13, 2023

We still use portalocker but this time it is the upstream version.

@will-moore
Copy link
Member

I'm going to create a Table on merge-ci to test, but I'm not really sure what pass/fail looks like?
If portalocker doesn't work, do we expect to see corrupted files or some other clue?
If the existing behaviour - e.g. omero::InternalException at https://forum.image.sc/t/server-error-after-populating-metadata-for-images-using-web-script/27844/14 is a "pass", what's a fail?

@will-moore
Copy link
Member

Testing with Josh's script at https://forum.image.sc/t/server-error-after-populating-metadata-for-images-using-web-script/27844/14 gives the same results on merge-ci and on latest-ci (but not sure that even tests this PR)?

@jburel
Copy link
Member Author

jburel commented Aug 17, 2023

As part of a general clean-up, it will be good to integrate this PR in the next release

@will-moore
Copy link
Member

@jburel - thinking about a release now, should we get this merged?

@jburel
Copy link
Member Author

jburel commented Aug 23, 2023

I think we can.
any objections @joshmoore @sbesson?

@sbesson
Copy link
Member

sbesson commented Aug 23, 2023

Have all the scenarios mentioned in #370 (comment) been validated? It's not obvious from the following reviews in this PR

@will-moore
Copy link
Member

It's not clear to me how to test concurrent access to OMERO config or OMERO.tables and what does a pass or fail look like?

We probably need a server without portalocker enabled and a clear idea of how to demonstrate something bad happening (file corruption?) if we want to be sure that this is prevented by portalocker.

@sbesson
Copy link
Member

sbesson commented Aug 23, 2023

It's not clear to me how to test concurrent access to OMERO config or OMERO.tables and what does a pass or fail look like?

For OMERO config, a minimal test is to call omero config edit in a terminal. This should lock the XML file. In another terminal to make a call to read/write operation on the configuration file.

With the current OMERO.py release, I get

$ omero config edit
Could not acquire lock on /opt/omero/OMERO.current/etc/grid/config.xml

OMERO.tables will likely need a custom script testing something a similar scenario where a table is being locked and accessed at the same time by another process.

@will-moore
Copy link
Member

With this branch installed locally, this works - (running this in 2 terminals at the same time):

$ omero config edit
Could not acquire lock on /Users/wmoore/Desktop/OMERO.server-5.6.5-ice36-b233/etc/grid/config.xml

Going on the tests as a guide, tried this...

$ touch test_lock
$ python

>>> import omero.hdfstorageV2 as storage_module
>>> import threading
>>> lock = threading.RLock()
>>> HdfStorage = storage_module.HdfStorage
>>> hdf1 = HdfStorage("/Users/wmoore/Desktop/PY/omero-py/test_lock", lock)
>>> hdf2 = HdfStorage("/Users/wmoore/Desktop/PY/omero-py/test_lock", lock)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/wmoore/Desktop/PY/omero-py/target/omero/hdfstorageV2.py", line 203, in __init__
    self.__hdf_file = HDFLIST.addOrThrow(file_path, self, read_only)
  File "/Users/wmoore/Desktop/PY/omero-py/target/omero/util/decorators.py", line 94, in with_lock
    return func(*args, **kwargs)
  File "/Users/wmoore/Desktop/PY/omero-py/target/omero/hdfstorageV2.py", line 126, in addOrThrow
    raise omero.LockTimeout(
omero.LockTimeout: exception ::omero::LockTimeout
{
    serverStackTrace = None
    serverExceptionClass = None
    message = Path already in HdfList: /Users/wmoore/Desktop/PY/omero-py/test_lock
    backOff = 0
    seconds = 0
}

Also tried this, which didn't throw an Exception, but not sure if this should?

# conn.connect() etc...
r = conn.getSharedResources()
t = r.openTable(omero.model.OriginalFileI(2154), conn.SERVICE_OPTS)
t2 = r.openTable(omero.model.OriginalFileI(2154), conn.SERVICE_OPTS)

Also just realised that this is testing server-side code, whereas I only have this PR installed client-side.

@will-moore
Copy link
Member

Any more testing needed?
How do I connect to merge-ci server? merge-ci-devspace.openmicroscopy.org?

$ omero login
Previously logged in to localhost:4064 as will
Server: [localhost:4064]merge-ci-devspace.openmicroscopy.org
Username: [will]user-3
Password:
InternalException: Failed to connect: Ice.ConnectionLostException:
recv() returned zero

@sbesson
Copy link
Member

sbesson commented Aug 28, 2023

Also just realised that this is testing server-side code, whereas I only have this PR installed client-side.

You're correct that the portalocker usage in OMERO.tables is fully server-side and I have not found a reliable way to test it client-side.

After a quick investigation, the most relevant place I found to test this logic are the OMERO.tables unit tests which have been de-activated since the Python 3 migration in b185956.

jburel/omero-py@portalocker...sbesson:omero-py:table_locking_tests proposes 2 commits to re-activate these tests on top of this branch. d677199 in particular tests the omero.LockTimeout exception raised by the portalocker logic. Note that a number of adjustments had to be made to the TestHDFList.testLocking which are quite possibly related to the upstream backwards incompatible changes in the Pytables 3.x line including https://www.pytables.org/MIGRATING_TO_3.x.html#api-name-changes and https://www.pytables.org/release-notes/RELEASE_NOTES_v3.1.x.html#backward-incompatible-changes

@jburel
Copy link
Member Author

jburel commented Aug 28, 2023

@sbesson do you want to push the commits to this branch?

@sbesson
Copy link
Member

sbesson commented Aug 28, 2023

Done. I believe the readthedocs CI build failure is unrelated

@jburel
Copy link
Member Author

jburel commented Aug 28, 2023

Done. I believe the readthedocs CI build failure is unrelated

It is
I have opened a PR. readthedocs is now requiring a flag. see #383

jburel and others added 10 commits August 28, 2023 11:30
As of the migration to Python 3 and f8af68d,
both int and long values are returned as rlong
- Expect tables.open_file to open the file in append mode
- Remove unnecessary print and FILE_OPEN_POLICY statement
- Remove conditional handling of Pytables 2.x openFile API
- Remove Pytables ValueError check (possibly invalid since Pytable 3.1)
@will-moore
Copy link
Member

I was trying to see where those unit tests are actually run, but I don't see anything explicit in the jobs above?
Do we just assume they're running and passing silently?

@sbesson
Copy link
Member

sbesson commented Aug 30, 2023

sbesson@1ab20f7 is one way to address this issue and use the verbose flag to list skipped/passing/failing tests individually. Should I push it to this branch?

@jburel
Copy link
Member Author

jburel commented Aug 30, 2023

@sbesson please push

Copy link
Member

@will-moore will-moore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the logs for last build at https://github.com/ome/omero-py/actions/runs/6023964482/job/16341736551?pr=370
I see...

  [gw2] PASSED test/unit/tablestest/test_hdfstorage.py::TestHdfStorage::testLocking 

So I think we are good.

@jburel jburel merged commit 273c275 into ome:master Sep 6, 2023
7 checks passed
@jburel jburel deleted the portalocker branch April 29, 2024 13:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants