-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simply download the OriginalFiles #264
Conversation
src/main/java/org/openmicroscopy/shoola/env/data/OMEROGateway.java
Outdated
Show resolved
Hide resolved
src/main/java/org/openmicroscopy/shoola/env/data/OMEROGateway.java
Outdated
Show resolved
Hide resolved
Deleted the comment, as I was able to download and unzip on subsequent attemps. |
Test 1:
Result: |
This also happens with the current Insight without this PR. Tbh, if download of the original files and reimport splits the multiimage Dicom image, than the problem is probably Dicom specific and lies deeper. Download and reimport of a vsi image for example works as expected with this PR. |
Test 2:
Result: The downloaded folder structure seems reasonable, see screenshot below. This test passes imho. |
Thank you @dominikl - confirming that with my release insight. The Dicom behaviour is not a regression, and has nothing to do with this PR. |
Would it be easy enough to expand this to "file or directory"? i.e. if anything already exists, you skip? I don't have a working example of how to trigger this, but I worried of downloading:
into a directory which already contains:
and thereby change the interpretation of something. |
Hi there, quick and hopefully not so stupid question... Should the changes from above also be included in Thanks, |
Hi Niko, The changes are certainly available on https://merge-ci.openmicroscopy.org/jenkins/job/OMERO-insight-build/ - just download the latest artifact of insight, that is what I am doing when testing. Not sure about omero-py, leave @dominikl to answer that. |
As Petr said, the changes are included in the current merge build of Insight. But they won't go into omero-py, that's totally unrelated to Insight. |
Last commit should address @joshmoore 's issue. Would be good to have some complex filesets (many files, many directories) to properly test this PR. What's a good image format for that? |
Tested so far
The behaviour was as expected. @dominikl to find one complex file format which will test it all is possibly not realistic ? Could we specify (with examples) what the "if the file already exists, it will be skipped" means, or, if that changed as a result of your last commit, how exactly and what is the expectation ? Only based on that we can set up some testing cases. |
Thanks Dominik. I somehow thought omero-py is calling the Java code when downloading files, but I should actually know better... |
@ehrenfeu See https://forum.image.sc/t/download-full-projects-with-their-datasets/56434/22 and discussion there for downloading via python. Improvements were made in |
Ah, very nice - thanks a lot @will-moore ! I was asking for a friend working on the HRM-OMERO connector 😁 but he already found out that he should just properly deal with the paths when downloading filesets from OMERO. Nevertheless I will look into your code. My parts in the connector are a bit aged by now, they could certainly use some polishing. |
Cross-linking to ome/omero-py#298 which added the Python improvements that @will-moore was referring too. This thread in particular includes many considerations around data integrity and data corruption that are directly relevant to this use case. Adding to the similarities, the former CLI download allowed the user to specify the name of the output file breaking the assumptions of the file readers when doing a round-trip. This is effectively another variant of #188. The agreement made in OMERO.py was to:
Talking with @pwalczysko @dominikl @will-moore and @khaledk2, I would strongly vote for using the occasion and be consistent in terms of the strategy across our download utilities. Probably the biggest differences between
is the one that is the most amenable to ensure the integrity of the original data and reduce the risks of cross-talk between filesets. In a nutshell, this is a small version of the ManagedRepository layout and matches the layout used in Probably the most important design decision for the layout above is the naming of these top-level fileset folders. There are a range of available options which offer trade-offs in terms of user-friendliness vs potential for interference with existing/other fileset folders. |
cc: @sukunis in case there are any cross-interactions with openlink. |
values.put(fs.getOriginalFile(), set); | ||
for (Object tmp : files) { | ||
Fileset fs = (Fileset) tmp; | ||
File filesetDir = new File(dir.getAbsolutePath()+File.separator+"Fileset_"+fs.getId().getValue()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
method like Files.createDirectories
could be used here instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't really make it simpler, for Files.createDirectory
I'd need to create a Path object instead of a File.
Testing the behaviour of the repeated dowload of the same file into the same folder. |
Tested so far
The above went fine. I think that the most logical to try now would be a plate download, but this si not enabled on merge-ci. What is the present workflow to change the config on that server please @sbesson ? |
Downloading https://merge-ci.openmicroscopy.org/web/webclient/?show=plate-4256 and https://merge-ci.openmicroscopy.org/web/webclient/?show=plate-2955 now. The https://merge-ci.openmicroscopy.org/web/webclient/?show=plate-3855 is ending up being many single Filesets, possibly because it is a result of Dataset to Plate script ? |
In case you're interested, I have created a very small (~200 kB) multifile VSI dataset using the original Olympus cellSens software here at our facility. My purpose is to use it for automated testing of the HRM-OMERO connector, hence I tried to have it as small as possible while keeping the multifile structure (once it gets too small, cellSens puts everything into a single Let me know if I should upload it somewhere! |
@pwalczysko Unfortunately I can only log the error if the file already exists. At the moment it's not possible to pass an error message up to the UI level. That would need a lot of changes to the Insight code. |
The plates https://merge-ci.openmicroscopy.org/web/webclient/?show=plate-4256 and https://merge-ci.openmicroscopy.org/web/webclient/?show=plate-2955 downloaded nicely, and the reimport as https://merge-ci.openmicroscopy.org/web/webclient/?show=screen-47354 went fine too. I do not see any problem there. |
Any ideas for further testing formats @sbesson ? Tested so far
|
That would make sense, @pwalczysko. |
The list of formats tested in #264 (comment) should cover a wide-range of scenarios in different imaging domains (HCS, pathology). I assume NDPI included the NPDIS case (i.e. a pathology fileset composed of multiple NDPI files)? |
ndpi was https://merge-ci.openmicroscopy.org/web/webclient/?show=image-103615, which is just ndpi. I will try https://merge-ci.openmicroscopy.org/web/webclient/?show=image-176115 which is ndpis. |
Hi Niko, |
zeiss-lsm see https://merge-ci.openmicroscopy.org/web/webclient/?show=image-210326 the three formats as above successfully downloaded and reimported. All MIFs and looked good after download and also after reimport. This would be the coverage as suggested by comment #264 (comment) We have now
As all files and functional tests passed and the coverage seems rich enough, ready to merge fmpov. |
Nothing else from my side. We might want to wait on the data from #264 (comment) to conduct a final test. Otherwise, I assume we are in a state where we should release OMERO.insight. |
Hi @pwalczysko here you go: https://doi.org/10.5281/zenodo.5848987 Thanks a lot! |
@ehrenfeu Thanks a lot Niko ! Downloaded, imported to OMERO and again downloaded from OMERO using insight from this PR. |
cc: @erickmartins since he may have something similar in his upcoming transfer tools https://twitter.com/erickratamero/status/1482060699539058688 |
We live in a python/ezomero world, so our strategy is to use our (we're probably making our transfer thing public this week. the repo is still private because we didn't want a half-half-baked solution in the hands of people, and because we still need a catchier name than |
@pwalczysko @jburel sorry if I'm repeating myself, I feel like having asked this question before... Is there an automatic build of insight available somewhere that includes this merge? |
I am preparing a release. It should be out next week. But you can try the daily build if you want. It is at https://merge-ci.openmicroscopy.org/jenkins/job/OMERO-insight-build/. That builds does not have the dmg, exe so it is very much for testing purpose |
Thanks J-M, highly appreciated! |
@ehrenfeu https://github.com/ome/omero-insight/releases/tag/v5.7.0 containing this PR is now available |
This pull request has been mentioned on Image.sc Forum. There might be relevant details there: https://forum.image.sc/t/download-vsi-files-from-omero-fails/83314/2 |
Fixes #188, successor of closed PR #217.
I'm very tempted to just keep it simple like this @jburel , ie. with this PR the download option will simply download the OriginalFiles and their directory structure. It won't create a separate directory or rename any files. If a file already exists, it will be skipped. If we make it more complex again, we'll run into all sorts of issues with special cases again.
/cc @ehrenfeu @CellKai
(Note:
Haven't tested it on Windows yetTested on Windows now too, incl. the latest commit)