Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix incremental update #329

Merged
merged 24 commits into from
Dec 19, 2023
Merged

fix incremental update #329

merged 24 commits into from
Dec 19, 2023

Conversation

giovp
Copy link
Member

@giovp giovp commented Jul 30, 2023

starting to work on #293

this PR only partially address the issue, in particular it implements the following:

  • enforce uniqueness of names of elements, by introducing the Elements class
  • remove lazy vs. in-memory logic for adding elements.

@giovp
Copy link
Member Author

giovp commented Aug 1, 2023

heads up, had to introduce a new module 😬 called "elements" (happy to change name).
Thing is, the __setitem__ method should be consistent whether a new element is added directly to spatialdata, or to the elements dict.
e.g.

sdata = SpatiaData(images={"myimage":image})
sdata["myimage"] = image # raise error
sdata.images["myimage"] = image # did not raise error, now it does

I must say I'm not 100% sure this is expected behaviour but at least it is consistent, wdyt? @LucaMarconato

EDIT: lots of tests failing because of this

@giovp
Copy link
Member Author

giovp commented Aug 8, 2023

I've started with option 1 from zulip but now the following example doesn't throw warning

sdata = SpatiaData(images={"myimage":image}, labels={"mylabel":label})
sdata["myimage"] = newimage
>>> UserWarning
sdata.labels["myimage"] = newlabel # does not throw warning, but it should..

need to think of best option

@LucaMarconato
Copy link
Member

@giovp still draft or ready for review?

@giovp giovp marked this pull request as draft August 15, 2023 15:51
@giovp
Copy link
Member Author

giovp commented Aug 15, 2023

still draft but very close!

@codecov
Copy link

codecov bot commented Aug 16, 2023

Codecov Report

Merging #329 (859094c) into main (c620c45) will decrease coverage by 0.16%.
The diff coverage is 93.66%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #329      +/-   ##
==========================================
- Coverage   91.75%   91.59%   -0.16%     
==========================================
  Files          36       37       +1     
  Lines        5007     4997      -10     
==========================================
- Hits         4594     4577      -17     
- Misses        413      420       +7     
Files Coverage Δ
src/spatialdata/_io/io_raster.py 96.29% <ø> (-0.07%) ⬇️
src/spatialdata/_io/io_zarr.py 90.29% <100.00%> (ø)
src/spatialdata/_types.py 76.92% <100.00%> (+10.25%) ⬆️
src/spatialdata/_core/spatialdata.py 92.77% <95.08%> (-1.22%) ⬇️
src/spatialdata/_core/_elements.py 92.00% <92.00%> (ø)

@giovp giovp marked this pull request as ready for review September 29, 2023 09:44
@giovp
Copy link
Member Author

giovp commented Oct 5, 2023

@LucaMarconato @kevinyamauchi this is ready for review !

Copy link
Collaborator

@melonora melonora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some minor questions / comments. Overall looks good to me and in fact a lot cleaner!

name = os.path.basename(node.metadata["name"])
# TODO: what to do with name? For now remove?
# name = os.path.basename(node.metadata["name"])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say a lot of people actually use this to store information that can't be stored in metadata yet. What is the reason for removing it currently?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @giovp in removing this, using a custom name has two downsides:

  1. it makes the access to tensor inside the xarray object more verbose and non uniform, for instance one needs to write something like sdata['my_image_name']['scale0']['my_image_name'], instead of the standardized sdata['my_image_name']['scale0']['image'].
  2. if the user assignes the image to a different name, like with sdata['my_image2'] = sdata['my_image_name'], the internal name will still renamin the old one, which is confusion.

@giovp for the sake of brevity I would actually call the name im instead of image; also in the case of labels I think im sounds better than image.

sdata.add_shapes(name="image", shapes=shapes)
with pytest.raises(ValueError):
sdata.add_labels(name="image", labels=labels)
with pytest.warns(UserWarning):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to the code but I believe this test would fail on a windows system. We currently do not do CI for windows, but for shapes here only np.int64 is supported while on windows this creates np.int32 dtype.

Shall I open an issue for this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please opena an issue for this; we should consider adding a Windows CI at some point in the next months.

tests/conftest.py Show resolved Hide resolved
@melonora
Copy link
Collaborator

just for others after some small QA with Giovanni, in case that the scope of this PR is to also allow for partial writes to disk, this should still be implemented as I believe this PR does not do this yet. In case it is implemented it is currently not covered by the tests yet.

@melonora
Copy link
Collaborator

Also upon testing with napari spatialdata, the userwarning is given every time you click on a coordinate space. I will look into it what causes this behaviour.

Copy link
Collaborator

@kevinyamauchi kevinyamauchi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I like the additional Elements class. This could be useful for shimming other actions and validators in the future. Thanks!

src/spatialdata/_core/_elements.py Show resolved Hide resolved
Copy link
Collaborator

@kevinyamauchi kevinyamauchi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, last review was meant to be an approval

@giovp giovp marked this pull request as ready for review November 28, 2023 21:05
@LucaMarconato
Copy link
Member

I would like to know if there are any downsides to actually using a class attribute here for _shared_keys. Also more from the perspective of clarity, would a name like _existing_keys be more clear? The keys are not actually shared, in fact they should not be shared as they should be unique.

I agree, I would rename to _existing_keys, wdyt @giovp?

@@ -682,292 +585,6 @@ def transform_to_coordinate_system(
elements[element_type][element_name] = transformed
return SpatialData(**elements, table=sdata.table)

def add_image(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep in mind that this PR removed the commented code here, since it will probably be useful for making the "lazy copy" approach, or similar, work in the future.

@@ -1044,8 +661,7 @@ def write(
# reload the image from the Zarr storage so that now the element is lazy loaded,
# and most importantly, from the correct storage
element_path = Path(self.path) / "images" / name
image = _read_multiscale(element_path, raster_type="image")
self._add_image_in_memory(name=name, image=image, overwrite=True)
_read_multiscale(element_path, raster_type="image")
Copy link
Member

@LucaMarconato LucaMarconato Dec 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be removed. It's the residual of some old code was for reloading lazily those data that could be lazy loaded. It's better for the user to reload things explicitly, maybe. I say maybe because certain softwares, for instance illustrator, change the path the current session is pointing to whenever the file is saved.

@@ -1023,7 +640,7 @@ def write(
# self.path = str(file_path)
# else:
# self.path = str(tmp_zarr_file)
self.path = str(file_path)
self.path = Path(file_path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a comment on this. See GitHub discussion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we shuold replace this with self._path = Path(file_path)

"""Path to the Zarr storage."""
return self._path

@path.setter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a comment on this setter, not sure if we should disallow it. See GitHub discussion. (I added a similar "See GitHub discussion" comment in other places, they all refer to the same discussion.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think this setter should not be public.

Comment on lines 278 to -316

def test_incremental_io_with_backed_elements(self, full_sdata):
# addressing https://github.com/scverse/spatialdata/issues/137
# we test also the non-backed case so that if we switch to the
# backed version in the future we already have the tests

with tempfile.TemporaryDirectory() as tmpdir:
f = os.path.join(tmpdir, "data.zarr")
full_sdata.write(f)

e = full_sdata.images.values().__iter__().__next__()
full_sdata.add_image("new_images", e, overwrite=True)
# support for overwriting backed images has been temporarily removed
with pytest.raises(ValueError):
full_sdata.add_image("new_images", full_sdata.images["new_images"], overwrite=True)

e = full_sdata.labels.values().__iter__().__next__()
full_sdata.add_labels("new_labels", e, overwrite=True)
# support for overwriting backed labels has been temporarily removed
with pytest.raises(ValueError):
full_sdata.add_labels("new_labels", full_sdata.labels["new_labels"], overwrite=True)

e = full_sdata.points.values().__iter__().__next__()
full_sdata.add_points("new_points", e, overwrite=True)
# support for overwriting backed points has been temporarily removed
with pytest.raises(ValueError):
full_sdata.add_points("new_points", full_sdata.points["new_points"], overwrite=True)

e = full_sdata.shapes.values().__iter__().__next__()
full_sdata.add_shapes("new_shapes", e, overwrite=True)
full_sdata.add_shapes("new_shapes", full_sdata.shapes["new_shapes"], overwrite=True)

# commenting out as it is failing
# f2 = os.path.join(tmpdir, "data2.zarr")
# sdata2 = SpatialData(table=full_sdata.table.copy())
# sdata2.write(f2)
# del full_sdata.table
# full_sdata.table = sdata2.table
# full_sdata.write(f2, overwrite=True)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remember that this code was removed here because we will need to add similar tests when working with incremental IO or with lazy copy.

@LucaMarconato
Copy link
Member

LucaMarconato commented Dec 5, 2023

Thanks @giovp for the PR, the code reads much cleaner and is more robust now! All the comments that I have added above are minor and can be very quickly resolved. The only part that requires some extra attention is what I will discuss in this comment and is about the meaning of sdata.path. Anyway, also this comment has an easy fix (removing the setter for sdata.path) and then we can follow up in the incremental-io PR.

In the previous PR, the sdata.path could either be None or a real path. In the first case the object could have been anything: a pure in-memory object or a mix of a in-memory/lazy-loaded/lazy-computed object. But if read() was called (and read() was automatically called after a write()), the path was then changed to a string, and this guaranteed that the lazy-loadable elements (images, labels and points), were necessarily being lazy loaded from the Zarr store.

In fact, if the user called read() and then sdata.add_points(...), the new points would not be added in-memory, but first written on disk and then lazy read. The advantage of this is the consistency of the meaning of sdata.path (I write it again for clarity: sdata.path is a str if and only if all the images, labels and points elements that are present are lazy loaded from the Zarr substores inside that path). But this came with a big disadvantage, the inability to overwrite lazy loaded data and enough to make the current PR necessary.

But this opens up the question of what's the meaning of sdata.path. I think we can postpone this question when we work on incremental IO, but for the moment I would say that we can do the following:

  • we remove the public setter for sdata.path;
  • sdata.path is automatically set when read() is called;
  • the user after calling write() will probably often want to manually call read() again because otherwise the performance improvement given by Zarr multiscale and parquet are not available. We need to make this clear in an new example notebook to make the user aware. Finally the user will be able to load in-memory specific element so that some elements are lazy loaded and some of them are fully loaded, but again, let's postpone this discussion to the incremental-io PR.

Anyway, since all the comments are minor I am pre-approving the PR, great work! 🚀

@giovp
Copy link
Member Author

giovp commented Dec 17, 2023

thanks @LucaMarconato for the comment, I agree on the plan.

  • I have fixed the setter behavior for images/labels/points/shapes and added tests
  • I have removed the path setter and removed the re-reading after write of the elements.

@giovp giovp merged commit 4eed839 into main Dec 19, 2023
9 checks passed
@giovp giovp deleted the fix/incremental-io branch December 19, 2023 10:59
melonora added a commit that referenced this pull request Jan 17, 2024
* [pre-commit.ci] pre-commit autoupdate (#394)

* [pre-commit.ci] pre-commit autoupdate

updates:
- [github.com/psf/black: 23.10.1 → 23.11.0](psf/black@23.10.1...23.11.0)
- [github.com/pre-commit/mirrors-prettier: v3.0.3 → v3.1.0](pre-commit/mirrors-prettier@v3.0.3...v3.1.0)
- [github.com/pre-commit/mirrors-mypy: v1.6.1 → v1.7.0](pre-commit/mirrors-mypy@v1.6.1...v1.7.0)
- [github.com/astral-sh/ruff-pre-commit: v0.1.3 → v0.1.6](astral-sh/ruff-pre-commit@v0.1.3...v0.1.6)

* ficx pre-precommit

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: giovp <giov.pll@gmail.com>

* initial tests multi_table design

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add mock new table

* create test class and cleanup

* additional cleanup

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* additional cleanup

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add pseudo methods

* Change table type in init

* make tables plural and add to validation in __init__

* revert to old public accessor

* Validate each table in dictionary

* iterate dict values

* add comment

* adjust table getter

* Add tables getter

* Fix missing parenthesis

* change to warnings.warn DeprecationWarning

* allow for backward compatibility in init

* [pre-commit.ci] pre-commit autoupdate (#408)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* fix dict subscriptable

* fix string representation of sdata

* add deprecation decorator for future

* Allow for tables not annotating elements

* switch to using tables with deprecation

* fix string representation

* write tables element group

* adjust io to multi_table

* Alter io to give None as default value for spatialdata attrs keys

* add tables setter

* raise keyerror table getter

* remove commented tables setter

* raise keyerror in table deleter

* add deprecation warning

* fix tests

* add DeprecationWarning

* comment test

* change setter into method

* circumvent mappingproxy set issue

* adjust set get test

* add get table keys

* add column getters

* add change set target table

* Give default table name

* fix spatialdata without table

* add int32 because of windows and add docstring

* fix filtering by coordinate system

* Change to Path to not be linux / mac specific

* Change to Path to not be linux / mac specific

* table should annotate existing element

* return table with AnnData having 0 rows

* Adjust for windows

* adjust for accessing table elements

* fix change annotation target

* fix set annotation target

* fix/add tests

* fix init from elements

* fix init from elements tests

* add validation check

* add table validation SpatialData.__init

* fix ruff

* only concatenate if annotating

* change into warning because of filtering

* fix last tests

* adjust to tables

* use tables parameter

* fix some mypy

* some mypy fixes

* some more mypy

* fix another mypy

* circumvent typing error on py3.9

* mypy yet again

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix pre_commit

* down to 12 mypy errors

* down to 1mypy error

* fixed mypy errors

* fix set_table_annotation

* added docstring

* refactor data loader (#299)

Co-authored-by: LucaMarconato <2664412+LucaMarconato@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Luca Marconato <m.lucalmer@gmail.com>

* add documentation

* add documentation

* minor adjustment docstring

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Added / adjusted docstrings

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix mypy after merge

* refactor function name

This is to avoid confusion. Many not easily resolved errors are created if we let this function generate table values. This makes it clear that
only spatial element values are generated and not tables. This in opposite to gen_elements which does return tables as well.

* [pre-commit.ci] pre-commit autoupdate (#411)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* small fixes

* added gen_elements docstrings

* tiny comments

* fix ruff pre-commit

* removed types from docstring

* refactor of set_table_annotation_target

* add quotes

* fix (?)

* refactor error messages

* fix incremental update (#329)

Co-authored-by: Wouter-Michiel Vierdag <w-mv@hotmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* add concatenate argument

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add util functions to init

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add util functions to init

* add tables class

* add table class

* add deprecation back

* rename function in tests

* rename function in tests

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix precommit

* update precommit and remove add_table, store_table and general fixes

* adjust tables init to incremental update pr

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix deletion of deprecated table

* revert filter change

* add public generators

* adjust to public generators

* Find element uses public generator

* add validation in sdata for tables

* add deprecation version number

* fix mypy errors

* Fix backing when deleting table

* Fix mypy

* cleanup

* chance target_element_name to region

* refactor test

* adjust concatenate regarding not concatenating tables

* add utility function

* concatenate if in multiple sdata objects

* minor docstring refactor

* fix import

* concatenate with tables

* cleanup

* fix test

* [pre-commit.ci] pre-commit autoupdate (#415)

updates:
- [github.com/psf/black: 23.11.0 → 23.12.1](psf/black@23.11.0...23.12.1)
- [github.com/pre-commit/mirrors-prettier: v4.0.0-alpha.4 → v4.0.0-alpha.8](pre-commit/mirrors-prettier@v4.0.0-alpha.4...v4.0.0-alpha.8)
- [github.com/pre-commit/mirrors-mypy: v1.7.1 → v1.8.0](pre-commit/mirrors-mypy@v1.7.1...v1.8.0)
- [github.com/astral-sh/ruff-pre-commit: v0.1.7 → v0.1.9](astral-sh/ruff-pre-commit@v0.1.7...v0.1.9)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* fixed typings; made pytest raises explicit

* minor fixes human readable strings

* fixed tests

* fix

* Add changes to changelog

* make private

* Remove commented code

* add cache to ignore

* updated changelog with giovp old pr

* refactor into private function

* Fix docstring

* Fix import

* specify key reuse in docstring

* add orphan_table argument

* Change docstring

* remove todo

* add example

* change concatenate logic

* updated changelog

* Allow force-overwriting existing files (non-backing) (#344)

* Add test for writing unbacked data over existing files

* Protect overwriting existing file only if it is backing file

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Simplify assertion, remove try/except

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed pre-commit

* added get_dask_backing_files(); improved sdata.write with overwrite=True

* fix docs

* changed version in changelog

* fix exception string

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Luca Marconato <m.lucalmer@gmail.com>
Co-authored-by: LucaMarconato <2664412+LucaMarconato@users.noreply.github.com>

* Added error message for removed add_elements functions (#420)

* added error message for removed add_elements functions

* moved _error_message_add_element() to _utils

* added validate and set region key

* fix docstring

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Added public functions Spatialdata

* fix tests

* add docstrings

* Added subset API; fix behavior with zero-len table (#426)

* added subset API, returning None instead of empty table for APIs with  filter_table=True

* fix 3.9

* [pre-commit.ci] pre-commit autoupdate (#424)

updates:
- [github.com/astral-sh/ruff-pre-commit: v0.1.9 → v0.1.11](astral-sh/ruff-pre-commit@v0.1.9...v0.1.11)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* remove docstring typehint

* Warn user over overwrite in docstring

* Fix query of 2D/3D data with 2D/3D bounding box (#409)

* wip

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* wip

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix 2d/3d bb for raster data

* support for 2d/3d bb for 2d/3d points

* better tests

* applied suggestions from giovp

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added comments

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* [pre-commit.ci] pre-commit autoupdate (#430)

updates:
- [github.com/astral-sh/ruff-pre-commit: v0.1.11 → v0.1.13](astral-sh/ruff-pre-commit@v0.1.11...v0.1.13)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* updated changelog

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* refactor subset

* remove todo

* Made _locate_spatial_element public, renamed to locate_element() (#427)

* made _locate_spatial_element public, renamed to locate_element()

* returning path instead of tuple in locate_element()

* updated changelog

* locate_elements() now returns a list

* fix test

* Update test_and_deploy.yaml (#434)

Triggering the tests for pull requests to any branch.

* change docstring

* fix query test

* add todo

* refactor filter_by_coordinate_system

* test filter with keep table

* adjust docstring

* adjust docstring

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: giovp <giov.pll@gmail.com>
Co-authored-by: Giovanni Palla <25887487+giovp@users.noreply.github.com>
Co-authored-by: LucaMarconato <2664412+LucaMarconato@users.noreply.github.com>
Co-authored-by: Luca Marconato <m.lucalmer@gmail.com>
Co-authored-by: aeisenbarth <54448967+aeisenbarth@users.noreply.github.com>
melonora added a commit that referenced this pull request Mar 14, 2024
* initial tests multi_table design (#405)

* initial tests multi_table design

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add mock new table

* create test class and cleanup

* additional cleanup

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* additional cleanup

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add pseudo methods

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Multi table (#410)

* [pre-commit.ci] pre-commit autoupdate (#394)

* [pre-commit.ci] pre-commit autoupdate

updates:
- [github.com/psf/black: 23.10.1 → 23.11.0](psf/black@23.10.1...23.11.0)
- [github.com/pre-commit/mirrors-prettier: v3.0.3 → v3.1.0](pre-commit/mirrors-prettier@v3.0.3...v3.1.0)
- [github.com/pre-commit/mirrors-mypy: v1.6.1 → v1.7.0](pre-commit/mirrors-mypy@v1.6.1...v1.7.0)
- [github.com/astral-sh/ruff-pre-commit: v0.1.3 → v0.1.6](astral-sh/ruff-pre-commit@v0.1.3...v0.1.6)

* ficx pre-precommit

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: giovp <giov.pll@gmail.com>

* initial tests multi_table design

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add mock new table

* create test class and cleanup

* additional cleanup

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* additional cleanup

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add pseudo methods

* Change table type in init

* make tables plural and add to validation in __init__

* revert to old public accessor

* Validate each table in dictionary

* iterate dict values

* add comment

* adjust table getter

* Add tables getter

* Fix missing parenthesis

* change to warnings.warn DeprecationWarning

* allow for backward compatibility in init

* [pre-commit.ci] pre-commit autoupdate (#408)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* fix dict subscriptable

* fix string representation of sdata

* add deprecation decorator for future

* Allow for tables not annotating elements

* switch to using tables with deprecation

* fix string representation

* write tables element group

* adjust io to multi_table

* Alter io to give None as default value for spatialdata attrs keys

* add tables setter

* raise keyerror table getter

* remove commented tables setter

* raise keyerror in table deleter

* add deprecation warning

* fix tests

* add DeprecationWarning

* comment test

* change setter into method

* circumvent mappingproxy set issue

* adjust set get test

* add get table keys

* add column getters

* add change set target table

* Give default table name

* fix spatialdata without table

* add int32 because of windows and add docstring

* fix filtering by coordinate system

* Change to Path to not be linux / mac specific

* Change to Path to not be linux / mac specific

* table should annotate existing element

* return table with AnnData having 0 rows

* Adjust for windows

* adjust for accessing table elements

* fix change annotation target

* fix set annotation target

* fix/add tests

* fix init from elements

* fix init from elements tests

* add validation check

* add table validation SpatialData.__init

* fix ruff

* only concatenate if annotating

* change into warning because of filtering

* fix last tests

* adjust to tables

* use tables parameter

* fix some mypy

* some mypy fixes

* some more mypy

* fix another mypy

* circumvent typing error on py3.9

* mypy yet again

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix pre_commit

* down to 12 mypy errors

* down to 1mypy error

* fixed mypy errors

* fix set_table_annotation

* added docstring

* refactor data loader (#299)

Co-authored-by: LucaMarconato <2664412+LucaMarconato@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Luca Marconato <m.lucalmer@gmail.com>

* add documentation

* add documentation

* minor adjustment docstring

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Added / adjusted docstrings

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix mypy after merge

* refactor function name

This is to avoid confusion. Many not easily resolved errors are created if we let this function generate table values. This makes it clear that
only spatial element values are generated and not tables. This in opposite to gen_elements which does return tables as well.

* [pre-commit.ci] pre-commit autoupdate (#411)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* small fixes

* added gen_elements docstrings

* tiny comments

* fix ruff pre-commit

* removed types from docstring

* refactor of set_table_annotation_target

* add quotes

* fix (?)

* refactor error messages

* fix incremental update (#329)

Co-authored-by: Wouter-Michiel Vierdag <w-mv@hotmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* add concatenate argument

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add util functions to init

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add util functions to init

* add tables class

* add table class

* add deprecation back

* rename function in tests

* rename function in tests

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix precommit

* update precommit and remove add_table, store_table and general fixes

* adjust tables init to incremental update pr

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix deletion of deprecated table

* revert filter change

* add public generators

* adjust to public generators

* Find element uses public generator

* add validation in sdata for tables

* add deprecation version number

* fix mypy errors

* Fix backing when deleting table

* Fix mypy

* cleanup

* chance target_element_name to region

* refactor test

* adjust concatenate regarding not concatenating tables

* add utility function

* concatenate if in multiple sdata objects

* minor docstring refactor

* fix import

* concatenate with tables

* cleanup

* fix test

* [pre-commit.ci] pre-commit autoupdate (#415)

updates:
- [github.com/psf/black: 23.11.0 → 23.12.1](psf/black@23.11.0...23.12.1)
- [github.com/pre-commit/mirrors-prettier: v4.0.0-alpha.4 → v4.0.0-alpha.8](pre-commit/mirrors-prettier@v4.0.0-alpha.4...v4.0.0-alpha.8)
- [github.com/pre-commit/mirrors-mypy: v1.7.1 → v1.8.0](pre-commit/mirrors-mypy@v1.7.1...v1.8.0)
- [github.com/astral-sh/ruff-pre-commit: v0.1.7 → v0.1.9](astral-sh/ruff-pre-commit@v0.1.7...v0.1.9)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* fixed typings; made pytest raises explicit

* minor fixes human readable strings

* fixed tests

* fix

* Add changes to changelog

* make private

* Remove commented code

* add cache to ignore

* updated changelog with giovp old pr

* refactor into private function

* Fix docstring

* Fix import

* specify key reuse in docstring

* add orphan_table argument

* Change docstring

* remove todo

* add example

* change concatenate logic

* updated changelog

* Allow force-overwriting existing files (non-backing) (#344)

* Add test for writing unbacked data over existing files

* Protect overwriting existing file only if it is backing file

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Simplify assertion, remove try/except

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed pre-commit

* added get_dask_backing_files(); improved sdata.write with overwrite=True

* fix docs

* changed version in changelog

* fix exception string

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Luca Marconato <m.lucalmer@gmail.com>
Co-authored-by: LucaMarconato <2664412+LucaMarconato@users.noreply.github.com>

* Added error message for removed add_elements functions (#420)

* added error message for removed add_elements functions

* moved _error_message_add_element() to _utils

* added validate and set region key

* fix docstring

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Added public functions Spatialdata

* fix tests

* add docstrings

* Added subset API; fix behavior with zero-len table (#426)

* added subset API, returning None instead of empty table for APIs with  filter_table=True

* fix 3.9

* [pre-commit.ci] pre-commit autoupdate (#424)

updates:
- [github.com/astral-sh/ruff-pre-commit: v0.1.9 → v0.1.11](astral-sh/ruff-pre-commit@v0.1.9...v0.1.11)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* remove docstring typehint

* Warn user over overwrite in docstring

* Fix query of 2D/3D data with 2D/3D bounding box (#409)

* wip

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* wip

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix 2d/3d bb for raster data

* support for 2d/3d bb for 2d/3d points

* better tests

* applied suggestions from giovp

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added comments

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* [pre-commit.ci] pre-commit autoupdate (#430)

updates:
- [github.com/astral-sh/ruff-pre-commit: v0.1.11 → v0.1.13](astral-sh/ruff-pre-commit@v0.1.11...v0.1.13)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* updated changelog

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* refactor subset

* remove todo

* Made _locate_spatial_element public, renamed to locate_element() (#427)

* made _locate_spatial_element public, renamed to locate_element()

* returning path instead of tuple in locate_element()

* updated changelog

* locate_elements() now returns a list

* fix test

* Update test_and_deploy.yaml (#434)

Triggering the tests for pull requests to any branch.

* change docstring

* fix query test

* add todo

* refactor filter_by_coordinate_system

* test filter with keep table

* adjust docstring

* adjust docstring

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: giovp <giov.pll@gmail.com>
Co-authored-by: Giovanni Palla <25887487+giovp@users.noreply.github.com>
Co-authored-by: LucaMarconato <2664412+LucaMarconato@users.noreply.github.com>
Co-authored-by: Luca Marconato <m.lucalmer@gmail.com>
Co-authored-by: aeisenbarth <54448967+aeisenbarth@users.noreply.github.com>

* Enforce instance key to be dtype int (#444)

* change genes in blobs points

* force instance_id of int dtype

* Change error

* Check unique instance_key values per region

* Join elements table (#445)

* change genes in blobs points

* force instance_id of int dtype

* Change error

* Check unique instance_key values per region

* silence warning

* add left inner join

* change to left join

* add left_exclusive join

* add to Enum

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add inner join

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add right join

* add ugly right exclusive join

* return none instead of empty df

* add left join tests

* test left_exclusive_join

* test inner join

* refactor get_table_keys

* assert valid element in elements_dict

* test warnings

* add fail join tests

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* silence warning

* add left inner join

* change to left join

* add left_exclusive join

* add to Enum

* add inner join

* fix merge conflict

* Add right join

* add ugly right exclusive join

* return none instead of empty df

* add left join tests

* test left_exclusive_join

* test inner join

* refactor get_table_keys

* assert valid element in elements_dict

* test warnings

* add fail join tests

* fix comments

* lowercase + docstring

* complete docstring

* lowercase enum

* explicit instance_id column in test

* test instance_id and region column

* change error

* add get method

* add contain

* get rid of create_element_dict

* add todo

* adjust enums

* add ability for matching rows

* some cleanup

* make any element none in right exclusive join

* add check of table order conservation

* fix tests

* add match_element_to_table

* change docstring

* revert to old match_table_to_element

* readd match_element_to_table

* added tests for label joins

* add points tests

* fixed type and docstring

* fixed comments and docstrings

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* change typehint

* update changelog with joins

* include join functions

* docs for joins

* added test for right match rows

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add validation of match rows

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix docstring

* fix docs

* fix typo in changelog

* attempt fix docs

* fix docs

* fix docs

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Luca Marconato <m.lucalmer@gmail.com>
Co-authored-by: LucaMarconato <2664412+LucaMarconato@users.noreply.github.com>

* fix tests

* fix docs

* add possibility for custom table name (#459)

* add possibility for custom table name

* change docstring

* updated changelog

* added table_name to SpatialData.aggregate()

---------

Co-authored-by: Luca Marconato <m.lucalmer@gmail.com>

* Update locate values (#460)

* add possibility for custom table name

* change docstring

* updated changelog

* added table_name parameter

* update changelog

---------

Co-authored-by: LucaMarconato <2664412+LucaMarconato@users.noreply.github.com>

* Filter table annotate (#462)

* add get_element_annotator

* add docstring

* add test

* wip get_centroids

* implemented get_centroids()

* made _assert_spatialdata_objects_seem_identical() into a util

* fix docs, attemp

* allow table to be None in get_values and _locate_values (#466)

* allow table to be None

* fixes to aggregate

* check annotation

* adjust docstrings

* remove table check

* add sphinx_pytest

* Added `validate_table_annotation_target()` (#468)

* added validate_table_annotation_target()

* removed test that is no longer relevant

* fix docs

* fixed way to test no warnings emitted

* merged validate_table_annotation_target() into validate_table_in_spatialdata()

* fix test

* silence warning

* fix gettin dtype from multiscale

* add else dtype back

* silence scipy.misc.face deprecation

* Operation `to_circles()` (#473)

* `get_centroids()` (#465)

* wip get_centroids

* implemented get_centroids()

* code suggestions from kevin

* wip vectorize.py

* added vectorize; still no tests

* refactored testing functions; wip tests to_circles()

* considering removing the target_coordinate_system parameter in to_circles()

* adjusted tests, docs, changelog

* added changelog

* fix tests, remove inject_docs

* fix sphinx

* attempt fix sphinx

* deedcopy() utils function (#480)

* deedcopy() utils function

* fixed missings seeds for default_rng()

* wip fix

* wip fix

* fix bug due to data being computed in-place and then failing validation

* add pooch requirement

* Update src/spatialdata/_core/_deepcopy.py

* Update src/spatialdata/_core/_deepcopy.py

* Update src/spatialdata/_core/_deepcopy.py

* Update src/spatialdata/_core/_deepcopy.py

---------

Co-authored-by: Wouter-Michiel Vierdag <w-mv@hotmail.com>

* fix bug deepcopy() of points wrong columns order

* workaround wrong order points columns after deepcopy

* rechunking raster data after spatial query (#479)

* rechunking raster data after spatial query

* using xarray chunk() instead of dask rechunk()

* Test joins with string indices and instance id (#485)

* test join strings

* fix dtype aggregate

---------

Co-authored-by: Luca Marconato <m.lucalmer@gmail.com>

* cleanup tests

* remove comments

---------

Co-authored-by: Wouter-Michiel Vierdag <w-mv@hotmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: giovp <giov.pll@gmail.com>
Co-authored-by: Giovanni Palla <25887487+giovp@users.noreply.github.com>
Co-authored-by: aeisenbarth <54448967+aeisenbarth@users.noreply.github.com>
Co-authored-by: Wouter-Michiel Vierdag <michiel.vierdag@embl.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants