Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add function to export spatialdata to legacy anndata format #102

Merged
merged 30 commits into from
Mar 20, 2024

Conversation

grst
Copy link
Contributor

@grst grst commented Jan 25, 2024

As discussed in #47 and nf-core/spatialvi#39 (comment)

@codecov-commenter
Copy link

codecov-commenter commented Jan 25, 2024

Codecov Report

Attention: Patch coverage is 53.19149% with 66 lines in your changes are missing coverage. Please review.

Project coverage is 37.13%. Comparing base (d58382b) to head (3a4f87a).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #102      +/-   ##
==========================================
+ Coverage   35.27%   37.13%   +1.85%     
==========================================
  Files          16       18       +2     
  Lines        1219     1360     +141     
==========================================
+ Hits          430      505      +75     
- Misses        789      855      +66     
Files Coverage Δ
src/spatialdata_io/experimental/__init__.py 100.00% <100.00%> (ø)
src/spatialdata_io/converters/legacy_anndata.py 52.51% <52.51%> (ø)

@grst
Copy link
Contributor Author

grst commented Jan 25, 2024

should I be adding tests anywhere?

@LucaMarconato
Copy link
Member

LucaMarconato commented Jan 25, 2024

Thank you @grst for the PR. I am adding a from_legacy_anndata() function so that I can setup some tests of bidirectional conversion on artificial data.

One note on your implementation. I see that the code assumes that adata.uns is populated with dataset information. This is generally not the case, so I was thinking on generalizing your function, especially for the non-Visium case. I would change the signature by adding a new coordinate_system parameter. I explain how the implementation will change in the docstring below:

def to_legacy_anndata(sdata: SpatialData, coordinate_system: str | None = None) -> AnnData:
    """
    Convert a SpatialData object to a (legacy) spatial AnnData object.

    This is useful for using packages expecting spatial information in AnnData, for example Scanpy and older versions
    of Squidpy. Using this format for any new package is not recommended.

    Parameters
    ----------
    sdata
        SpatialData object
    coordinate_system
        The coordinate system to consider. The AnnData object will be populated with the data aligned to this coordinate
        system.

    Returns
    -------
    The legacy spatial AnnData object

    Notes
    -----
    Limitations and edge cases:

        - The Table can only annotate Shapes, not Labels. If labels are needed, please manually compute the centroids,
          approximate radii from the Labels areas, add this as a Shapes element to the SpatialData object and make the
          table annotate the Shapes element.
        - The Table cannot annotate more than one Shapes element. If more than one Shapes element are present, please
          merge them into a single Shapes element and make the table annotate the new Shapes element.
        - Table rows not annotating geometries in the coordinate system will be dropped. Similarly, Shapes rows
          not annotated by Table rows will be dropped.

    How resolutions, coordinates, indices and Table rows will be affected:

        - The Images will be scaled (see later) and will not maintain the original resolution.
        - The Shapes centroids will change iff their transformation is not an Identity.
        - The Table and Shapes rows will have the same indices but, as mentioned above, some rows may be dropped.

    The AnnData object does support only low-resolution images and limited coordinate transformations; eventual
    transformations are applied before conversion. Precisely, this is how the Images are prepared for the AnnData
    object:

        - For each Image aligned to the specified coordinate system, the coordinate transformation to the coordinate
          system is applied; all the Images are rendered to a common target bounding box, which is the bounding box of
          the union of the Images aligned to the target coordinate system.
        - Practically the above implies that if the Images have very dissimilar locations, most the rendered Images will
          have empty spaces.
        - Each Image is downscaled during rendering to fit a 2000x2000 pixels image (downscaled_hires) and a 600x600
          pixels image (downscaled_lowres).
    """
    pass

What do you think about this? If you agree on the above I will continue your PR and proceed with the implementation.

@LucaMarconato
Copy link
Member

Actually I realized that even if the legacy AnnData is a format that we don't aim to support and we don't encourage, the conversion functions and the tests would be similar to the ones that would be required for supporting bidirectional conversion to emObject and voyager-py objects. Therefore I am going to put more time into this and prepare exhaustive tests.

I won't have time tomorrow, I will get back on ~monday.

@grst
Copy link
Contributor Author

grst commented Jan 26, 2024

Sounds good, I'm just wondering how generalizable this function needs to be. Has the "legacy AnnData" format been used for anything but visium?

@LucaMarconato
Copy link
Member

Yes it has been for many techs, for instance in the legacy squidpy datasets and with SODB. Anyway I'll try to not overgeneralize: I want to lay down some foundations for eventual converters for other libraries but I'll still keep it focused on the legacy anndata.

@LucaMarconato LucaMarconato marked this pull request as draft February 13, 2024 15:13
@LucaMarconato LucaMarconato linked an issue Mar 18, 2024 that may be closed by this pull request
@LucaMarconato LucaMarconato marked this pull request as ready for review March 18, 2024 18:36
@LucaMarconato
Copy link
Member

LucaMarconato commented Mar 18, 2024

I finally finalized this, ready for review 😊

An example of the converter functions in action is in this notebook (updated version of the "squidpy interoperability" notebook in the docs): https://github.com/scverse/spatialdata-notebooks/blob/improve/docs/notebooks/examples/squidpy_integration.ipynb.

Some notes:

  • The function is under the submodule experimental
  • One of the cells fail because of a PR in spatialdata-plot not being merged yet, but the example is valid anyway, and the PR will be ready soon
  • The tests are failing because we need to cut a release in spatialdata, but they pass locally
  • Not sure what's wrong with the docs, if anybody can have a look please go ahead; otherwise I will try later on when the tests pass.

Comment on lines +1 to +6
from spatialdata_io.converters.legacy_anndata import (
from_legacy_anndata,
to_legacy_anndata,
)

__all__ = ["from_legacy_anndata", "to_legacy_anndata"]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary for a package with version number 0.0.x?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. The reason why we put experimental (which we use also here: #75) is because we are not super sure about the need/utility of the code and because we would like to get more feedback from the users before adjusting things to a "definitive version" or dropping the support.

Copy link
Member

@LucaMarconato LucaMarconato Mar 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Therefore even if the 0.0.x indicates that the library is still, in a way, work in progress, I would still mark those parts of the code as experimental to suggest extra care/expect that things may not work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then I would probably argue that you should be 0.x since some time ;) Actually, the semver spec even suggest to start at 0.1.0.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, another good point 😁 We actually want to release a 0.1.0 this week, so we'll fix that!

@LucaMarconato
Copy link
Member

I think I fixed the problem with the docs. Now all the errors (docs and tests) are due to the lack of a release in spatialdata.

@LucaMarconato
Copy link
Member

I made a pre-release in spatialdata and changed the CI configuration to run the tests and build the docs against the latest pre-release. Now all the checks pass.

I will merge already because the code is marked as experimental; happy to eventually reiterate on this.

@LucaMarconato LucaMarconato merged commit 8b6c01c into main Mar 20, 2024
6 checks passed
@LucaMarconato LucaMarconato deleted the feat/to_legacy_anndata branch March 20, 2024 13:10
lucas-diedrich pushed a commit to lucas-diedrich/spatialdata-io that referenced this pull request Nov 26, 2024
Add function to export spatialdata to legacy anndata format
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

anndata2spatialdata conversion
3 participants