Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dummy_anndata generates data types not supported by anndata #12

Open
rcannood opened this issue Nov 15, 2024 · 2 comments
Open

dummy_anndata generates data types not supported by anndata #12

rcannood opened this issue Nov 15, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@rcannood
Copy link
Member

rcannood commented Nov 15, 2024

Report

A later release of AnnData disallows pandas.core.arrays. Is this a false positive from their side (since numpy.ndarray is allowed) or should we change dummy_anndata?

import dummy_anndata as da

adata = da.generate_dataset(
  x_type=None,
  layer_types=[],
  obs_types=[],
  var_types=[],
  obsm_types=["categorical_missing_values"],
  varm_types=[],
  obsp_types=[],
  varp_types=[],
  uns_types=[],
  nested_uns_types=[],
)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/rcannood/.local/lib/python3.12/site-packages/dummy_anndata/generate_dataset.py", line 165, in generate_dataset
    return ad.AnnData(
           ^^^^^^^^^^^
  File "/home/rcannood/.local/lib/python3.12/site-packages/anndata/_core/anndata.py", line 252, in __init__
    self._init_as_actual(
  File "/home/rcannood/.local/lib/python3.12/site-packages/anndata/_core/anndata.py", line 442, in _init_as_actual
    self.obsm = obsm
    ^^^^^^^^^
  File "/home/rcannood/.local/lib/python3.12/site-packages/anndata/_core/aligned_mapping.py", line 435, in __set__
    _ = self.construct(obj, store=value)  # Validate
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rcannood/.local/lib/python3.12/site-packages/anndata/_core/aligned_mapping.py", line 408, in construct
    return self.cls(obj, axis=self.axis, store=store)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rcannood/.local/lib/python3.12/site-packages/anndata/_core/aligned_mapping.py", line 297, in __init__
    super().__init__(parent, store=store)
  File "/home/rcannood/.local/lib/python3.12/site-packages/anndata/_core/aligned_mapping.py", line 210, in __init__
    self._data[k] = self._validate_value(v, k)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rcannood/.local/lib/python3.12/site-packages/anndata/_core/aligned_mapping.py", line 279, in _validate_value
    return super()._validate_value(val, key)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rcannood/.local/lib/python3.12/site-packages/anndata/_core/aligned_mapping.py", line 100, in _validate_value
    return coerce_array(val, name=name, allow_df=self._allow_df)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rcannood/.local/lib/python3.12/site-packages/anndata/_core/storage.py", line 66, in coerce_array
    raise ValueError(msg) from e
ValueError: Obsm 'categorical_missing_values' needs to be of one of <class 'numpy.ndarray'>, <class 'numpy.ma.core.MaskedArray'>, <class 'scipy.sparse._csr.csr_matrix'>, <class 'scipy.sparse._csc.csc_matrix'>, <class 'scipy.sparse._base.sparray'>, <class 'anndata.compat.AwkArray'>, <class 'h5py._hl.dataset.Dataset'>, <class 'zarr.core.Array'>, <class 'anndata.compat.ZappyArray'>, <class 'anndata.abc.CSRDataset'>, <class 'anndata.abc.CSCDataset'>, <class 'dask.array.core.Array'>, <class 'anndata.compat.CupyArray'>, or <class 'anndata.compat.CupySparseMatrix'>, not <class 'pandas.core.arrays.categorical.Categorical'>.

Version information

-----
anndata             0.11.1
dummy_anndata       0.0.2
pandas              2.2.2
session_info        1.0.0
-----
abrt_exception_handler3     NA
asciitree                   NA
cloudpickle                 3.0.0
cython_runtime              NA
cytoolz                     0.12.2
dask                        2024.8.1
dateutil                    2.8.2
dill                        0.3.8
h5py                        3.10.0
jinja2                      3.1.2
markupsafe                  2.1.3
msgpack                     1.0.6
natsort                     8.4.0
numcodecs                   0.13.0
numpy                       1.26.2
packaging                   24.1
paste                       NA
psutil                      5.9.8
pyarrow                     15.0.0
pydot                       2.0.0
pyparsing                   3.1.2
pytz                        2023.3.post1
scipy                       1.12.0
six                         1.16.0
sphinxcontrib               NA
systemd                     NA
tblib                       3.0.0
tlz                         0.12.2
toolz                       0.12.0
torch                       2.2.0+rocm5.7
torchgen                    NA
tqdm                        4.66.5
typing_extensions           NA
xxhash                      NA
yaml                        6.0.1
zarr                        2.18.2
-----
Python 3.12.7 (main, Oct  1 2024, 00:00:00) [GCC 14.2.1 20240912 (Red Hat 14.2.1-3)]
Linux-6.11.5-200.fc40.x86_64-x86_64-with-glibc2.39
-----
Session information updated at 2024-11-15 09:25
@rcannood rcannood added the bug Something isn't working label Nov 15, 2024
rcannood added a commit to scverse/anndataR that referenced this issue Nov 15, 2024
rcannood added a commit to scverse/anndataR that referenced this issue Nov 15, 2024
* migrate obsm varm tests, skip if known issue

* update known issues

* fix linting issues

* add yaml to suggests

* fix error message

* add obsm varm as well

* port layers

* port X

* add note

* also include arrays in obsmvarm

* fix obsvar

* fix styling

* skip categorical

* Update tests/testthat/test-InMemoryAnnData.R

Co-authored-by: Luke Zappia <lazappi@users.noreply.github.com>

* Update tests/testthat/test-roundtrip-obsvar.R

* add temporary workaround for data-intuitive/dummy-anndata#12

---------

Co-authored-by: Luke Zappia <lazappi@users.noreply.github.com>
@LouiseDck
Copy link
Collaborator

LouiseDck commented Nov 20, 2024

It does make sense: pandas arrays should be 1D I think? So you wouldn't try to put them in .obsm?

It's not specified in their on-disk format, however it does state that The group MAY contain a mapping obsm. Entries in obsm MUST be sparse arrays, dense arrays, or dataframes. These entries MUST have a first dimension of size n_obs, and I think it should be possible to put a pandas array, and specifically, any 1D array, in there, according to this specification.

I could reach out to verify that this was an intentional change on their part?

@rcannood
Copy link
Member Author

I'd say it'd be fine to just fix dummy_anndata accordingly, but feel free to reach out to anndata for further clarification -- I can imagine there is probabyl already an issue about this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants