release v0.3.0 #273

davidsebfischer · 2021-05-20T08:19:02Z

No description provided.

…arate consts object

…ields

…ared features

… parties

lazy datasets now draw from either properties defined in constructor on available in a meta data file. meta data files are streamlined, both in loading and saving.

…ty database via meta objects

Cellxgene loader

* add plot_npc and plot_active_latent_units (#9) * add plot_npc and plot_active_latent_units * make sure handling of z and z_mean is consistent for VAE embeddings * clean up and documentation * formatting Co-authored-by: Martin König <martin.koenig@ymail.com> Co-authored-by: le-ander <20015434+le-ander@users.noreply.github.com> * added data loader for interactive workflows with unprocessed data * made cell type loading optional in dataset .load() * enabled usage of type estimator on data without labels in prediction mode * recursively search custom model repo for weights files * sort model lookuptable alphabetically before writing it * make sure mode_path is set correctly in model_lookuptable when recursive weights loading is used * fix os.path.join usage in dataloaders * replace path handling through string concatenations with os.paths.join and f-strings * fix bug in lookup table writing * add mdoel file path to lookup table * reset index in model lookuptable before saving * add method to user interface for pushing local model weights to zenodo * fix bug in user interface * fix bux in summaries.py * use absolute model paths when model_lookuptable is used * fix bug in pretrained weights loading * fix bug in pretrained weights loading * automatically create an InteractiveDataset when loading data through the UI * fix bug inUI data loading * Explicitly cast indices and indptr of final backed file to int64. (#17) For the background on this: scverse/anndata#453 * update human lung dataset doi * align mouse organ names with human organ names * fix typo in trachea organ naming in mouse * rename mouse ovary organ to femalegonad * rename mouse ovary organ to femalegonad * sort by model type in classwise f1 heatmap plot * another hacky solution to ensure a summary tab can be created when both vae and other models are loaded at once * allow custom metadata in zenodo submission * do not return doi but deposit url after depositing to zenodo sandbox as dois don't wrk on sandbox * updated model zoo description * recognise all .h5 and .data-0000... files as sfaira weights when constructing lookuptable * Update README.rst * Add selu activation and lecun_normal weight_init scheme for human VAEVAMP. (#19) * update sfaira erpo url and handle .h5 extension in model lookuptable id * add meta_data download information to all human dataloaders * updated docs * updated reference to README in docs * updated index * included reference to svensson et al data base in docs * fixed typo in docs * fixed typos in docs * restructured docs * fixed bug in reference roadmap in docs * updated data and model zoo description * added summary picture into index of docs * fixed typo in docs * updated summary panel * add badges to readme and docs index * updated summary panel (#20) * Doc updates (#21) * updated summary panel * fixed concept figure references * Doc updates (#22) * updated zoo panels * move from `import sfaira.api as sfaira` to `import sfaira` and from `import sfaira_extension.api as sfairae` to `import sfaira_extension` * add custom genomes to sfaira_extension * fix loading of custom topology versions from sfaira_extension * fix circular imports between sfaira_extension and sfaira * fix dataloader * fix celltype versioning through sfaira_extension * fix celltype versioning through sfaira_extension * formatting * Doc updates (#25) * added mention of download scripts into docs Co-authored-by: mk017 <martin.koenig@tum.de> Co-authored-by: Martin König <martin.koenig@ymail.com> Co-authored-by: le-ander <20015434+le-ander@users.noreply.github.com> Co-authored-by: Abdul Moeed <abdulmoeed444@gmail.com>

* fixed missing import in mouse trachea * fixed meta data accession and added automatic subsetting of datasetgroups during loading subsets to available data sets * depreceated api and added consts into __init__ api

fixed unit tests and fixed adata field related bugs * updated order of data sets into meta loading dict to be alphabetic * fixed bug in setting pandas dataframe index * depreceated kipoi test * fixed vae unit test * many: field related bugs

…ames, ensembl_col=ADATA_IDS_SFAIRA.gene_id_ensembl) uses strings now (#35)

* fixed remaining instance of has_celltypes * fixed fields reference

* Refactor __get_dataset() to always return data generator. Modify functions to handle generator. * Remove unnecessary sparse tensor conversion.

* improved and modified meta data usage * fixed download website to be able to contain multiple entries in meta data saving * added support for variable uns and obs setting of data meta data that can be cell specific * added support to check for equiality of column entries to condition string to define healthy into super method * moved _mapped_features into constants * introduced cell ontology obs keys into base class * updated cellxgene data loader to new constant field standards * coupled .annotated to obs keys that indicate presence of cell types * added dev stage and age meta data * refactored UNS_STRING_META_IN_OBS as constant * updated cellxgene data loader to new data loading format * updated data loaders to new formate * updated interactive data set * grouped data loaders by study * made all data loaders raw data loaders * added new auto chaching * added download method into base class * removed separate caching from data loaders * use "organsism" instead of "species" across sfaira and optimise imports * fixed example code block * exluded data loader tests from git path * took out external.py files from data loaders to make directories leaner * added directory oriented, automated data set groups * added sfaira wide super groups and super groups nesting * enabled parallelised loading * refactored unknown cell type identification * added xlrd dependency * added github workflows * homogenized string style to "" * enabled rapid raw loading of groups saved in one object * remove superflous raw loading docstrings Co-authored-by: le-ander <20015434+le-ander@users.noreply.github.com> Co-authored-by: Lukas Heumos <lukas.heumos@posteo.net>

@xlancelottx

* FAQ with contributions from @xlancelottx and @lauradmartens * more detailed step-by-step explanation of loader contribution

…ding / writing fails

* fixed bug in config writing of store and built unit test * improved error reportign for Dataset

* adapt meta_writing to new streamlining backend * fix handling of celltype labels in write_meta * fix donload_url_meta property setter * handle combinatorial batch keys correctly in write_meta * handle combinatorial batch keys correctly in write_meta * handle combinatorial batch keys correctly in write_meta

…ger scalar" scipy error (#251)

…es for loading data (#256)

#255) * fixed store to estimator interface and added further unit tests on store and estimators * fixed zoo TopologyContainer interface and added .obs into store * adapated zoo handling and model ontology switched to modelclass_name_provider to be more generalisable modelclass is the only constant beyond provider that is really essential because it maps to a unique estimator class * updated store subsetting * fixed cached_store_writing for testing * fixed store-estimator bugs * stabilised size factor computation, fixed store unit tests and added dense conversion for feature indexing in store generator * improved efficiency of generator queries of store and generator usage by estimator keras * id is retained if uns_to_obs is True in streamlining * removed conitnuous batches option from store generator

* reduced indices in store to non empty indices * fixed unit test * extended store class documentation * added additional edges to UBERON wrapper class * updated DAG break warning in ontology classes * made empty argument error in get_effective_leaves more verbose * warning instead of error if empty sets are passed to target universe writing * removed empty configs * caught non annotated data sets * fix out path in target writing and some syntax * enforced subsetting by selected indices within data set in target universe writing script Co-authored-by: le-ander <20015434+le-ander@users.noreply.github.com>

* add polioudakis data * fix download function * add manual download for polioudakis * fix polioudakis * fix maps writing * update polioudakis * add celltype ids * fix cli print * update manual downlaod workflow * fix manual download * fix download url polioudakis

* renamed sfaira lint to validate Signed-off-by: zethson <lukas.heumos@posteo.net> * remove clean from instructions Signed-off-by: zethson <lukas.heumos@posteo.net>

Signed-off-by: zethson <lukas.heumos@posteo.net>

review-notebook-app · 2021-05-20T08:19:09Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

Signed-off-by: zethson <lukas.heumos@posteo.net>

* fixes training-related bugs * added zarr-dask distributed acccess store store * fixed indexing bug in h5ad store * fixed store unit test and renamed unknown meta data field in sfaira streamlining * added optional custom cell ontology to estimator constructor (#254) * fix handling of new model_id format * clean up model zoo * fixed grid search summaries * fix celltype handling in UI * include topology files in zenodo deposition Co-authored-by: le-ander <20015434+le-ander@users.noreply.github.com>

* removed previously deleted files reintroduced by merge * removed old rst files

github-actions · 2021-05-28T14:59:52Z

Hi @davidsebfischer,

It looks like this pull-request is has been made against the theislab/sfaira master branch.
The master branch should always contain code from the latest release.
Because of this, PRs to master are only allowed if they come from any theislab/sfaira release or patch branch.

You do not need to close this PR, you can change the target branch to development by clicking the "Edit" button at the top of this page.

Thanks again for your contribution!

davidsebfischer · 2021-05-28T15:04:25Z

commit 950abad was a force push to resolve the conflicts with master.

davidsebfischer and others added 30 commits October 8, 2020 14:19

added first version of cellxgene data format loader

b4e12a0

refactored anndata field entries from data loaders to be named in sep…

27fa462

…arate consts object

adapted adata field refectoring in data base classs

4b082e5

updated cellxgene data loader to use refactored constants for adata f…

8aa4ef6

…ields

updated missing refactored gene id fields in data loaders

2a145fd

refactored adata fields constant container classes to reflect core sh…

d70f524

…ared features

updated old usages of ADATA_IDS to ADATA_IDS_SFAIRA

4a280d0

added constants based classses into api to improve interfacing to 3rd…

16e9aba

… parties

refactored lazy dataset properties and meta data objects

60f5bce

lazy datasets now draw from either properties defined in constructor on available in a meta data file. meta data files are streamlined, both in loading and saving.

renamed remaining instances of "animal" into "species"

1e32103

allowed maps of meta data file nomenclature

77a6dff

moved meta data code in DatasetBase for readability

9f844c7

introduced meta_fn attribute of dataset class and depreceated 3rd par…

d3e356a

…ty database via meta objects

added datsetgroup subsetting based on meta / lazy properties

4f988a9

Merge branch 'master' into cellxgene_loader

facab5b

Merge branch 'master' into cellxgene_loader

4b85cbb

Merge pull request #10 from theislab/cellxgene_loader

365739c

Cellxgene loader

fixed missing import in mouse trachea (#29)

af3e575

* fixed missing import in mouse trachea * fixed meta data accession and added automatic subsetting of datasetgroups during loading subsets to available data sets * depreceated api and added consts into __init__ api

pass paths correctly to extension datasets

273d7b6

Dataloading fix (#33)

6a37f0b

fixed unit tests and fixed adata field related bugs * updated order of data sets into meta loading dict to be alphabetic * fixed bug in setting pandas dataframe index * depreceated kipoi test * fixed vae unit test * many: field related bugs

self._convert_and_set_var_names(symbol_col=ADATA_IDS_SFAIRA.gene_id_n…

198350e

…ames, ensembl_col=ADATA_IDS_SFAIRA.gene_id_ensembl) uses strings now (#35)

fixed remaining instance of has_celltypes (#36)

ac24ca9

Has celltypes bug (#37)

1356856

* fixed remaining instance of has_celltypes * fixed fields reference

Data generator for model evaluation and prediction. (#46)

582e6c8

* Refactor __get_dataset() to always return data generator. Modify functions to handle generator. * Remove unnecessary sparse tensor conversion.

extended data loader documentation (#77)

700cd37

* FAQ with contributions from @xlancelottx and @lauradmartens * more detailed step-by-step explanation of loader contribution

added development FAQ section (#79)

f95ea89

update cached reading to only warn and not throw errors if cached rea…

03d590f

…ding / writing fails

davidsebfischer and others added 19 commits April 27, 2021 13:12

fixed bug in config writing of store and built unit test (#242)

742b593

* fixed bug in config writing of store and built unit test * improved error reportign for Dataset

fixed dev stage assignment (#243)

562a668

fixed subsetting (#244)

5c872d6

added skipping to store writing and fixed hcl bug (#247)

b7ddeda

fixed script (#248)

ca82c32

fix development stage annotation in pisco mouse data. closes #245 (#249)

25ade54

reduce step size in matrix copying to get tid of "cannot convert inte…

8a20260

…ger scalar" scipy error (#251)

change read from backed=True (r+) to r so that read permission suffic…

63f559d

…es for loading data (#256)

fixed config scripts (#257)

914920c

fix filename in config load (#258)

e2c27ff

fix download function

a40fd0a

renamed sfaira lint to validate (#263)

80641f7

* renamed sfaira lint to validate Signed-off-by: zethson <lukas.heumos@posteo.net> * remove clean from instructions Signed-off-by: zethson <lukas.heumos@posteo.net>

add sfaira dataloader schema (#265)

f89eddb

Signed-off-by: zethson <lukas.heumos@posteo.net>

Fix error due to undefined variable. (#266)

4abb78d

Fix model_id docstring to suggest correct format. (#267)

c23b4c8

davidsebfischer marked this pull request as draft May 20, 2021 08:19

Zethson and others added 4 commits May 28, 2021 14:13

partially fix test

7372c4f

Signed-off-by: zethson <lukas.heumos@posteo.net>

Dev merge (#278)

c3198ed

* removed previously deleted files reintroduced by merge * removed old rst files

Merge branch 'master' into dev

950abad

davidsebfischer changed the base branch from master to release May 28, 2021 15:08

davidsebfischer marked this pull request as ready for review May 28, 2021 15:15

davidsebfischer merged commit 45a9fb8 into release May 28, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

release v0.3.0 #273

release v0.3.0 #273

davidsebfischer commented May 20, 2021

review-notebook-app bot commented May 20, 2021

github-actions bot commented May 28, 2021

davidsebfischer commented May 28, 2021

release v0.3.0 #273

release v0.3.0 #273

Conversation

davidsebfischer commented May 20, 2021

review-notebook-app bot commented May 20, 2021

github-actions bot commented May 28, 2021

davidsebfischer commented May 28, 2021