Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task denoising method dca #431

Merged

Conversation

wes-lewis
Copy link
Collaborator

Submission type

  • This submission adds a new dataset
  • [x ] This submission adds a new method
  • This submission adds a new metric
  • This submission adds a new task
  • This submission adds a new Docker image
  • This submission fixes a bug (link to related issue: )
  • This submission adds a new feature not listed above

Testing

  • This submission was written on a forked copy of SingleCellOpenProblems
  • GitHub Actions "Run Benchmark" tests are passing on this base branch of this pull request (include link to passed test: )
  • If this pull request is not ready for review (including passing the "Run Benchmark" tests), I will open this PR as a draft (click on the down arrow next to the "Create Pull Request" button)

Submission guidelines

  • This submission follows the guidelines in our Contributing document
  • I have checked to ensure there aren't other open Pull Requests for the same update/change

PR review checklist

This PR will be evaluated on the basis of the following checks:

  • The task addresses a valid open problem in single-cell analysis
  • The latest version of master is merged and tested
  • The methods/metrics are imported to __init__.py and were tested in the pipeline
  • Method and metric decorators are annotated with paper title, year, author, code version, and date
  • [] The README gives an outline of the methods, metrics and datasets in the folder
  • [] The README provides a satisfactory task explanation (for new tasks)
  • [] The sample test data is appropriate to test implementation of all methods and metrics (for new tasks)

wes-lewis and others added 30 commits March 22, 2021 10:20
Add alra.py (includes existing bug).
…ante-immunai/openproblems into scottgigante/bugfix/alra_import
…ante-immunai/openproblems into scottgigante/bugfix/alra_import
…lra_import

Scottgigante/bugfix/alra import
add to_csr() to fix coo matrix error
try adding custom exception to catch shape mismatch from ALRA
@scottgigante-immunai
Copy link
Collaborator

scottgigante-immunai commented Jun 28, 2022

Actually, since adata.obsm['train'] is sampled randomly from adata.X, it's possible by design that there might be zero genes. We either need to a) require that genes have sufficient counts that the likelihood of sampling zero is vanishingly low; or b) require that methods be able to handle zeros. I lean towards the latter. We could however create a helper function to do what I suggested above so that the workflow would be

X_train, missing_idx = utils.filter_zero_genes(adata.obsm["train"])
X_out = my_denoising_function(X_train)
return utils.unfilter_zero_genes(X_out, missing_idx)

or even getting clever (but this requires not copying adata_filt, which is hard to enforce and could lead to confusing errors)

with utils.filter_zero_genes(adata) as adata_filt:
    adata_filt.obsm["X_denoised"] = denoise(adata_filt.obsm["X_train"])
return adata

@LuckyMD
Copy link
Collaborator

LuckyMD commented Jul 4, 2022

Just getting back to this... I think it's actually okay if methods can't deal with 0 genes... this seems like a filtering issue and not a denoising issue practically. I would probably go with just filtering them out after train-test split rather than doing fancy things and not trying to denoise them. In the latter case you would get a good result for genes you didn't take into account at all. Also, I think we do the same 0 filtering in other tasks like label projection as well.

@scottgigante-immunai
Copy link
Collaborator

@wes-lewis once you resolve the open comments, this should be ready to merge. Congrats!

@scottgigante-immunai scottgigante-immunai marked this pull request as ready for review July 12, 2022 16:39
@codecov
Copy link

codecov bot commented Jul 12, 2022

Codecov Report

Merging #431 (2749be8) into main (4baa861) will increase coverage by 0.03%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #431      +/-   ##
==========================================
+ Coverage   93.59%   93.63%   +0.03%     
==========================================
  Files         101      103       +2     
  Lines        2530     2591      +61     
  Branches      127      130       +3     
==========================================
+ Hits         2368     2426      +58     
- Misses        115      116       +1     
- Partials       47       49       +2     
Flag Coverage Δ
unittests 93.63% <100.00%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...egration/batch_integration_graph/methods/_utils.py 64.28% <ø> (-2.39%) ⬇️
...integration/batch_integration_graph/methods/mnn.py 100.00% <ø> (ø)
...ation/batch_integration_graph/methods/scanorama.py 100.00% <ø> (ø)
openproblems/api/main.py 94.11% <100.00%> (+0.17%) ⬆️
openproblems/api/parser.py 90.54% <100.00%> (+1.30%) ⬆️
openproblems/api/test.py 100.00% <100.00%> (ø)
openproblems/api/utils.py 86.04% <100.00%> (ø)
openproblems/data/utils.py 96.92% <100.00%> (+0.04%) ⬆️
.../_batch_integration/batch_integration_graph/api.py 100.00% <100.00%> (ø)
openproblems/tasks/denoising/api.py 100.00% <100.00%> (ø)
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update aa73981...2749be8. Read the comment docs.

@scottgigante-immunai scottgigante-immunai merged commit 903469a into openproblems-bio:main Jul 13, 2022
@wes-lewis
Copy link
Collaborator Author

This is super exciting! Thank you so much Scott

@scottgigante-immunai
Copy link
Collaborator

Thank you for all your work on this!

rcannood pushed a commit that referenced this pull request Sep 4, 2024
* explicitly convert non-string objects in .obs slots to categoricals

* Add info about fix in comments
rcannood pushed a commit that referenced this pull request Sep 4, 2024
* explicitly convert non-string objects in .obs slots to categoricals

* Add info about fix in comments

Former-commit-id: 624c442
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants