Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improvement] Check that coding ASEs share the frame with the primary transcript #134

Closed
lucventurini opened this issue Oct 12, 2018 · 3 comments
Assignees
Milestone

Comments

@lucventurini
Copy link
Collaborator

Currently, Mikado only performs a CDS overlap check to make sure that two transcripts are compatible as ASEs. However, we do not check whether they actually encode a compatible protein. This requires to calculate the CDS codons and verify that at least some of them are in common.

As it is probably an expensive operation, we should not call upon it until the last possible moment (ie during the ASE validation step).

@lucventurini lucventurini added this to the 1.3 milestone Oct 12, 2018
lucventurini pushed a commit that referenced this issue Oct 12, 2018
lucventurini added a commit that referenced this issue Oct 15, 2018
…rimary transcript CDS length as the denominator (not the minimum between the two compared transcripts).
lucventurini added a commit that referenced this issue Oct 15, 2018
@lucventurini
Copy link
Collaborator Author

We currently have to decide whether we will calculate the CDS overlap for ASEs based on the shorter between the two CDSs or whether to always use the primary transcript as benchmark. Currently implemented as a switch and keeping the old behaviour as default; this cannot be affected by the outside.

@lucventurini
Copy link
Collaborator Author

The current method for calculating the frames is too expensive. A better way would do the following with two exons:

  • check whether they are overlapping
  • if they are overlapping, sort them, then check whether (downstream + phase)-(upstream + phase) %3==0

Of course downstream/upstream have to be defined according to the stand.

lucventurini added a commit that referenced this issue Oct 16, 2018
@lucventurini
Copy link
Collaborator Author

Feature implemented and tested. Closing.

lucventurini added a commit to lucventurini/mikado that referenced this issue Sep 20, 2019
…ed 'in-frame' if at least one of their exons is in-frame. Previously, the only contribution to cds_overlap was given by in-frame CDS segments, which is probably too restrictive.
lucventurini added a commit to lucventurini/mikado that referenced this issue Sep 20, 2019
…uced by default to 50% (again, 75% was probably too restrictive)
lucventurini added a commit that referenced this issue Sep 26, 2019
* Now Mikado pick will use lightweight SQLite databases for inteprocess data exchange (#218). It could still be improved by allowing to remove more fragments.
* Small corrections for the `daijin` pipelines.
* Fix #215 
* Fixing the recovery for lost loci.
* Amend for #134. Now min_cds_overlap has been reduced by default to 50% (75% was probably too restrictive)
* Solved a bug in `mikado compare` that led to incorrect statistics when using multiprocessing.
* Needed bug fixes for Mikado serialise.
* Mikado configure was embedding the scoring file within the configuration - now amended.
* Fix #217
lucventurini pushed a commit to lucventurini/mikado that referenced this issue Feb 11, 2021
lucventurini added a commit to lucventurini/mikado that referenced this issue Feb 11, 2021
lucventurini added a commit to lucventurini/mikado that referenced this issue Feb 11, 2021
…lso always uses the primary transcript CDS length as the denominator (not the minimum between the two compared transcripts).
lucventurini added a commit to lucventurini/mikado that referenced this issue Feb 11, 2021
lucventurini added a commit to lucventurini/mikado that referenced this issue Feb 11, 2021
… a bit the unittest coverage and documentation.
lucventurini added a commit to lucventurini/mikado that referenced this issue Feb 11, 2021
* Now Mikado pick will use lightweight SQLite databases for inteprocess data exchange (EI-CoreBioinformatics#218). It could still be improved by allowing to remove more fragments.
* Small corrections for the `daijin` pipelines.
* Fix EI-CoreBioinformatics#215 
* Fixing the recovery for lost loci.
* Amend for EI-CoreBioinformatics#134. Now min_cds_overlap has been reduced by default to 50% (75% was probably too restrictive)
* Solved a bug in `mikado compare` that led to incorrect statistics when using multiprocessing.
* Needed bug fixes for Mikado serialise.
* Mikado configure was embedding the scoring file within the configuration - now amended.
* Fix EI-CoreBioinformatics#217
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants