Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extra QA for scoring #141

Closed
2 tasks
smlmbrt opened this issue Aug 6, 2023 · 5 comments
Closed
2 tasks

Extra QA for scoring #141

smlmbrt opened this issue Aug 6, 2023 · 5 comments
Assignees
Labels
bug Something isn't working enhancement New feature or request

Comments

@smlmbrt
Copy link
Member

smlmbrt commented Aug 6, 2023

Description of feature

Add additional checks to make sure all variants in the scoring file have been calculated on the samples. Currently this check is only for test data, but it should run on real data to ensure the SUMs are always directly comparable across datasets). Related to #139

  • Check .sscore.vars against scoring file variants to ensure that they have been calculated correctly.
  • Update test profile to require scoring file combination and deduplication.
@smlmbrt smlmbrt added the enhancement New feature or request label Aug 6, 2023
@smlmbrt smlmbrt added this to the v2.1.0 milestone Aug 6, 2023
@smlmbrt
Copy link
Member Author

smlmbrt commented Mar 19, 2024

In #244 we make sure that all scoring files have yielded results.

@DarioS
Copy link

DarioS commented Mar 23, 2024

Does "score correlation tests" refer to something like a heatmap? It would be nice to see one as standard in the HTML report.
image
This plot tell us whether newer PGS are genuinely novel or are more of the same and of dubious value.

@smlmbrt
Copy link
Member Author

smlmbrt commented Apr 29, 2024

Specifically checking that the .vars file is identical to the variants in the scoring file using a diff command?

@smlmbrt
Copy link
Member Author

smlmbrt commented Apr 29, 2024

Does "score correlation tests" refer to something like a heatmap? It would be nice to see one as standard in the HTML report. image This plot tell us whether newer PGS are genuinely novel or are more of the same and of dubious value.

No, this seems too related to custom analyses and not general use of the pipeline. It's also trivial to do by reading in the pgs file, pivoting wide, and running cor in R.

@nebfield nebfield removed this from the v2.0.0-beta.1 milestone Jul 8, 2024
@smlmbrt smlmbrt closed this as completed Jul 29, 2024
@smlmbrt smlmbrt reopened this Aug 2, 2024
@smlmbrt smlmbrt added this to the v.2.0.0-beta.3 milestone Aug 2, 2024
@smlmbrt smlmbrt added the bug Something isn't working label Aug 2, 2024
@smlmbrt
Copy link
Member Author

smlmbrt commented Aug 9, 2024

Now added to PLINK2_SCORE:

n_missing=\$(comm -3 <(zcat --force $scorefile | tail -n +2 | cut -f 1 | sort) <(sort ${output}.sscore.vars) | wc -l | tr -d ' ')
if [ \$n_missing -gt 0 ]
then
echo "ERROR: \$n_missing variant(s) missing from final calculated score!"
exit 1
else
echo "INFO: Scoring file variants match listed variants in sscore.vars"
fi

@smlmbrt smlmbrt closed this as completed Aug 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants