Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add LOCO support for Plink format (Bimbam is in 0.97) #46

Closed
pjotrp opened this issue Jul 3, 2017 · 12 comments
Closed

Add LOCO support for Plink format (Bimbam is in 0.97) #46

pjotrp opened this issue Jul 3, 2017 · 12 comments
Assignees
Labels
enhancement kinship lmm mvlmm See gemma2/lib https://github.com/genetics-statistics/gemma2lib/issues

Comments

@pjotrp
Copy link
Member

pjotrp commented Jul 3, 2017

We should add leave one chromosome out support to GEMMA. From what I can tell from the source code this is not particularly hard because GEMMA already tracks chromosome information. We need to recompute the kinship matrix for n chromosomes leaving one out and then rerun the LMM and run GEMMA n+1 times and present a result file with n+1 scores. I think this should be an internal feature of GEMMA and one would like to save the precomputed kinship matrices for rerunning with covariates etc.

@pjotrp
Copy link
Member Author

pjotrp commented Jul 8, 2017

I am studying the GEMMA code to see how we add LOCO support.

@pcarbo
Copy link
Collaborator

pcarbo commented Jul 8, 2017

Okay @pjotrp. It basically would involve an "outer loop" that would run relationship matrix creation + LMM association analysis on each chromosome. Note that I have some R code (see function run.gemma in that file) that does this in case it is useful for helping to implement LOCO within gemma.

@pjotrp
Copy link
Member Author

pjotrp commented Jul 9, 2017

The outer loop may be the easiest hack for now though it will be hard to make generic for all supported input formats. If you look at the code you'll see that GEMMA has an innerloop for every file type it supports. I somehow need to plug in K-1 and subset SNPs. Looks like that will be pretty intrusive to make that generic.

@pjotrp
Copy link
Member Author

pjotrp commented Aug 2, 2017

I have implemented LOCO for Bimbam format. I'll put in a PR after we agree on the coding style, see #62

@pjotrp
Copy link
Member Author

pjotrp commented Aug 15, 2017

Two more things need to be done:

  1. Add LOCO to plink format
  2. Run all chromosomes instead of just one

I think we need to focus on BIMBAM and Plink formats for core algorithms. Other formats should really format into one of these and eventually we should remove internal support. That way we can become more DRY.

wdyt?

@xiangzhou
Copy link
Collaborator

I agree.

@pcarbo
Copy link
Collaborator

pcarbo commented Aug 15, 2017

@pjotrp Sounds like a good plan to me!

@pjotrp pjotrp added this to the 0.98 release milestone Sep 12, 2017
@pjotrp
Copy link
Member Author

pjotrp commented Sep 12, 2017

https://github.com/genetics-statistics/gemma-wrapper now runs all chromosomes and caches results too.

Leaving issue open until I add LOCO to Plink.

pjotrp added a commit to genenetwork/GEMMA that referenced this issue Oct 5, 2017
@pjotrp
Copy link
Member Author

pjotrp commented Oct 6, 2017

AnalyzeBimBam and AnalyzePlink contain duplicate logic - and they
differ now because the first has LOCO support. To add LOCO support to
PLINK we'll converge on logic and make the functions DRY.

The functions can be split into:

  1. Initialization
  2. Partially read genotype data (batch processing)
  3. LMM on the submatrix
  4. Collate the results
  5. Cleanup

The only thing specific regarding the input file format is (2). The
way to solve is is to put that reader in a function that gets passed
in. I'll work on that in a new branch (for 0.98 release).

@pjotrp pjotrp changed the title Add LOCO support Add LOCO support for Plink format (Bimbam is in 0.97) Dec 19, 2017
@flaviahodel
Copy link

Hello! Are there any news regarding LOCO for Plink format?

@pjotrp
Copy link
Member Author

pjotrp commented Apr 13, 2018

It will happen sometime this year. If you want to help out :). Meanwhile, convert plink to BIMBAM. It can be done with the plink tool.

@pjotrp pjotrp modified the milestones: 0.98 release, Later Aug 25, 2018
@pjotrp pjotrp modified the milestones: Later, faster-lmm-d Nov 21, 2018
@pjotrp pjotrp added the See gemma2/lib https://github.com/genetics-statistics/gemma2lib/issues label Sep 29, 2020
@pjotrp pjotrp modified the milestones: faster-lmm-d, Python support (gemma2/lib) Sep 29, 2020
@pjotrp
Copy link
Member Author

pjotrp commented Sep 29, 2020

It is part of gemma2/lib. Gemma2/lib converts plink and bimbam to a new Rqtl2/GEMMA2 format and that supports LOCO.

@pjotrp pjotrp closed this as completed Sep 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement kinship lmm mvlmm See gemma2/lib https://github.com/genetics-statistics/gemma2lib/issues
Projects
None yet
Development

No branches or pull requests

4 participants