Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to v3.2.5.3 causes weird reproducible bug with step 1 #413

Closed
freeseek opened this issue May 10, 2023 · 7 comments
Closed

Update to v3.2.5.3 causes weird reproducible bug with step 1 #413

freeseek opened this issue May 10, 2023 · 7 comments
Labels
bug Something isn't working

Comments

@freeseek
Copy link

Regenie step 1 fails after update to REGENIE 3.2.5.3

I have this reproducible error. With version 3.2.5 everything works fine:

Start time: Wed May 10 11:52:55 2023

              |=============================|
              |      REGENIE v3.2.5.gz      |
              |=============================|

Copyright (c) 2020-2022 Joelle Mbatchou, Andrey Ziyatdinov and Jonathan Marchini.
Distributed under the MIT License.
Compiled with Boost Iostream library.
Using Intel MKL with Eigen.

Log of output saved in file : ukb_male_Y_loss.log

Options in effect:
  --step 1 \
  --bed ukb_male.prune \
  --phenoFile ukb_male.phe.Y_loss \
  --covarFile ukb_male.cov \
  --bt \
  --bsize 1000 \
  --loocv \
  --run-l1 ukb_male_Y_loss.master \
  --keep-l0 \
  --gz \
  --write-null-firth \
  --threads 4 \
  --out ukb_male_Y_loss

Fitting null model
 * bim              : [ukb_male.prune.bim] n_snps = 295172
 * fam              : [ukb_male.prune.fam] n_samples = 221517
 * bed              : [ukb_male.prune.bed]
 * phenotypes       : [ukb_male.phe.Y_loss] n_pheno = 1
   -dropping observations with missing values at any of the phenotypes
   -number of phenotyped individuals with no missing data = 221517
 * covariates       : [ukb_male.cov] n_cov = 22
   -number of individuals with covariate data = 221511
 * number of individuals used in analysis = 221511
 * case-control counts for each trait:
   - 'Y_loss': 45546 cases and 175965 controls
   -fitting null logistic regression on binary phenotypes...done (215ms) 
   -residualizing and scaling phenotypes...done (5ms) 
 * using results from running 23 parallel jobs at level 0
 * # threads        : [4]
 * block size       : [1000]
 * # blocks         : [306] for 295172 variants
 * # CV folds       : [221511]
 * ridge data_l1    : [5 : 0.01 0.25 0.5 0.75 0.99 ]
 * approximate memory usage : 4GB
 * writing null Firth estimates to file
 * setting memory...done

 (skipping to level 1 models)
 Level 1 ridge with logistic regression...
   -on phenotype 1 (Y_loss)...done (307814ms) 

Output
------
phenotype 1 (Y_loss) : 
  0.01  : Rsq = 0.128818, MSE = 0.142324, -logLik/N = 0.440873
  0.25  : Rsq = 0.133385, MSE = 0.14158, -logLik/N = 0.439042
  0.5   : Rsq = 0.13374, MSE = 0.141528, -logLik/N = 0.438916
  0.75  : Rsq = 0.134189, MSE = 0.141459, -logLik/N = 0.43874
  0.99  : Rsq = 0.134458, MSE = 0.141433, -logLik/N = 0.438685<- min value
  * making predictions...writing LOCO predictions...writing null approximate Firth estimates...done (128573ms) 

List of blup files written to: [ukb_male_Y_loss_pred.list]
List of files with null Firth estimates written to: [ukb_male_Y_loss_firth.list]

Elapsed time : 439.421s
End time: Wed May 10 12:00:14 2023

However, with version 3.2.5.3 (and also with verison 3.2.6) the computation throws an error in the end:

Start time: Wed May 10 12:09:16 2023

              |===============================|
              |      REGENIE v3.2.5.3.gz      |
              |===============================|

Copyright (c) 2020-2022 Joelle Mbatchou, Andrey Ziyatdinov and Jonathan Marchini.
Distributed under the MIT License.
Compiled with Boost Iostream library.
Using Intel MKL with Eigen.

Log of output saved in file : ukb_male_Y_loss.log

Options in effect:
  --step 1 \
  --bed ukb_male.prune \
  --phenoFile ukb_male.phe.Y_loss \
  --covarFile ukb_male.cov \
  --bt \
  --bsize 1000 \
  --loocv \
  --run-l1 ukb_male_Y_loss.master \
  --keep-l0 \
  --gz \
  --write-null-firth \
  --threads 4 \
  --out ukb_male_Y_loss

Fitting null model
 * bim              : [ukb_male.prune.bim] n_snps = 295172
 * fam              : [ukb_male.prune.fam] n_samples = 221517
 * bed              : [ukb_male.prune.bed]
 * phenotypes       : [ukb_male.phe.Y_loss] n_pheno = 1
   -dropping observations with missing values at any of the phenotypes
   -number of phenotyped individuals with no missing data = 221517
 * covariates       : [ukb_male.cov] n_cov = 22
   -number of individuals with covariate data = 221511
 * number of individuals used in analysis = 221511
 * case-control counts for each trait:
   - 'Y_loss': 45546 cases and 175965 controls
   -fitting null logistic regression on binary phenotypes...done (210ms) 
   -residualizing and scaling phenotypes...done (4ms) 
 * using results from running 23 parallel jobs at level 0
 * # threads        : [4]
 * block size       : [1000]
 * # blocks         : [306] for 295172 variants
 * # CV folds       : [221511]
 * ridge data_l1    : [5 : 0.01 0.25 0.5 0.75 0.99 ]
 * approximate memory usage : 4GB
 * writing null Firth estimates to file
 * setting memory...done

 (skipping to level 1 models)
 Level 1 ridge with logistic regression...
   -on phenotype 1 (Y_loss)...done (290844ms) 

Output
------
phenotype 1 (Y_loss) : 
  0.01  : Rsq = 0.128818, MSE = 0.142324, -logLik/N = 0.440873
  0.25  : Rsq = 0.133385, MSE = 0.14158, -logLik/N = 0.439042
  0.5   : Rsq = 0.13374, MSE = 0.141528, -logLik/N = 0.438916
  0.75  : Rsq = 0.134189, MSE = 0.141459, -logLik/N = 0.43874
  0.99  : Rsq = 0.134458, MSE = 0.141433, -logLik/N = 0.438685<- min value
  * making predictions...regenie_v3.2.5.3.gz_x86_64_Linux_mkl: /home/s.joelle.mbatchou/software/regenie/external_libs/eigen-3.4.0/Eigen/src/Core/DenseCoeffsBase.h:427: Eigen::DenseCoeffsBase<Derived, 1>::Scalar& Eigen::DenseCoeffsBase<Derived, 1>::operator()(Eigen::Index) [with Derived = Eigen::Array<int, -1, 1>; Eigen::DenseCoeffsBase<Derived, 1>::Scalar = int; Eigen::Index = long int]: Assertion `index >= 0 && index < size()' failed.
Aborted (core dumped)

Also notice that the Copyright notice was not updated to 2023

@rachitk
Copy link

rachitk commented May 11, 2023

I can confirm that the same error occurs for Step 1 when using the latest release, Regenie 3.2.6:

* making predictions...regenie: /src/regenie/external_libs/eigen-3.4.0/Eigen/src/Core/DenseCoeffsBase.h:427: Eigen::DenseCoeffsBase<Derived, 1>::Scalar& Eigen::DenseCoeffsBase<Derived, 1>::operator()(Eigen::Index) [with Derived = Eigen::Array<int, -1, 1>; Eigen::DenseCoeffsBase<Derived, 1>::Scalar = int; Eigen::Index = long int]: Assertion `index >= 0 && index < size()' failed.

@joellembatchou
Copy link
Collaborator

Hi,

Can you please check if it still persists in the newest version release?

Cheers,
Joelle

@freeseek
Copy link
Author

Version 3.2.7 stdout:

Start time: Fri May 26 21:42:22 2023

              |=============================|
              |      REGENIE v3.2.7.gz      |
              |=============================|

Copyright (c) 2020-2022 Joelle Mbatchou, Andrey Ziyatdinov and Jonathan Marchini.
Distributed under the MIT License.
Compiled with Boost Iostream library.
Using Intel MKL with Eigen.

Log of output saved in file : ukb_male_Y_loss.log

Options in effect:
  --step 1 \
  --bed ukb_male.prune \
  --phenoFile ukb_male.phe.Y_loss \
  --covarFile ukb_male.cov \
  --bt \
  --bsize 1000 \
  --loocv \
  --run-l1 ukb_male_Y_loss.master \
  --keep-l0 \
  --gz \
  --write-null-firth \
  --out ukb_male_Y_loss

Fitting null model
 * bim              : [ukb_male.prune.bim] n_snps = 295172
 * fam              : [ukb_male.prune.fam] n_samples = 221517
 * bed              : [ukb_male.prune.bed]
 * phenotypes       : [ukb_male.phe.Y_loss] n_pheno = 1
   -dropping observations with missing values at any of the phenotypes
   -number of phenotyped individuals with no missing data = 221517
 * covariates       : [ukb_male.cov] n_cov = 22
   -number of individuals with covariate data = 221511
 * number of individuals used in analysis = 221511
 * case-control counts for each trait:
   - 'Y_loss': 45546 cases and 175965 controls
   -fitting null logistic regression on binary phenotypes...done (254ms) 
   -residualizing and scaling phenotypes...done (8ms) 
 * using results from running 23 parallel jobs at level 0
 * # threads        : [1]
 * block size       : [1000]
 * # blocks         : [306] for 295172 variants
 * # CV folds       : [221511]
 * ridge data_l1    : [5 : 0.01 0.25 0.5 0.75 0.99 ]
 * approximate memory usage : 4GB
 * writing null Firth estimates to file
 * setting memory...done

 (skipping to level 1 models)
 Level 1 ridge with logistic regression...
   -on phenotype 1 (Y_loss)...done (689059ms) 

Output
------
phenotype 1 (Y_loss) : 
  0.01  : Rsq = 0.128818, MSE = 0.142324, -logLik/N = 0.440873
  0.25  : Rsq = 0.133385, MSE = 0.14158, -logLik/N = 0.439042
  0.5   : Rsq = 0.13374, MSE = 0.141528, -logLik/N = 0.438916
  0.75  : Rsq = 0.134189, MSE = 0.141459, -logLik/N = 0.43874
  0.99  : Rsq = 0.134458, MSE = 0.141433, -logLik/N = 0.438685<- min value
  * making predictions...

Version 3.2.7 stderr:

regenie_v3.2.7.gz_x86_64_Linux_mkl: /home/s.joelle.mbatchou/software/regenie/external_libs/eigen-3.4.0/Eigen/src/Core/DenseCoeffsBase.h:427: Eigen::DenseCoeffsBase<Derived, 1>::Scalar& Eigen::DenseCoeffsBase<Derived, 1>::operator()(Eigen::Index) [with Derived = Eigen::Array<int, -1, 1>; Eigen::DenseCoeffsBase<Derived, 1>::Scalar = int; Eigen::Index = long int]: Assertion `index >= 0 && index < size()' failed.

@freeseek
Copy link
Author

Version 3.2.8 stdout:

Start time: Thu Jun 22 15:13:56 2023

              |=============================|
              |      REGENIE v3.2.8.gz      |
              |=============================|

Copyright (c) 2020-2023 Joelle Mbatchou, Andrey Ziyatdinov and Jonathan Marchini.
Distributed under the MIT License.
Compiled with Boost Iostream library.
Using Intel MKL with Eigen.

Log of output saved in file : ukb_male_Y_loss.log

Options in effect:
  --step 1 \
  --bed ukb_male.prune \
  --phenoFile ukb_male.phe.Y_loss \
  --covarFile ukb_male.cov \
  --bt \
  --bsize 1000 \
  --loocv \
  --run-l1 ukb_male_Y_loss.master \
  --keep-l0 \
  --gz \
  --write-null-firth \
  --threads 4 \
  --out ukb_male_Y_loss

Fitting null model
 * bim              : [ukb_male.prune.bim] n_snps = 295172
 * fam              : [ukb_male.prune.fam] n_samples = 221517
 * bed              : [ukb_male.prune.bed]
 * phenotypes       : [ukb_male.phe.Y_loss] n_pheno = 1
   -dropping observations with missing values at any of the phenotypes
   -number of phenotyped individuals with no missing data = 221517
 * covariates       : [ukb_male.cov] n_cov = 22
   -number of individuals with covariate data = 221511
 * number of individuals used in analysis = 221511
 * case-control counts for each trait:
   - 'Y_loss': 45546 cases and 175965 controls
   -fitting null logistic regression on binary phenotypes...done (252ms) 
   -residualizing and scaling phenotypes...done (8ms) 
 * using results from running 23 parallel jobs at level 0
 * # threads        : [4]
 * block size       : [1000]
 * # blocks         : [306] for 295172 variants
 * # CV folds       : [221511]
 * ridge data_l1    : [5 : 0.01 0.25 0.5 0.75 0.99 ]
 * approximate memory usage : 4GB
 * writing null Firth estimates to file
 * setting memory...done

 (skipping to level 1 models)
 Level 1 ridge with logistic regression...
   -on phenotype 1 (Y_loss)...done (688937ms) 

Output
------
phenotype 1 (Y_loss) : 
  0.01  : Rsq = 0.128818, MSE = 0.142324, -logLik/N = 0.440873
  0.25  : Rsq = 0.133385, MSE = 0.14158, -logLik/N = 0.439042
  0.5   : Rsq = 0.13374, MSE = 0.141528, -logLik/N = 0.438916
  0.75  : Rsq = 0.134189, MSE = 0.141459, -logLik/N = 0.43874
  0.99  : Rsq = 0.134458, MSE = 0.141433, -logLik/N = 0.438685<- min value
  * making predictions...

Version 3.2.8 stderr:

regenie_v3.2.8.gz_x86_64_Linux_mkl: /home/s.joelle.mbatchou/software/regenie/external_libs/eigen-3.4.0/Eigen/src/Core/DenseCoeffsBase.h:427: Eigen::DenseCoeffsBase<Derived, 1>::Scalar& Eigen::DenseCoeffsBase<Derived, 1>::operator()(Eigen::Index) [with Derived = Eigen::Array<int, -1, 1>; Eigen::DenseCoeffsBase<Derived, 1>::Scalar = int; Eigen::Index = long int]: Assertion `index >= 0 && index < size()' failed.

@pmquiros
Copy link

pmquiros commented Jul 3, 2023

I can confirm the same error after Step1 with version 3.2.8

regenie: /src/regenie/external_libs/eigen-3.4.0/Eigen/src/Core/DenseCoeffsBase.h:427: Eigen::DenseCoeffsBase<Derived, 1>::Scalar& Eigen::DenseCoeffsBase<Derived, 1>::operator()(Eigen::Index) [with Derived = Eigen::Array<int, -1, 1>; Eigen::DenseCoeffsBase<Derived, 1>::Scalar = int; Eigen::Index = long int]: Assertion `index >= 0 && index < size()' failed.

joellembatchou added a commit that referenced this issue Jul 31, 2023
bug fix for #413
@joellembatchou
Copy link
Collaborator

Hi everyone,

This bug has been fixed in v3.2.9. Please let me know if that is not the case.

Cheers,
Joelle

@freeseek
Copy link
Author

freeseek commented Aug 1, 2023

I can confirm that version 3.2.9 fixes the problem:

Start time: Tue Aug  1 15:05:55 2023

              |=============================|
              |      REGENIE v3.2.9.gz      |
              |=============================|

Copyright (c) 2020-2023 Joelle Mbatchou, Andrey Ziyatdinov and Jonathan Marchini.
Distributed under the MIT License.
Compiled with Boost Iostream library.
Using Intel MKL with Eigen.

Log of output saved in file : ukb_male_Y_loss.log

Options in effect:
  --step 1 \
  --bed ukb_male.prune \
  --phenoFile ukb_male.phe.Y_loss \
  --covarFile ukb_male.cov \
  --bt \
  --bsize 1000 \
  --loocv \
  --run-l1 ukb_male_Y_loss.master \
  --keep-l0 \
  --gz \
  --write-null-firth \
  --out ukb_male_Y_loss

Fitting null model
 * bim              : [ukb_male.prune.bim] n_snps = 295172
 * fam              : [ukb_male.prune.fam] n_samples = 221517
 * bed              : [ukb_male.prune.bed]
 * phenotypes       : [ukb_male.phe.Y_loss] n_pheno = 1
   -dropping observations with missing values at any of the phenotypes
   -number of phenotyped individuals with no missing data = 221517
 * covariates       : [ukb_male.cov] n_cov = 22
   -number of individuals with covariate data = 221511
 * number of individuals used in analysis = 221511
 * case-control counts for each trait:
   - 'Y_loss': 45546 cases and 175965 controls
   -fitting null logistic regression on binary phenotypes...done (266ms) 
   -residualizing and scaling phenotypes...done (8ms) 
 * using results from running 23 parallel jobs at level 0
 * # threads        : [1]
 * block size       : [1000]
 * # blocks         : [306] for 295172 variants
 * # CV folds       : [221511]
 * ridge data_l1    : [5 : 0.01 0.25 0.5 0.75 0.99 ]
 * approximate memory usage : 4GB
 * writing null Firth estimates to file
 * setting memory...done

 (skipping to level 1 models)
 Level 1 ridge with logistic regression...
   -on phenotype 1 (Y_loss)...done (687493ms) 

Output
------
phenotype 1 (Y_loss) : 
  0.01  : Rsq = 0.128818, MSE = 0.142324, -logLik/N = 0.440873
  0.25  : Rsq = 0.133385, MSE = 0.14158, -logLik/N = 0.439042
  0.5   : Rsq = 0.13374, MSE = 0.141528, -logLik/N = 0.438916
  0.75  : Rsq = 0.134189, MSE = 0.141459, -logLik/N = 0.43874
  0.99  : Rsq = 0.134458, MSE = 0.141433, -logLik/N = 0.438685<- min value
  * making predictions...writing LOCO predictions...writing null approximate Firth estimates...done (222839ms) 

List of blup files written to: [ukb_male_Y_loss_pred.list]
List of files with null Firth estimates written to: [ukb_male_Y_loss_firth.list]

Elapsed time : 914.442s
End time: Tue Aug  1 15:21:09 2023

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants