Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

treeinfo_compute_loglh: Assertion `total_loglh < 0.' failed #93

Closed
davipatti opened this issue Jun 2, 2020 · 3 comments
Closed

treeinfo_compute_loglh: Assertion `total_loglh < 0.' failed #93

davipatti opened this issue Jun 2, 2020 · 3 comments

Comments

@davipatti
Copy link

Hello, I've been running into this error:

david@puck~/d/c/P/b/a/r/debug> raxml-ng --search1 --msa ali.fasta --model FLU+G --seed 42 --log VERBOSE

RAxML-NG v. 0.9.0git released on 26.11.2019 by The Exelixis Lab.
Developed by: Alexey M. Kozlov and Alexandros Stamatakis.
Contributors: Diego Darriba, Tomas Flouri, Benoit Morel, Sarah Lutteropp, Ben Bettisworth.
Latest version: https://github.com/amkozlov/raxml-ng
Questions/problems/suggestions? Please visit: https://groups.google.com/forum/#!forum/raxml

RAxML-NG was called at 02-Jun-2020 09:30:23 as follows:

raxml-ng --search1 --msa ali.fasta --model FLU+G --seed 42 --log VERBOSE

Analysis options:
  run mode: ML tree search
  start tree(s): random (1)
  random seed: 42
  tip-inner: OFF
  pattern compression: ON
  per-rate scalers: OFF
  site repeats: ON
  fast spr radius: AUTO
  spr subtree cutoff: 1.000000
  branch lengths: proportional (ML estimate, algorithm: NR-FAST)
  SIMD kernels: AVX2
  parallelization: PTHREADS (6 threads), thread pinning: OFF

[00:00:00] Reading alignment from file: ali.fasta
[00:00:00] Loaded alignment with 4000 taxa and 570 sites
[00:00:00] Extracting partitions... 
[00:00:00] Checking the alignment...
[00:00:00] Compressing alignment patterns... 

Alignment comprises 1 partitions and 569 patterns

Partition 0: noname
Model: FLU+G4m
Alignment sites / patterns: 570 / 569
Gaps: 19.26 %
Invariant sites: 7.19 %


NOTE: Binary MSA file created: ali.fasta.raxml.rba


NOTE: Per-rate scalers were automatically enabled to prevent numerical issues on taxa-rich alignments.
NOTE: You can use --force switch to skip this check and fall back to per-site scalers.

[00:00:00] Generating 1 random starting tree(s) with 4000 taxa

Initial model parameters:
   Partition: noname
   Rate heterogeneity: GAMMA (4 cats, mean),  alpha: 1.000000 (ML),  weights&rates: (0.250000,0.136954) (0.250000,0.476752) (0.250000,1.000000) (0.250000,2.386294) 
   Base frequencies (model): 0.047072 0.050910 0.074214 0.047860 0.025022 0.033304 0.054587 0.076373 0.019964 0.067134 0.071498 0.056785 0.018151 0.030496 0.050656 0.088409 0.074339 0.018524 0.031474 0.063229 
   Substitution rates (model): 0.138659 0.053367 0.584852 0.026447 0.353754 1.484235 1.132313 0.214758 0.149927 0.023117 0.474334 0.058745 0.080491 0.659311 3.011345 5.418298 0.196000 0.018289 3.532005 0.161001 0.006772 0.167207 3.292717 0.124898 1.190624 1.879570 0.246117 0.296046 15.300097 0.890162 0.016100 0.154027 0.950138 0.183077 1.369429 0.099855 0.103964 7.737393 0.000013 0.530643 0.061652 0.322525 1.387096 0.218572 0.000836 2.646848 0.005252 0.000836 0.036400 3.881311 2.140332 0.000536 0.373102 0.010258 0.014100 0.145469 5.370511 1.934833 0.887571 0.014086 0.005731 0.290043 0.041763 0.000001 0.188539 0.338372 0.135481 0.000015 0.525399 0.297124 0.002547 0.000000 0.116941 0.021800 0.001112 0.005614 0.000004 0.111457 0.104054 0.000000 0.336263 0.011975 0.094107 0.601692 0.054905 1.195629 0.108051 5.330313 0.028840 1.020367 2.559587 0.190259 0.032681 0.712770 0.487822 0.602341 0.044000 0.072206 0.406698 1.593099 0.256492 0.014200 0.016500 3.881489 0.313974 0.001004 0.319559 0.307140 0.280125 0.155245 0.104093 0.285048 0.058775 0.000016 0.006516 0.264149 0.001500 0.001237 0.038632 1.585647 0.018808 0.196486 0.074815 0.337230 0.243190 0.321612 0.347303 0.001274 0.119029 0.924467 0.580704 0.368714 0.022400 6.448954 0.098631 3.512072 0.227708 9.017954 1.463357 0.080543 0.290381 2.904052 0.032132 0.273934 14.394052 0.129224 6.746936 2.986800 0.634309 0.570767 0.044926 0.431278 0.340058 0.890599 1.331292 0.320000 0.195751 0.283808 1.526964 0.000050 0.012416 0.073128 0.279911 0.056900 0.007027 2.031511 0.070460 0.874272 4.904842 0.007132 0.996686 0.000135 0.814753 5.393924 0.592588 2.087385 0.542251 0.000431 0.000182 0.058972 2.206860 0.099836 0.392552 0.088256 0.207066 0.124898 0.654109 0.427755 0.256900 0.167582 

[00:00:00] Data distribution: max. partitions/sites/weight per thread: 1 / 95 / 7600

thread#	part#	start	length	weight
0	0	475	94	7520

1	0	380	95	7600

2	0	285	95	7600

3	0	190	95	7600

4	0	95	95	7600

5	0	0	95	7600


Starting ML tree search with 1 distinct starting trees

[00:00:00 -352943.122177] Initial branch length optimization
[00:00:16 -278338.347051] Model parameter optimization (eps = 10.000000)
raxml-ng: ~/Downloads/raxml-ng/libs/pll-modules/src/tree/treeinfo.c:1073: treeinfo_compute_loglh: Assertion `total_loglh < 0.' failed.
raxml-ng: ~/Downloads/raxml-ng/libs/pll-modules/src/tree/treeinfo.c:1073: treeinfo_compute_loglh: Assertion `total_loglh < 0.' failed.
fish: “raxml-ng --search1 --msa ali.fa…” terminated by signal SIGABRT (Abort)

I have investigated a bit:

  • --model FLU and --model FLU+F+I run fine. +G seems to be causing the issue.
  • --threads 6 and --threads 1 both cause the same error.
  • Here is a link to ali.fasta. It contains 4000 amino acid sequences. The tree search runs fine using either the first 2000 or the last 2000 sequences. Also --parse runs fine. So, I don't think the error is caused by problematic data.
  • This error occurs on two linux machines running pop!_os 19.10 and centos 7

Any help would be great, thanks

@amkozlov
Copy link
Owner

amkozlov commented Jun 3, 2020

Hi @davipatti,

that's an interesting one :)

Apparently rounding some very small substitution rates to zero leads to numerical problems on this particular dataset.

Could you please try to re-run with original FLU model (attached, also available from ftp://ftp.sanger.ac.uk/pub/1000genomes/lsq/FLU), using the following command:

raxml-ng --search1 --msa ali.fasta --model PROTGTR{FLU.txt}+G --seed 42 

This seems to fix the problem for me, if you can confirm, I will update the rates in the built-in FLU model.

FLU.txt

@davipatti
Copy link
Author

I've tried --model PROTGTR{FLU.txt}+G and that runs fine. Thanks!

amkozlov added a commit that referenced this issue Jun 8, 2020
- parsimony starting tree construction used to be very inefficient, now runs 20x-30x faster
@amkozlov
Copy link
Owner

amkozlov commented Jun 8, 2020

Thanks for the confirmation! This has been fixed by 12f68c5

@amkozlov amkozlov closed this as completed Jun 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants