Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Per-rate scaling notification #133

Closed
boopsboops opened this issue Jan 15, 2022 · 4 comments
Closed

Per-rate scaling notification #133

boopsboops opened this issue Jan 15, 2022 · 4 comments

Comments

@boopsboops
Copy link

I'm trying to speed up raxml-ng by disabling the automatic per-rate scaling (it worked fine on old raxml which I think did not implement this feature?).

But with --rate-scalers off --force I get the following note:

NOTE: Per-rate scalers were automatically enabled to prevent numerical issues on taxa-rich alignments.
NOTE: You can use --force switch to skip this check and fall back to per-site scalers.

I'm assuming that it was disabled, but can't be sure, given this message?

Also, is it essential to have to use --force when combined with --rate-scalers off, as I would rather not if possible?

Full log below.

Cheers!

RAxML-NG v. 1.1-master released on 29.11.2021 by The Exelixis Lab.
Developed by: Alexey M. Kozlov and Alexandros Stamatakis.
Contributors: Diego Darriba, Tomas Flouri, Benoit Morel, Sarah Lutteropp, Ben Bettisworth.
Latest version: https://github.com/amkozlov/raxml-ng
Questions/problems/suggestions? Please visit: https://groups.google.com/forum/#!forum/raxml

System: Intel(R) Core(TM) i7-3820 CPU @ 3.60GHz, 4 cores, 15 GB RAM

RAxML-NG was called at 15-Jan-2022 16:37:02 as follows:

raxml-ng --search --msa ali.rba --tree pars{1} --rate-scalers off --force --seed 42 --redo --threads auto

Analysis options:
  run mode: ML tree search
  start tree(s): parsimony (1)
  random seed: 42
  tip-inner: OFF
  pattern compression: ON
  per-rate scalers: OFF
  site repeats: ON
  fast spr radius: AUTO
  spr subtree cutoff: 1.000000
  branch lengths: proportional (ML estimate, algorithm: NR-FAST)
  SIMD kernels: AVX
  parallelization: coarse-grained (auto), PTHREADS (auto)

WARNING: Running in REDO mode: existing checkpoints are ignored, and all result files will be overwritten!

WARNING: Running in FORCE mode: all safety checks are disabled!

[00:00:00] Loading binary alignment from file: ali.rba
[00:00:00] Alignment comprises 5473 taxa, 1 partitions and 335 patterns

Partition 0: noname
Model: TN93+FC+G4m
Alignment sites / patterns: 338 / 335
Gaps: 10.40 %
Invariant sites: 6.21 %


Parallelization scheme autoconfig: 1 worker(s) x 4 thread(s)


NOTE: Per-rate scalers were automatically enabled to prevent numerical issues on taxa-rich alignments.
NOTE: You can use --force switch to skip this check and fall back to per-site scalers.

Parallel reduction/worker buffer size: 1 KB  / 0 KB

[00:00:00] Generating 1 parsimony starting tree(s) with 5473 taxa
[00:00:09] Data distribution: max. partitions/sites/weight per thread: 1 / 84 / 1344
[00:00:09] Data distribution: max. searches per worker: 1

Starting ML tree search with 1 distinct starting trees
@amkozlov
Copy link
Owner

I don't think that rate scalers are the main problem here. Rather, you have a very low sites-to-taxa ratio (<0.1), which means the signal in the MSA is insufficient to resolve most branches. Hence, the extensive and time-consuming ML search as implemented and in raxml-ng (or old RAxML) is pretty pointless on this dataset.

We have discussed this topic multiple times on our RAxML google group, please search for keywords like "poor signal":

https://groups.google.com/g/raxml/search?q=poor%20signal

In short, your options are:

  1. subsampling / clustering
  2. use FastTree or parsimony
  3. increase convergence epsilon, eg --lh-epsilon 10

@boopsboops
Copy link
Author

Many apologies for confusing matters by mentioning why I was disabling the rate scalers, but thanks for the tips anyway!

The main purpose of the report was simply to highlight that I had turned the scalers off, but the software reports that they had been enabled regardless. This was a bit misleading, so I was concerned that it may be a bug.

@amkozlov
Copy link
Owner

ok I see, thanks for reporting!

I guess you are right and it's currently not possible to disable rate scalers for alignments with >2000 taxa. I will take care of this.

amkozlov added a commit that referenced this issue Jan 20, 2022
@amkozlov
Copy link
Owner

ok this should be fixed now, you can disable automatic rate scalers with --force model_rate_scalers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants