Replies: 1 comment 11 replies
-
Hi Adrian ! Is this initial or incremental reindex ? If the former, then 3 hours for initial index of 360 GB of data/code is not so bad at all I'd say. Also, I wonder how exactly you disabled the history. There is no tunable in the official image that would allow this. Unless the container is configured with CPU limits, it should use all CPU power available. The indexer in the container is run via Lines 11 to 15 in 8a7aa08 RLIMIT_NOFILE resource setting might be to low - maybe it should be dynamic).
This means that there is no thread cap imposed by the tooling. Consequently, the indexer will choose the parallelism level based on the number of available CPUs: When the indexer is run without history processing, only 2nd phase of the indexing will be executed. This phase is heavily parallelized. That said, back to the Docker image. When projects are enabled, the
And again, each of these will run with as many threads as there are CPUs in the system, assuming the container has not been configured otherwise previously. These threads from distinct One way to check this would be to determine the number of spawned threads and/or ctags processes (there is 1-1 correspondence) by the indexer while it is running. If there are no ctags processed spawned by the indexer process, the indexer has not reached the 2nd phase of the indexing yet. For example, when I run the indexing on my laptop with 8 "CPUs" ( I confirmed the indexer parallelism level by running Anyhow, while this is definitely way above the 5% CPU utilization, it seems there might be some room for improvement, assuming the I/O does not block the indexer too much. Perhaps raising the thread count above the CPU count could help. |
Beta Was this translation helpful? Give feedback.
-
Hi,
I would like to share some information about OpenGrok performance on our environment.
Node description:
OpenGrok parameters:
Typically indexing source base take ~2h50m.
But I noticed that CPU utilization is really small. I’ve been monitoring indexing in htop and dstat, and it shows 300-800% CPU usage so only small portion of Threadripper power (5%) is used. Clearly there is some bottleneck here.
I have question for you guys – what were the largest CPU load you saw when indexing?
Maybe you have some tips how to unlock full potential?
Best Regards,
Adrian
Beta Was this translation helpful? Give feedback.
All reactions