You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
JVM version:
openjdk version "1.8.0_162"
OpenJDK Runtime Environment (build 1.8.0_162-8u162-b12-0ubuntu0.16.04.2-b12)
OpenJDK 64-Bit Server VM (build 25.162-b12, mixed mode
OS version
Linux elasticsearch-6-2-3-client-eu-0 4.13.0-1012-gcp #16-Ubuntu SMP Thu Mar 15 12:00:42 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
The Instances are running within Google Cloud. Client and data node instances are both on n1-standard-16s (16 cpu and 60GB).
I'm trying to optimise the indexing rate, which is currently 25k/sec but this needs to be much faster. It seems like one of the possible issues is that one data node is taking on most of the work when bulk importing.
I bulk import in 10MB pieces and I have tried lowering that all the way to 3MB with no changes. A new connection is created to each client and I round robin between them when importing. I have also tried with a single connection rather than creating a new for each client with no difference.
The setup that I have is as follows:
Master Nodes: Three
Client Nodes: Two
Data Nodes: Four
The bulk thread pool size is set to 17 on the client and data nodes.
Elastic provides a forum for asking general questions and instead prefers to use GitHub only for verified bug reports and feature requests. There's an active community there that should be able to help get an answer to your question. As such, I hope you don't mind that I close this.
Elasticsearch version: 6.2.3
JVM version:
openjdk version "1.8.0_162"
OpenJDK Runtime Environment (build 1.8.0_162-8u162-b12-0ubuntu0.16.04.2-b12)
OpenJDK 64-Bit Server VM (build 25.162-b12, mixed mode
OS version
Linux elasticsearch-6-2-3-client-eu-0 4.13.0-1012-gcp #16-Ubuntu SMP Thu Mar 15 12:00:42 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
The Instances are running within Google Cloud. Client and data node instances are both on n1-standard-16s (16 cpu and 60GB).
I'm trying to optimise the indexing rate, which is currently 25k/sec but this needs to be much faster. It seems like one of the possible issues is that one data node is taking on most of the work when bulk importing.
I bulk import in 10MB pieces and I have tried lowering that all the way to 3MB with no changes. A new connection is created to each client and I round robin between them when importing. I have also tried with a single connection rather than creating a new for each client with no difference.
The setup that I have is as follows:
Master Nodes: Three
Client Nodes: Two
Data Nodes: Four
The bulk thread pool size is set to 17 on the client and data nodes.
The text was updated successfully, but these errors were encountered: