Skip to content

Index sample apm events - errors, spans and transactions

Notifications You must be signed in to change notification settings

Abmun/rally-apm-data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

APM Rally Track

For general information about rally tracks and how to use them, check out the rally docs and the rally and rally-tracks github repos.

Starting a Race

Currently APM track offers four different challenges. If you want to store results to a file, preserve temporary Elasticsearch, use a running Elasticsearch instance, etc. please refer to rally's command_line_reference.

Default challenge

Ingest data of type error,transactions,spans in parallel into Elasticsearch. You can start the race for the default challenge with:

esrally --track-path=<local-path-to-apm-track>

It will download and decompress prepared corpora for error, transaction and span events when running the challenge for the first time. Data used in this track are dumped from example Opbeans applications, instrumented with different agents. The dump was created in October 2018 and it is stored in a s3 bucket.

Single Event Type challenge

If you want to test only a dedicated event type, you can use the second challenge, by running

esrally --track-path=<local-path-to-apm-track> --track-params="event_type:'<event_type>'" --challenge=ingest-event-type

The same test data as for the default challenge are used.

High Field cardinality challenge

With this challenge you can test for field explosion created by high cardinality of tags in spans or transactions.

A preparation step is necessary to bring the data into the expected format.

For preparing the benchmark tests with default data, run

virtualenv -p python3 rally/.venv
source rally/.venv/bin/activate
pip install -r rally/_tools/requirements.txt
python ./rally/_tools/prepare.py --skip-daily

By default 500 random tags are created. You can change that by applying cmd line options when starting the preparation step. Use python ./rally/_tools/prepare.py --help for more information about available options.

Base data used for this track are dumped from example Opbeans applications, instrumented with different agents. The dump was created in October 2018. The base data are stored in a s3 bucket and consist of error, span and transaction data.

After preparing the test data, run:

esrally --track-path=<local-path-to-apm-track> --track-params="event_type:'<span|transaction>'" --challenge=ingest-field-explosion

Test Query performance

For testing query performance it can be important to have data split up into multiple daily indices. A preparation step is necessary to bring the data into the expected format.

For preparing the benchmark tests with default data, run

virtualenv -p python3 rally/.venv
source rally/.venv/bin/activate
pip install -r rally/_tools/requirements.txt
python ./rally/_tools/prepare.py

By default testdata for 10 days are created. You can change that by applying cmd line options when starting the preparation step. Use python ./rally/_tools/prepare.py --help for more information about available options.

Base data used for this track are dumped from example Opbeans applications, instrumented with different agents. The dump was created in October 2018. The base data are stored in a s3 bucket and consist of error, span and transaction data.

After preparing the testdata you can run the rally track. You can specify the number of days for which data should be ingested. (Note: the number of days cannot be higher than the days for which you created data in the preparation step.)

Run esrally --track-path=<local-path-to-apm-track> --track-params="index_count:<days>" --challenge=query-apm

Parameters

When providing --track-params you can override following default parameters for the challenges:

  • ingest_percentage: Percentage defining how much of the document corpus should be indexed. Defaults to 100.
  • bulk-size: Number of documents per bulk operation. Defaults to 1_000.
  • bulk_indexing_clients: Number of clients running bulk indexing requests.
  • event_type: Event type to index. Only used for the ingest-event-type challenge.
  • index_count: Number of daily indices to create. Only used for the query-apm challenge.

Read more about track-params in general.

It is recommended to perform the rally task on a different host than where your Elasticsearch instance is running. By default rally downloads and starts an Elasticsearch instance. You can provide an URL to a running Elasticsearch instance instaed, by adding --pipeline=benchmark-only --target-hosts=<ES-host>:<ES-port>

Using a different dataset

If you want to use the benchmark tool with different data, you can also build your own corpora as a base. There is a python script, you can use for fetching data from an Elasticsearch instance. In this case you'd also need to change the document-count attribute in the track.json.

About

Index sample apm events - errors, spans and transactions

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages