open-telemetry · tigrannajaryan · Nov 10, 2020 · Jul 28, 2020 · Sep 8, 2020 · Sep 11, 2020
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -35,6 +35,8 @@ New:
   ([#981](https://github.com/open-telemetry/opentelemetry-specification/pull/981))
 - Define PropagationOnly Span to simplify active Span logic in Context
   ([#994](https://github.com/open-telemetry/opentelemetry-specification/pull/994))
+- Add performance benchmark specification
+  ([#748](https://github.com/open-telemetry/opentelemetry-specification/pull/748))
 
 Updates:
 

diff --git a/specification/performance-benchmark.md b/specification/performance-benchmark.md
@@ -0,0 +1,58 @@
+# Performance Benchmark of OpenTelemetry API
+
+This document describes common performance benchmark guidelines on how to
+measure and report the performance of OpenTelemetry SDKs.
+
+## Benchmark Configuration
+
+### Span Configuration
+
+- No parent `Span` and `SpanContext`.
+- Default Span [Kind](./trace/api.md#spankind) and
+  [Status](./trace/api.md#set-status).
+- Associated to a [resource](overview.md#resources) with attributes
+  `service.name`, `service.version`, `name`, and 10 characters string value for
+  each attribute.
+- 1 [attribute](./common/common.md#attributes) with a signed 64-bit integer
+  value.
+- 1 [event](./trace/api.md#add-events) without any attributes.
+- The `AlwaysOn` sampler should be enabled.
+
+## Throughput Measurement
+
+### Create Spans
+
+Number of spans which could be created and exported via OTLP exporter in 1
+second per logical core and average number over all logical cores, with each
+span containing 10 attributes, and each attribute containing two 20 characters
+strings, one as attribute name the other as value.
+
+## Instrumentation Cost
+
+### CPU Usage Measurement
+
+With given number of span throughput specified by user, or 10,000 spans per
+second as default if user does not input the number, measure and report the CPU
+usage for SDK with both simple and batching span processors together with OTLP
+exporter. The benchmark should create an out-of-process OTLP receiver which
+listens on the exporting target or adopts existing OTLP exporter which runs
+out-of-process, responds with success status immediately and drops the data. The
+collector should not add significant CPU overhead to the measurement. Because
+the benchmark does not include user processing logic, the total CPU consumption
+of benchmark program could be considered as approximation of SDK's CPU
+consumption.
+
+The total running time for one test iteration is suggested to be at least 15
+seconds. The average and peak CPU usage should be reported.
+
+### Memory Usage Measurement
+
+Measure dynamic memory consumption, e.g. heap, for the same scenario as above
+CPU Usage section with 15 seconds duration.
+
+## Report
+
+### Report Format
+
+All the numbers above should be measured multiple times (suggest 10 times at
+least) and reported.