Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Continuous benchmark tests as part of the CI #1174

Merged
merged 6 commits into from
Jan 21, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
72 changes: 72 additions & 0 deletions .github/workflows/benchmark.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
name: OpenTelemetry-cpp benchmarks
on:
push:
branches:
- main

permissions:
contents: write
deployments: write

jobs:
benchmark:
name: Run OpenTelemetry-cpp benchmarks
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
with:
submodules: 'recursive'
- name: Mount Bazel Cache
uses: actions/cache@v2
env:
cache-name: bazel_cache
with:
path: /home/runner/.cache/bazel
key: bazel_benchmark
- name: setup
run: |
sudo ./ci/setup_cmake.sh
sudo ./ci/setup_ci_environment.sh
- name: Run benchmark
id: run_benchmarks
run: |
ci/do_ci.sh bazel.benchmark
mkdir -p benchmarks
mv api-benchmark_result.json benchmarks
mv sdk-benchmark_result.json benchmarks
mv exporters-benchmark_result.json benchmarks
- uses: actions/upload-artifact@master
with:
name: benchmark_results
path: benchmarks
store_benchmark:
needs: benchmark
strategy:
matrix:
components: ["api", "sdk", "exporters"]
name: Store benchmark result
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/download-artifact@master
with:
name: benchmark_results
path: benchmarks
- name: Print json files
id: print_json
run: |
cat benchmarks/*
- name: Push benchmark result
uses: benchmark-action/github-action-benchmark@v1
with:
name: OpenTelemetry-cpp ${{ matrix.components }} Benchmark
tool: 'googlecpp'
output-file-path: benchmarks/${{ matrix.components }}-benchmark_result.json
github-token: ${{ secrets.GITHUB_TOKEN }}
auto-push: true
# Show alert with commit comment on detecting possible performance regression
alert-threshold: '200%'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the threshold of comparison with the previous result from the main branch? Trying to understand the flow here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes there will be an alert similar to this, when there is an slowdown bigger than 200%.
Smaller threshold could be an issue, as the machines used for CI are not always the same.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will fail-on-alert block PR merge if the threshold is higher? If yes, is it possible to let CI job fail, but allow merge on case to case basis?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will fail-on-alert block PR merge if the threshold is higher? If yes, is it possible to let CI job fail, but allow merge on case to case basis?

No the merge will go through, only the job will fail with some alert as a comment on the commit.

comment-on-alert: true
fail-on-alert: true
gh-pages-branch: gh-pages
benchmark-data-dir-path: benchmarks
6 changes: 4 additions & 2 deletions bazel/otel_cc_benchmark.bzl
Original file line number Diff line number Diff line change
Expand Up @@ -24,16 +24,17 @@ def otel_cc_benchmark(name, srcs, deps, tags = [""]):
srcs = srcs,
deps = deps + ["@com_github_google_benchmark//:benchmark"],
tags = tags + ["manual"],
defines = ["BAZEL_BUILD"],
)

# The result of running the benchmark, captured into a text file.
native.genrule(
name = name + "_result",
outs = [name + "_result.txt"],
outs = [name + "_result.json"],
tools = [":" + name],
tags = tags + ["benchmark_result", "manual"],
testonly = True,
cmd = "$(location :" + name + (") --benchmark_color=false --benchmark_min_time=.1 &> $@"),
cmd = "$(location :" + name + (") --benchmark_format=json --benchmark_color=false --benchmark_min_time=.1 &> $@"),
)

# This is run as part of "bazel test ..." to smoke-test benchmarks. It's
Expand All @@ -44,4 +45,5 @@ def otel_cc_benchmark(name, srcs, deps, tags = [""]):
deps = deps + ["@com_github_google_benchmark//:benchmark"],
args = ["--benchmark_min_time=0"],
tags = tags + ["benchmark"],
defines = ["BAZEL_BUILD"],
)
47 changes: 43 additions & 4 deletions ci/do_ci.sh
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,42 @@ function install_prometheus_cpp_client
popd
}

function run_benchmarks
{
docker run -d --rm -it -p 4317:4317 -p 4318:4318 -v \
$(pwd)/examples/otlp:/cfg otel/opentelemetry-collector:0.38.0 \
--config=/cfg/opentelemetry-collector-config/config.dev.yaml

[ -z "${BENCHMARK_DIR}" ] && export BENCHMARK_DIR=$HOME/benchmark
mkdir -p $BENCHMARK_DIR
bazel $BAZEL_STARTUP_OPTIONS build $BAZEL_OPTIONS -c opt -- \
$(bazel query 'attr("tags", "benchmark_result", ...)')
echo ""
echo "Benchmark results in $BENCHMARK_DIR:"
(
cd bazel-bin
find . -name \*_result.json -exec bash -c \
'echo "$@" && mkdir -p "$BENCHMARK_DIR/$(dirname "$@")" && \
cp "$@" "$BENCHMARK_DIR/$@" && chmod +w "$BENCHMARK_DIR/$@"' _ {} \;
)

# collect benchmark results into one array
pushd $BENCHMARK_DIR
components=(api sdk exporters)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - this is already defined.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, cleaned.

for component in "${components[@]}"
do
out=$component-benchmark_result.json
find ./$component -type f -name "*_result.json" -exec cat {} \; > $component_tmp_bench.json
cat $component_tmp_bench.json | docker run -i --rm itchyny/gojq:0.12.6 -s \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - one minor comment, if it's not a major rework, can we use jq instead of gojq, it's more lightweight without any external dependencies.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried jq, it was showing different behavior on my local (Ubuntu 20.4) and on the GHA vm (Ubuntu latest). Collecting the benchmark results of all the tests into one array didn't work on the jq of GHA vm. So I switched to gojq which has a docker image.

'.[0].benchmarks = ([.[].benchmarks] | add) |
if .[0].benchmarks == null then null else .[0] end' > $BENCHMARK_DIR/$out
done

mv *benchmark_result.json ${SRC_DIR}
popd
docker kill $(docker ps -q)
}

[ -z "${SRC_DIR}" ] && export SRC_DIR="`pwd`"
[ -z "${BUILD_DIR}" ] && export BUILD_DIR=$HOME/build
mkdir -p "${BUILD_DIR}"
Expand Down Expand Up @@ -124,6 +160,10 @@ elif [[ "$1" == "cmake.exporter.otprotocol.test" ]]; then
make -j $(nproc)
cd exporters/otlp && make test
exit 0
elif [[ "$1" == "bazel.with_abseil" ]]; then
bazel $BAZEL_STARTUP_OPTIONS build $BAZEL_OPTIONS --//api:with_abseil=true //...
bazel $BAZEL_STARTUP_OPTIONS test $BAZEL_TEST_OPTIONS --//api:with_abseil=true //...
exit 0
elif [[ "$1" == "cmake.test_example_plugin" ]]; then
# Build the plugin
cd "${BUILD_DIR}"
Expand Down Expand Up @@ -162,9 +202,8 @@ elif [[ "$1" == "bazel.test" ]]; then
bazel $BAZEL_STARTUP_OPTIONS build $BAZEL_OPTIONS //...
bazel $BAZEL_STARTUP_OPTIONS test $BAZEL_TEST_OPTIONS //...
exit 0
elif [[ "$1" == "bazel.with_abseil" ]]; then
bazel $BAZEL_STARTUP_OPTIONS build $BAZEL_OPTIONS --//api:with_abseil=true //...
bazel $BAZEL_STARTUP_OPTIONS test $BAZEL_TEST_OPTIONS --//api:with_abseil=true //...
elif [[ "$1" == "bazel.benchmark" ]]; then
run_benchmarks
exit 0
elif [[ "$1" == "bazel.macos.test" ]]; then
bazel $BAZEL_STARTUP_OPTIONS build $BAZEL_MACOS_OPTIONS //...
Expand Down Expand Up @@ -200,7 +239,7 @@ elif [[ "$1" == "benchmark" ]]; then
echo "Benchmark results in $BENCHMARK_DIR:"
(
cd bazel-bin
find . -name \*_result.txt -exec bash -c \
find . -name \*_result.json -exec bash -c \
'echo "$@" && mkdir -p "$BENCHMARK_DIR/$(dirname "$@")" && \
cp "$@" "$BENCHMARK_DIR/$@" && chmod +w "$BENCHMARK_DIR/$@"' _ {} \;
)
Expand Down
1 change: 1 addition & 0 deletions exporters/otlp/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -279,5 +279,6 @@ otel_cc_benchmark(
],
deps = [
":otlp_grpc_exporter",
"//examples/common/foo_library:common_foo_library",
],
)
42 changes: 42 additions & 0 deletions exporters/otlp/test/otlp_grpc_exporter_benchmark.cc
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,22 @@
#include "opentelemetry/exporters/otlp/otlp_grpc_exporter.h"
#include "opentelemetry/exporters/otlp/otlp_recordable.h"

#include <benchmark/benchmark.h>
#include "opentelemetry/sdk/trace/simple_processor.h"
#include "opentelemetry/sdk/trace/tracer_provider.h"
#include "opentelemetry/trace/provider.h"

#ifdef BAZEL_BUILD
# include "examples/common/foo_library/foo_library.h"
#else
# include "foo_library/foo_library.h"
#endif

namespace trace = opentelemetry::trace;
namespace nostd = opentelemetry::nostd;
namespace trace_sdk = opentelemetry::sdk::trace;
namespace otlp = opentelemetry::exporter::otlp;

#include <benchmark/benchmark.h>

OPENTELEMETRY_BEGIN_NAMESPACE
Expand Down Expand Up @@ -171,4 +187,30 @@ BENCHMARK(BM_OtlpExporterDenseSpans);
} // namespace exporter
OPENTELEMETRY_END_NAMESPACE

namespace
{
opentelemetry::exporter::otlp::OtlpGrpcExporterOptions opts;
void InitTracer()
{
// Create OTLP exporter instance
auto exporter = std::unique_ptr<trace_sdk::SpanExporter>(new otlp::OtlpGrpcExporter(opts));
auto processor = std::unique_ptr<trace_sdk::SpanProcessor>(
new trace_sdk::SimpleSpanProcessor(std::move(exporter)));
auto provider =
nostd::shared_ptr<trace::TracerProvider>(new trace_sdk::TracerProvider(std::move(processor)));
// Set the global trace provider
trace::Provider::SetTracerProvider(provider);
}

void BM_otlp_grpc_with_collector(benchmark::State &state)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to measure benchmark with the actual collector, or it would be sufficient to have results using faking the service stub, as we are more interested in the stats resulting from the otel-cpp code?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a good-to-have number for the users.
Can we have both with and without the actual collector?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would ideally like to keep the stats with an actual collector. We have lately seen CI failures because of transient network timeout issues, I am just concerned if testing with real collector instance shouldn't add to that. Also, whether adding docker instances to the VM consume more resources and slows the CI jobs. And we are spawning another docker instance for jq parsing. We can keep them if you don't see any such slowness in CI with multiple iterations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test with the collector was pretty stable in my test CI. This can't be guaranteed to be the case always though. The job will be executed only when we merge a commit to the main branch, so we will see the issues if any only after merge to main.
This can't be part of checks for RPs as it will be noisy.
Shall we keep it as it is, in case we got failures, I can raise a PR to use mock.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we keep it as it is, in case we got failures, I can raise a PR to use mock.

Should be fine for me. Let's wait for suggestions from @ThomsonTan too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @lalitb , I prefer to use mock to avoid that the result could be affected by environment workload and network traffic, but we could do this in later PRs.

{
InitTracer();
while (state.KeepRunning())
{
foo_library();
}
}
BENCHMARK(BM_otlp_grpc_with_collector);
} // namespace

BENCHMARK_MAIN();