Skip to content

Commit

Permalink
feat: Tracing using OpenTelemetry (#1728)
Browse files Browse the repository at this point in the history
* feat: Add FirestoreOpenTelemetryOptions to FirestoreOptions. (#1531)

* feat: Add FirestoreOpenTelemetryOptions to FirestoreOptions.

* Address code review feedback.

* feat: Add com.google.cloud.firestore.telemetry package. (#1533)

* feat: Add FirestoreOpenTelemetryOptions to FirestoreOptions.

* feat: Add com.google.cloud.firestore.telemetry package.

* Address code review feedback.

* Factor out the otel version in pom.xml.

* fix: Remove OpenCensus tracing code. (#1589)

* feat: Add FirestoreOpenTelemetryOptions to FirestoreOptions.

* feat: Add com.google.cloud.firestore.telemetry package.

* fix: Remove OpenCensus tracing code.

* feat: tracing for aggregate queries, bulkwriter, partition queries, a… (#1590)

* feat: Add FirestoreOpenTelemetryOptions to FirestoreOptions.

* feat: Add com.google.cloud.firestore.telemetry package.

* fix: Remove OpenCensus tracing code.

* feat: tracing for aggregate queries, bulkwriter, partition queries, and listDocuments.

* Address code review feedback.

* Address feedback.

* don't use wildcard imports.

* feat: trace instrumentation for DocumentReference methods. (#1591)

* feat: Add FirestoreOpenTelemetryOptions to FirestoreOptions.

* feat: Add com.google.cloud.firestore.telemetry package.

* fix: Remove OpenCensus tracing code.

* feat: tracing for aggregate queries, bulkwriter, partition queries, and listDocuments.

* feat: trace instrumentation for DocumentReference methods.

* feat: trace instrumentation for queries and transactions. (#1592)

* feat: Add FirestoreOpenTelemetryOptions to FirestoreOptions.

* feat: Add com.google.cloud.firestore.telemetry package.

* fix: Remove OpenCensus tracing code.

* feat: tracing for aggregate queries, bulkwriter, partition queries, and listDocuments.

* feat: trace instrumentation for DocumentReference methods.

* feat: trace instrumentation for queries and transactions.

* test: Adding first e2e client-tracing test w/ Custom Root Span (#1621)

* test: Adding first e2e client-tracing test w/ Custom Root Span

* Roll back E2E tests commit.

* Address feedback.

* Address feedback (better event log message).

* Address feedback.

---------

Co-authored-by: Jimit J Shah <57637300+jimit-j-shah@users.noreply.github.com>

* test: End-to-End Integration Test for Client-side Tracing in Firestore Java Server SDK using OpenTelemetry SDK and Cloud Trace Exporter against Cloud Trace. (#1635)

* Adding first e2e client-tracing test w/ Custom Root Span

* test: Adding first e2e client-tracing test w/ Custom Root Span

* Fixing test dependencies and use default GCP testing project.

Fixing

* Fixing test dependencies and use default GCP testing project.

* Fixing formatting

* Add aggregationQueryGet Test

* Add bulkWriterCommitTrace Test

* Fixing running multiple-tests

* Add partitionQuery Test

* Add collectionListDocumentsTrace Test

* Add docRef*Trace Tests

* Add docRefUpdate*Trace and docRefDelete*Trace Tests

* Fixing Trace fetching using retries for missing or incomplete traces due to eventual consistency of Cloud Trace

* Add get/query Trace Tests

* Add Transaction test

* Added TraceContainer to be able to test transaction test-cases

* test: Adding Transaction tests

* test: Adding Transaction tests

* test: Adding TestParameterInjector to run the test for global and non-global opentelemetry SDK instances

* test: formatting and cleanup

* test: Adding first e2e client-tracing test w/ Custom Root Span

* test: Add aggregationQueryGet Test

* test: Add bulkWriterCommitTrace Test and fixed running multiple-tests

* test: Add partitionQuery Test

* test: Add collectionListDocumentsTrace Test

* test: Add docRefUpdate*Trace and docRefDelete*Trace Tests and fixed Trace fetching using retries for missing or incomplete traces due to eventual consistency of Cloud Trace

* test: Add get/query Trace Tests

* test: Added Transaction tests using TraceContainer to verify traces for Transaction ops (BeginTransaction, Rollback etc)

* test: Adding TestParameterInjector to run the test for global and non-global opentelemetry SDK instances

* test: Formatting and cleanup

* test: review comments

* test: fixing dfs to handle case where the compareTo callstack may be shorter than the trace callstack - don't need to throw an exception in that case

* test: Consolidating verification methods

* test: review comments

* fix: Make telemetry-related fields transient. (#1638)

* fix: Rename 'enabled' to 'tracingEnabled'. (#1639)

* fix: Rename 'enabled' to 'tracingEnabled'.

In the future, FirestoreOpenTelemetryOptions will support enabling/disabling
Logging and Metrics as well. So we should use a better name for this field.

* address feedback.

* fix: Minor improvement to the ITE2ETracingTest. (#1637)

* fix: Minor improvement to the ITE2ETracingTest.

* revert the numExpectedSpans change.

* feat: Add 'isTransactional' attribute. (#1657)

* fix: Necessary test improvements for CI environments. (#1673)

* fix: Necessary test improvements for CI environments.

* Address feedback.

* feat: Disable the tracing feature and remove public APIs.

* 🦉 Updates from OwlBot post-processor

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

* Add the Firestore SDK version to the attributes.

* 🦉 Updates from OwlBot post-processor

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

* Update the "test" dependency versions.

* Address feedback related to attributes.

* Add 'project_id' attribute.

* 🦉 Updates from OwlBot post-processor

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

* Revert a736fcc ("Disable the tracing feature and remove public APIs").

* GlobalOtel reset for test must happen in `before`, not `after`.

* Address feedback.

* 🦉 Updates from OwlBot post-processor

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

---------

Co-authored-by: Jimit J Shah <57637300+jimit-j-shah@users.noreply.github.com>
Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com>
  • Loading branch information
3 people authored Jul 18, 2024
1 parent 05a6f73 commit 00dc240
Show file tree
Hide file tree
Showing 33 changed files with 4,388 additions and 591 deletions.
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,13 +57,13 @@ implementation 'com.google.cloud:google-cloud-firestore'
If you are using Gradle without BOM, add this to your dependencies:

```Groovy
implementation 'com.google.cloud:google-cloud-firestore:3.22.0'
implementation 'com.google.cloud:google-cloud-firestore:3.23.1'
```

If you are using SBT, add this to your dependencies:

```Scala
libraryDependencies += "com.google.cloud" % "google-cloud-firestore" % "3.22.0"
libraryDependencies += "com.google.cloud" % "google-cloud-firestore" % "3.23.1"
```
<!-- {x-version-update-end} -->

Expand Down Expand Up @@ -222,7 +222,7 @@ Java is a registered trademark of Oracle and/or its affiliates.
[kokoro-badge-link-5]: http://storage.googleapis.com/cloud-devrel-public/java/badges/java-firestore/java11.html
[stability-image]: https://img.shields.io/badge/stability-stable-green
[maven-version-image]: https://img.shields.io/maven-central/v/com.google.cloud/google-cloud-firestore.svg
[maven-version-link]: https://central.sonatype.com/artifact/com.google.cloud/google-cloud-firestore/3.22.0
[maven-version-link]: https://central.sonatype.com/artifact/com.google.cloud/google-cloud-firestore/3.23.1
[authentication]: https://github.com/googleapis/google-cloud-java#authentication
[auth-scopes]: https://developers.google.com/identity/protocols/oauth2/scopes
[predefined-iam-roles]: https://cloud.google.com/iam/docs/understanding-roles#predefined_roles
Expand Down
89 changes: 81 additions & 8 deletions google-cloud-firestore/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
</parent>
<properties>
<site.installationModule>google-cloud-firestore</site.installationModule>
<opentelemetry.version>1.38.0</opentelemetry.version>
</properties>
<dependencies>
<dependency>
Expand All @@ -39,10 +40,6 @@
<artifactId>grpc-google-cloud-firestore-v1</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>io.opencensus</groupId>
<artifactId>opencensus-contrib-grpc-util</artifactId>
</dependency>
<dependency>
<groupId>com.google.code.findbugs</groupId>
<artifactId>jsr305</artifactId>
Expand Down Expand Up @@ -91,10 +88,6 @@
<groupId>io.grpc</groupId>
<artifactId>grpc-stub</artifactId>
</dependency>
<dependency>
<groupId>io.opencensus</groupId>
<artifactId>opencensus-api</artifactId>
</dependency>
<dependency>
<groupId>com.google.auth</groupId>
<artifactId>google-auth-library-credentials</artifactId>
Expand All @@ -113,6 +106,20 @@
<artifactId>protobuf-java-util</artifactId>
</dependency>

<!-- OpenTelemetry -->
<dependency>
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-api</artifactId>
</dependency>
<dependency>
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-context</artifactId>
</dependency>
<dependency>
<groupId>io.opentelemetry.instrumentation</groupId>
<artifactId>opentelemetry-grpc-1.6</artifactId>
</dependency>
<!-- END OpenTelemetry -->

<!-- Test dependencies -->
<dependency>
Expand Down Expand Up @@ -173,6 +180,72 @@
<version>3.15.0</version>
<scope>test</scope>
</dependency>
<!-- OpenTelemetry -->
<dependency>
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-sdk</artifactId>
<version>${opentelemetry.version}</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-sdk-testing</artifactId>
<version>${opentelemetry.version}</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-semconv</artifactId>
<version>1.30.1-alpha</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-sdk-trace</artifactId>
<version>${opentelemetry.version}</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-sdk-common</artifactId>
<version>${opentelemetry.version}</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>com.google.cloud.opentelemetry</groupId>
<artifactId>exporter-trace</artifactId>
<version>0.15.0</version>
<scope>test</scope>
</dependency>
<!-- END OpenTelemetry -->
<!-- Cloud Ops -->
<dependency>
<groupId>com.google.api.grpc</groupId>
<artifactId>proto-google-cloud-trace-v1</artifactId>
<version>1.3.0</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>com.google.cloud.opentelemetry</groupId>
<artifactId>exporter-trace</artifactId>
<version>0.15.0</version>
<scope>test</scope>
</dependency>
<!-- END OpenTelemetry -->
<!-- Cloud Ops -->
<dependency>
<groupId>com.google.api.grpc</groupId>
<artifactId>proto-google-cloud-trace-v1</artifactId>
<version>1.3.0</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>com.google.cloud</groupId>
<artifactId>google-cloud-trace</artifactId>
<version>1.3.0</version>
<scope>test</scope>
</dependency>
<!-- END Cloud Ops -->
</dependencies>

<reporting>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,9 @@

package com.google.cloud.firestore;

import static com.google.cloud.firestore.telemetry.TraceUtil.ATTRIBUTE_KEY_ATTEMPT;
import static com.google.cloud.firestore.telemetry.TraceUtil.SPAN_NAME_RUN_AGGREGATION_QUERY;

import com.google.api.core.ApiFuture;
import com.google.api.core.InternalExtensionOnly;
import com.google.api.core.SettableApiFuture;
Expand All @@ -24,6 +27,8 @@
import com.google.api.gax.rpc.StatusCode;
import com.google.api.gax.rpc.StreamController;
import com.google.cloud.Timestamp;
import com.google.cloud.firestore.telemetry.TraceUtil;
import com.google.cloud.firestore.telemetry.TraceUtil.Scope;
import com.google.cloud.firestore.v1.FirestoreSettings;
import com.google.common.collect.ImmutableMap;
import com.google.firestore.v1.RunAggregationQueryRequest;
Expand All @@ -35,6 +40,7 @@
import com.google.firestore.v1.Value;
import com.google.protobuf.ByteString;
import java.util.ArrayList;
import java.util.Collections;
import java.util.HashMap;
import java.util.HashSet;
import java.util.List;
Expand All @@ -59,6 +65,11 @@ public class AggregateQuery {
this.aliasMap = new HashMap<>();
}

@Nonnull
private TraceUtil getTraceUtil() {
return query.getFirestore().getOptions().getTraceUtil();
}

/** Returns the query whose aggregations will be calculated by this object. */
@Nonnull
public Query getQuery() {
Expand All @@ -85,34 +96,57 @@ public ApiFuture<AggregateQuerySnapshot> get() {
*/
@Nonnull
public ApiFuture<ExplainResults<AggregateQuerySnapshot>> explain(ExplainOptions options) {
AggregateQueryExplainResponseDeliverer responseDeliverer =
new AggregateQueryExplainResponseDeliverer(
/* transactionId= */ null,
/* readTime= */ null,
/* startTimeNanos= */ query.rpcContext.getClock().nanoTime(),
/* explainOptions= */ options);
runQuery(responseDeliverer);
return responseDeliverer.getFuture();
TraceUtil.Span span = getTraceUtil().startSpan(TraceUtil.SPAN_NAME_AGGREGATION_QUERY_GET);
try (Scope ignored = span.makeCurrent()) {
AggregateQueryExplainResponseDeliverer responseDeliverer =
new AggregateQueryExplainResponseDeliverer(
/* transactionId= */ null,
/* readTime= */ null,
/* startTimeNanos= */ query.rpcContext.getClock().nanoTime(),
/* explainOptions= */ options);
runQuery(responseDeliverer, /* attempt */ 0);
ApiFuture<ExplainResults<AggregateQuerySnapshot>> result = responseDeliverer.getFuture();
span.endAtFuture(result);
return result;
} catch (Exception error) {
span.end(error);
throw error;
}
}

@Nonnull
ApiFuture<AggregateQuerySnapshot> get(
@Nullable final ByteString transactionId, @Nullable com.google.protobuf.Timestamp readTime) {
AggregateQueryResponseDeliverer responseDeliverer =
new AggregateQueryResponseDeliverer(
transactionId, readTime, /* startTimeNanos= */ query.rpcContext.getClock().nanoTime());
runQuery(responseDeliverer);
return responseDeliverer.getFuture();
TraceUtil.Span span =
getTraceUtil()
.startSpan(
transactionId == null
? TraceUtil.SPAN_NAME_AGGREGATION_QUERY_GET
: TraceUtil.SPAN_NAME_TRANSACTION_GET_AGGREGATION_QUERY);
try (Scope ignored = span.makeCurrent()) {
AggregateQueryResponseDeliverer responseDeliverer =
new AggregateQueryResponseDeliverer(
transactionId,
readTime,
/* startTimeNanos= */ query.rpcContext.getClock().nanoTime());
runQuery(responseDeliverer, /* attempt= */ 0);
ApiFuture<AggregateQuerySnapshot> result = responseDeliverer.getFuture();
span.endAtFuture(result);
return result;
} catch (Exception error) {
span.end(error);
throw error;
}
}

private <T> void runQuery(ResponseDeliverer<T> responseDeliverer) {
private <T> void runQuery(ResponseDeliverer<T> responseDeliverer, int attempt) {
RunAggregationQueryRequest request =
toProto(
responseDeliverer.getTransactionId(),
responseDeliverer.getReadTime(),
responseDeliverer.getExplainOptions());
AggregateQueryResponseObserver<T> responseObserver =
new AggregateQueryResponseObserver<T>(responseDeliverer);
new AggregateQueryResponseObserver<T>(responseDeliverer, attempt);
ServerStreamingCallable<RunAggregationQueryRequest, RunAggregationQueryResponse> callable =
query.rpcContext.getClient().runAggregationQueryCallable();
query.rpcContext.streamRequest(request, responseObserver, callable);
Expand Down Expand Up @@ -249,20 +283,34 @@ private final class AggregateQueryResponseObserver<T>
private Timestamp readTime = Timestamp.MAX_VALUE;
@Nullable private Map<String, Value> aggregateFieldsMap = null;
@Nullable private ExplainMetrics metrics = null;
private int attempt;

AggregateQueryResponseObserver(ResponseDeliverer<T> responseDeliverer) {
AggregateQueryResponseObserver(ResponseDeliverer<T> responseDeliverer, int attempt) {
this.responseDeliverer = responseDeliverer;
this.attempt = attempt;
}

Map<String, Object> getAttemptAttributes() {
return Collections.singletonMap(ATTRIBUTE_KEY_ATTEMPT, attempt);
}

private boolean isExplainQuery() {
return this.responseDeliverer.getExplainOptions() != null;
}

@Override
public void onStart(StreamController streamController) {}
public void onStart(StreamController streamController) {
getTraceUtil()
.currentSpan()
.addEvent(SPAN_NAME_RUN_AGGREGATION_QUERY + " Stream started.", getAttemptAttributes());
}

@Override
public void onResponse(RunAggregationQueryResponse response) {
getTraceUtil()
.currentSpan()
.addEvent(
SPAN_NAME_RUN_AGGREGATION_QUERY + " Response Received.", getAttemptAttributes());
if (response.hasReadTime()) {
readTime = Timestamp.fromProto(response.getReadTime());
}
Expand All @@ -288,8 +336,19 @@ public void onResponse(RunAggregationQueryResponse response) {
@Override
public void onError(Throwable throwable) {
if (shouldRetry(throwable)) {
runQuery(responseDeliverer);
getTraceUtil()
.currentSpan()
.addEvent(
SPAN_NAME_RUN_AGGREGATION_QUERY + ": Retryable Error",
Collections.singletonMap("error.message", throwable.getMessage()));

runQuery(responseDeliverer, attempt + 1);
} else {
getTraceUtil()
.currentSpan()
.addEvent(
SPAN_NAME_RUN_AGGREGATION_QUERY + ": Error",
Collections.singletonMap("error.message", throwable.getMessage()));
responseDeliverer.deliverError(throwable);
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,13 +21,10 @@
import com.google.api.gax.rpc.ApiException;
import com.google.cloud.Timestamp;
import com.google.common.base.Preconditions;
import com.google.common.collect.ImmutableMap;
import com.google.common.util.concurrent.MoreExecutors;
import com.google.firestore.v1.BatchWriteRequest;
import com.google.firestore.v1.BatchWriteResponse;
import io.grpc.Status;
import io.opencensus.trace.AttributeValue;
import io.opencensus.trace.Tracing;
import java.util.ArrayList;
import java.util.List;
import java.util.Set;
Expand Down Expand Up @@ -69,18 +66,10 @@ ApiFuture<WriteResult> wrapResult(int writeIndex) {
* <p>The writes in the batch are not applied atomically and can be applied out of order.
*/
ApiFuture<Void> bulkCommit() {

// Follows same thread safety logic as `UpdateBuilder::commit`.
committed = true;
BatchWriteRequest request = buildBatchWriteRequest();

Tracing.getTracer()
.getCurrentSpan()
.addAnnotation(
TraceUtil.SPAN_NAME_BATCHWRITE,
ImmutableMap.of(
"numDocuments", AttributeValue.longAttributeValue(request.getWritesCount())));

ApiFuture<BatchWriteResponse> response =
processExceptions(
firestore.sendRequest(request, firestore.getClient().batchWriteCallable()));
Expand Down
Loading

0 comments on commit 00dc240

Please sign in to comment.