Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Lineage metrics for BigtableIO #32068

Merged
merged 4 commits into from
Aug 6, 2024
Merged

Conversation

Abacn
Copy link
Contributor

@Abacn Abacn commented Aug 2, 2024

Please add a meaningful description for your change here


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.

@@ -71,6 +71,7 @@
import org.apache.beam.sdk.io.gcp.bigtable.BigtableIO.BigtableSource;
import org.apache.beam.sdk.io.range.ByteKeyRange;
import org.apache.beam.sdk.metrics.Distribution;
import org.apache.beam.sdk.metrics.Lineage;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the challenges is that BigtableIO has a complicated fallback logic to resolve projectId / instanceId. From transform setting / Bigtable setting / pipeline options, and there is BigtableConfig/BigtableWriteOptions/BigtableDataSettings. This was partly due to client migration that supported both old and new config class.

For this reason this PR chose to emit metrics at the low level (BigtableServiceImpl) where the worker actually talking to the server, and where projectId/instanceId has been resolved definitely

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you.

@@ -23,11 +23,8 @@
public class Lineage {

public static final String LINEAGE_NAMESPACE = "lineage";
public static final String SOURCE_METRIC_NAME = "sources";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is breaking change but Lineage is just introduced for one release and only used internally. I would suggest change the constant String to a enum so we can have a query method guarantees the metric name is among defined ones

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SG. Thanks.

checkLineageSourceMetric(r, tableId);
}

private void checkLineageSourceMetric(PipelineResult r, String tableId) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one cannot check Lineage metrics in unit test in BigtableIO as what BigQueryIO does is because the metrics are emitted in ServiceImpl, where unit tests used a fake Reader + ServiceImpl which does not emit Lineage (and it does not make sense add Lineage metrics there as well, because it is completely differerent code path from the real Reader / ServiceImpl)

@Abacn
Copy link
Contributor Author

Abacn commented Aug 3, 2024

testE2EBigtableSegmentRead failed Lineage metrics not exist, however the test passed locally

The reason is that there is actually an OOM seen in log:

SEVERE: Error occurred within org.apache.beam.runners.direct.DirectTransformExecutor@7c79c330
java.lang.OutOfMemoryError: Java heap space
	at java.util.Arrays.copyOf(Arrays.java:3236)
	at java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:191)
	at org.apache.beam.sdk.util.ExposedByteArrayOutputStream.toByteArray(ExposedByteArrayOutputStream.java:105)
	at org.apache.beam.sdk.util.CoderUtils.encodeToByteArray(CoderUtils.java:71)
	at org.apache.beam.sdk.util.CoderUtils.encodeToByteArray(CoderUtils.java:55)
	at org.apache.beam.sdk.util.CoderUtils.clone(CoderUtils.java:168)
	at org.apache.beam.runners.direct.CloningBundleFactory$CloningBundle.add(CloningBundleFactory.java:87)
	at org.apache.beam.runners.direct.ImmutabilityCheckingBundleFactory$ImmutabilityEnforcingBundle.add(ImmutabilityCheckingBundleFactory.java:119)
	at org.apache.beam.runners.direct.BoundedReadEvaluatorFactory$BoundedReadEvaluator.processElement(BoundedReadEvaluatorFactory.java:154)
	at org.apache.beam.runners.direct.DirectTransformExecutor.processElements(DirectTransformExecutor.java:165)
	at org.apache.beam.runners.direct.DirectTransformExecutor.run(DirectTransformExecutor.java:129)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)

consequently, the pipeline actually failed, but the test does not assert it! This test actually has always been failing

@Abacn Abacn marked this pull request as ready for review August 3, 2024 03:30
Copy link
Contributor

github-actions bot commented Aug 3, 2024

Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment assign set of reviewers

@Abacn
Copy link
Contributor Author

Abacn commented Aug 5, 2024

R: @rohitsinha54

Copy link
Contributor

github-actions bot commented Aug 5, 2024

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control. If you'd like to restart, comment assign set of reviewers

@Abacn
Copy link
Contributor Author

Abacn commented Aug 5, 2024

R: @robertwb

@Abacn Abacn mentioned this pull request Aug 6, 2024
3 tasks
@@ -23,11 +23,8 @@
public class Lineage {

public static final String LINEAGE_NAMESPACE = "lineage";
public static final String SOURCE_METRIC_NAME = "sources";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SG. Thanks.

@@ -218,6 +218,10 @@ task integrationTest(type: Test, dependsOn: processTestResources) {

useJUnit {
excludeCategories "org.apache.beam.sdk.testing.UsesKms"
filter {
// https://github.com/apache/beam/issues/32071
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting.....
Thanks for catching this.

@@ -71,6 +71,7 @@
import org.apache.beam.sdk.io.gcp.bigtable.BigtableIO.BigtableSource;
import org.apache.beam.sdk.io.range.ByteKeyRange;
import org.apache.beam.sdk.metrics.Distribution;
import org.apache.beam.sdk.metrics.Lineage;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you.

@@ -38,4 +38,35 @@ public static StringSet getSources() {
public static StringSet getSinks() {
return SINKS;
}

/** Query {@link StringSet} metrics from {@link MetricResults}. */
public static Set<String> query(MetricResults results, Type type) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@Abacn Abacn merged commit 5ab908b into apache:master Aug 6, 2024
30 checks passed
@Abacn Abacn deleted the lineagebigtable branch August 6, 2024 19:04
@Abacn Abacn mentioned this pull request Aug 9, 2024
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants