Add storeCompactionState flag support to msq #15965

gargvishesh · 2024-02-26T05:55:30Z

Compaction in the native engine by default records the state of compaction for each segment in the lastCompactionState segment field. This PR adds support for doing the same in the MSQ engine, targeted for future cases such as REPLACE and compaction done via MSQ.

Note that this PR doesn't implicitly store the compaction state for MSQ replace tasks; it is stored with flag "storeCompactionState": true in the query context.

Release Note

storeCompactionState context flag is now supported for MSQ Replace tasks.

cryptoe

Left some comments.
We should also add a assert to the compaction config in MSQReplaceTests.

cryptoe · 2024-02-26T07:43:21Z

extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java

+      List<String> partitionDimensions
+  )
+  {
+    final boolean storeCompactionState = task.getContextValue(


We should document this in the SQLBasedIngestion context parameter docs saying we support these context parameters in MSQ and link them to the original documentation of storeCompactionState

I'm just thinking if it would be better to move it to query context instead of task context to enable setting it from the web-console. Any thoughts on that?

Moved this to query context

cryptoe · 2024-02-26T09:16:27Z

extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java

+      IndexSpec indexSpec = task.getQuerySpec().getTuningConfig().getIndexSpec();
+      GranularitySpec granularitySpec = dataSchema.getGranularitySpec();
+      DimensionsSpec dimensionsSpec = dataSchema.getDimensionsSpec();
+      Map<String, Object> transformSpec = dataSchema.getTransformSpec() == null


Curious to know why this is required for compaction state.

All these fields are captured when setting compaction state in the native flow.

cryptoe · 2024-02-26T09:17:33Z

extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java

+        Tasks.DEFAULT_STORE_COMPACTION_STATE
+    );
+
+    if (storeCompactionState) {


Where are we checking that the sql statement is a replace ?

That would be part of the logic to set this flag itself. Currently, this PR doesn't incorporate the logic to implicitly set it in REPLACE commands, so this would have to be explicitly set.

If the flag is set and the user issues an insert command, it should be an error.

You can check if its a replace query using :

druid/extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/indexing/destination/DataSourceMSQDestination.java

Line 131 in aeaf41f

public boolean isReplaceTimeChunks()

Done. Thanks!

extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java

gargvishesh · 2024-02-26T09:57:46Z

Left some comments. We should also add a assert to the compaction config in MSQReplaceTests.

A clarification: this PR doesn't implicitly set the compaction state for all MSQ replace tasks

… granularity spec accordingly

extensions-core/multi-stage-query/src/test/java/org/apache/druid/msq/exec/MSQReplaceTest.java

AmatyaAvadhanula

Would it be better to extract the dataSchema, indexSpec, granularitySpec and (compute) the partitionsSpec, and use a common utility for both native batch and MSQ?

extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java

gargvishesh · 2024-03-22T05:22:27Z

Would it be better to extract the dataSchema, indexSpec, granularitySpec and (compute) the partitionsSpec, and use a common utility for both native batch and MSQ?

The bulk of the code is to extract/compute these values themselves, so I think the common utility will be of little value other than just creating the CompactionState object and the transform fn.

AmatyaAvadhanula · 2024-03-22T05:43:02Z

other than just creating the CompactionState object and the transform fn.

I think we are interested in storing the compaction state to prevent additional compactions from running on an already compacted interval.
If there are two separate methods for the native batch and MSQ based compaction state computations, it's possible that they diverge, leading to unwanted compactions.

extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java

kfaraz · 2024-04-01T07:46:04Z

extensions-core/multi-stage-query/src/test/java/org/apache/druid/msq/exec/MSQReplaceTest.java

+
+  @MethodSource("data")
+  @ParameterizedTest(name = "{index}:with context {0}")
+  public void testReplaceSegmentsWithQuarterSegmentGranularity(String contextName, Map<String, Object> context)


This parameter needs to be removed.

It is required and is there in every other test -- though unused.

I see, thanks for the clarification. The second parameter is used just to name the test in JUnit. There also seems to be a lot of repetition of the annotations.

Edit: Apparently this is the just the way JUnit5 works. There is no way (yet) to parameterize the constructor of the entire test class.

…er minor changes.

gargvishesh · 2024-03-22T13:07:56Z

I think we are interested in storing the compaction state to prevent additional compactions from running on an already compacted interval.

I've moved the annotation function calculation to a common place now

website/.spelling

cryptoe

Left some comments. Overall lgtm.
@gargvishesh Please test this on a dev local cluster confirming that the compaction does not trigger if we set this flag.
Or triggers if there is a change in the "compaction spec".

extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java

cryptoe · 2024-04-01T07:07:31Z

extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java

+    } else if (Objects.equals(shardSpec.getType(), ShardSpec.Type.NUMBERED)) {
+      partitionSpec = new DynamicPartitionsSpec(task.getQuerySpec().getTuningConfig().getRowsPerSegment(), null);
+    } else {
+      log.error(


I feel this should be a MSQ error ie throw a MSQ Fault and fail the job since if we do add new shardSpecs to MSQ, we should also add support to store compaction stage. If we donot add code here, the jobs of the user would pass with this error message in the logs. It would require lot of debugging to figure out that we missed adding stuff here.

Throwing an MSQException now.

extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java

kfaraz · 2024-04-01T07:54:54Z

extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java

@@ -1715,9 +1726,109 @@ private void publishSegmentsIfNeeded(
      //noinspection unchecked
      @SuppressWarnings("unchecked")
      final Set<DataSegment> segments = (Set<DataSegment>) queryKernel.getResultObjectForStage(finalStageId);
+
+      Function<Set<DataSegment>, Set<DataSegment>> compactionStateAnnotateFunction = Function.identity();


Rather than declaring this function outside the if-else and assigning identity() to it, it should just be declared where necessary. You can make the segments non-final.

Applying the function in the branch itself now before sending to publish.

kfaraz

@gargvishesh , thanks for the changes.

I have left some comments. There are also some from @cryptoe that need to be addressed before this PR can be merged.

I guess at some point, we would need a test that verifies that segments written by MSQ REPLACE are indeed not picked up by compaction if the desired state matches. But maybe we can do that in an integration-test in a follow up PR.

extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java

...xing-service/src/main/java/org/apache/druid/indexing/common/task/AbstractBatchIndexTask.java

extensions-core/multi-stage-query/src/test/java/org/apache/druid/msq/exec/MSQReplaceTest.java

extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java

kfaraz · 2024-04-01T08:19:54Z

extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java

+    if ((Objects.equals(shardSpec.getType(), ShardSpec.Type.SINGLE)
+         || Objects.equals(shardSpec.getType(), ShardSpec.Type.RANGE))) {
+      List<String> partitionDimensions = ((DimensionRangeShardSpec) shardSpec).getDimensions();
+      partitionSpec = new DimensionRangePartitionsSpec(


Since we are using DimensionRangePartitionsSpec for single-dim segments, is it possible that segments partitioned by single-dim would get re-picked by compaction if compaction config has single as the desired state?
I am not entirely sure if we still allow users to use single in the compaction config.

From the code, the equals method compares the class type first before comparing the fields themselves. So you are right: a single type spec should be stored in the corresponding instance. Have updated the handling now. Thanks!

gargvishesh · 2024-04-02T06:02:21Z

@cryptoe @kfaraz
Tested it on local cluster. Storing compaction state and triggering compaction only if states differ are working as expected. However, the defaults for DynamicPartitionsSpec weren't matching with those used by compaction. I've changed the spec creation from using DynamicPartitionsSpec(task.getQuerySpec().getTuningConfig().getRowsPerSegment(), null) to DynamicPartitionsSpec(null, DynamicPartitionsSpec.DEFAULT_COMPACTION_MAX_TOTAL_ROWS). The original task.getQuerySpec().getTuningConfig().getRowsPerSegment() uses 3M by default and is meant to be target #rows whereas default maxRowsPerSegment used by DynamicPartitionsSpec is 5M.

cryptoe · 2024-04-02T10:27:34Z

ested it on local cluster. Storing compaction state and triggering compaction only if states differ are working as expected. However, the defaults for DynamicPartitionsSpec weren't matching with those used by compaction. I've changed the spec creation from using DynamicPartitionsSpec(task.getQuerySpec().getTuningConfig().getRowsPerSegment(), null) to DynamicPartitionsSpec(null, DynamicPartitionsSpec.DEFAULT_COMPACTION_MAX_TOTAL_ROWS). The original task.getQuerySpec().getTuningConfig().getRowsPerSegment() uses 3M by default and is meant to be target #rows whereas default maxRowsPerSegment used by DynamicPartitionsSpec is 5M.

Very important catch @gargvishesh . Thank you.
I feel the fix is not clean though. We would have to adjust the shuffle specs in MSQ.
Let me think through a bit more.

docs/multi-stage-query/reference.md

kfaraz · 2024-04-05T04:39:49Z

extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java

+    final MSQTuningConfig tuningConfig = task.getQuerySpec().getTuningConfig();
+    PartitionsSpec partitionSpec;
+
+    // There is currently no way of specifying either maxRowsPerSegment or maxTotalRows for an MSQ task.


Please add another line of comment to explain the implications of this fact for the code here.

Added and relocated to the appropriate line.

cryptoe · 2024-04-08T10:34:09Z

extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java

+    } else if (Objects.equals(shardSpec.getType(), ShardSpec.Type.NUMBERED)) {
+      // There is currently no way of specifying either maxRowsPerSegment or maxTotalRows for an MSQ task.
+      // Hence using null for both which ends up translating to DEFAULT_MAX_ROWS_PER_SEGMENT for maxRowsPerSegment.
+      partitionSpec = new DynamicPartitionsSpec(null, null);


maxRowsPerSegment=numRowsPerSegment no ?

maybe use this spec :

druid/extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/indexing/MSQTuningConfig.java

Line 118 in c72e69a

public int getRowsPerSegment()

cryptoe

Left 2 comments. LGTM otherwise.

cryptoe · 2024-04-08T10:54:31Z

extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java

+    PartitionsSpec partitionSpec;
+
+    if (Objects.equals(shardSpec.getType(), ShardSpec.Type.SINGLE)) {
+      String partitionDimension = ((SingleDimensionShardSpec) shardSpec).getDimension();


I think shard spec cannot be single in MSQ. Lets just check for Range shard spec.

cryptoe

Changes LGTM!!

# Conflicts: # extensions-core/multi-stage-query/src/test/java/org/apache/druid/msq/exec/MSQReplaceTest.java

cryptoe · 2024-04-09T11:18:09Z

@gargvishesh Could you please add the release notes in the PR description.

gargvishesh added 2 commits February 23, 2024 21:49

Add storeCompactionState annotation function

d2c28d4

Add flag and change some config sources

555d5d5

github-actions bot added Area - Batch Ingestion Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 labels Feb 26, 2024

Add type check for shard spec before casting

0ac20e4

cryptoe reviewed Feb 26, 2024

View reviewed changes

gargvishesh added 5 commits February 26, 2024 15:37

Check if there is a segment granularity in the context and revise the…

a6d3dc0

… granularity spec accordingly

Check if there is a segment granularity in the context and revise the…

b218280

… granularity spec accordingly

Address review comments

33b5a82

Add tests for compaction state

cf37c65

Corrections

f877b91

kfaraz self-requested a review March 5, 2024 05:39

github-advanced-security bot found potential problems Mar 5, 2024

View reviewed changes

extensions-core/multi-stage-query/src/test/java/org/apache/druid/msq/exec/MSQReplaceTest.java Fixed Show fixed Hide fixed

AmatyaAvadhanula self-requested a review March 7, 2024 03:18

AmatyaAvadhanula reviewed Mar 7, 2024

View reviewed changes

gargvishesh added 2 commits March 22, 2024 10:45

Address review comments

6464605

Remove unused var

b24a7c9

gargvishesh added 2 commits March 22, 2024 10:55

Merge branch 'master' into add-store-compaction-state-to-msq

ac2d8f5

Fix compilation errors due to junit5 migration

1a0517c

AmatyaAvadhanula reviewed Mar 22, 2024

View reviewed changes

extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java Outdated Show resolved Hide resolved

AmatyaAvadhanula reviewed Mar 22, 2024

View reviewed changes

extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java Outdated Show resolved Hide resolved

github-advanced-security bot found potential problems Mar 22, 2024

View reviewed changes

Separate compactionStateAnnotationFunction to a common place, and oth…

f402523

…er minor changes.

github-actions bot added the Area - Ingestion label Mar 22, 2024

gargvishesh added 2 commits March 22, 2024 18:43

Checkstyle fixes

13f2c99

Try again

3b57dfa

kfaraz reviewed Apr 1, 2024

View reviewed changes

website/.spelling Outdated Show resolved Hide resolved

cryptoe reviewed Apr 1, 2024

View reviewed changes

kfaraz reviewed Apr 1, 2024

View reviewed changes

extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java Outdated Show resolved Hide resolved

kfaraz reviewed Apr 1, 2024

View reviewed changes

extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java Outdated Show resolved Hide resolved

kfaraz reviewed Apr 1, 2024

View reviewed changes

extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java Outdated Show resolved Hide resolved

kfaraz reviewed Apr 1, 2024

View reviewed changes

gargvishesh added 2 commits April 2, 2024 11:05

Address review comments

c87ff9f

Resolve checkstyle errors

f18650d

gargvishesh requested review from cryptoe and kfaraz April 2, 2024 06:06

Remove redundant comment

49053db

Revert maxTotalRows to null

29ea760

kfaraz reviewed Apr 5, 2024

View reviewed changes

Address review comments and fix tests

7e43b5d

kfaraz approved these changes Apr 5, 2024

View reviewed changes

cryptoe reviewed Apr 8, 2024

View reviewed changes

Correct values in DynamicPartitionSpec.

a282e32

cryptoe approved these changes Apr 8, 2024

View reviewed changes

gargvishesh added 4 commits April 8, 2024 17:11

Fix checkstyle

b6bc0a5

Fix tests

6a4edc9

Merge branch 'master' into add-store-compaction-state-to-msq

be761ed

Merge branch 'master' into add-store-compaction-state-to-msq

659fe07

# Conflicts: # extensions-core/multi-stage-query/src/test/java/org/apache/druid/msq/exec/MSQReplaceTest.java

cryptoe merged commit 3d595cf into apache:master Apr 9, 2024
85 checks passed

adarshsanjeev added this to the 30.0.0 milestone May 6, 2024

adarshsanjeev mentioned this pull request May 28, 2024

[DRAFT] 30.0.0 release notes #16505

Closed

Add storeCompactionState flag support to msq #15965

Add storeCompactionState flag support to msq #15965

Conversation

gargvishesh commented Feb 26, 2024 • edited Loading

Release Note

cryptoe left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cryptoe Feb 26, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gargvishesh commented Feb 26, 2024 • edited Loading

AmatyaAvadhanula left a comment

Choose a reason for hiding this comment

gargvishesh commented Mar 22, 2024

AmatyaAvadhanula commented Mar 22, 2024

kfaraz Apr 1, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kfaraz Apr 5, 2024 • edited Loading

Choose a reason for hiding this comment

gargvishesh commented Mar 22, 2024

cryptoe left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kfaraz Apr 1, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kfaraz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gargvishesh commented Apr 2, 2024

cryptoe commented Apr 2, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cryptoe left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cryptoe left a comment

Choose a reason for hiding this comment

cryptoe commented Apr 9, 2024

gargvishesh commented Feb 26, 2024 •

edited

Loading

cryptoe Feb 26, 2024 •

edited

Loading

gargvishesh commented Feb 26, 2024 •

edited

Loading

kfaraz Apr 1, 2024 •

edited

Loading

kfaraz Apr 5, 2024 •

edited

Loading

kfaraz Apr 1, 2024 •

edited

Loading