-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleanup serialiazation of TaskReportMap #16217
Conversation
Co-authored-by: Abhishek Radhakrishnan <abhishek.rb19@gmail.com>
…nto cleanup_build_task_report
Assert.assertEquals(reportMap1, reportMap2); | ||
} | ||
|
||
@Test | ||
public void testWriteReportMapToStringAndRead() throws Exception |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we also add a test to verify that a serialized old type task report Map<String, TaskReport>
deserializes correctly into the new type TaskReport.ReportMap
? Should address any upgrade concerns.
Ditto for the reverse roundtrip for a downgrade scenario.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, that makes sense.
* a TaskReport is serialized without the type information and cannot be | ||
* deserialized back into a concrete implementation. | ||
*/ | ||
class ReportMap extends LinkedHashMap<String, TaskReport> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any tests that verify the reports are indeed ordered since we rely on a LinkedHashMap
? Just looking at the callers of buildTaskReports()
, I don't seem to find any.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I can add a test to verify the order. Although, I don't see any actual task writing a report map that contains multiple entries. Also not sure why the order was considered to be important in the first place, its json anyway.
@@ -546,12 +548,17 @@ public ListenableFuture<Void> runTask(String taskId, Object taskObject) | |||
|
|||
@Override | |||
public ListenableFuture<Map<String, Object>> taskReportAsMap(String taskId) | |||
{ | |||
return Futures.immediateFuture(null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this call getLiveReportsForTask(taskId)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was doing that originally but not needed right now as I have added the other method getLiveReportsForTask()
just below this one.
This is anyway used only in the tests and I plan to fix it back up once I replace the Map<String, Object>
in the OverlordClient
with TaskReport.ReportMap
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, makes sense
return Futures.immediateFuture(null); | ||
} | ||
|
||
public TaskReport.ReportMap getLiveReportsForTask(String taskId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public TaskReport.ReportMap getLiveReportsForTask(String taskId) | |
protected TaskReport.ReportMap getLiveReportsForTask(String taskId) |
@@ -546,12 +548,17 @@ public ListenableFuture<Void> runTask(String taskId, Object taskObject) | |||
|
|||
@Override | |||
public ListenableFuture<Map<String, Object>> taskReportAsMap(String taskId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can taskReportAsMap()
now return the concrete type TaskReport.ReportMap
instead of Map<String, Object>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I have that change in a follow up PR. Didn't do it here as it requires moving all the TaskReport
related classes to the druid-processing
module, so that OverlordClient
can use it.
Thanks a lot for the review, @abhishekrb19 ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks! I'm ok with doing the suggestions in a follow-up 👍
Follow up to #16217 Changes: - Update `OverlordClient.getReportAsMap()` to return `TaskReport.ReportMap` - Move the following classes to `org.apache.druid.indexer.report` in the `druid-processing` module - `TaskReport` - `KillTaskReport` - `IngestionStatsAndErrorsTaskReport` - `TaskContextReport` - `TaskReportFileWriter` - `SingleFileTaskReportFileWriter` - `TaskReportSerdeTest` - Remove `MsqOverlordResourceTestClient` as it had only one method which is already present in `OverlordResourceTestClient` itself
Issue
While serializing a
Map
or even aList
containingTaskReport
objects, thetype
information is lost. Thus, the object cannot be serialized back.This is a known issue with Jackson.
Existing solution in Druid
The serialization of a
Map<String, TaskReport>
was originally fixed in #12938.The way this has been tackled in the Druid code till now is:
SingleFileTaskReportFileWriter.writeReportToStream()
to write out eachTaskReport
object one by oneTaskReport
objectProposed solution
Add a new
ReportMap
class.Changes
TaskReport.ReportMap
TaskReport.buildTaskReports()
return the new classMap<String, TaskReport>
withTaskReport.ReportMap
to ensurethat we always use this class for serialization of reports
AbstractBatchIndexTask.buildLiveIngestionStatsReport()
to reduce duplication and hard-coding of serializable field names.Important classes
TaskReport
AbstractBatchIndexTask
SingleFileTaskReportFileWriter
ParallelIndexSupervisorTask
SinglePhaseSubTask
IndexTask
Rolling upgrade concerns
None