-
Notifications
You must be signed in to change notification settings - Fork 200
job reporting
Easy Batch records several metrics during batch processing and provides a complete report at the end of execution. This report is an instance of the JobReport
class and contains the following information:
- The job start and end times
- The job status
- The number of read records
- The number of written records
- The number of filtered records
- The number of errors
It is possible to contribute custom metrics by adding them to JobMetrics
with addMetric
method.
In order to get access to JobMetrics
, you need to implement the JobListener
interface.
The following example is a listener that calculates record processing time average and add it to the job report as a custom metric:
public class RecordProcessingTimeCalculator implements PipelineListener, JobListener {
private long startTime;
private long nbRecords;
private long recordProcessingTimesSum;
@Override
public Record beforeRecordProcessing(Record record) {
nbRecords++;
startTime = System.currentTimeMillis();
return record;
}
@Override
public void afterRecordProcessing(Record input, Record output) {
recordProcessingTimesSum += System.currentTimeMillis() - startTime;
}
@Override
public void onRecordProcessingException(Record record, Throwable throwable) {
recordProcessingTimesSum += System.currentTimeMillis() - startTime;
}
@Override
public void afterJobEnd(JobReport jobReport) {
jobReport.getMetrics().addMetric(
"Record processing time average (in ms)",
(double)recordProcessingTimesSum / (double)nbRecords);
}
}
When you run multiple jobs to process a data source in parallel, each job will generate a partial report for the data partition it has processed.
You may want to merge partial reports into a consolidated one. This is where the JobReportMerger
comes to the rescue:
The merged report is defined as follows:
- The start time is the minimum of start times in partial reports
- The end time is the maximum of end times in partial reports
- The total read records is the sum of total read records in partial reports
- The total written records is the sum of total written records in partial reports
- The total filtered records is the sum of total filtered records in partial reports
- The total error records is the sum of total error records in partial reports
- The final status is COMPLETED (if all partials are completed) or FAILED (if one of partials has failed).
- The final name is the concatenation of partial job names.
You can use the job report merger as follows:
JobReportMerger reportMerger = new DefaultJobReportMerger();
JobReport finalReport = reportMerger.mergeReports(report1, report2);
Easy Batch is created by Mahmoud Ben Hassine with the help of some awesome contributors
-
Introduction
-
User guide
-
Job reference
-
Component reference
-
Get involved