Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Show model plot info for partitions that do not contain anomalies #53839

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,11 @@ declare interface Ml {

results: {
getMaxAnomalyScore: (jobIds: string[], earliestMs: number, latestMs: number) => Promise<any>;
fetchPartitionFieldsValues: (
criteriaFields: Array<{ fieldName: string; fieldValue: any }>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency, I think job ID should be passed as a separate parameter, rather than including it in criteriaFields.

earliestMs: number,
latestMs: number
) => Observable<Record<string, string[]>>;
};

jobs: {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -75,4 +75,15 @@ export const results = {
},
});
},

fetchPartitionFieldsValues(criteriaFields, earliestMs, latestMs) {
return http$(`${basePath}/results/partition_fields_values`, {
method: 'POST',
body: {
criteriaFields,
earliestMs,
latestMs,
},
});
},
};
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ import {
getModelPlotOutput,
getRecordsForCriteria,
getScheduledEventsByBucket,
fetchPartitionFieldsValues,
} from './result_service_rx';
import {
getEventDistributionData,
Expand Down Expand Up @@ -42,6 +43,7 @@ export const mlResultsService = {
getEventDistributionData,
getModelPlotOutput,
getRecordMaxScoreByTime,
fetchPartitionFieldsValues,
};

type time = string;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -532,3 +532,11 @@ export function getScheduledEventsByBucket(
})
);
}

export function fetchPartitionFieldsValues(
criteriaFields: Array<{ fieldName: string; fieldValue: any }>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, for consistency with the other functions in here, I think job ID should be passed as a separate parameter, rather than including it in criteriaFields.

earliestMs: number,
latestMs: number
) {
return ml.results.fetchPartitionFieldsValues(criteriaFields, earliestMs, latestMs);
}
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
* React component for rendering Single Metric Viewer.
*/

import { chain, difference, each, find, first, get, has, isEqual, without } from 'lodash';
import { difference, each, find, first, get, has, isEqual, without } from 'lodash';
import moment from 'moment-timezone';
import { Subject, Subscription, forkJoin } from 'rxjs';
import { map, debounceTime, switchMap, tap, withLatestFrom } from 'rxjs/operators';
Expand Down Expand Up @@ -419,7 +419,11 @@ export class TimeSeriesExplorer extends React.Component {
);
};

loadEntityValues = (callback = () => {}) => {
/**
* Loads available entity values.
* @param callback
*/
loadEntityValues = async (callback = () => {}) => {
const { timefilter } = this.props;
const { detectorId, entities, selectedJob } = this.state;

Expand All @@ -428,49 +432,44 @@ export class TimeSeriesExplorer extends React.Component {
const bounds = timefilter.getActiveBounds();
const detectorIndex = +detectorId;

mlResultsService
.getRecordsForCriteria(
[selectedJob.job_id],
[{ fieldName: 'detector_index', fieldValue: detectorIndex }],
0,
const {
partition_field: partitionField,
over_field: overField,
by_field: byField,
} = await mlResultsService
.fetchPartitionFieldsValues(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment above here needs amending, as it no longer no populates the dropdowns with the values for the top records.

[
{
fieldName: 'job_id',
fieldValue: selectedJob.job_id,
},
{
fieldName: 'detector_index',
fieldValue: detectorIndex,
},
],
bounds.min.valueOf(),
bounds.max.valueOf(),
ANOMALIES_TABLE_DEFAULT_QUERY_SIZE
bounds.max.valueOf()
)
.toPromise()
.then(resp => {
if (resp.records && resp.records.length > 0) {
const firstRec = resp.records[0];
.toPromise();

this.setState(
{
entities: entities.map(entity => {
const newEntity = { ...entity };
if (firstRec.partition_field_name === newEntity.fieldName) {
newEntity.fieldValues = chain(resp.records)
.pluck('partition_field_value')
.uniq()
.value();
}
if (firstRec.over_field_name === newEntity.fieldName) {
newEntity.fieldValues = chain(resp.records)
.pluck('over_field_value')
.uniq()
.value();
}
if (firstRec.by_field_name === newEntity.fieldName) {
newEntity.fieldValues = chain(resp.records)
.pluck('by_field_value')
.uniq()
.value();
}
return newEntity;
}),
},
callback
);
}
});
this.setState(
{
entities: entities.map(entity => {
if (partitionField?.name === entity.fieldName) {
entity.fieldValues = partitionField.values;
}
if (overField?.name === entity.fieldName) {
entity.fieldValues = overField.values;
}
if (byField?.name === entity.fieldName) {
entity.fieldValues = byField.values;
}
return entity;
}),
},
callback
);
};

loadForForecastId = forecastId => {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -402,11 +402,111 @@ export function resultsServiceProvider(callWithRequest) {
return definition;
}

/**
* Gets the record of partition filed - values pairs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: small typo field - value

* @param criteriaFields
* @param earliestMs
* @param latestMs
* @returns {Promise<*>}
*/
async function getPartitionFieldsValues(criteriaFields, earliestMs, latestMs) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about the structure of the query+aggs here.
For example partition_field_name will return all field names, and partition_field_values will return all values.
Is the query done so we can be sure there will be only one partition_field_name returned? Otherwise the values could refer to different names.
To be more on the safe side, the aggregations could be done nested so the values agg is nested below the names agg. That way we'd have a definitive relationship between names+values defined in the response data.

const resp = await callWithRequest('search', {
index: ML_RESULTS_INDEX_PATTERN,
size: 0,
body: {
query: {
bool: {
filter: [
...criteriaFields.map(({ fieldName, fieldValue }) => {
return {
term: {
[fieldName]: fieldValue,
},
};
}),
{
range: {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As commented in the follow-up, should consider adding an extra term to the filter here, for result_type, depending on whether model plot has been enabled or not.

timestamp: {
gte: earliestMs,
lte: latestMs,
format: 'epoch_millis',
},
},
},
],
},
},
aggs: {
partition_field_name: {
terms: {
field: 'partition_field_name',
},
},
over_field_name: {
terms: {
field: 'over_field_name',
},
},
by_field_name: {
terms: {
field: 'by_field_name',
},
},
partition_field_values: {
terms: {
size: 100,
field: 'partition_field_value',
},
},
over_field_values: {
terms: {
size: 100,
field: 'over_field_value',
},
},
by_field_values: {
terms: {
size: 100,
field: 'by_field_value',
},
},
},
},
});

const {
partition_field_name: partitionFieldName,
over_field_name: overFieldName,
by_field_name: byFieldName,
partition_field_values: partitionFieldValues,
over_field_values: overFieldValues,
by_field_values: byFieldValues,
} = resp.aggregations;

return {
...getFieldObject(partitionFieldName, partitionFieldValues, 'partition_field'),
...getFieldObject(overFieldName, overFieldValues, 'over_field'),
...getFieldObject(byFieldName, byFieldValues, 'by_field'),
};
}

function getFieldObject(fieldName, values, field) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: We should probably define this before it gets used as good practice.

return fieldName.buckets.length > 0
? {
[field]: {
name: fieldName.buckets[0].key,
values: values.buckets.map(({ key }) => key),
},
}
: {};
}

return {
getAnomaliesTableData,
getCategoryDefinition,
getCategoryExamples,
getLatestBucketTimestampByJob,
getMaxAnomalyScore,
getPartitionFieldsValues,
};
}
20 changes: 20 additions & 0 deletions x-pack/legacy/plugins/ml/server/routes/results_service.js
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,12 @@ function getMaxAnomalyScore(callWithRequest, payload) {
return rs.getMaxAnomalyScore(jobIds, earliestMs, latestMs);
}

function getPartitionFieldsValues(callWithRequest, payload) {
const rs = resultsServiceProvider(callWithRequest);
const { criteriaFields, earliestMs, latestMs } = payload;
return rs.getPartitionFieldsValues(criteriaFields, earliestMs, latestMs);
}

export function resultsServiceRoutes({ commonRouteConfig, elasticsearchPlugin, route }) {
route({
method: 'POST',
Expand Down Expand Up @@ -103,4 +109,18 @@ export function resultsServiceRoutes({ commonRouteConfig, elasticsearchPlugin, r
...commonRouteConfig,
},
});

route({
method: 'POST',
path: '/api/ml/results/partition_fields_values',
handler(request) {
const callWithRequest = callWithRequestFactory(elasticsearchPlugin, request);
return getPartitionFieldsValues(callWithRequest, request.payload).catch(resp =>
wrapError(resp)
);
},
config: {
...commonRouteConfig,
},
});
}