Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HLRC: ML Flush job #33187

Merged
merged 7 commits into from
Sep 1, 2018
Merged

Conversation

benwtrent
Copy link
Member

This relates to (#29827)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core

Copy link
Contributor

@dimitris-athanasiou dimitris-athanasiou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Left some minor comments and an important one in the documentation.

String endpoint = new EndpointBuilder()
.addPathPartAsIs("_xpack")
.addPathPartAsIs("ml")
.addPathPartAsIs("anomaly_detectors").addPathPart(flushJobRequest.getJobId())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: put the second in a new line (it'll make it more readable)

* Specifies to advance to a particular time value.
* Results are generated and the model is updated for data from the specified time interval.
*
* @param advanceTime String representation of an EPOCH timestamp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this may be an epoch seconds, epoch millis or an ISO string. We're actually already inconsistent on the docs wherever we can set a date. We'll have to make them consistent.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what do you suggest? Just saying String representation of a timestamp or pushing users into a specific direction?

* Specifies to skip to a particular time value.
* Results are not generated and the model is not updated for data from the specified time interval.
*
* @param skipTime String representation of an EPOCH timestamp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

boolean flushed = (boolean) a[0];

Date date = null;
if (a[1] != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could use ternary to simplify

}

/**
* provides the timestamp (in milliseconds-since-the-epoch) of the end of the last bucket that was processed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: capitalize first letter

//tag::x-pack-ml-flush-job-request
FlushJobRequest flushJobRequest = new FlushJobRequest("flushing-my-first-machine-learning-job"); //<1>
flushJobRequest.setCalcInterim(true); //<2>
flushJobRequest.setAdvanceTime("1400"); //<3>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather have ISO timestamps in the example as it's what I'd prefer users to use. Same for start and end.


@Override
protected boolean supportsUnknownFields() {
return true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The request should not be supporting unknown fields.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've noticed many of the request and response objects support unknown fields I thought this was by design. For the requests the only time they are parsed is in the unit tests so that seems redundant. Responses are parsed by the client and if the client is not the same version as the server is may see unknown fields in which case we should error as nothing is learned by silently ignoring the field.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to leniently parse responses to enable compatibility of the HLRC client with multiple versions. Imagine you use rest directly and we add some new fields in a response that your script is simply ignoring. It is the same. On the request objects, on the other hand, we don't need to be lenient. In fact, for those it doesn't matter as they only have a parser for testing but there is no point doing the extra work of randomizing fields in the tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the requests the only time they are parsed is in the unit tests so that seems redundant

True

Responses are parsed by the client and if the client is not the same version as the server is may see unknown fields in which case we should error

A 7.0 client needs to be compatible with 7.6, and between 7.0 and 7.6 we may have added extra fields to the responses. So I think the HLRC should ignore unknown fields in responses. Obviously by using a 7.0 client with 7.6 you lose out on any new functionality, but we should not have broken backwards compatibility so missing out on the new fields shouldn't break anything that worked in 7.0. #11184 is related to this, although I think any changes will be too late to affect what we do in these classes in the short term.

--------------------------------------------------
include-tagged::{doc-tests}/MlClientDocumentationIT.java[x-pack-ml-flush-job-request]
--------------------------------------------------
<1> Constructing a new request referencing an existing `jobId`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These need to be in an optional arguments section. See get-buckets.asciidoc as a similar example.

Copy link
Member

@davidkyle davidkyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Left a comment about a comment

@@ -288,6 +290,58 @@ public void getBucketsAsync(GetBucketsRequest request, RequestOptions options, A
Collections.emptySet());
}

/**
* Flushes a given Machine Learning Job
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are missing quick overview how about changing this to: 'Flushes internally buffered data for the given Machine Learning Job ensuring all data sent to the has been processed. This may cause new results to be calculated depending on the contents of the buffer'

@benwtrent
Copy link
Member Author

Jenkins retest this please

@benwtrent benwtrent merged commit 6770a45 into elastic:master Sep 1, 2018
@benwtrent benwtrent deleted the feature/hlrc-ml-flush-job branch September 1, 2018 21:01
benwtrent added a commit that referenced this pull request Sep 1, 2018
* HLRC: ML Flush job

* Fixing package, paths, and test

* Addressing comments
dnhatn added a commit that referenced this pull request Sep 2, 2018
* master:
  HLRC: ML Flush job (#33187)
  HLRC: Adding ML Job stats (#33183)
  LLREST: Drop deprecated methods (#33223)
  Mute testSyncerOnClosingShard
  [DOCS] Moves machine learning APIs to docs folder (#31118)
dnhatn added a commit that referenced this pull request Sep 2, 2018
* 6.x:
  HLRC: ML Flush job (#33187)
  Switch more LLREST usage to new style Requests (#33171)
  HLRC: Adding ML Job stats (#33183)
  HLREST: add reindex API (#32679)
  Mute testSyncerOnClosingShard
  [DOCS] Moves machine learning APIs to docs folder (#31118)
jasontedor added a commit to martijnvg/elasticsearch that referenced this pull request Sep 2, 2018
* master: (64 commits)
  HLREST: add update by query API (elastic#32760)
  TEST: Increase timeout testFollowIndexAndCloseNode (elastic#33333)
  HLRC: ML Flush job (elastic#33187)
  HLRC: Adding ML Job stats (elastic#33183)
  LLREST: Drop deprecated methods (elastic#33223)
  Mute testSyncerOnClosingShard
  [DOCS] Moves machine learning APIs to docs folder (elastic#31118)
  Mute test watcher usage stats output
  [Rollup] Fix FullClusterRestart test
  Adjust soft-deletes version after backport into 6.5
  completely drop `index.shard.check_on_startup: fix` for 7.0 (elastic#33194)
  Fix AwaitsFix issue number
  Mute SmokeTestWatcherWithSecurityIT testsi
  drop `index.shard.check_on_startup: fix` (elastic#32279)
  tracked at
  [DOCS] Moves ml folder from x-pack/docs to docs (elastic#33248)
  [DOCS] Move rollup APIs to docs (elastic#31450)
  [DOCS] Rename X-Pack Commands section (elastic#33005)
  TEST: Disable soft-deletes in ParentChildTestCase
  Fixes SecurityIntegTestCase so it always adds at least one alias (elastic#33296)
  ...
jasontedor added a commit to debadair/elasticsearch that referenced this pull request Sep 2, 2018
* master: (283 commits)
  HLREST: add update by query API (elastic#32760)
  TEST: Increase timeout testFollowIndexAndCloseNode (elastic#33333)
  HLRC: ML Flush job (elastic#33187)
  HLRC: Adding ML Job stats (elastic#33183)
  LLREST: Drop deprecated methods (elastic#33223)
  Mute testSyncerOnClosingShard
  [DOCS] Moves machine learning APIs to docs folder (elastic#31118)
  Mute test watcher usage stats output
  [Rollup] Fix FullClusterRestart test
  Adjust soft-deletes version after backport into 6.5
  completely drop `index.shard.check_on_startup: fix` for 7.0 (elastic#33194)
  Fix AwaitsFix issue number
  Mute SmokeTestWatcherWithSecurityIT testsi
  drop `index.shard.check_on_startup: fix` (elastic#32279)
  tracked at
  [DOCS] Moves ml folder from x-pack/docs to docs (elastic#33248)
  [DOCS] Move rollup APIs to docs (elastic#31450)
  [DOCS] Rename X-Pack Commands section (elastic#33005)
  TEST: Disable soft-deletes in ParentChildTestCase
  Fixes SecurityIntegTestCase so it always adds at least one alias (elastic#33296)
  ...
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Sep 3, 2018
* master: (197 commits)
  Prevent NPE parsing the stop datafeed request. (elastic#33347)
  HLRC: Add ML get overall buckets API (elastic#33297)
  Core: Fix epoch millis java time formatter (elastic#33302)
  [Docs] Improve tuning for speed advice (elastic#33315)
  [Rollup] Fix Caps Comparator to handle calendar/fixed time (elastic#33336)
  [CI] Mute  IndexShardTests#testIndexCheckOnStartup fails elastic#33345
  [CI] Mute LuceneChangesSnapshotTests#testUpdateAndReadChangesConcurrently
  Security for _field_names field should not override field statistics (elastic#33261)
  Add early termination support to BucketCollector (elastic#33279)
  Fix extractjar task  ci  (elastic#33272)
  Mute testFollowIndexAndCloseNode
  Logging: Drop Settings from some logging ctors (elastic#33332)
  HLREST: add update by query API (elastic#32760)
  TEST: Increase timeout testFollowIndexAndCloseNode (elastic#33333)
  HLRC: ML Flush job (elastic#33187)
  HLRC: Adding ML Job stats (elastic#33183)
  LLREST: Drop deprecated methods (elastic#33223)
  Mute testSyncerOnClosingShard
  [DOCS] Moves machine learning APIs to docs folder (elastic#31118)
  Mute test watcher usage stats output
  ...
@benwtrent benwtrent removed the :ml Machine learning label Oct 25, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants