[ML] Removing old per-partition normalization code #32816

edsavage · 2018-08-13T16:46:39Z

Per-partition normalization is an old, undocumented feature that was
never used by clients. It has been superseded by per-partition maximum
scoring (see #32748).

This PR removes the now redundant code.

A PR containing the corresponding changes to the ml-cpp code will follow.

Per-partition normalization is an old, undocumented feature that was never used by clients. It has been superseded by per-partition maximum scoring.

elasticmachine · 2018-08-13T16:46:41Z

Pinging @elastic/ml-core

davidkyle

LGTM

dimitris-athanasiou

This looks good but we need to address the BWC issues I raised. Also, to avoid breaking the build we need to follow the process of: 1. adding the version checks against 7 on master to get a green CI, 2. backport and change version to 6.5 but also disable bwc tests 3. once we have successful builds, we can change version check to 6.5 on master and re-enable the bwc tests.

dimitris-athanasiou · 2018-08-14T10:48:04Z

x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ml/job/config/AnalysisConfig.java

@@ -164,8 +160,6 @@ public AnalysisConfig(StreamInput in) throws IOException {
                }
            }
        }
-
-        usePerPartitionNormalization = in.readBoolean();


Here we need to check that if we are reading from an older node we consume the boolean (although we do nothing with it).

dimitris-athanasiou · 2018-08-14T10:48:21Z

x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ml/job/config/AnalysisConfig.java

@@ -194,8 +188,6 @@ public void writeTo(StreamOutput out) throws IOException {
        if (out.getVersion().before(Version.V_6_5_0)) {
            out.writeBoolean(false);
        }
-
-        out.writeBoolean(usePerPartitionNormalization);


And here we need to check that if we are writing to an older node we write a false.

dimitris-athanasiou · 2018-08-14T10:49:46Z

x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ml/job/results/Bucket.java

@@ -143,7 +137,6 @@ public Bucket(StreamInput in) throws IOException {
        if (in.getVersion().before(Version.V_5_5_0)) {
            in.readGenericValue();
        }
-        partitionScores = in.readList(PartitionScore::new);


I believe we can get away without doing anything for BWC for the buckets because they are not being transferred between nodes. But I would like @droberts195 to confirm as well.

I think we do need to consider BWC for these lists. If you look at the implementation of readList() and writeList() they start by reading/writing the list length. So we need to write an empty list to versions before 6.5, and read a list of something. We can replace PartitionScore::new with a function in Bucket that reads the same stuff that PartitionScore::new read but just discards it.

That is true for when there is a transport client which I didn't think of at the first place. So, yes, we'll need to do the trick of reading the scores. There is another place where I'm doing this: https://github.com/elastic/elasticsearch/blob/6.x/x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ml/job/config/Detector.java#L253. You can take a look and follow a similar approach. Note we only need that code in the 6.x branch.

Thanks @dimitris-athanasiou ! That all makes sense.

To maintain communication compatibility with nodes prior to 6.5 it is necessary to maintain/cope with the old wire format

dimitris-athanasiou · 2018-08-14T14:59:36Z

x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ml/job/results/Bucket.java

@@ -167,6 +184,10 @@ public void writeTo(StreamOutput out) throws IOException {
        if (out.getVersion().before(Version.V_5_5_0)) {
            out.writeGenericValue(Collections.emptyMap());
        }
+        // bwc for perPartitionNormalization
+        if (out.getVersion().before(Version.V_6_5_0)) {
+            out.writeGenericValue(Collections.emptyList());


I was expecting this to be out.writeList(Collections.emptyList());. Did you try that out?

I'll make the change now - you're right, writeList is the better option here

dimitris-athanasiou · 2018-08-14T15:01:40Z

Also, just realised we should remove partition score from the Bucket class in the high level rest client.

dimitris-athanasiou

LGTM

BWC tests disabled while backporting #32816

* elastic/master: Revert "cluster formation DSL - Gradle integration - part 2 (#32028)" (#32876) cluster formation DSL - Gradle integration - part 2 (#32028) Introduce global checkpoint listeners (#32696) Move connection profile into connection manager (#32858) [ML] Temporarily disabling rolling-upgrade tests Use generic AcknowledgedResponse instead of extended classes (#32859) [ML] Removing old per-partition normalization code (#32816) Use JDK 10 for 6.4 BWC builds (#32866) Removed flaky test. Looks like randomisation makes these assertions unreliable. [test] mute IndexShardTests.testDocStats Introduce the dissect library (#32297) Security: remove password hash bootstrap check (#32440) Move validation to server for put user requests (#32471) [ML] Add high level REST client docs for ML put job endpoint (#32843) Test: Fix forbidden uses in test framework (#32824) Painless: Change fqn_only to no_import (#32817) [test] mute testSearchWithSignificantTermsAgg Watcher: Remove unused hipchat render method (#32211) Watcher: Remove extraneous auth classes (#32300) Watcher: migrate PagerDuty v1 events API to v2 API (#32285)

[ML] Removing old per-partition normalization code Per-partition normalization is an old, undocumented feature that was never used by clients. It has been superseded by per-partition maximum scoring. To maintain communication compatibility with nodes prior to 6.5 it is necessary to maintain/cope with the old wire format

Per-partition normalization is an old, undocumented feature that was never used by clients. It has been superseded by per-partition maximum scoring (see #32748). This PR removes the now redundant code. Relates elastic/elasticsearch#32816

#32816

Re-enable BWC tests for ML now that elastic#32816 has been backported to 6.x

[ML] Re-enabling BWC tests Re-enable BWC tests for ML now that #32816 has been backported to 6.x

#32816

[ML] Re-enabling BWC tests Re-enable BWC tests for ML now that #32816 has been backported to 6.x

Per-partition normalization is an old, undocumented feature that was never used by clients. It has been superseded by per-partition maximum scoring (see #32748). This PR removes the now redundant code. Relates elastic/elasticsearch#32816

[ML] Removing old per-partition normalization code

b66f9e7

Per-partition normalization is an old, undocumented feature that was never used by clients. It has been superseded by per-partition maximum scoring.

edsavage added >non-issue review v7.0.0 :ml Machine learning v6.5.0 labels Aug 13, 2018

Removed unused import

6b0c35a

davidkyle approved these changes Aug 14, 2018

View reviewed changes

edsavage added 2 commits August 14, 2018 09:57

Merge branch 'master' into remove_per_partition_normalization

4ea36a6

Removed unused imports from test cases

66e05ae

dimitris-athanasiou suggested changes Aug 14, 2018

View reviewed changes

edsavage added 2 commits August 14, 2018 14:02

Added version checks for BWC

bcbfbc2

BWC version checks for Bucket to/from wire format

b260c8e

To maintain communication compatibility with nodes prior to 6.5 it is necessary to maintain/cope with the old wire format

dimitris-athanasiou reviewed Aug 14, 2018

View reviewed changes

Attending to code review comments

80b4f63

dimitris-athanasiou approved these changes Aug 15, 2018

View reviewed changes

edsavage merged commit 8ce1ab3 into elastic:master Aug 15, 2018

edsavage added a commit that referenced this pull request Aug 15, 2018

[ML] Temporarily disabling rolling-upgrade tests

51cece1

BWC tests disabled while backporting #32816

edsavage added a commit that referenced this pull request Aug 16, 2018

[ML] BWC changes for backport of PR #32816

c08dd01

edsavage mentioned this pull request Aug 16, 2018

[ML] Remove old per-partition normalization code elastic/ml-cpp#184

Merged

edsavage added a commit that referenced this pull request Aug 16, 2018

Temporarily disabled ML BWC tests for backporting

d604b3e

#32816

edsavage added a commit to edsavage/elasticsearch that referenced this pull request Aug 16, 2018

[ML] Re-enabling BWC tests

cc6a256

Re-enable BWC tests for ML now that elastic#32816 has been backported to 6.x

edsavage mentioned this pull request Aug 16, 2018

Re enable ml bwc tests #32916

Merged

edsavage added a commit that referenced this pull request Aug 16, 2018

Re enable ml bwc tests (#32916)

62559d2

[ML] Re-enabling BWC tests Re-enable BWC tests for ML now that #32816 has been backported to 6.x

jimczi pushed a commit that referenced this pull request Aug 17, 2018

Temporarily disabled ML BWC tests for backporting

28b5ce5

#32816

jimczi pushed a commit that referenced this pull request Aug 17, 2018

Re enable ml bwc tests (#32916)

b5ae8ff

[ML] Re-enabling BWC tests Re-enable BWC tests for ML now that #32816 has been backported to 6.x

edsavage mentioned this pull request Aug 20, 2018

[ML] Remove old per-partition normalization code (#184) elastic/ml-cpp#186

Merged

edsavage deleted the remove_per_partition_normalization branch August 20, 2018 12:13

jimczi added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ML] Removing old per-partition normalization code #32816

[ML] Removing old per-partition normalization code #32816

edsavage commented Aug 13, 2018

elasticmachine commented Aug 13, 2018

davidkyle left a comment

dimitris-athanasiou left a comment

dimitris-athanasiou Aug 14, 2018

dimitris-athanasiou Aug 14, 2018

dimitris-athanasiou Aug 14, 2018

droberts195 Aug 14, 2018

dimitris-athanasiou Aug 14, 2018 •

edited

Loading

edsavage Aug 14, 2018

dimitris-athanasiou Aug 14, 2018

edsavage Aug 14, 2018

dimitris-athanasiou commented Aug 14, 2018

dimitris-athanasiou left a comment

[ML] Removing old per-partition normalization code #32816

[ML] Removing old per-partition normalization code #32816

Conversation

edsavage commented Aug 13, 2018

elasticmachine commented Aug 13, 2018

davidkyle left a comment

Choose a reason for hiding this comment

dimitris-athanasiou left a comment

Choose a reason for hiding this comment

dimitris-athanasiou Aug 14, 2018

Choose a reason for hiding this comment

dimitris-athanasiou Aug 14, 2018

Choose a reason for hiding this comment

dimitris-athanasiou Aug 14, 2018

Choose a reason for hiding this comment

droberts195 Aug 14, 2018

Choose a reason for hiding this comment

dimitris-athanasiou Aug 14, 2018 • edited Loading

Choose a reason for hiding this comment

edsavage Aug 14, 2018

Choose a reason for hiding this comment

dimitris-athanasiou Aug 14, 2018

Choose a reason for hiding this comment

edsavage Aug 14, 2018

Choose a reason for hiding this comment

dimitris-athanasiou commented Aug 14, 2018

dimitris-athanasiou left a comment

Choose a reason for hiding this comment

dimitris-athanasiou Aug 14, 2018 •

edited

Loading