Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move tokenizers to analysis common module #30538

Merged

Conversation

martijnvg
Copy link
Member

The following tokenizers were moved: classic, edge_ngram,
letter, lowercase, ngram, path_hierarchy, pattern, thai, uax_url_email and
whitespace.

The only tokenizer that I didn't move in this PR is keyword. This is because
normalizer infrastructure directly dependents on this tokenizer and should
be tackled in a separate PR. Also quite some tests use this tokenizer. I plan
to do this after this PR.

This PR is mainly mechanical and tests were either moved to analysis-common module
or changed to use the standard tokenizer or mock whitespace tokenizer.

Relates to #23658

martijnvg added 3 commits May 11, 2018 11:59
The following tokenizers were moved: classic, edge_ngram, keyword,
letter, lowercase, ngram, path_hierarchy, pattern, thai, uax_url_email and
whitespace.

Relates to elastic#23658
normalizers directly depend on it.This should be addressed on a
follow up change.
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search-aggs

// see #5120
// Note: moved from org.elasticsearch.search.query.SearchQueryIT as it relies on this module now.
public void testNGramCopyField() {
CreateIndexRequestBuilder builder = prepareCreate("test").setSettings(Settings.builder()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should move both of these to yaml tests. The second one feels like the kind of yaml tests that I was adding when I moved things if my memory serves me. It has been a while....

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the first one feels fairly special around copy and ngram together? But I it can be a yaml one too without too much trouble.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will move the first one to a yaml test and you're right about the second test. It already exists :), so I will just remove that one.

…test) and

moved test from java integtest to yaml test and
also added yaml tests for provided tokenizers
@martijnvg martijnvg force-pushed the move_tokenizers_to_analysis_common_module branch from 707d15e to 3a3dec9 Compare May 11, 2018 16:23
@martijnvg martijnvg merged commit 7b95470 into elastic:master May 14, 2018
martijnvg added a commit that referenced this pull request May 14, 2018
The following tokenizers were moved: classic, edge_ngram,
letter, lowercase, ngram, path_hierarchy, pattern, thai, uax_url_email and
whitespace.

Left keyword tokenizer factory in server module, because
normalizers directly depend on it.This should be addressed on a
follow up change.

Relates to #23658
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request May 14, 2018
…them-all

* elastic/master:
  Add missing dependencies on testClasses (elastic#30527)
  [TEST] Mute ML test that needs updating to following ml-cpp changes
  Document woes between auto-expand-replicas and allocation filtering (elastic#30531)
  Moved tokenizers to analysis common module (elastic#30538)
dnhatn added a commit that referenced this pull request May 14, 2018
* master:
  Default to one shard (#30539)
  Unmute IndexUpgradeIT tests
  Forbid expensive query parts in ranking evaluation (#30151)
  Docs: Update HighLevelRestClient migration docs (#30544)
  Clients: Switch to new performRequest (#30543)
  [TEST] Fix typo in MovAvgIT test
  Add missing dependencies on testClasses (#30527)
  [TEST] Mute ML test that needs updating to following ml-cpp changes
  Document woes between auto-expand-replicas and allocation filtering (#30531)
  Moved tokenizers to analysis common module (#30538)
  Adjust copy settings versions
  Mute ShrinkIndexIT suite
  SQL: SYS TABLES ordered according to *DBC specs (#30530)
  Deprecate not copy settings and explicitly disallow (#30404)
  [ML] Improve state persistence log message
  Build: Add mavenPlugin cluster configuration method (#30541)
  Re-enable FlushIT tests
  Bump Gradle heap to 2 GB (#30535)
  SQL: Use request flavored methods in tests (#30345)
  Suppress hdfsFixture if there are spaces in the path (#30302)
  Delete temporary blobs before creating index file (#30528)
  Watcher: Remove TriggerEngine.getJobCount() (#30395)
  [ML] Fix wire BWC for JobUpdate (#30512)
  Use simpler write-once semantics for FS repository (#30435)
  Derive max composite buffers from max content len
  Use simpler write-once semantics for HDFS repository (#30439)
  SQL: Improve correctness of SYS COLUMNS & TYPES (#30418)
  Mute two tests in FlushIT with @AwaitsFix.
  Fix incorrect template name in test case
  Build: Remove legacy bwc files from xpack (#30485)
  Mute UnicastZenPingTests#testSimplePings with @AwaitsFix.
  Security: cleanup code in file stores (#30348)
  Security: fix TokenMetaData equals and hashcode (#30347)
  Mute two tests from SmokeTestWatcherWithSecurityClientYamlTestSuiteIT.
  Mute SharedClusterSnapshotRestoreIT#testSnapshotSucceedsAfterSnapshotFailure with @AwaitsFix.
  SQL: Improve compatibility with MS query (#30516)
  SQL: Fix parsing of dates with milliseconds (#30419)
dnhatn added a commit that referenced this pull request May 14, 2018
* 6.x:
  Unmute IndexUpgradeIT tests
  Forbid expensive query parts in ranking evaluation (#30151)
  Docs: Update HighLevelRestClient migration docs (#30544)
  Clients: Switch to new performRequest (#30543)
  [TEST] Fix typo in MovAvgIT test
  [TEST] Mute ML test that needs updating to following ml-cpp changes
  Moved tokenizers to analysis common module (#30538)
  Document woes between auto-expand-replicas and allocation filtering (#30531)
  [ML] Hide internal Job update options from the REST API (#30537)
  Deprecate not copy settings and explicitly disallow (#30404)
  Mute ShrinkIndexIT suite
  SQL: SYS TABLES ordered according to *DBC specs (#30530)
  [ML] Improve state persistence log message
  Build: Add mavenPlugin cluster configuration method (#30541)
  Re-enable FlushIT tests
  Bump Gradle heap to 2 GB (#30535)
  Bump Gradle heap to 1792m (#30484)
  SQL: Use request flavored methods in tests (#30345)
  Suppress hdfsFixture if there are spaces in the path (#30302)
  Delete temporary blobs before creating index file (#30528)
  Watcher: Remove TriggerEngine.getJobCount() (#30395)
  Use simpler write-once semantics for FS repository (#30435)
  Use simpler write-once semantics for HDFS repository (#30439)
  SQL: Improve correctness of SYS COLUMNS & TYPES (#30418)
  Mute two tests in FlushIT with @AwaitsFix.
  Fix incorrect template name in test case
  Build: Remove legacy bwc files from xpack (#30485)
  Security: Simplify security index listeners (#30466)
  Mute SharedClusterSnapshotRestoreIT#testSnapshotSucceedsAfterSnapshotFailure with @AwaitsFix.
  Add proper longitude validation in geo_polygon_query (#30497)
  Mute UnicastZenPingTests#testSimplePings with @AwaitsFix.
  Security: cleanup code in file stores (#30348)
  Security: fix TokenMetaData equals and hashcode (#30347)
  Mute two tests from SmokeTestWatcherWithSecurityClientYamlTestSuiteIT.
  Fix incorrect merged entry in changelog
  SQL: Improve compatibility with MS query (#30516)
  SQL: Fix parsing of dates with milliseconds (#30419)
martijnvg added a commit that referenced this pull request May 15, 2018
* es/ccr: (37 commits)
  Default to one shard (#30539)
  Unmute IndexUpgradeIT tests
  Forbid expensive query parts in ranking evaluation (#30151)
  Docs: Update HighLevelRestClient migration docs (#30544)
  Clients: Switch to new performRequest (#30543)
  [TEST] Fix typo in MovAvgIT test
  Add missing dependencies on testClasses (#30527)
  [TEST] Mute ML test that needs updating to following ml-cpp changes
  Document woes between auto-expand-replicas and allocation filtering (#30531)
  Moved tokenizers to analysis common module (#30538)
  Adjust copy settings versions
  Mute ShrinkIndexIT suite
  SQL: SYS TABLES ordered according to *DBC specs (#30530)
  Deprecate not copy settings and explicitly disallow (#30404)
  [ML] Improve state persistence log message
  Build: Add mavenPlugin cluster configuration method (#30541)
  Re-enable FlushIT tests
  Bump Gradle heap to 2 GB (#30535)
  SQL: Use request flavored methods in tests (#30345)
  Suppress hdfsFixture if there are spaces in the path (#30302)
  ...
martijnvg added a commit that referenced this pull request May 15, 2018
* es/ccr: (37 commits)
  Default to one shard (#30539)
  Unmute IndexUpgradeIT tests
  Forbid expensive query parts in ranking evaluation (#30151)
  Docs: Update HighLevelRestClient migration docs (#30544)
  Clients: Switch to new performRequest (#30543)
  [TEST] Fix typo in MovAvgIT test
  Add missing dependencies on testClasses (#30527)
  [TEST] Mute ML test that needs updating to following ml-cpp changes
  Document woes between auto-expand-replicas and allocation filtering (#30531)
  Moved tokenizers to analysis common module (#30538)
  Adjust copy settings versions
  Mute ShrinkIndexIT suite
  SQL: SYS TABLES ordered according to *DBC specs (#30530)
  Deprecate not copy settings and explicitly disallow (#30404)
  [ML] Improve state persistence log message
  Build: Add mavenPlugin cluster configuration method (#30541)
  Re-enable FlushIT tests
  Bump Gradle heap to 2 GB (#30535)
  SQL: Use request flavored methods in tests (#30345)
  Suppress hdfsFixture if there are spaces in the path (#30302)
  ...
martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request May 15, 2018
* es/ccr: (37 commits)
  Default to one shard (elastic#30539)
  Unmute IndexUpgradeIT tests
  Forbid expensive query parts in ranking evaluation (elastic#30151)
  Docs: Update HighLevelRestClient migration docs (elastic#30544)
  Clients: Switch to new performRequest (elastic#30543)
  [TEST] Fix typo in MovAvgIT test
  Add missing dependencies on testClasses (elastic#30527)
  [TEST] Mute ML test that needs updating to following ml-cpp changes
  Document woes between auto-expand-replicas and allocation filtering (elastic#30531)
  Moved tokenizers to analysis common module (elastic#30538)
  Adjust copy settings versions
  Mute ShrinkIndexIT suite
  SQL: SYS TABLES ordered according to *DBC specs (elastic#30530)
  Deprecate not copy settings and explicitly disallow (elastic#30404)
  [ML] Improve state persistence log message
  Build: Add mavenPlugin cluster configuration method (elastic#30541)
  Re-enable FlushIT tests
  Bump Gradle heap to 2 GB (elastic#30535)
  SQL: Use request flavored methods in tests (elastic#30345)
  Suppress hdfsFixture if there are spaces in the path (elastic#30302)
  ...
martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request May 15, 2018
* es/ccr: (37 commits)
  Default to one shard (elastic#30539)
  Unmute IndexUpgradeIT tests
  Forbid expensive query parts in ranking evaluation (elastic#30151)
  Docs: Update HighLevelRestClient migration docs (elastic#30544)
  Clients: Switch to new performRequest (elastic#30543)
  [TEST] Fix typo in MovAvgIT test
  Add missing dependencies on testClasses (elastic#30527)
  [TEST] Mute ML test that needs updating to following ml-cpp changes
  Document woes between auto-expand-replicas and allocation filtering (elastic#30531)
  Moved tokenizers to analysis common module (elastic#30538)
  Adjust copy settings versions
  Mute ShrinkIndexIT suite
  SQL: SYS TABLES ordered according to *DBC specs (elastic#30530)
  Deprecate not copy settings and explicitly disallow (elastic#30404)
  [ML] Improve state persistence log message
  Build: Add mavenPlugin cluster configuration method (elastic#30541)
  Re-enable FlushIT tests
  Bump Gradle heap to 2 GB (elastic#30535)
  SQL: Use request flavored methods in tests (elastic#30345)
  Suppress hdfsFixture if there are spaces in the path (elastic#30302)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants