Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Multiple Datasource][Version Decoupling] Add data source engine type to data source saved object #7026

Merged
merged 3 commits into from
Jun 14, 2024

Conversation

ZilongX
Copy link
Collaborator

@ZilongX ZilongX commented Jun 13, 2024

Description

  • This one is to enrich the data source saved object by adding a new meta data dataSourceEngineType
  • This feature is required per discussion and feedback collected from RFC [RFC] Plugins Version Decoupling #5877.
  • In order to decouple the plugin versions with OS cluster versions, we will need to specify a supported data source version range for example >= 2.1.0, yet given we support both OpenSearch and ElasticSearch clusters, the version range could be overlapped since only the numeric part is recorded for now, so for an ES 7.10.2 data source the version would be recorded as 7.10.2, which would break the support range check as it would also satisfy the semver check of >=2.1.0. This extra piece of engine type information would get this issue resolved.

Issues Resolved

#7021

Screenshot

  • Newly created OS datasource with dataSourceEngineType as OpenSearch
    image

  • Newly created ES datasource with dataSourceEngineType as Elasticsearch
    image

  • Newly created OSS datasource with dataSourceEngineType as OpenSearch Serverless
    image

Testing the changes

All related test cases passed as

image

Changelog

  • fix: [MDS] Add data source engine type to data source saved object

Check List

  • All tests pass
    • yarn test:jest
    • yarn test:jest_integration
  • New functionality includes testing.
  • New functionality has been documented.
  • Update CHANGELOG.md
  • Commits are signed per the DCO using --signoff

… to data source saved object

Signed-off-by: Zilong Xia <zilongx@amazon.com>
Copy link
Contributor

❌ Changelog Entry Missing Hyphen

Changelog entries must begin with a hyphen (-).

Copy link
Contributor

❌ Invalid Prefix For Manual Changeset Creation

Invalid description prefix. Found "feat". Only "skip" entry option is permitted for manual commit of changeset files.

If you were trying to skip the changelog entry, please use the "skip" entry option in the ##Changelog section of your PR description.

1 similar comment
Copy link
Contributor

❌ Invalid Prefix For Manual Changeset Creation

Invalid description prefix. Found "feat". Only "skip" entry option is permitted for manual commit of changeset files.

If you were trying to skip the changelog entry, please use the "skip" entry option in the ##Changelog section of your PR description.

Copy link
Contributor

❌ Invalid Prefix For Manual Changeset Creation

Invalid description prefix. Found "fix". Only "skip" entry option is permitted for manual commit of changeset files.

If you were trying to skip the changelog entry, please use the "skip" entry option in the ##Changelog section of your PR description.

Copy link

codecov bot commented Jun 13, 2024

Codecov Report

Attention: Patch coverage is 71.42857% with 4 lines in your changes missing coverage. Please review.

Project coverage is 67.44%. Comparing base (a4aa682) to head (26c64e4).

Files Patch % Lines
.../server/routes/data_source_connection_validator.ts 60.00% 3 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7026      +/-   ##
==========================================
- Coverage   67.44%   67.44%   -0.01%     
==========================================
  Files        3444     3444              
  Lines       67865    67873       +8     
  Branches    11027    11029       +2     
==========================================
+ Hits        45772    45777       +5     
- Misses      19429    19431       +2     
- Partials     2664     2665       +1     
Flag Coverage Δ
Linux_1 33.07% <ø> (ø)
Linux_2 55.11% <ø> (ø)
Linux_3 45.22% <71.42%> (-0.01%) ⬇️
Linux_4 34.86% <ø> (ø)
Windows_1 33.10% <ø> (ø)
Windows_2 55.06% <ø> (ø)
Windows_3 45.24% <71.42%> (+<0.01%) ⬆️
Windows_4 34.86% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -52,3 +53,10 @@ export enum SigV4ServiceName {
}

export { DataSourceError } from './error';

export enum DataSourceEngineType {
OpenSearch = 'OpenSearch',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come not default?

If the data source then is undefined then it's assumed default. One thing I noticed related to the flint stuff passing around types with spaces causes some issues if the global url states.

vs using default which is the implied OpenSearch and is BWC if we assume default if no type explicitly defined.

I know in the repo it's a mixed casing with enums but not sure if using default will cause some issues lowercase so maybe

export enum DATA_SOURCE_ENGINE_TYPE {
  DEFAULT = 'default',
  SERVERLESS = 'serverless',
  LEGACY = 'legacy',
  NONE = 'none'
}

the serverless banking off that we assume it's OpenSearch. The legacy because internally we reference it as legacy internally [example]

and NONE because it's matches the pattern to other enums in the repo [example]

Copy link
Collaborator Author

@ZilongX ZilongX Jun 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is default needed?

This engine type is retrieved when the data source object is pulled together and assuming a default just makes no sense to the users where the default just mean OS cluster ? This info is not leveraged in global url states but maybe share the flint issues link if any so we can double check. (is flint even open sourced ?)

Per the legacy it's pretty vague where it could be meaning either Elasticsearch or Legacy OpenSearch (say when we have OpenSearch 3.0 out and OpenSearch 1.x retired which become Legacy OpenSearch). For supported data source types there are three now explicitly so given three exact type rather than giving general terms looks more accurate.

Copy link
Member

@kavilla kavilla Jun 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(is flint even open sourced ?)

but i dont think elasticsearch is either which is why internally codewise we refer to elasticsearch references as legacy

a default just makes no sense to the users where the default just mean OS cluster

do we display this info to the user in functional purposes? have we verified this with legal? previous requirements were no trademark displays from that legacy engine. so if we plan on displaying this info could be an issue. if not then i would consider that internally codewise since forking we reference elasticsearch as legacy. so it would be an antipattern

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this info is not displayed to user, would like to follow up on this legal verification mentioned here, do you have any previous ref or poc from legal ? have no problem to switch all reference of elasticsearch to legacy

@@ -10,6 +10,7 @@ export interface DataSourceAttributes extends SavedObjectAttributes {
description?: string;
endpoint: string;
dataSourceVersion?: string;
dataSourceEngineType?: DataSourceEngineType;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need a migration? i dont think so right

'2.4.0': flow(migrateDataSource), // 2.4.0 is the version that introduces the datasource

it's optional field. so think it doesn't need it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's why this field is marked optional yes

@@ -10,6 +10,7 @@ export interface DataSourceAttributes extends SavedObjectAttributes {
description?: string;
endpoint: string;
dataSourceVersion?: string;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: prolly keeping in line with version which makes sense too.

but might be redundant to put dataSource in the field name since it's apart of the DataSource attributes. but i also understand the concept of data sources has different meanings even within this repo so might make sense to specify.

Copy link
Collaborator Author

@ZilongX ZilongX Jun 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah you got the point, the dataSource prefix was added intentionally just to distinguish with other data source terms

@kavilla
Copy link
Member

kavilla commented Jun 14, 2024

just an opinion on the enums: #7026 (comment)

but this is great thank you!

@ZilongX
Copy link
Collaborator Author

ZilongX commented Jun 14, 2024

just an opinion on the enums: #7026 (comment)

but this is great thank you!

Thanks @kavilla, all great feedbacks ~~~

Copy link
Contributor

@yujin-emma yujin-emma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, hope it can resolve the version filter well

@ZilongX
Copy link
Collaborator Author

ZilongX commented Jun 14, 2024

Great, hope it can resolve the version filter well

Thanks @yujin-emma , yeah first baby step ...

if (
body.version.distribution !== null &&
body.version.distribution !== undefined &&
body.version.distribution === 'opensearch'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:
body.version.distribution === 'opensearch' should be enough here.
But no hurts to add !== null and !== undefined explicitly.

Comment on lines 374 to 377
expect(result.body).toEqual({
dataSourceVersion: '2.11.0',
dataSourceEngineType: 'OpenSearch',
installedPlugins: ['opensearch-ml', 'opensearch-sql'],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this is the only place where we are comparing result.body during testing? What if we are testing result.body at other places like functional, integration? It will break, right? We will need to include dataSourceEngineType everyehwhere.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes as dataSourceEngineType is marked as optional to ensure existing test won't fail

Comment on lines +62 to +65
if (
body.version.distribution !== null &&
body.version.distribution !== undefined &&
body.version.distribution === 'opensearch'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (
body.version.distribution !== null &&
body.version.distribution !== undefined &&
body.version.distribution === 'opensearch'
if (
body.version.distribution === 'opensearch'

This can be simplified

@ZilongX ZilongX merged commit 190dab0 into opensearch-project:main Jun 14, 2024
71 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Jun 14, 2024
… to data source saved object (#7026)

* [Multiple Datasource][Version Decoupling] Add data source engine type to data source saved object

Signed-off-by: Zilong Xia <zilongx@amazon.com>

* Changeset file for PR #7026 created/updated

---------

Signed-off-by: Zilong Xia <zilongx@amazon.com>
Co-authored-by: opensearch-changeset-bot[bot] <154024398+opensearch-changeset-bot[bot]@users.noreply.github.com>
(cherry picked from commit 190dab0)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
ZilongX pushed a commit that referenced this pull request Jun 14, 2024
… to data source saved object (#7026) (#7033)

* [Multiple Datasource][Version Decoupling] Add data source engine type to data source saved object



* Changeset file for PR #7026 created/updated

---------



(cherry picked from commit 190dab0)

Signed-off-by: Zilong Xia <zilongx@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: opensearch-changeset-bot[bot] <154024398+opensearch-changeset-bot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants