Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MDS] Support Vega Visualizations #5975

Merged
merged 11 commits into from
Mar 8, 2024

Conversation

huyaboo
Copy link
Member

@huyaboo huyaboo commented Feb 28, 2024

Description

This PR adds Vega support for MDS. If data_source.enabled: true and a data_source_id field is provided in the Vega spec, then the Vega visualization will render with data from the data source (assuming the data source has the specified index).

Issues Resolved

closes #5927

Screenshot

Screen.Recording.2024-03-06.at.2.01.38.PM.mov

Testing the changes

Pre-requisites

  1. Enable MDS (setdata_source.enabled: true in opensearch_dashboards.yml)
  2. Add data source (in this example, it is vegaDataSource)
  3. Add the newly created data source name to the data_source_name field under the url body

Testing

  1. Go to Visualize -> Create visualization -> Vega
  2. Copy-paste existing Vega visualization from the data source (or create a new one)
  3. Add a field data_source_id to the vega spec under url and set the value to the data source id.
  4. Vega should render, update when timefield is updated, and save.

Check List

  • All tests pass
    • yarn test:jest
    • yarn test:jest_integration
  • New functionality includes testing.
  • New functionality has been documented.
  • Update CHANGELOG.md
  • Commits are signed per the DCO using --signoff

Signed-off-by: Huy Nguyen <73027756+huyaboo@users.noreply.github.com>
Signed-off-by: Huy Nguyen <73027756+huyaboo@users.noreply.github.com>
Signed-off-by: Huy Nguyen <73027756+huyaboo@users.noreply.github.com>
@@ -70,7 +71,9 @@ export class SearchAPI {
requestResponders[requestId].json(params.body);
}

return search({ params }, { abortSignal: this.abortSignal }).pipe(
const searchApiParams = dataSourceId ? { params, dataSourceId } : { params };
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should check if data source plugin is enabled or not

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this API public? I mean without UI, can I make API call?

What is dataSourceId is empty and there is no OpenSearch running locally (or opensearch.hosts:[] in OSD YML?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are the four cases and expected behaviors:

  1. MDS is disabled and no datasource name is provided: visualization will fetch data from local cluster as usual
  2. MDS is disabled but a datasource name is provided: visualization will throw error that MDS is disabled
  3. MDS is enabled and no datasource name is provided: visualization will fetch data from local cluster as usual
  4. MDS is enabled and a datasource name is provided: visualization will (attempt) to fetch data from remote

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if - MDS is enabled and no datasource name is provided and opensearch.hosts:[]

return search({ params }, { abortSignal: this.abortSignal }).pipe(
const searchApiParams = dataSourceId ? { params, dataSourceId } : { params };

return search(searchApiParams, { abortSignal: this.abortSignal }).pipe(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we support imported visualization for vega?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. Also I am curious about how do we support Sample Vega when we add sample data from remote data source?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for import/export

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import visualization will be limited in support.

If Vega visualization were to be imported from a non-MDS enabled cluster -> MDS enabled cluster, then if a datasource is specified, the associated datasource name can be added to each opensearch url body (which would require editing the Vega spec directly). This is what can happen when importing sample data.

However, if the visualization were to be imported from an MDS enabled cluster -> MDS enabled cluster, then there is no current method to map new datasource names to url bodies. This is because the import saved objects can only determine 1 datasource. Since the Vega spec supports multiple url queries to multiple datasources, the import would inevitably break.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@huyaboo import and export can be addressed in separate issue, we can check the vega script inside the saved object to see if datasource name is specified, if not, then we can add the data source name, if yes, then we don't add but checks if the data source exists. Does that simplify it?

@@ -61,6 +61,7 @@ export class SearchAPI {
return combineLatest(
searchRequests.map((request) => {
const requestId = request.name;
const dataSourceId = request.data_source_id;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add test coverage? thanks

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some test cases to parse the id from the data_source_name

@bandinib-amzn
Copy link
Member

This need not to be scope in this PR

Instead of adding a field data_source_id to the vega spec under url, can we show a screen to choose data source? cc: @kgcreative

Once we have selected datasource, we will have the data source id and we can inject that data source id into vega spec(under url) under the hood. I think this will improve user experience overall.

On top of my mind, flow will remain same as existing vizualization. Instead of showing index pattern, searches, we will show data sources.

Screenshot 2024-03-01 at 4 03 40 PM

CHANGELOG.md Outdated
@@ -70,6 +70,7 @@ Inspired from [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
- [Multiple Datasource] Concatenate data source name with index pattern name and change delimiter to double colon ([#5907](https://github.com/opensearch-project/OpenSearch-Dashboards/pull/5907))
- [Multiple Datasource] Refactor client and legacy client to use authentication registry ([#5881](https://github.com/opensearch-project/OpenSearch-Dashboards/pull/5881))
- [Multiple Datasource] Improved error handling for the search API when a null value is passed for the dataSourceId ([#5882](https://github.com/opensearch-project/OpenSearch-Dashboards/pull/5882))
- [Multiple Datasource] Add Vega support to MDS by specifying a data source id in the Vega spec ([#5975](https://github.com/opensearch-project/OpenSearch-Dashboards/pull/5975))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we move this change log to unreleased?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to unreleased

@kgcreative
Copy link
Member

I think the challenge here is Vega allows for customers to select both index patterns, but also raw indexes

@seraphjiang
Copy link
Member

vega is for advanced user who could write scripting.

I'm leaning for a simple user experience in the initial release if possible.

In the future we might considering some auto completion or code intelligence feature, or integrate with AI assistant based on requirements

BTW
@bandinib-amzn @Flyingliuhub @ZilongX @BionIT appreciate if you could help to review the current implementation

@seraphjiang
Copy link
Member

@BionIT @yujin-emma

Do we need to change saved object export code in order to support vega in import/export scenarios for MDS. If so, could we create a follow up issue to track?

Signed-off-by: Huy Nguyen <73027756+huyaboo@users.noreply.github.com>
Signed-off-by: Huy Nguyen <73027756+huyaboo@users.noreply.github.com>
Signed-off-by: Huy Nguyen <73027756+huyaboo@users.noreply.github.com>
BionIT
BionIT previously approved these changes Mar 8, 2024
@BionIT BionIT added the multiple datasource multiple datasource project label Mar 8, 2024
bandinib-amzn
bandinib-amzn previously approved these changes Mar 8, 2024
Signed-off-by: Huy Nguyen <73027756+huyaboo@users.noreply.github.com>
@huyaboo huyaboo dismissed stale reviews from bandinib-amzn and BionIT via fb8fe84 March 8, 2024 04:04
BionIT
BionIT previously approved these changes Mar 8, 2024
bandinib-amzn
bandinib-amzn previously approved these changes Mar 8, 2024
Comment on lines +106 to +117
const dataSources = await this.dataSourceFindQuery(dataSourceName);

// In the case that data_source_name is a prefix of another name, match exact data_source_name
const possibleDataSourceIds = dataSources.savedObjects.filter(
(obj) => obj.attributes.title === dataSourceName
);

if (possibleDataSourceIds.length !== 1) {
throw new Error(
`Expected exactly 1 result for data_source_name "${dataSourceName}" but got ${possibleDataSourceIds.length} results`
);
}
Copy link
Member

@xinruiba xinruiba Mar 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic is to check no duplicate datasource names between all datasources.

But we set the paging size 10 here: https://github.com/opensearch-project/OpenSearch-Dashboards/pull/5975/files#diff-f1c265abe9e3b3c11a64d4110cf3e6a010dfe369cf6eed9a20ea532763489fbdR125

In that case, will the duplication check only applies to first 10 datasources?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there are two cases where possibleDataSourceIds before filtering is > 1:

  1. The data_source_name is a duplicate of another (which means querying cannot be done)
  2. The name is a prefix of other data_source_names. For example, if the name were Vega Data Source and the other were Vega Data Source Copy, then the find query for Vega Data Source will return both these data source names. The filtering is done to ensure that Vega Data Source is chosen and not Vega Data Source Copy.

The paging limit of 10 helps keep the number of possible datasources low since this is client-side.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for the initial paging to be free of duplicates, but duplication happens when applied across all data sources?

Copy link
Member

@xinruiba xinruiba Mar 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is not a blocker of this PR.
I don't think we are able to create datasource with duplication name, so this logic should be fine.

Signed-off-by: Huy Nguyen <73027756+huyaboo@users.noreply.github.com>
@huyaboo huyaboo dismissed stale reviews from bandinib-amzn and BionIT via f9747f3 March 8, 2024 20:09
@bandinib-amzn bandinib-amzn merged commit 1c5ad6c into opensearch-project:main Mar 8, 2024
68 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Mar 19, 2024
* Add MDS support for Vega

Signed-off-by: Huy Nguyen <73027756+huyaboo@users.noreply.github.com>

* Refactor field to data_source_id

Signed-off-by: Huy Nguyen <73027756+huyaboo@users.noreply.github.com>

* Add to CHANGELOG.md

Signed-off-by: Huy Nguyen <73027756+huyaboo@users.noreply.github.com>

* Added test cases and renamed field to use data_source_name

Signed-off-by: Huy Nguyen <73027756+huyaboo@users.noreply.github.com>

* Add prefix datasource name test case and add example in default hjson

Signed-off-by: Huy Nguyen <73027756+huyaboo@users.noreply.github.com>

* Move CHANGELOG to appropriate section

Signed-off-by: Huy Nguyen <73027756+huyaboo@users.noreply.github.com>

* Increased test coverage of search() method

Signed-off-by: Huy Nguyen <73027756+huyaboo@users.noreply.github.com>

---------

Signed-off-by: Huy Nguyen <73027756+huyaboo@users.noreply.github.com>
(cherry picked from commit 1c5ad6c)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

# Conflicts:
#	CHANGELOG.md
@huyaboo huyaboo mentioned this pull request Mar 19, 2024
7 tasks
BionIT pushed a commit that referenced this pull request Mar 19, 2024
* Add MDS support for Vega

Signed-off-by: Huy Nguyen <73027756+huyaboo@users.noreply.github.com>

* Refactor field to data_source_id

Signed-off-by: Huy Nguyen <73027756+huyaboo@users.noreply.github.com>

* Add to CHANGELOG.md

Signed-off-by: Huy Nguyen <73027756+huyaboo@users.noreply.github.com>

* Added test cases and renamed field to use data_source_name

Signed-off-by: Huy Nguyen <73027756+huyaboo@users.noreply.github.com>

* Add prefix datasource name test case and add example in default hjson

Signed-off-by: Huy Nguyen <73027756+huyaboo@users.noreply.github.com>

* Move CHANGELOG to appropriate section

Signed-off-by: Huy Nguyen <73027756+huyaboo@users.noreply.github.com>

* Increased test coverage of search() method

Signed-off-by: Huy Nguyen <73027756+huyaboo@users.noreply.github.com>

---------

Signed-off-by: Huy Nguyen <73027756+huyaboo@users.noreply.github.com>
(cherry picked from commit 1c5ad6c)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

# Conflicts:
#	CHANGELOG.md

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
opensearch-trigger-bot bot pushed a commit that referenced this pull request Mar 20, 2024
* Add MDS support for Vega

Signed-off-by: Huy Nguyen <73027756+huyaboo@users.noreply.github.com>

* Refactor field to data_source_id

Signed-off-by: Huy Nguyen <73027756+huyaboo@users.noreply.github.com>

* Add to CHANGELOG.md

Signed-off-by: Huy Nguyen <73027756+huyaboo@users.noreply.github.com>

* Added test cases and renamed field to use data_source_name

Signed-off-by: Huy Nguyen <73027756+huyaboo@users.noreply.github.com>

* Add prefix datasource name test case and add example in default hjson

Signed-off-by: Huy Nguyen <73027756+huyaboo@users.noreply.github.com>

* Move CHANGELOG to appropriate section

Signed-off-by: Huy Nguyen <73027756+huyaboo@users.noreply.github.com>

* Increased test coverage of search() method

Signed-off-by: Huy Nguyen <73027756+huyaboo@users.noreply.github.com>

---------

Signed-off-by: Huy Nguyen <73027756+huyaboo@users.noreply.github.com>
(cherry picked from commit 1c5ad6c)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

# Conflicts:
#	CHANGELOG.md

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
(cherry picked from commit 05ede47)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
BionIT pushed a commit that referenced this pull request Mar 20, 2024
* Add MDS support for Vega



* Refactor field to data_source_id



* Add to CHANGELOG.md



* Added test cases and renamed field to use data_source_name



* Add prefix datasource name test case and add example in default hjson



* Move CHANGELOG to appropriate section



* Increased test coverage of search() method



---------


(cherry picked from commit 1c5ad6c)


# Conflicts:
#	CHANGELOG.md


(cherry picked from commit 05ede47)

Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
@huyaboo huyaboo deleted the vega-mds-support branch April 18, 2024 00:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[RFC] Vega support with MDS
7 participants