Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Projected Views #11957

Merged
merged 63 commits into from
Feb 20, 2024
Merged

Projected Views #11957

merged 63 commits into from
Feb 20, 2024

Conversation

peternied
Copy link
Member

@peternied peternied commented Jan 19, 2024

Description

A new kind of resource that abstracts indices from searches. A view define a projection of targeted indices. I'm trying to start out with as barebones of an implementation to get design feedback and find a good matching vertical scenario that could be added to dashboards.

Related Issues

High Level Design

Views are stored in cluster metadata, standard CRUD operations. A view can be used for _search which validates and transforms the request onto the 'physical' structure of the data like indices. See more in the views-design.md [link] included in this PR.

sequenceDiagram
    participant Client
    participant HTTP_Request as ActionHandler
    participant Cluster_Metadata as Cluster Metadata Store
    participant Data_Store as Indices

    Client->>HTTP_Request: View List/Get/Update/Create/Delete<BR>/views or /views/{view_id}
    HTTP_Request->>Cluster_Metadata: Query Views
    alt Update/Create/Delete
        Cluster_Metadata->>Cluster_Metadata: Refresh Cluster
    end
    Cluster_Metadata-->>HTTP_Request: Return
    HTTP_Request-->>Client: Return

    Client->>HTTP_Request: Search View<br>/views/{view_id}/search
    HTTP_Request->>Cluster_Metadata: Query Views
    Cluster_Metadata-->>HTTP_Request: Return
    HTTP_Request->>HTTP_Request: Rewrite Search Request
    HTTP_Request->>HTTP_Request: Validate Search Request
    HTTP_Request->>Data_Store: Search indices
    Data_Store-->>HTTP_Request: Return
    HTTP_Request-->>Client: Return
Loading

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • Public documentation issue/PR created

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Peter Nied <petern@amazon.com>
Signed-off-by: Peter Nied <petern@amazon.com>
Signed-off-by: Peter Nied <petern@amazon.com>
Signed-off-by: Peter Nied <petern@amazon.com>
Signed-off-by: Peter Nied <petern@amazon.com>
Signed-off-by: Peter Nied <petern@amazon.com>
Signed-off-by: Peter Nied <petern@amazon.com>
Signed-off-by: Peter Nied <petern@amazon.com>
Signed-off-by: Peter Nied <petern@amazon.com>
Signed-off-by: Peter Nied <petern@amazon.com>
Signed-off-by: Peter Nied <petern@amazon.com>
Signed-off-by: Peter Nied <petern@amazon.com>
Copy link
Contributor

github-actions bot commented Jan 19, 2024

Compatibility status:

Checks if related components are compatible with change 2ac6e68

Incompatible components

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/flow-framework.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/performance-analyzer.git]

Copy link
Contributor

❌ Gradle check result for 61e8c98: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Peter Nied <petern@amazon.com>
Signed-off-by: Peter Nied <petern@amazon.com>
Signed-off-by: Peter Nied <petern@amazon.com>
Signed-off-by: Peter Nied <petern@amazon.com>
@peternied
Copy link
Member Author

@msfroh I'm going to keep this in draft until that checklist in the description has been completed, but this PR would be in a good place for feedback around the metadata changes / CreatViewAction / ViewService. Feedback would be greatly appreciated - thanks!

Copy link
Contributor

github-actions bot commented Feb 1, 2024

❌ Gradle check result for 2f615cf: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Peter Nied <petern@amazon.com>
@peternied
Copy link
Member Author

@cwperks @DarshitChanpura could you take another look? I've snapped to the patterns associated with the OpenSearch repo and created integration tests. While I still need to add some additional transport actions and associated tests (Get/Deleted/Update/Enumerate) they will follow the same conventions established with CreateViewAction and the ViewService.

Copy link
Contributor

github-actions bot commented Feb 3, 2024

❌ Gradle check result for d45fcb9: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Peter Nied <peternied@hotmail.com>
Copy link
Member

@cwperks cwperks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peternied Took a first pass and added some high-level comments.

Copy link
Contributor

github-actions bot commented Feb 5, 2024

❌ Gradle check result for 2832b24: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

github-actions bot commented Feb 5, 2024

❌ Gradle check result for 12296ce: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@jainankitk
Copy link
Collaborator

jainankitk commented Feb 16, 2024

@peternied - Can you share more details on how we are planning to use the views differently than aliases / index-pattern before we make View first class citizen of Opensearch?

Signed-off-by: Peter Nied <peternied@hotmail.com>
Copy link
Contributor

❕ Gradle check result for 2ac6e68: UNSTABLE

  • TEST FAILURES:
      1 org.opensearch.search.SearchWeightedRoutingIT.testShardRoutingWithNetworkDisruption_FailOpenEnabled
      1 org.opensearch.search.SearchWeightedRoutingIT.testMultiGetWithNetworkDisruption_FailOpenEnabled

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

@peternied
Copy link
Member Author

@jainankitk Thanks for looking into this PR and the related RFC

before we make View first class citizen

I'm not sure views will become a first class citizen. This feature is experimental until we can get more solid scenarios behind it. I'm trying to organically build up use cases and clear value before shipping this as part of the product. Even this 'basic' implementation is 3k lines of code, I like to start here to make incremental process.

In this pull request #11957 (comment) also has more context around some of these thoughts

how we are [...] views different than aliases / index-pattern

Index patterns don't exist in the backend, they are a dashboards only concept. I think that views can become the representation of the Dashboards index pattern object which includes metadata for the field to sort time series data on and derived fields and transformations.

Alias are directly attached to concrete indexes. Views don't have those limitations. While there is some existing filtering logic into alias, I'd rather not muddle what an index is with a view - as both of these can live side by side associated with different use cases.

@peternied
Copy link
Member Author

@andrross @reta @msfroh any other items that you'd like to see resolved before merging? I think I've covered all outstanding feedback

@jainankitk
Copy link
Collaborator

This feature is experimental until we can get more solid scenarios behind it. I'm trying to organically build up use cases and clear value before shipping this as part of the product.

IMO, we should work backwards from the use cases instead of shipping experimental APIs and establish use cases and clear value later

Alias are directly attached to concrete indexes. Views don't have those limitations.

I don't understand this. Alias can point to multiple indices and index patterns. How are views different than that?

While there is some existing filtering logic into alias, I'd rather not muddle what an index is with a view - as both of these can live side by side associated with different use cases.

And, this is why we need clarity on strong use case Views are serving to not extend alias and introduce Views

Even this 'basic' implementation is 3k lines of code, I like to start here to make incremental process.

Please correct me if I am wrong, but most of the 'basic' implementation is standard template for CRUD APIs. So, I will not worry about the number of lines in the code! :)
I am big proponent of making incremental progress, but would really to see the value proposition clearly before investing resources in it.

@reta
Copy link
Collaborator

reta commented Feb 19, 2024

@andrross @reta @msfroh any other items that you'd like to see resolved before merging? I think I've covered all outstanding feedback

@peternied from implementation perspective, LGTM. As we discussed offline (and in comments here), there are clearly concerns regarding the compelling use cases of this feature and how the end result looks like. I would love to hear others' opinions on the subject, personally I feel like I am missing something to say definitive "yes" or "no" with respect to getting it in.

@peternied
Copy link
Member Author

Thanks for your thoughts @reta @jainankitk. I'm trying to keep this PR and the View construct as lean as possible - as it is expensive to get a feature built up in OpenSearch. If we can get a minimal version in core, it makes the next set of improvements faster.

I like the notion of iterative development - its fair if you think that this feature has too wide of a possibility space. I am stuck with business realities that I do not yet know what is going to justify the next phase of 'funding' for this project. Maybe it will be perf improvements, document level security, sandboxing request resources, search pipeline improvements, or something completely new to me.

I believe with this foundation unblocks further work that is largely outside of core [1]. I think keeping this work in main (v3.0.0) gives lots of headroom to improve or revert if it doesn't go anywhere.

@jainankitk
Copy link
Collaborator

Thanks for your thoughts @reta @jainankitk. I'm trying to keep this PR and the View construct as lean as possible - as it is expensive to get a feature built up in OpenSearch. If we can get a minimal version in core, it makes the next set of improvements faster.

I like the notion of iterative development - its fair if you think that this feature has too wide of a possibility space. I am stuck with business realities that I do not yet know what is going to justify the next phase of 'funding' for this project. Maybe it will be perf improvements, document level security, sandboxing request resources, search pipeline improvements, or something completely new to me.

While I am fine with getting clarity on all the use cases for new feature over time, but it is good to have atleast 1-2 accompanying use cases.

I am also okay to wait for funding on those 1-2 use cases before we start development (since feature is marked experimental until then), but there should be more clarity on how that ties or is different compared to other existing constructs in the system

I believe with this foundation unblocks further work that is largely outside of core [1]. I think keeping this work in main (v3.0.0) gives lots of headroom to improve or revert if it doesn't go anywhere.

* [1] [[Feature Request] Views features wishlist #12322](https://github.com/opensearch-project/OpenSearch/issues/12322)

Probably other maintainers can chime in here, but adding logic to core with intent of potentially reverting does not make much sense to me. Maybe we can keep the PR open and merge as and when we get clarity on the use cases. Given most of the logic is in new files, I don't foresee many conflicts as well.

@stephen-crawford
Copy link
Contributor

In my opinion, we have created a moving target that will prevent us from ever seeing delivery.

There are frequent changes in OpenSearch core which lack immediate use cases or which are contingent on further work to be valuable. I am happy to provide examples privately but will avoid doing so publicly since it is not considerate of those who have put effort into features I would list. I don't think there is anything wrong with these types of changes, but would ask we stick to a consistent standard when reviewing changes.

I would also like to address a couple of the main blockers that have come up here:

@reta you had previously voiced support of the change #11957 (comment) but changed your mind and are now pushing @peternied to add additional code to see this move forward. Of course it is fine to change your mind--everyone does--but I think we should try to be mindful of the feedback we are providing when reviewing PRs. Reviewing the discussion, I have not seen much actionable input provided to Peter. We are setting a moving target when we ask him to add something to convince us the change is worthwhile. Instead we should be providing clear feedback saying what we would like to see for this to move forward. What issues have been raised that you feel need to be addressed before this can go forward? In my opinion, Peter has provided a basic use case and a framework for expansion. But if you feel we need a concrete difference between Views and index patterns or aliases can you please put that down so we can make sure that we give Peter direction in our feedback?

@jainankitk I understand your concerns about Views and agree that we should not be making changes which we expect to be reverted. That being said, I think Peter has provided some resources (for example his documentation of Views: opensearch-project/documentation-website#6394) which can help address some of your questions. You also mentioned concerns about shipping experimental APIs. I believe the intent is for Views to only be a part of 3.0 for now so that this can be expanded upon. This means it will not be shipped. 3.0 is full of similar changes which are not completely supported at this time. I don't see any reason why we should hold a higher bar for Views versus other changes which have already gone in--to this point we have released entire repositories which we do not actively support. I do not think it will be realistic for Peter to make any further changes in places other than core without having the foundation in place. Adding this code in core offers very little cost (worse case scenario is we do have to revert the change a year from now) while providing a base for development which Peter has explored. As with Reta, if you are still unconvinced could you please list the actionable feedback which you would like to see for this to move forward? Is the issue the overlap with existing features that you want clarified or something else? I think this would greatly help this move forward.

Thanks everyone for reading; I appreciate how you are raising the bar and expecting the best but I want to keep us grounded.

@reta
Copy link
Collaborator

reta commented Feb 20, 2024

@reta you had previously voiced support of the change #11957 (comment) but changed your mind and are now pushing @peternied to add additional code to see this move forward. Of course it is fine to change your mind--everyone does--but I think we should try to be mindful of the feedback we are providing when reviewing PRs.

Thanks @scrawfor99 , I didn't change my mind to be fair - the code looks good, as I stated, but I do not see the compelling use case for this feature (and so does @peternied in offline conversations), aka "killer app" that clears out all doubts and confirms that we need this feature. What I have seen though - the people (other reviewers) asking the same questions I've been asking and that makes me think that we may be pushing the feature we don't have a clear vision for yet. Also, again this came out from comments thread, we are doomed to provide multiple ways to do the same things - this will cause confusion for users and additional work for maintainers. As per Peter's comment, we won't be targeting 2.x (no direct user exposure), this is on the plus side to get it merged, I just fail to see what we could do with the views now (as per this change set), that we cannot do with core features (aliases / index patterns) or sql plugin.

The only thing I am looking for is buy-in from other maintainers here, since I acknowledge the fact I may be missing something out of the large picture.

@msfroh
Copy link
Collaborator

msfroh commented Feb 20, 2024

I would really like "something" that is not an index that we can attach search pipelines to, so that they can be used to enforce rules on requests/responses (like what's proposed in #10938).

I really like the the idea of a construct where we can say "All search access through this window will follow these rules". I can see how aliases could be that construct. From an enforcement angle, if we can manage permissions on aliases (rather than the underlying indices), it would also solve my problem.

On the other hand, I see how aliases have a lot of preexisting baggage. Creating something new that doesn't have the baggage is also appealing (which is why I've been an eager cheerleader for views).

@reta
Copy link
Collaborator

reta commented Feb 20, 2024

On the other hand, I see how aliases have a lot of preexisting baggage. Creating something new that doesn't have the baggage is also appealing (which is why I've been an eager cheerleader for views).

The views address only the read side (hence views) but aliases do both (write and read), I would be inclined to deprecate aliases in favour of views but that would not work.

@jainankitk
Copy link
Collaborator

@scrawfor99 - Thank you for providing your inputs. Please see my response below.

That being said, I think Peter has provided some resources (for example his documentation of Views: opensearch-project/documentation-website#6394) which can help address some of your questions.

I went through the mentioned documentation and could see big overlap between the definition for alias and view.

Opensearch documentation for alias mentions:

An alias is a virtual index name that can point to one or more indexes.
If your data is spread across multiple indexes, rather than keeping track of which indexes to query, you can create an alias and query it instead.

Above documentation of view mentions:

A View in OpenSearch is a virtual index that allows you to define a projection over one or more physical indexes. It enables you to perform search operations across multiple indexes as if they were a single index. Views can be used hide index schema, dynamically reshape from its indexed format, and limit data exposure.

I believe the intent is for Views to only be a part of 3.0 for now so that this can be expanded upon. This means it will not be shipped. 3.0 is full of similar changes which are not completely supported at this time. I don't see any reason why we should hold a higher bar for Views versus other changes which have already gone in--to this point we have released entire repositories which we do not actively support.

I will not comment on the other features or repositories that have already gone in. But as principal, we have to start somewhere and keep raising the bar. Mistakes made in past should not be a reason to keep repeating them, unless ofcourse we agree it was not a mistake.

I do not think it will be realistic for Peter to make any further changes in places other than core without having the foundation in place. Adding this code in core offers very little cost (worse case scenario is we do have to revert the change a year from now) while providing a base for development which Peter has explored. As with Reta, if you are still unconvinced could you please list the actionable feedback which you would like to see for this to move forward? Is the issue the overlap with existing features that you want clarified or something else? I think this would greatly help this move forward.

It is primarily the significant overlap with existing alias feature due to which I am not completely sure if we need Views. And thats where I am wondering if there is some documentation on clarifying how Views are different than alias and why can't aliases evolve to what we are envisioning for Views. I am generally in favor of improving existing feature over introducing new features.

@peternied peternied merged commit 041373b into opensearch-project:main Feb 20, 2024
31 of 32 checks passed
@peternied peternied deleted the views-core branch February 20, 2024 23:40
@peternied
Copy link
Member Author

On the other hand, I see how aliases have a lot of preexisting baggage. Creating something new that doesn't have the baggage is also appealing (which is why I've been an eager cheerleader for views).

Well put @msfroh

I appreciate the back and forth, merging this change isn't a one-way door for OpenSearch, lets keep iterating on these ideas.

@peternied peternied mentioned this pull request Feb 21, 2024
5 tasks
peteralfonsi pushed a commit to peteralfonsi/OpenSearch that referenced this pull request Mar 1, 2024
Views, simplify data access and manipulation by providing a virtual layer over one or more indices

Signed-off-by: Peter Nied <petern@amazon.com>
Signed-off-by: Peter Nied <peternied@hotmail.com>
amkhar pushed a commit to amkhar/OpenSearch that referenced this pull request Mar 12, 2024
Views, simplify data access and manipulation by providing a virtual layer over one or more indices

Signed-off-by: Peter Nied <petern@amazon.com>
Signed-off-by: Peter Nied <peternied@hotmail.com>
Signed-off-by: Aman Khare <amkhar@amazon.com>
rayshrey pushed a commit to rayshrey/OpenSearch that referenced this pull request Mar 18, 2024
Views, simplify data access and manipulation by providing a virtual layer over one or more indices

Signed-off-by: Peter Nied <petern@amazon.com>
Signed-off-by: Peter Nied <peternied@hotmail.com>
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
Views, simplify data access and manipulation by providing a virtual layer over one or more indices

Signed-off-by: Peter Nied <petern@amazon.com>
Signed-off-by: Peter Nied <peternied@hotmail.com>
Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Basic View operations in OpenSearch
9 participants