-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Configurable Automation for OpenSearch AI Use Cases #9213
Comments
Proposed ImplementationThe Backend will contain a minimal Processor interface taking a map parameter to pass the necessary context between workflow steps. In particular, the output of one step needs to be available as the input for the next step. Other configuration may need to persist across multiple steps, and partial results may be stored during execution to provide status if requested. Processors will be similar to (and in some cases link directly to) Search Processors and Ingest Processors. If a process needs to make a REST API call as part of its execution, it can do so similarly to this Kendra Ranking Processor. Workflow validation and execution is demonstrated below for a Semantic Search example.
Workflow Sequence DiagramThe sequence below shows the complete interaction between user and OpenSearch.
|
@dbwiddis, thanks for putting together the RFC. I like to request to adjust the name because I don't want to give the community that we are building an application framework like Streamlit or business process automation software. We are building backend functionality and we have no intention of creating a framework that makes it difficult for users to decouple app and data tier logic. I propose we describe this work along the lines of a no-code designer and configurable automation for AI-augmented search and ingest pipelines. I also think there are at least two projects here. One is the no-code designer and one is the backend APIs. |
Naming things is hard! I agree we can find a better name and happy to discuss alternatives here.
Yes, as mentioned in the first section, "This No-Code frontend application will be covered in a separate RFC". This RFC is for the templates and back-end framework. "Configurable automation for AI-augmented search and ingest pipelines" is a rather verbose name that I think limits the scope of what we're doing here. I do think "Configurable automation for AI-augmented workflows" could work? |
Questions about how this integrates with various things: What would it take to make this a backend tool for Langflow or Flowise? Or whatever they call it. Will this make it easy to have OpenSearch-based AI blocks in my external AI workflow or is it an either/or - OpenSearchFlow vs Flowise? It feels to me like there's a fair amount of overlap with Agent Framework, in terms of configuring complicated CoT/GenAI workflows. What should be done with that RFC versus with this one, and where do they meet? |
Regarding what to call these things, how about OpenSearch [AI] Studio and OpenSearch [AI] Workflows? Because what I'm reading here sounds a lot like AWS Glue Studio and Glue Workflows. Glue also offers blueprints which are constructs built on workflows which I think I am also sensing where this might be headed. I put "AI" as [optional] since none of this sounds specifically tied to AI workloads. Is the backend going to live in OpenSearch ("core") itself or will it run as a separate thing? Playing devil's advocate, I want to ask why this can't be all done using Apache Airflow (or something similar). I can see there being something new to bring for the frontend, but not seeing a compelling reason for the backend to be built anew unless it has to be deeply integrated inside the core? |
@ylwu-amzn, @dbwiddis. I agree. Configurable workflow support will provide a flexible but generic way to support variations of RAG with or without CoT, which may require multi-pass model invocations--the goal of our project is to leave it to the user to decide how they design the query pipelines and what belongs in application-tier logic. The intent of this feature isn't to be overly prescriptive and leave app developers with full control over agent execution logic. IMHO, agents should be built in the app layer, and the data stores like OpenSearch will provide users with the flexibility to determine how to execute AI-enriched information retrieval workflows for those agent applications. Let's chat.
We are considering this, but there are some challenges 1/ how do we provide compatibility and a seamless experience for LangChain backends that have functionality that is beyond the scope of OpenSearch (Eg. other vectorstores)? 2/ how do we extend these frameworks beyond LLMs? There are a multitude of predictive time-series analytics use case and non-LLM search use cases like visual and multi-modal search that our framework aims to support. |
Hi Austin, the workflows that we are creating will cover ingest and query workflows. The later is beyond what Glue does and is executed in real-time within the OpenSearch engine (Search Pipelines). Secondly, we are not building/re-building any data processing engine. On the ingest side, we'll provide the option for users to describe ingest workflows that can be translated to some target engine like OpenSearch Ingestion and Apache Spark. We'll look to expanding the our ML Extensibility features so that the community can easily plugin alternative engines. Our framework handles the orchestration, integration and a portable interface between supported engines.
It will run on the cluster either as a plugin or core. As described in the RFC, we're using existing OpenSearch components, which are plugins (eg. ml-commons) or core (eg. Search Pipelines). The intent of this work is to simplify what could be done manually. Users could configure and build-on the individual components in the RFCs (with enhancements) to accomplish the same thing. We feel like we need to simplify the developer and user experiences.
I think we need to adjust the title of this RFC because I feared people might think we're building things like an Airflow alternative (@dbwiddis ) and we are not. One way to think of this is we are enhancing Search Pipelines. Unlike Airflow, our intent isn't to provide general purpose batch-oriented workflows. We are building something exclusively for OpenSearch users. As well, Search Pipelines is not batch--it performs query-time processing. The ingest part could call out to Airflow, but we also don't want to build something that forces people to use Airflow with OpenSearch. As commented above, I am going to advocate for our team to make this component extensible and support multiple ingest job execution engines. A user could also build an Airflow workflow that uses this API to prep OpenSearch for a vector database hydration process. A user might chose this option because our intent is to provide a simpler higher-level interface than if they were to build on lower level components like the OpenSearch ingest |
I agree. But I think we need be more specific that just workflows because I already see comments asking how this is different that general purpose data processing and workflow engines. Let's think of something that: Some thoughts... |
This is a thoughtful proposal. After reading it I asked myself why this is not "just an enhancement over search pipelines". So that seems like the best incremental approach to take. That should solve a lot of the open questions, such as storage. |
We could consider calling this project that if it's easier for people to understand the intent. A lot of the work is revolved around creating search processors, making search pipelines configurable and improving usability. However, there are also ingest workflows and extensibility elements that are beyond the scope of search pipelines. |
Responding to @HenryL27 :
Both of those products (and many others) depend on ReactFlow. Accordingly, their workflows are in the same JSON format cited in examples earlier here (Flowise) and here (Langflow). From a backend framework, as long as the same fields are in either product, we could use either one. The actual selection of a front-end UI will be covered in another RFC, but just as a preview, to answer "what would it take":
Certainly this can be done quickly and easily, but it would then also require OpenSearch users who want to edit these templates to download/install/setup the appropriate editor. Both Langflow and Flowise require operating a webserver (one can deploy it locally on their own machine with docker). So we're distributing the (repetitive) setup effort among many users who can follow the setup steps in line with their corporate security requirements.
We can easily do both. We are defining a template format compatible with either. Getting it out there faster in these other frameworks can happen in parallel with any UI we choose to build (if we choose to do it ourselves).
They meet at the REST API layer. I wrote this considering the current state of API available in 2.9 and making it available easily to as wide an audience as possible. It may be that an API is not available today but will be available in 3 months. This framework makes it easily accessible today, until first-class API support is available (conceptually, much like a set of "custom tool" modules in Flowise could do the same thing, until a first-class module replaces it.) In my initial research I considered building this inside ml-commons, but reasoned that some use cases will be external to that plugin. From what I understand of that RFC (and other RFCs such as Conversations, and existing work improving the capabilities of Search Pipelines, etc.) there will be new APIs made available to streamline many specific processes. This is a good thing. The first few templates created may end up obsolete in 6 months, by which time we'll be building templates for even more things to enable them earlier, even if they become obsolete in 6 more months... |
Responding to @austintlee :
I'm planning initial rapid development in a Opensearch Plugin. Some general functionality may be moved into core later, particularly if it's useful for other projects. There will be many overlapping concepts/interfaces/code with search pipelines, ingest pipelines, and others, that will probably find their way into core to be depended on by all projects.
I think the best counter to that is a perspective of "where does the automation live". An external server automates remotely and sends API queries. Automation inside OpenSearch (via a plugin) can be triggered by a single API call and is under control of the cluster to execute. When using an external webserver-hosted automation framework, we are automating a bunch of API calls coming in to OpenSearch externally.
Automation directly on the cluster has direct access to cluster statistics (e.g., indexing pressure) allowing more stability. Task completion can execute asynchronously with immediate chaining of the next step(s). There's no network latency for API requests, easier checks of versioning/compatibility and plugin availability. |
Another reply to @HenryL27
There is some commonality in the low-level "tools" that could be leveraged/co-developed by both teams. However, we are solving fundamentally different problems. This framework is solving known execution sequences that can be articulated in a directed graph and executed in a known order. Our primary purpose is enabling rapid prototyping of by exchanging particular components of a known workflow, such as trying different models to improve semantic search. The Agent Framework says it's solving a "complex problem, the process generally is hard to be predefined. We need to find some way to solve the problem step by step, identify potential solutions to reach a resolution." The order of execution is unknown in advance. I don't see an obvious application in rapid prototyping as you don't even know if/when a particular tool will be used. |
Responding to @dylan-tong-aws
Yes, but... One question is "why don't we build this in ML-commons" and the answer is that the capability is more generic than ML-commons. Yes we are initially focusing on AI use cases, but we are building a generic capability that will be more easily reused/adapted outside of these cases.
I think this is closest to where we're going. "Configurable automation for OpenSearch AI use cases"?
I'm not sure "priming" is technically specific enough. |
@dbwiddis thanks for putting up the proposal. I tried to follow the whole conversation but might have missed few things, hence please let me know if this question is a already answered I will check that response. So the question I have is: |
@navneet1v, we plan to create a pipeline similar to Search and Ingest for chaining of all the plugins required for a specific use case as mentioned here . Later, this pipeline can be use with hot path of OpenSearch be it SearchRequest similar to how Search Pipeline does today.
|
@owaiskazi19 still not clear. Looking at the above request what I can see is its a simple _search query. You added a query param with value my_pipeline. But bigger question is why this query looks like a text search query? So is the expectation is we will convert this query to a Neural Search query with "neural" clause and add the model id too in the payload? |
Yes, since we would take care of uploading the model (and will have access to model_id as well) or for that matter any other use cases of OpenSearch like Multi Modal Search, Vector Search etc. We would mold the query based on the use case selected by the user. The only requirement from the user would be to provide us the pipeline_id with the request once the pipeline has been created using the drag/drop option. The aim of the framework is to provide user with minimal setup to be done for any use case and thus the heavy lifting would be done by the plugin. |
|
The RFC for the frontend no-code designer is here: opensearch-project/OpenSearch-Dashboards#4755 |
Hey @navneet1v let me try to address your question.
The customer does not need to change anything. The search API will remain as it is. A customer with less experience with the _search api will be empowered to do more and experiment more easily.
For "builders":
For End users,
For a not-yet-existing experience that requires multiple queries, the user experience will be a single query. Take for example RAG. We currently don't have RAG in a single query (although one is proposed in opensearch-project/ml-commons#1150 with an open PR implementing it as a Processor in search pipelines. This is great and may be available in a future version of OpenSearch.) So a RAG template today may be obsolete in 3-6 months, but we still might have an easier builder for experimenting with processors to drag and drop into search pipelines, etc. For an existing single-query use case such as semantic search, the primary benefit is that the user needs to provide less information and know fewer technical details. Let's consider the query used in this blog post: GET amazon-review-index-nlp/_search?size=5
{
"query": {
"neural": {
"review_embedding": {
"query_text": "I love this product so much i bought it twice! But their customer service is TERRIBLE. I received the second glassware broken and did not receive a response for one week and STILL have not heard from anyone to receive my refund. I received it on time, but am not happy at the moment., review_title: Would recommend this product, but not the seller if something goes wrong.",
"model_id": <model_id>,
"k": 10
}
}
},
"fields": ["review_body", "stars"],
"_source": false
} A user using this API needs to know:
But they also need to know lots of things about _search API which may be relevant for this use case, such as using a size parameter to limit results, or using a match query to apply a painless script in some use cases such as in this example. Ultimately the end user just needs to know the template/pipeline ID, and a list of required fields (like indices to search) and optional ones for which the default is pre-populated but they can override. Search templates can help simplify some of this, but they're more limited, but you could conceptualize these use case templates as somewhat of a superset of search templates, search pipelines, and more. |
I synced with @dbwiddis on slack. Cleared out few things. The main contention for me was happening because of this example
It is touching the main search api path by hijacking the _search request. Ideally it should have been this, which is a different api:
which after talking to @dbwiddis I got the clarity. |
I'm wondering how much incremental progress could be made just by adding an optional self-describing interface to search and ingest pipeline processors. Essentially, this interface could return some kind of processor spec that includes a name and description for the processor, and enumerate the available configuration parameters (with the types, constraints, and description for each parameter). Basically, you could ask As a bonus, this self-describing interface could be used to generate documentation. |
I think this is a great idea, as it would provide a path forward to automatically map the required inputs for any processor, that way we can support any new processor types without having to manually specify the required inputs for each processor used within a use case template. |
@msfroh I could see this as a potential big maintainability win on the frontend UI like you've mentioned. We are exploring how to persist some of these interfaces on the frontend for this framework, including the parameters for different search & ingest processors. We want to make sure we can scale well as the processor libraries continues to grow. |
Final Backend Design ApproachA workflow setup would be needed to chain all the components together at one place. This setup should take care of the sequencing of the different plugins APIs required for a specific use case and also creating Search or Ingest Pipeline based on the operation selected by the user. A sample flow is outlined below for Semantic Search. This will be a one time setup and once the above steps are completed a workflow id would be return to the frontend. All the responses received from the above respective APIs would be stored in the global context. This will also help a dependent component to utilize the response of the previous component by reading the global context. Global context can be a system index stored in OpenSearch. Collaborator: @joshpalis Orchestrator The Orchestrator is responsible for mapping the workflow ID to the use case template sub-workflows to ascertain the necessary API calls and the order of execution. Upon retrieving the chosen sub-workflow, the Orchestrator then iterates over each API and uses the Payload Generator to transform the user request into the correct format. The resulting query produced by the Payload Generator is then used to invoke this API, and the subsequent response will be used to pre-fill the next API request body if applicable. Upon consolidating all the responses, the final response will be reformatted in such a way that removes unnecessary information, such as number of hits, etc, prior to being sent back to the user. Payload Generator The Payload Generator will be responsible for mapping the request to the correct query template and filling in the required fields from both the global context and the user input. The query templates will be mapped to a specific API and will format all the user provided input. Query templates will take inspiration from SearchTemplates, which allow us to specify the query body and required parameters. These query templates will be stored as mustache scripts within the cluster state, similar to how search templates are configured and stored, and will be used to define the required inputs for the user on the front end plugin. Currently, only SearchTemplates utilize mustache scripts to re-write queries, but this can be expanded by the Payload Generator such that any API payload can be re-written into the desired format. |
Will this support arbitrary plugins and extensions (let's say they support ML/AI use cases)? |
@austintlee, there will be multiple integration/extension points. For instance, 1. via the external OpenSearch API 2. by publishing a use case template 3. by contributing building blocks eg. a data processor like a text chunker, a specialized ml processor like a connector to a AI service...etc. Let's connect sometime, so I can understand how you like to extend this system. |
We will be able to bring the Hugging Face models as well? As in JSON/YAML document we will just have the below settings and the user will not be able to adjust the response value. Does user will have flexibility to use any LLM model or just pre-defined hardcoded models? Like we have now
|
Note: This RFC is for a back-end framework and templates that enable it. See No-code designer for AI-augmented workflows for the corresponding front-end RFC.
Proposal
The current process of using ML offerings in OpenSearch, such as Semantic Search , requires users to handle complex setup and pre-processing tasks, and send verbose user queries, both of which can be time-consuming and error-prone.
The directional idea is to provide OpenSearch users with use case templates, which provide a compact description (e.g., JSON document). These templates would describe configurations for automated workflows such as Retrieval Augment Generation (RAG), AI connectors and other components that prime OpenSearch as a backend to leverage generative models—once primed, builders can query OpenSearch directly without building middleware logic to stitch together data flows and ML models.
While a range of pre-configured templates will be available, we also envision a no-code drag-and drop frontend application similar to LangFlow, Flowise, and other offerings, which would enable quick editing of these templates for rapid prototyping. This No-Code frontend application will be covered in a separate RFC, but it is important to note that these templates will be written in user-readable form (e.g., JSON or YAML), so use of the no-code builder is not required for the backend framework to use them.
Once a workflow is uploaded to OpenSearch, it will enable a streamlined API call with only a minimum amount of (required and optional) parameters provided by the user.
This proposal is the collaboration of the following additional contributors:
Goals and Benefits
The goal is to improve the developer experience that we created in 2.9 for developing semantic search and GenAI chatbot solutions (RAG), as well as future capabilities being developed, by creating a framework that simplifies these tasks, hiding the complexity and enabling users to leverage ML offerings seamlessly.
This framework will provide a complete solution, streamlining the setup process with an easy-to-use interface for creating and accessing workflows supporting ML use cases. Using the framework, builders can create workflows leveraging OpenSearch and external AI apps to support visual, semantic, or multi-modal search, intelligent personalization, anomaly detection, forecasting, and other AI-powered features faster and easier. End-user app integrations will be greatly simplified and easy to integrate into third party app frameworks.
These templates help users automatically configure search pipelines, ml-commons models and AI connectors and other lower-level framework components through declarative-style APIs. Once a template successfully executes, the OpenSearch cluster should be fully configured and vector database hydrated, allowing an app developer to run direct AI-enriched queries. Since the API is built on framework components—neural search, ml framework, AI connectors and search pipelines--there’s no middleware like LangChain microservices to manage. In some cases, the ingest process may require sophisticated customizations so templates might leave a builder with an environment to touch up workflows and control over the right time to hydrate OpenSearch.
Background
Between OpenSearch 2.4 and 2.9, we released a number of platform features that simplifies the creation of semantic search apps on OpenSearch. Our intent is to continue building on these features to improve the user experiences, performance, and economics. More importantly, we want to enhance our framework so that it’s not limited to semantic search type use cases. One of our goals is to support a broader set of AI use cases suited for OpenSearch, for example the upcoming Conversational Memory (Chat History) API and Conversation Plugin. Currently, our framework is built around the following:
RAG extends the semantic search query workflow. The first part of RAG is the retrieval workflow which is the same as the semantic search query flow, but instead of returning similar documents to the user it sends it to a generative LLM. The LLM processes those results to return a modified response (eg. summarization or recommendations as a conversational response).
Search pipelines provides APIs to configure query workflows based on painless scripts. In 2.9 we leverage search pipelines to support the RAG workflow.
These framework components provide a lot of the building blocks required to assemble an AI app on OpenSearch. There are still some feature gaps and variation between use cases, which depend on future OpenSearch capabilities. Initial design must envision this potential to extend and evolve as these new features are added.
High Level Design
The design will center around use case templates which describe the sequence of execution of building blocks in a Directed Acyclic Graph (DAG). Both series and parallel execution will be supported. Each building block implements a logical step in a workflow. Example building blocks include:
Drag-and-drop editors under consideration depend on ReactFlow. While we are considering using an existing one, we may also develop our own, also depending on ReactFlow. Exporting of the flow takes the form of a JSON object with fields that we can implement as necessary. A minimal default flow with two nodes connected by one edge shows the format identifying the graph underlying the workflow:
For more detailed example templates, consider this example in Flowise using Anthropic Claude to ingest a document, or this basic example in Langflow. We intend to use a similar format, and potentially enable the ability to parse/import formats from these and other popular no-code builders.
These templates can be generated using a no-code editor, or manually constructed/edited using the required fields. The interaction between a no-code front end and the execution layer in a plugin is shown here.
Template Fields
The below example outlines potential fields that we will likely include. Key components include version compatibility (some APIs require minimum OpenSearch versions), what client interfaces/APIs are needed (to permit validation of appropriate plugin installation), what external connectors are required, what Integrations may be used, and other definitions. Some workflows associated with common use cases such as Retrieval Augmented Generation (RAG) will be available from the backend framework by name, offering a streamlined template definition, while others may be assembled by the user from a selection of building block nodes.
Open Questions
The text was updated successfully, but these errors were encountered: