Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Offline Background Tasks #12361

Open
linuxpi opened this issue Feb 19, 2024 · 3 comments
Open

[RFC] Offline Background Tasks #12361

linuxpi opened this issue Feb 19, 2024 · 3 comments
Assignees
Labels
enhancement Enhancement or improvement to existing feature or request merges RFC Issues requesting major changes Roadmap:Cost/Performance/Scale Project-wide roadmap label Storage Issues and PRs relating to data and metadata storage

Comments

@linuxpi
Copy link
Collaborator

linuxpi commented Feb 19, 2024

[Detailed Design Proposal] #13554

Introduction

Opensearch process running with data role, has responsibilities to execute various Background Tasks apart from indexing & search, some of these are:

  • Segment Merges
  • Force Merges
  • Re-indexing
  • Remote Garbage Collection
  • Shard Split/Shrink
  • Snapshots etc.

These tasks are a crucial part of an Opensearch Cluster. For example, Segment Merges ensure indices are in an optimal state. As an index grows and data is constantly added or updated, these segments need to be periodically merged to maintain efficient search performance and minimum storage footprint.

This is even more important for indices where data ingestion is sparse over time leading to high number of small-small segments. Segments Merges combines these segments into larger ones ensuring better overall index performance.

Similarly, each background task has its own importance.

Is your feature request related to a problem? Please describe

While being a crucial part of Opensearch, these Tasks consume some resources, taking a toll on the process which is supposed to deliver predictable and consistent indexing and search throughput. For ex: Segment merges is an important, frequent and heavy operation which demands a good chunk of available resources. Force Merging to lesser no of segment is an even heavier toll.

Apart from that, the configured resources on the node might not be sufficient to perform these operations along with incoming traffic, in a expected timeframe, which leads to timeouts/failures, eventually delays to background operations.
Apart from that any failures/bugs in these background operations tampers with core operations.

Describe the solution you'd like

Allow users the ability to segregate such operations to separate/dedicated node(s), it helps them scale indexing/search performance predictably without having to compete for resources with background tasks. Similarly background tasks won't be impacted by any surge in core operations traffic.

With introduction of Remote Store, offloading background operations makes even more sense as data is separated out in Remote Store and efficient to interact with, from a separate/dedicated node.

Proposal is to introduce a separate fleet of Nodes(Offline Fleet) to execute all background tasks. This ensures full segregation from core operations and allows users to independently scale this fleet based on the pending background tasks.

achitecture_queue

To begin with, we can target Segment Merges or Force Merges and allow Remote Store Clusters the ability to separate out merges. Later we can extend it to other background tasks and even think about how to extend the functionality for non Remote Store clusters.

Here is high level view of how the flow looks like with Offline Fleet for a Cluster.

HighLevelSequenceDiagram

The Added Cost

Not all the users would want to spin up separate nodes for background operations, so however we choose to implement/execute this, we would ensure status quo is maintained.

There is obviously an added cost of the Offline Fleet, which would be directly dependent on the no of nodes provisioned in the Offline Fleet.

Apart from that, with Offline fleet, there would be 2 additional downloads. Consider Segment Merges:

  1. Offline Fleet Node would have to download the Segments to be Merged, today since the segments are already present in local, there is no download needed.
  2. Once the Merged Segments are uploaded to Remote Store, the data node with corresponding Shard would download those merged segments

In future, we could also support a hybrid model where light weight Tasks could be run locally on Data Nodes while others could be offloaded to Background Fleet.

As we progress, I plan on adding more details to the individual components involved and how they interact with each other and existing component.

Related component

Storage

Describe alternatives you've considered

Apart from the approach mentioned above, another option would be isolation of resources on the data node itself for core(indexing/search) operations and other adhoc operations like merges and snapshots. This would have less friction from users in adoption as they don’t have to provision a separate fleet. But it has some caveats which doesn't make it much appealing:

  • We wouldn’t be able to independently scale resources for merges without affecting core operations.
  • Reserving resources for adhoc operations on the data node might not be optimal as all the nodes will not have merges to be performed all the time. Instead pooling all the merge operations from all nodes together into dedicated nodes would give better utilization of dedicated resources.
  • Complete Isolation of resources on the same node is not be as trivial to solve.

Additional context

No response

@linuxpi linuxpi added enhancement Enhancement or improvement to existing feature or request untriaged labels Feb 19, 2024
@github-actions github-actions bot added the Storage Issues and PRs relating to data and metadata storage label Feb 19, 2024
@linuxpi linuxpi self-assigned this Feb 19, 2024
@linuxpi linuxpi changed the title [RFC] Offline Merge [RFC] Offline Background Tasks Feb 19, 2024
@peternied peternied added the RFC Issues requesting major changes label Feb 21, 2024
@peternied
Copy link
Member

[Triage - attendees 1 2 3 4 5]
@linuxpi Thanks for filing, looking forward to seeing how this progresses

@linuxpi
Copy link
Collaborator Author

linuxpi commented Mar 21, 2024

Phases

Phase

Goal - Have basic framework ready to run Segment Merges(including ForceMerge) on Dedicated background tasks nodes while maintaining status quo

Meta - #12725

To achieve the goal mentioned above, we need to explore concrete solutions for the following items, which in upcoming Phases, could be extended to various other Background Tasks like Snapshot etc.

Separate out Merge Functionality to an independent Component

Most of the codebase for Opensearch today exists as a Monolith in :server hosting code related to various background tasks, including Merge. It would be an anti-pattern to build entire :sever jar and host on Offline Node, which is just responsible for performing Merges. We need a way to separate out individual components like “Merge” and be able to run separately on Offline Fleet.

Build a Task Coordination Framework to manage task lifecycle.

With Offline fleet, data nodes and Offline Fleet nodes itself can submit background tasks to Offline Fleet. At any point, the no of Tasks submitted might be too much for available nodes in the Offline Fleet to distribute amongst themselves.

Even if we do try to assign a task to a particular node right after its submitted, the node may or may not have resources at that time to start the task and would need to put it into a “Queue”. Apart from that, if the node goes down, and this Queue is not persisted in Remote, all those tasks in Queue are lost.

Phase #2

Goal - Onboard more usecases like Remote GC, Snapshots

@dblock
Copy link
Member

dblock commented May 22, 2024

Late to this game coming from another PR. I think the name "offline" is confusing, would call these "worker" nodes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request merges RFC Issues requesting major changes Roadmap:Cost/Performance/Scale Project-wide roadmap label Storage Issues and PRs relating to data and metadata storage
Projects
Status: New
Status: 🏗 In progress
Development

No branches or pull requests

5 participants