Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redis State Storage #962

Closed

Conversation

zbirenbaum
Copy link
Contributor

@zbirenbaum zbirenbaum commented Jun 4, 2024

  • Add serialize and deserialize to structs
  • Add OperationStateManager for redis

Description

Basic Implementation of WorkerStateManager, ClientStateManager, and MatchingEngineStateManager for RedisStateManager. Adds Serialize and Deserialize to all structs required for storing in redis as json.

One of the following must be merged for this to work:

  1. Serialize and Deserialize added to structs (done in a commit on this PR)
  2. Redis wrapper types #941 must be merged and types here can be changed. Redis wrapper types #941 would enable storage of non-json values in Redis

Type of change

Please delete options that aren't relevant.

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

TDB

Checklist

  • Tests added/amended
  • bazel test //... passes locally
  • PR is contained in a single commit, using git amend see some docs

This change is Reviewable

Copy link
Member

@allada allada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r1, 3 of 5 files at r4.
Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Bazel Dev / ubuntu-22.04, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), docker-compose-compiles-nativelink (22.04), integration-tests (20.04), integration-tests (22.04), macos-13, pre-commit-checks, ubuntu-22.04, and 14 discussions need to be resolved


-- commits line 1 at r4:
nit: Please condense this.


nativelink-scheduler/Cargo.toml line 34 at r4 (raw file):

redis = { version = "0.25.2", features = ["aio", "tokio", "json"] }
serde = "1.0.203"
serde_json = "1.0.117"

nit: unused?


nativelink-scheduler/src/redis_operation_state.rs line 1 at r4 (raw file):

use std::sync::Arc;

nit: Please add copyright header.


nativelink-scheduler/src/redis_operation_state.rs line 14 at r4 (raw file):

pub struct RedisStateManager  {
    client: Client,

Instead, lets bring this in as a RedisStore. Add some custom functions to get the underlying redis-store client so it's accessible here.

In the constructor have it take in a Store. It may also be good to rebase on top of:
https://reviewable.io/reviews/TraceMachina/nativelink/964

This should allow you to do everything but pub/sub, but you can use the redis-store-client-getter for that.

This will dramatically reduce the amount of changes needed and will allow us to eventually have this take any store.


nativelink-scheduler/src/redis_operation_state.rs line 80 at r4 (raw file):

impl RedisOperation {
    pub fn action_stage(&self) -> Result<ActionStage, Error> {
        match self.stage {

Why not just store the stage as an ActionStage? This would also remove the need for self.result.


nativelink-scheduler/src/redis_operation_state.rs line 97 at r4 (raw file):

        }
    }
    pub fn unique_qualifier(&self) -> &ActionInfoHashKey {

nit: no need to be public.


nativelink-scheduler/src/redis_operation_state.rs line 121 at r4 (raw file):

#[async_trait]
impl MatchingEngineStateManager for RedisStateManager {
    /// Returns a stream of operations that match the filter.

nit: No need to comment here. This is documented in the api.


nativelink-scheduler/src/redis_operation_state.rs line 126 at r4 (raw file):

        filter: OperationFilter,
    ) -> Result<ActionStateResultStream, Error> {
        let mut con = self.get_multiplex_connection().await.unwrap();

nit: don't use unwrap() use .err_tip().


nativelink-scheduler/src/redis_operation_state.rs line 127 at r4 (raw file):

    ) -> Result<ActionStateResultStream, Error> {
        let mut con = self.get_multiplex_connection().await.unwrap();
        let operations = {

nit: Just hoist the data directly, there doesn't appear to be any reason to add an additional 3 lines that don't make it more readable and don't have functionality.


nativelink-scheduler/src/redis_operation_state.rs line 128 at r4 (raw file):

        let mut con = self.get_multiplex_connection().await.unwrap();
        let operations = {
            // returns list of all operations

nit: caps and period.


nativelink-scheduler/src/redis_operation_state.rs line 129 at r4 (raw file):

        let operations = {
            // returns list of all operations
            let Json(operations): Json<Vec<RedisOperation>> = con.json_get("operations:", "$").await?;

nit: Needs .err_tip.


nativelink-scheduler/src/redis_operation_state.rs line 132 at r4 (raw file):

            operations
        };
        let mut v: Vec<Arc<dyn ActionStateResult>> = Vec::new();

nit: Lets use the raw stream api. use operations.into_iter() as the state in unfold().


nativelink-scheduler/src/redis_operation_state.rs line 147 at r4 (raw file):

        _action_stage: Result<ActionStage, Error>,
    ) -> Result<(), Error> {
        let mut con = self.get_multiplex_connection().await.unwrap();

nit: Just use todo!() don't add any of the other code.


nativelink-scheduler/src/redis_operation_state.rs line 153 at r4 (raw file):

    }

    /// Remove an operation from the state manager.

nit: No need to comment here. This is documented in the api.

Copy link
Member

@adam-singer adam-singer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r1, 1 of 4 files at r2, 4 of 4 files at r3, 3 of 5 files at r4.
Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Bazel Dev / ubuntu-22.04, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), docker-compose-compiles-nativelink (22.04), integration-tests (20.04), integration-tests (22.04), macos-13, pre-commit-checks, ubuntu-22.04, and 14 discussions need to be resolved

@zbirenbaum zbirenbaum force-pushed the redis-operation-filter branch 4 times, most recently from eb7491e to f01a3ec Compare June 4, 2024 05:01
Copy link
Contributor Author

@zbirenbaum zbirenbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Analyze (javascript-typescript), Installation / ubuntu-22.04, Local / ubuntu-22.04, Vercel, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), pre-commit-checks, ubuntu-20.04 / stable, vale, and 7 discussions need to be resolved


-- commits line 1 at r4:

Previously, allada (Nathan (Blaise) Bruer) wrote…

nit: Please condense this.

This relies on #960 #959 and Either #962 or the Add serialize and deserialize to structs commit as a seperate PR. Those will go away once the other PR's are merged.


nativelink-scheduler/Cargo.toml line 34 at r4 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

nit: unused?

It was used at one point and will be in the future but removed for now.


nativelink-scheduler/src/redis_operation_state.rs line 80 at r4 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

Why not just store the stage as an ActionStage? This would also remove the need for self.result.

ActionStage::CompletedFromCache(Proto) isn't serializable/deserializable and can't be made so.


nativelink-scheduler/src/redis_operation_state.rs line 97 at r4 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

nit: no need to be public.

Done.


nativelink-scheduler/src/redis_operation_state.rs line 121 at r4 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

nit: No need to comment here. This is documented in the api.

Done.


nativelink-scheduler/src/redis_operation_state.rs line 126 at r4 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

nit: don't use unwrap() use .err_tip().

Done. I can just use ? here since they both return Results.


nativelink-scheduler/src/redis_operation_state.rs line 127 at r4 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

nit: Just hoist the data directly, there doesn't appear to be any reason to add an additional 3 lines that don't make it more readable and don't have functionality.

Done.


nativelink-scheduler/src/redis_operation_state.rs line 128 at r4 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

nit: caps and period.

Done.


nativelink-scheduler/src/redis_operation_state.rs line 147 at r4 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

nit: Just use todo!() don't add any of the other code.

Done. This is implemented now.


nativelink-scheduler/src/redis_operation_state.rs line 153 at r4 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

nit: No need to comment here. This is documented in the api.

Done.

Copy link
Contributor Author

@zbirenbaum zbirenbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Analyze (javascript-typescript), Installation / ubuntu-22.04, Local / ubuntu-22.04, Vercel, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), pre-commit-checks, ubuntu-20.04 / stable, ubuntu-22.04, vale, and 7 discussions need to be resolved


nativelink-scheduler/src/redis_operation_state.rs line 129 at r4 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

nit: Needs .err_tip.

I'm a bit confused on this one. Should we use .err_tip everywhere ? is used? Pretty much every function in redis-rs returns a result so that might get a bit unwieldy.

@zbirenbaum zbirenbaum changed the title Redis Operation Filter Redis State Storage Jun 4, 2024
@zbirenbaum zbirenbaum force-pushed the redis-operation-filter branch 4 times, most recently from 2c88755 to c7af92e Compare June 11, 2024 19:31
Copy link
Member

@adam-singer adam-singer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 4 files at r9, 9 of 12 files at r10.
Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04, Installation / macos-13, Installation / macos-14, Installation / ubuntu-22.04, Local / ubuntu-22.04, Publish image, Publish nativelink-worker-lre-cc, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), docker-compose-compiles-nativelink (22.04), integration-tests (20.04), integration-tests (22.04), macos-13, pre-commit-checks, ubuntu-20.04 / stable, ubuntu-22.04, ubuntu-22.04 / stable, windows-2022 / stable, and 8 discussions need to be resolved


nativelink-scheduler/src/redis_operation_state.rs line 1 at r10 (raw file):

// Copyright 2024 The NativeLink Authors. All rights reserved.

Could we put this into its own folder and separate out some of this into different files?


nativelink-scheduler/tests/redis_state_manager_test.rs line 130 at r10 (raw file):

    async fn basic_update_operation_test() -> Result<(), Error> {
        let worker_id = WorkerId(uuid::Uuid::new_v4());
        let state_manager = RedisStateManager::new("redis://localhost/".to_string());

Does this mean we assume that redis is running when executing unit tests? Unsure if we have that wired up in our test harness cc: @blakehatch ?

Copy link
Member

@allada allada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 9 of 12 files at r10, all commit messages.
Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), docker-compose-compiles-nativelink (22.04), integration-tests (20.04), integration-tests (22.04), macos-13, pre-commit-checks, ubuntu-20.04 / stable, ubuntu-22.04, ubuntu-22.04 / stable, windows-2022 / stable, and 10 discussions need to be resolved


nativelink-scheduler/src/redis_operation_state.rs line 30 at r10 (raw file):

pub struct RedisOperationState {
    client: Client,

nit: Have this instead take a Store and make a helper function:

fn get_redis_client(&self) -> Result<Client, Error>

And downcast it to the redis store.


nativelink-scheduler/src/redis_operation_state.rs line 43 at r10 (raw file):

}

impl TryFrom<RedisOperationImpl> for ActionState {

nit: Move this below to the rest of RedisOperation stuff.


nativelink-scheduler/src/redis_operation_state.rs line 60 at r10 (raw file):

        let mut sub = client.get_async_pubsub().await?;
        let sub_channel = format!("{}:*", &self.inner.operation_id.unique_qualifier.action_name());
        // Subscribe to action name: any completed operation can return status

nit: period.


nativelink-scheduler/src/redis_operation_state.rs line 61 at r10 (raw file):

        let sub_channel = format!("{}:*", &self.inner.operation_id.unique_qualifier.action_name());
        // Subscribe to action name: any completed operation can return status
        sub.subscribe(sub_channel).await.unwrap();

This could easily cause memory leaks. It looks like we need to do something similar to what was done with the StoreSubscription and make this return a Boxinstead. Because we need to callunsubscribe()` if no-one is subscribed any more.

This is also were we'll have the spawn's handle live.


nativelink-scheduler/src/redis_operation_state.rs line 64 at r10 (raw file):

        let mut stream = sub.into_on_message();
        let action_state: ActionState = self.inner.clone().try_into()?;
        // let arc_action_state: Arc<ActionState> = Arc::new();

nit: Cleanup.


nativelink-scheduler/src/redis_operation_state.rs line 65 at r10 (raw file):

        let action_state: ActionState = self.inner.clone().try_into()?;
        // let arc_action_state: Arc<ActionState> = Arc::new();
        // This hangs forever atm

nit: cleanup?


nativelink-scheduler/src/redis_operation_state.rs line 67 at r10 (raw file):

        // This hangs forever atm
        let (tx, rx) = tokio::sync::watch::channel(Arc::new(action_state));
        // Hand tuple of rx and future to pump the rx

nit: period.


nativelink-scheduler/src/redis_operation_state.rs line 69 at r10 (raw file):

        // Hand tuple of rx and future to pump the rx
        // Note: nativelink spawn macro name field doesn't accept variables so for now we have to use this to avoid conflicts.
        #[allow(clippy::disallowed_methods)]

nit: remove.


nativelink-scheduler/src/redis_operation_state.rs line 270 at r10 (raw file):

#[derive(Serialize, Deserialize, Clone, ToRedisArgs, FromRedisValue)]
pub struct RedisOperationImpl {

nit: This is not an impl, it's just RedisOperation

Copy link
Member

@adam-singer adam-singer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 12 files at r10, all commit messages.
Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), docker-compose-compiles-nativelink (22.04), integration-tests (20.04), integration-tests (22.04), macos-13, pre-commit-checks, ubuntu-20.04 / stable, ubuntu-22.04, ubuntu-22.04 / stable, windows-2022 / stable, and 10 discussions need to be resolved

@zbirenbaum zbirenbaum force-pushed the redis-operation-filter branch 4 times, most recently from 45e8cbb to d9234fb Compare June 17, 2024 20:22
Copy link
Contributor Author

@zbirenbaum zbirenbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Analyze (javascript-typescript), Local / ubuntu-22.04, Vercel, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), pre-commit-checks, ubuntu-20.04 / stable, ubuntu-22.04, vale, and 4 discussions need to be resolved


nativelink-scheduler/src/redis_operation_state.rs line 14 at r4 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

Instead, lets bring this in as a RedisStore. Add some custom functions to get the underlying redis-store client so it's accessible here.

In the constructor have it take in a Store. It may also be good to rebase on top of:
https://reviewable.io/reviews/TraceMachina/nativelink/964

This should allow you to do everything but pub/sub, but you can use the redis-store-client-getter for that.

This will dramatically reduce the amount of changes needed and will allow us to eventually have this take any store.

Done.


nativelink-scheduler/src/redis_operation_state.rs line 132 at r4 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

nit: Lets use the raw stream api. use operations.into_iter() as the state in unfold().

Done. I don't think this applies anymore.


nativelink-scheduler/src/redis_operation_state.rs line 60 at r10 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

nit: period.

Done. This function no longer exists as it uses store now.


nativelink-scheduler/src/redis_operation_state.rs line 61 at r10 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

This could easily cause memory leaks. It looks like we need to do something similar to what was done with the StoreSubscription and make this return a Boxinstead. Because we need to callunsubscribe()` if no-one is subscribed any more.

This is also were we'll have the spawn's handle live.

Done. This function no longer exists as it uses store now.


nativelink-scheduler/src/redis_operation_state.rs line 64 at r10 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

nit: Cleanup.

Done. This function no longer exists as it uses store now.


nativelink-scheduler/src/redis_operation_state.rs line 65 at r10 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

nit: cleanup?

Done. This function no longer exists as it uses store now.


nativelink-scheduler/src/redis_operation_state.rs line 67 at r10 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

nit: period.

Done. This function no longer exists as it uses store now.


nativelink-scheduler/src/redis_operation_state.rs line 69 at r10 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

nit: remove.

Done. This function no longer exists as it uses store now.


nativelink-scheduler/src/redis_operation_state.rs line 270 at r10 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

nit: This is not an impl, it's just RedisOperation

Done.

Copy link
Contributor Author

@zbirenbaum zbirenbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Analyze (javascript-typescript), Local / ubuntu-22.04, Vercel, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), pre-commit-checks, ubuntu-20.04 / stable, ubuntu-22.04, vale, and 4 discussions need to be resolved


nativelink-scheduler/src/redis_operation_state.rs line 1 at r4 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

nit: Please add copyright header.

Done.

Copy link
Contributor Author

@zbirenbaum zbirenbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Analyze (javascript-typescript), Local / ubuntu-22.04, Vercel, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), pre-commit-checks, ubuntu-20.04 / stable, ubuntu-22.04, vale, and 2 discussions need to be resolved


nativelink-scheduler/src/redis_operation_state.rs line 30 at r10 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

nit: Have this instead take a Store and make a helper function:

fn get_redis_client(&self) -> Result<Client, Error>

And downcast it to the redis store.

Done.


nativelink-scheduler/src/redis_operation_state.rs line 43 at r10 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

nit: Move this below to the rest of RedisOperation stuff.

Done.

Basic Implementation of WorkerStateManager, ClientStateManager,
and MatchingEngineStateManager for RedisStateManager.
@zbirenbaum
Copy link
Contributor Author

No longer working on this here, split into:
#1020
#1021
#1022
#1023

@zbirenbaum zbirenbaum closed this Jun 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants