Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_coalescing: improve error handling #834

Merged
merged 2 commits into from
Oct 23, 2023
Merged

Conversation

cvybhu
Copy link
Contributor

@cvybhu cvybhu commented Oct 13, 2023

test_coalescing spawns a lot of tokio tasks, each of which runs a query on the database.
If a query fails, the task will panic because of the unwrap().

The problem is that all of the tasks are joined using join_all, which doesn't check for errors in the joined tasks.
For example, this piece of code prints "Ok!" and the program finishes successfuly:

let mut tasks = Vec::new();
for _i in 0..4 {
    tasks.push(tokio::spawn(async { panic!("Panic!") }));
}

futures::future::join_all(tasks).await;

println!("Ok!");

There are some error messages in the output, but the important thig is that they don't stop the control flow.

To fix it, let's use try_join_all instead of join_all. try_join_all will check whether the task panicked or not, and return an error if there was a panic.

I manually tested that it works by inserting a panic!() after conn.query().

Let's also make sure that the result of join_all doesn't contain any Result<>, as it could hide the errors. The result of join_all should be just a Vec<> of unit values.

The PR also adds a check that Connection::query didn't return QueryResponse::Error. It is required to check it in order to ensure that the query didn't fail.

Refs: #828

Pre-review checklist

  • I have split my patch into logically separate commits.
  • All commit messages clearly explain what they change and why.
  • I added relevant tests for new features and bug fixes.
  • All commits compile, pass static checks and pass test.
  • PR description sums up the changes and reasons why they should be introduced.
  • I have provided docstrings for the public items that I want to introduce.
  • I have adjusted the documentation in ./docs/source/.
  • I added appropriate Fixes: annotations to PR description.

@cvybhu cvybhu requested a review from piodul October 13, 2023 20:30
Comment on lines 2023 to 2028
let futs = (base..base + batch_size).map(|j| {
let q = Query::new("INSERT INTO t (p, v) VALUES (?, ?)");
let conn = conn.clone();
async move {
conn.query(&q, (j, vec![j as u8; j as usize]), None)
.await
.unwrap()
}
async move { conn.query(&q, (j, vec![j as u8; j as usize]), None).await }
});
futures::future::join_all(futs).await;
futures::future::try_join_all(futs).await
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this part is OK, actually. If you have a compound future (i.e. a future that polls other futures, like the one returned by join_all), panics from the inner futures will be propagated properly. I'm not sure that changes are needed here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I guess we can just panic there.
I was scared of panics because I don't fully understand how tokio handles panics, but it looks like it would be ok.

Although I found one more issue here. Connection::query returns a QueryResponse, which can be QueryResponse::Error, so we have to also check that. Fixed in the new version.

}));

tokio::task::yield_now().await;
}

futures::future::join_all(futs).await;
futures::future::try_join_all(futs).await.unwrap();
Copy link
Collaborator

@piodul piodul Oct 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand, I'm not sure this line is correct now after your changes. If you look at the definition of tokio::task::spawn, you will see that it returns a handle that is a future which returns its own Result - if the future polled by the tokio task panics, the handle will contain Err. So, because you changed the futures to return Result<>s, now try_join_all returns a Result<Result<_>>.

Assuming that you won't change the futs.push part of the code, you will need to do two unwraps here. On the other hand, if you revert changes you did to the futs.push part, you should keep the line directly above as it is and that's it.

To sum up, the whole change should be an one-liner (unless I messed something in my reasoning).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh you're right. It's a shame that rust didn't complain about unused Results here.
I uploaded new version which also checks that the result of join_all is always a Vec<()>

`test_coalescing` spawns a lot of tokio tasks, each of which
runs a query on the database.
If a query fails, the task will panic because of the `unwrap()`.

The problem is that all of the tasks are joined using `join_all`,
which doesn't check for errors in the joined tasks.
For example, this piece of code prints "Ok!" and the program finishes successfuly:
```rust
let mut tasks = Vec::new();
for _i in 0..4 {
    tasks.push(tokio::spawn(async { panic!("Panic!") }));
}

futures::future::join_all(tasks).await;

println!("Ok!");
```
There are some error messages in the output, but the important thig is
that they don't stop the control flow.

To fix it, let's use `try_join_all` instead of `join_all`.
`try_join_all` will check whether the task panicked or not,
and return an error if there was a panic.

I manually tested that it works by inserting a panic!()
after `conn.query()`.

Let's also make sure that the result of join_all doesn't
contain any `Result<>`, as it could hide the errors.
The result of `join_all` should be just a `Vec<>` of unit values.

Refs: scylladb#828

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
`test_coalescing` runs a bunch of insert queries using `Connection::query`.
It unwraps the result of `Connection::query`, but this function returns
a `QueryResponse`, which might be `QueryResponse::Error`.
To make sure that the query was successful we have to check that the response
doesn't contain any errors. This can by done using `into_non_error_query_response()`.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
@cvybhu
Copy link
Contributor Author

cvybhu commented Oct 19, 2023

v2:

  • Went back to doing panic! if the query fails, unwrap() is easier to use than Result. I manually tested that the test fails when I add a panic!() next to conn.query.
  • Ensure that join_all doesn't return a Vec of results, it should always be a vec of unit values
  • I noticed that conn.query returns a QueryResponse which could be QueryResponse::Error. To make sure that a query was successful we also have to check that the response isn't an error. Fixed that in a new commit.

@piodul piodul merged commit 535c724 into scylladb:main Oct 23, 2023
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants