-
Notifications
You must be signed in to change notification settings - Fork 552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Silently fails to write GCS cache #1384
Comments
I think failing to write itself should not be an error, there might be intermittent sporadic issues that should not cause a build to abort. Failing to authorize is a different ballpark though, and if happening repeatedly should be an error. PRs welcome, but it should be somewhat generalizable. Not sure on the heuristic to use for aborting. Note that this is a breaking change (changes user observable behavior). |
Yeah I didn't mean to say that every write failure should be an error, just that entirely failing to ever write because it can't authenticate should be. I think I'd always abort on an unauthorized response. It might also be good to keep track of other errors and somehow detect high error rates, but that's a much more nuanced and large undertaking. So, I'd propose:
WDYT? |
sounds good. anyway, it doesn't have to be perfect at first! |
Sounds good. I'll figure out how to build Rust and take a look at contributing :-) |
It actually looks like this issue is more widespread. Failing to create an Azure cache is similarly a warning: Line 342 in b6b2005
The cache creation function doesn't appear to be configured to return an error. It can log an error, but that doesn't actually halt the server from starting. Should that return a |
Here are some notes from OpenDAL's side (since s3/azure/gcs have been migrated to opendal). Service that has been migrated to opendal will return opendal::Error. To address this issue, we can check the kind returned by error like the following: match err.kind() {}
ErrorKind::ObjectNotFound => ...,
_ => ...
} Writing a By the way, opendal provides a LoggingLayer which can print informative logs for every operation. use anyhow::Result;
use opendal::layers::LoggingLayer;
use opendal::Operator;
use opendal::Scheme;
let _ = Operator::from_env(Scheme::Fs)
.expect("must init")
.layer(LoggingLayer::default()); Logs will be like the following: [2022-12-13T06:29:52Z WARN opendal::services] service=s3 operation=stat path=x/x/y -> errored: ObjectNotFound (permanent) at stat
Context:
response: Parts { status: 404, version: HTTP/1.1, headers: {"x-amz-request-id": "HRX2A7QGVQ7X6EXY", "x-amz-id-2": "y7DKQ9PhRq3jFyWlH6/oReVQXL0bO+HAyicycw1H6qqZHzW/TAb2yHqeTwKXAxA4B+azRJ2mgKE=", "content-type": "applicati***/xml", "date": "Tue, 13 Dec 2022 06:29:52 GMT", "server": "AmazonS3"} }
service: s3
path: x/x/y |
If sccache is unable to authenticate to GCS it just silently proceeds without authentication. If it gets a "unauthorized" response when writing, it proceeds. Instead, I think several of these things should be an error, with the option to configure it to bypass the errors. I found #1180, which I think covers some of these issues, but there are others that should
Here are some logs demonstrating the issue: https://gist.github.com/GMNGeoffrey/b0f7453349b8211b3af355bc9a9765f9
This should be an error. I think just changing
sccache/src/cache/cache.rs
Line 378 in b6b2005
This should be
DEBUG
. A 404 is just a cache miss. #1180 covers this.Should this be an error? No authentication with ReadWrite, seems exceptionally unlikely to work... I wonder if any cloud bucket is configured for unauthenticated write access
This should be an error.
I am happy to send patches for these issues.
The text was updated successfully, but these errors were encountered: