-
Notifications
You must be signed in to change notification settings - Fork 258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: get stats from cgroup file directly #1464
Conversation
We do this because getting it through docker can take up to 1 second per project (container) which makes the ambulance task unacceptably slow.
bbb20d3
to
95626f0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left comments from private channel
gateway/src/project.rs
Outdated
|
||
let usage: u64 = | ||
std::fs::read_to_string(format!("/sys/fs/cgroup/cpuacct/docker/{id}/cpuacct.usage")) | ||
.unwrap_or_default() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should report errors reading this file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed. This will now return an error, so we need to make sure this will be available, or else all projects will be errored. We will test this on staging. We are also thinking of adding a check before starting gateway, so the old gateway will be used rather than starting the new one.
gateway/src/project.rs
Outdated
let id = safe_unwrap!(container.id); | ||
|
||
let usage: u64 = | ||
std::fs::read_to_string(format!("/sys/fs/cgroup/cpuacct/docker/{id}/cpuacct.usage")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also need to update the compose file to have this mounted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, but in a simple way. We need to re-think it a bit to make local development easier, but we will deploy it like this to staging to unblock testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
# filters: | ||
# branches: | ||
# only: main |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please uncomment before merging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
let docker_stats_path = | ||
PathBuf::from_str(DOCKER_STATS_PATH).expect("to parse docker stats path"); | ||
|
||
// Return an error early if the docker stats path is not in the expected location. | ||
if !docker_stats_path.exists() { | ||
return Err(std::io::Error::new( | ||
io::ErrorKind::NotFound, | ||
format!( | ||
"could not find docker stats at path: {:?}", | ||
DOCKER_STATS_PATH | ||
), | ||
)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should just ensure this works for local development, then we can merge it. I don't imagine it will work on Windows, but developing Shuttle locally on Windows is already limited. For Linux, we need to see which version of cgroups are available, or just check if either path exists. For mac, I'm not sure, wdyt @iulianbarbu? See the get_container_stats
function for which paths we use on linux.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll release this without the local development fix, we can do that after.
Description of change
This cherry-picks this commit: 0f99d46, minus the parts that were extracted in #1463
How has this been tested? (if applicable)
I verified that this doesn't cause deserialization errors in the old gateway when deploying to staging, as long as the old gateway has already been deployed with the #1463 fix.
We will also load test this change on staging before deploying to production.