Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨Autoscaling: use label instead of draining machine #5340

Merged
merged 38 commits into from
Feb 19, 2024

Conversation

sanderegg
Copy link
Member

@sanderegg sanderegg commented Feb 15, 2024

What do these changes do?

It seems that the performance of starting a dynamic-sidecar on an autoscaled node is reduced when the docker node is first transitioning from drain to active state.
This PR changes how the autoscaling service treats "drained" nodes.

Before:

  • drained nodes would be set to the drained state as defined in docker terminology (docker node in state drained)

After:

  • drained nodes are maintained as active in the docker terminology but a docker node label is added
  • the docker node label is named osparc-services-ready
  • when the node is "drained", osparc-services-ready=false and when the node is "active" it is set as osparc-services-ready=true
  • an ENV variable allows to control the behavior on the autoscaling service, AUTOSCALING_LABELIZE_DRAINED_NODES it is set currently by default to False so we can test the results in AWS-master. Therefore changes in osparc-config might come later if this proves fruitful.

Related issue/s

How to test

Dev Checklist

DevOps Checklist

@sanderegg sanderegg added the a:autoscaling autoscaling service in simcore's stack label Feb 15, 2024
@sanderegg sanderegg added this to the Schoggilebe milestone Feb 15, 2024
@sanderegg sanderegg self-assigned this Feb 15, 2024
Copy link

codecov bot commented Feb 15, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (734e40e) 87.3% compared to head (60941f1) 87.3%.

Additional details and impacted files

Impacted file tree graph

@@          Coverage Diff           @@
##           master   #5340   +/-   ##
======================================
  Coverage    87.3%   87.3%           
======================================
  Files        1321    1321           
  Lines       54146   54167   +21     
  Branches     1172    1172           
======================================
+ Hits        47278   47307   +29     
+ Misses       6618    6610    -8     
  Partials      250     250           
Flag Coverage Δ
integrationtests 64.9% <ø> (-0.1%) ⬇️
unittests 85.1% <100.0%> (+<0.1%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...g/src/simcore_service_autoscaling/core/settings.py 97.9% <100.0%> (+<0.1%) ⬆️
...e_service_autoscaling/modules/auto_scaling_core.py 94.2% <100.0%> (+<0.1%) ⬆️
...vice_autoscaling/modules/auto_scaling_mode_base.py 81.4% <100.0%> (-0.4%) ⬇️
...scaling/modules/auto_scaling_mode_computational.py 91.0% <100.0%> (ø)
...e_autoscaling/modules/auto_scaling_mode_dynamic.py 100.0% <100.0%> (ø)
.../simcore_service_autoscaling/utils/utils_docker.py 99.0% <100.0%> (+<0.1%) ⬆️

... and 3 files with indirect coverage changes

@sanderegg sanderegg marked this pull request as ready for review February 16, 2024 16:38
Copy link
Member

@pcrespov pcrespov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx!

Copy link
Contributor

@matusdrobuliak66 matusdrobuliak66 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes! Q: Now other containers such as exporter will still keep running correct?

@sanderegg sanderegg force-pushed the autoscaling/use-drain-label branch from 6d67a44 to 2326465 Compare February 19, 2024 07:24
@YuryHrytsuk
Copy link
Contributor

Q: do we need to update some director ENV to make sure dynamic sidecar has proper node label constraints?

Copy link
Contributor

@GitHK GitHK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

Copy link

Quality Gate Passed Quality Gate passed

Issues
0 New issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

@YuryHrytsuk
Copy link
Contributor

🚀 🚀 🚀

@sanderegg sanderegg merged commit 68b681c into ITISFoundation:master Feb 19, 2024
55 checks passed
@sanderegg sanderegg deleted the autoscaling/use-drain-label branch February 19, 2024 08:46
jsaq007 pushed a commit to jsaq007/osparc-simcore that referenced this pull request Feb 20, 2024
jsaq007 pushed a commit to jsaq007/osparc-simcore that referenced this pull request Feb 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a:autoscaling autoscaling service in simcore's stack
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Autoscaling: use labels instead of draining nodes
5 participants