Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable worker core pinning in CPU nightly benchmark #2166

Merged

Conversation

min-jean-cho
Copy link
Collaborator

@min-jean-cho min-jean-cho commented Mar 1, 2023

This PR enables worker core pinning by default for better cpu performance.

Below is a nightly cpu benchmark performance comparison:
torchserve_nightly_benchmark_throughput
torchserve_nightly_benchmark_latency

default: https://github.com/pytorch/serve/actions/runs/4238841923
OMP_NUM_THREADS=1: https://github.com/pytorch/serve/actions/runs/4255376669
core pinning, --use_logical_cores: https://github.com/pytorch/serve/actions/runs/4320222197

@msaroufim
Copy link
Member

msaroufim commented Mar 16, 2023

Just retriggered CI - @min-jean-cho I noticed CI failures last week and a bunch of upstream changes to torch, do you need any help resolving this PR?

@min-jean-cho
Copy link
Collaborator Author

Thanks @msaroufim, let me convert this PR to WIP. The current status of this PR is to address the concerns brought up by @lxning when deploying this in production. I'll update this when ready, thanks!

@min-jean-cho min-jean-cho changed the title Enable worker core pinning by default [WIP] Enable worker core pinning by default Mar 16, 2023
@codecov
Copy link

codecov bot commented Apr 12, 2023

Codecov Report

Merging #2166 (86a6898) into master (a460afb) will not change coverage.
The diff coverage is n/a.

❗ Current head 86a6898 differs from pull request most recent head 26d5c39. Consider uploading reports for the commit 26d5c39 to get more accurate results

@@           Coverage Diff           @@
##           master    #2166   +/-   ##
=======================================
  Coverage   71.41%   71.41%           
=======================================
  Files          73       73           
  Lines        3348     3348           
  Branches       57       57           
=======================================
  Hits         2391     2391           
  Misses        954      954           
  Partials        3        3           

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@min-jean-cho min-jean-cho changed the title [WIP] Enable worker core pinning by default Enable worker core pinning in CPU nightly benchmark Apr 13, 2023
Comment on lines +113 to +116
def install_numactl(self):
if os.system("numactl --show") != 0 or args.force:
os.system(f"{self.sudo_cmd}apt-get install -y numactl")

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need add numactl into docker container for cpu?

Copy link
Member

@msaroufim msaroufim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question as the one Li asked, otherwise LGTM. Approving because the this isn't breaking the docker CI

In our current dev docker images we do indeed run the install dependencies script https://github.com/pytorch/serve/blob/master/docker/Dockerfile.dev#L70 but since we're moving to using the regular docker images moving forward should also add numactl there - cc @agunapal

@min-jean-cho
Copy link
Collaborator Author

We can add numactl in docker cpu as well. But I recall in docker, numactl required privileged mode -- taskset is available for in this case.

Copy link
Collaborator

@agunapal agunapal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@min-jean-cho Could you please tell me where numactl is being used here.
Trying to understand why docker CI is not complaining.

@min-jean-cho
Copy link
Collaborator Author

min-jean-cho commented Apr 13, 2023

Could you please tell me where numactl is being used here

@agunapal numactl is used internally by the launcher here, here.

Trying to understand why docker CI is not complaining.

I think because if cpu_launcher_enable=true and numactl is not available, then enabling launcher just skips. When we upgrade to the next torch 2.x release, if cpu_launcher_enable=true and numactl is not available, then then launcher automatically tries taskset here instead of just skipping.

@min-jean-cho min-jean-cho merged commit 9edd461 into pytorch:master Apr 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants