Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handle cpu vs gpu nodes on gust/derecho #89

Open
jedwards4b opened this issue Mar 16, 2023 · 10 comments
Open

handle cpu vs gpu nodes on gust/derecho #89

jedwards4b opened this issue Mar 16, 2023 · 10 comments
Assignees

Comments

@jedwards4b
Copy link
Collaborator

CPU nodes have 128 MAX_TASKS_PER_NODE while gpu nodes have 64. How do we handle each case independently
and how do we handle the hybrid case?

@jedwards4b jedwards4b self-assigned this Mar 16, 2023
@sjsprecious
Copy link
Collaborator

Could we add something like MAX_TASKS_PER_GPU_NODE and use it only when ngpus_per_node > 0?

@fischer-ncar
Copy link
Collaborator

Going along with @sjsprecious, we could change MAX_TASKS_PER_NODE to MAX_TASKS_PER_CPU_NODE.

@jedwards4b
Copy link
Collaborator Author

@fischer-ncar I think that your change would not be backward compatible and I think it may cause confusion. I think MAX_CPUTASKS_PER_GPU_NODE could be the solution.

@sjsprecious
Copy link
Collaborator

@jedwards4b so we will add a new XML variable MAX_CPUTASKS_PER_GPU_NODE and only use it when ngpus_per_node > 0?

@jedwards4b
Copy link
Collaborator Author

I want to figure out how to run on GPU nodes but also hybrid CPU/GPU before I decide.

@sjsprecious
Copy link
Collaborator

I want to figure out how to run on GPU nodes but also hybrid CPU/GPU before I decide.

Can you explain more about "hybrid CPU/GPU"? The GPU nodes are "hybrid" to some extent since there are CPU cores on the GPU nodes as well.

@jedwards4b
Copy link
Collaborator Author

jedwards4b commented Mar 17, 2023 via email

@amametjanov
Copy link
Member

In E3SM cime_config, runs on CPUs and GPUs are configured with cime compiler names like gnu vs gnugpu.
E.g.: https://github.com/E3SM-Project/E3SM/blob/master/cime_config/machines/config_machines.xml#L263

  • compiler: gnugpu, gnu
  • MAX_MPITASKS_PER_NODE: 4 for gnugpu, 64 for cpu-only gnu

gnugpu compiler adds GPU-specific compile flags.
I hope this is the case-configuration that you want to enable.

@rljacob
Copy link
Member

rljacob commented Mar 17, 2023

If the component has no GPU pieces, then it just compiles like its a cpu-only run. Then its all about getting the layout right.

@jedwards4b
Copy link
Collaborator Author

This is what I am trying to move away from. Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants