Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dtc/develop: speed up static compilation #277

Merged

Conversation

climbfuji
Copy link
Collaborator

@climbfuji climbfuji commented Apr 2, 2020

Changes in ccpp-framework:

  • scripts/mkstatic.py: avoid recreating the auto-generated caps and API when no changes are made

Associated PRs:
#277
NCAR/ccpp-physics#425
NCAR/fv3atm#35
NCAR/ufs-weather-model#33

For regression testing information, see NCAR/ufs-weather-model#33.

…efile/cmakefile/sourcefile snippets if nothing has changed
@climbfuji climbfuji marked this pull request as ready for review April 3, 2020 02:15
@climbfuji climbfuji requested a review from gold2718 April 3, 2020 02:16
@climbfuji
Copy link
Collaborator Author

climbfuji commented Apr 3, 2020

@gold2718 I am adding you as a reviewer here, because this change will go to master very soon (right now it goes to dtc/develop). This change is needed to speed up the compilation of the CCPP code. Without it, the auto-generated caps are recompiled every time, even when no change is made.

I realize that this solution is only required for a short time before transitioning to cap_gen.py.

Copy link
Collaborator

@gold2718 gold2718 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks okay. FWIW, I do not have any such logic in capgen. I always felt it was the host model's responsibility to decide whether or not to run capgen. We should talk about this sometime.

@climbfuji
Copy link
Collaborator Author

Looks okay. FWIW, I do not have any such logic in capgen. I always felt it was the host model's responsibility to decide whether or not to run capgen. We should talk about this sometime.

Thanks for reviewing. We'll probably need to add this capability (not now). EMC has complained that recompiling CCPP takes too long, and if you make a change in one scheme you don't need to recompile all auto-generated files (but you still need to run cap_gen.py to regenerate the one cap that has changed).

@gold2718
Copy link
Collaborator

gold2718 commented Apr 3, 2020

I have a couple of issues with this approach.

  1. Enshrining EMC's build system in the framework will force other models to adapt that workflow or institute weird workarounds. On the other hand, I can share my solution which accomplishes the same thing (using checksums instead of filecmp but that can easily be configured or modified). It can be inserted into the build process for EMC CCPP models without forcing other host models into a particular workflow.
  2. capgen, in general, may be more atomic than prebuild. For instance, a change in the host model can affect the suite cap code so I am not confident that we can guarantee any partial generation. It would be an interesting, but time consuming exercise to use the new capgen database to limit file re-generation. One file that I think it will be easy and important to control is ccpp_kinds.F90 as that is (or should be) widely used. However, it you only want to rebuild the host cap and you also need to rebuild/recompile the suite cap(s), that does not sound like a lot more compiling. What am I missing?

@climbfuji
Copy link
Collaborator Author

I have a couple of issues with this approach.

  1. Enshrining EMC's build system in the framework will force other models to adapt that workflow or institute weird workarounds. On the other hand, I can share my solution which accomplishes the same thing (using checksums instead of filecmp but that can easily be configured or modified). It can be inserted into the build process for EMC CCPP models without forcing other host models into a particular workflow.
  2. capgen, in general, may be more atomic than prebuild. For instance, a change in the host model can affect the suite cap code so I am not confident that we can guarantee any partial generation. It would be an interesting, but time consuming exercise to use the new capgen database to limit file re-generation. One file that I think it will be easy and important to control is ccpp_kinds.F90 as that is (or should be) widely used. However, it you only want to rebuild the host cap and you also need to rebuild/recompile the suite cap(s), that does not sound like a lot more compiling. What am I missing?

One could put the burden on the developer to decide/set a flag whether to run ccpp_prebuild.py / cap_gen.py or not. However, this is dangerous and may lead to unexpected errors and many help requests sent to us (people forget to rerun ccpp_prebuild.py, then their code breaks because they forgot that they made changes to the metadata, ...).

I don't see why comparing the caps is specific to the UFS host model build system, it makes sense to not update/rewrite the cap if it hasn't changed. If an underlying module has changed (e.g. ccpp_kinds.F90 or machine.F in our world), then the cmake build system recognizes that this file and all files using the module will need to be recompiled automatically.

I was thinking about checksums as well, but decided that this would be more development work and not worth the effort, given that ccpp_prebuild.py is a dying horse.

@gold2718
Copy link
Collaborator

gold2718 commented Apr 3, 2020

Okay, maybe this makes everyone happy:
capgen gets a switch (e.g., --force) which reproduces the current behavior. The new default (because I think --no-force looks dumb) would be similar to what prebuild does:

  1. Generate all files to a temporary directory (e.g., if the capgen directory is ccpp, generate into ccpp_tmp).
  2. Only copy files from the temporary directory if they do not exist in the configured destination or have changed.
  3. Remove the temporary directory.

Sound like a plan? This is easy to implement and may even benefit CESM builds. Currently, we re-run capgen if any metadata file or SDF has changed. However, if the only change is to a variable's long name or a comment, the generated files will not be different. Using both techniques might help.

Copy link
Contributor

@llpcarson llpcarson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like a good solution for now - approved

@climbfuji climbfuji merged commit 16b6047 into NCAR:dtc/develop Apr 3, 2020
@climbfuji climbfuji deleted the avoid_recompile_without_change branch June 27, 2022 02:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants