Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed-up dramatically proj.db build time. #4280

Merged
merged 1 commit into from
Oct 22, 2024

Conversation

rouault
Copy link
Member

@rouault rouault commented Oct 16, 2024

(follow-up / improvement of #4279)

Current proj.db build time is typically 50 to 60 seconds (and up to 7.5 hours on arm64 cross-compilation with full emulation!). Most of it is due to running consistency checks. Those checks actually only need to run once each time when we update the content of the database. When skipping them, the build time is cut to 3 seconds or so.
So in data/CMakeLists.txt, let keep track of an expected md5sum resulting from the concatenation of the data/sql/*.sql files. When building proj.db, we check if the got and expected md5sum match. If they do build proj.db by inserting the consistency check triggers after having inserted data record. If there's a mismatch, do a one time build with the triggers inserted before the data records, check that proj.db builds fine with that, and if so, emit a CMake error message indicating to the user that they must update the PROJ_DB_SQL_EXPECTED_MD5 variable in data/CMakeLists.txt with the provided value. Next runs will go through the fast build path, until content is updated again.

Demo of first time with outdated value in PROJ_DB_SQL_EXPECTED_MD5:

CMake Warning at generate_proj_db.cmake:35 (message):
  all.sql.in content has changed.  Running extra validation checks when
  building proj.db...


CMake Error at generate_proj_db.cmake:43 (message):
  Update 'set(PROJ_DB_SQL_EXPECTED_MD5 ...)' line in data/CMakeLists.txt with
  9a6b21de7b18f68719acb2260c3492fb value

Current proj.db build time is typically 50 to 60 seconds (and up to 7.5
hours on arm64 cross-compilation with full emulation!). Most of it is
due to running consistency checks. Those checks actually only need to run
once each time when we update the content of the database. When skipping
them, the build time is cut to 3 seconds or so.
So in data/CMakeLists.txt, let keep track of an expected md5sum
resulting from the concatenation of the data/sql/*.sql files. When
building proj.db, we check if the got and expected md5sum match. If they
do build proj.db by inserting the consistency check triggers after
having inserted data record. If there's a mismatch, do a one time build
with the triggers inserted before the data records, check that proj.db
builds fine with that, and if so, emit a CMake error message indicating
to the user that they must update the PROJ_DB_SQL_EXPECTED_MD5 variable
in data/CMakeLists.txt with the provided value. Next runs will go
through the fast build path, until content is updated again.
@rouault rouault force-pushed the fast_proj_db_build branch from a5ddd97 to 842240d Compare October 16, 2024 21:18
@rouault rouault merged commit 98974da into OSGeo:master Oct 22, 2024
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 9.5 Backport to 9.5 branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant