Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactored & simplified Settings.hpp and pmtv format #402

Merged
merged 2 commits into from
Sep 5, 2024

Conversation

RalphSteinhagen
Copy link
Member

@RalphSteinhagen RalphSteinhagen commented Sep 4, 2024

Details:

  • code and templating structure (via and in response to -ftime-trace analysis)
  • replaced std::ranges::find(..) by more compile-time friendly version
  • replaced std::sort by binary-search+merge-sort
  • added non-inline to methods to avoid unnecessary compile-time optimisations
  • optimised custom pmtv formatted by reducing the std::visit(...) pattern (not yet fully eliminated.

Compile-time differences (~4% ... hoped to be more, but it's a start).

before:

real    20m21.229s
user    89m12.953s
sys     3m8.205s

this PR:

real    19m33.855s
user    85m44.586s
sys     3m35.693s

ClangAnalyser profile with this PR (based on Clang's -ftime-trace feature) in the ./build directory:

ClangBuildAnalyzer --all . gr4_all_before.bin; ClangBuildAnalyzer --analyze gr4_all_before.bin > gr4_all_before.txt

before: gr4_all_before.txt

Analyzing build trace from 'gr4_all_before.bin'...
**** Time summary:
Compilation (184 times):
  Parsing (frontend):         2265.8 s
  Codegen & opts (backend):   8065.3 s
  [..]

after: gr4_all_after.txt

Analyzing build trace from 'gr4_all_after.bin'...
**** Time summary:
Compilation (184 times):
  Parsing (frontend):         2121.1 s
  Codegen & opts (backend):   4473.3 s
  [..]

 * code and templating structure (via and in response to -ftime-trace analysis)
 * replaced std::ranges::find(..) by more compile-time friendly version
 * replaced std::sort by binary-search+merge-sort
 * added non-inline to methods to avoid unnecessary compile-time optimisations

Signed-off-by: Ralph J. Steinhagen <r.steinhagen@gsi.de>
Signed-off-by: Ralph J. Steinhagen <r.steinhagen@gsi.de>
Copy link
Contributor

@drslebedev drslebedev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

Your changes look good.
I left some comments in the code for consideration, but the PR can be merged as is.

Comment on lines +539 to 549
const std::set<std::string>& currentAutoUpdateParams = _autoUpdateParameters.at(bestMatchSettingsCtx.value());
// auto notAutoUpdateView = parameters.value() | std::views::filter([&](const auto& pair) { return !currentAutoUpdateParams.contains(pair.first); });
// property_map notAutoUpdateParams(notAutoUpdateView.begin(), notAutoUpdateView.end());

// the following is more compile-time friendly
property_map notAutoUpdateParams;
for (const auto& pair : parameters.value()) {
if (!currentAutoUpdateParams.contains(pair.first)) {
notAutoUpdateParams.insert(pair);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about the gain of compilation time, but we might still need to consider using ranges, where possible?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re gain: see numbers above. Unfortunately quite significant and not in favour of ranges. :-|

Comment on lines +942 to +945
std::vector<std::pair<SettingsCtx, property_map>>& sortedVectorForContext = _storedParameters[ctx.context];
// binary search and merge-sort
auto it = std::ranges::lower_bound(sortedVectorForContext, ctx.time, std::less<>{}, [](const auto& pair) { return pair.first.time; });
sortedVectorForContext.insert(it, {ctx, newParameters});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::vector::insert can be quite costly, so using std::sort might be a better option in this case. However, without concrete benchmark measurements, it's difficult to make a definitive judgment, and I don't have a strong preference either way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cost for this insert sort is lower than a regular sort because the list is already sorted. I.e. in most cases the field is added to the end. In the other it's one insert followed by a linear shift ( O = N) ... much faster than any other O = N log N sorts

@RalphSteinhagen RalphSteinhagen merged commit 63f82c7 into main Sep 5, 2024
9 of 11 checks passed
@RalphSteinhagen RalphSteinhagen deleted the pmtFormat branch September 5, 2024 12:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants