Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance improvements by providing caching for Configuration objects built from JSON configuration files #950

Merged
merged 16 commits into from
May 31, 2019

Conversation

smgallo
Copy link
Contributor

@smgallo smgallo commented May 29, 2019

Description

The Configuration and extending classes such as EtlConfiguration and XdmodConfiguration are now used to process all XDMoD configuration files and local overrides (e.g., roles.json and files in roles.d) as well as ETL configuration files. The extremely flexible nature of these files introduces significant overhead when the same configuration file is required in multiple locations throughout the code to perform a given operation.

This implements an object cache so that the same global configuration file (e.g., roles.d) and local configuration files (e.g., files in roles.d) only need to be parsed once with the resulting object stored in an object cache. Subsequent accesses are retrieved from the cache. The unique cache key is generated using the following pieces of information:

  • Fully qualified path to the global configuration file. This determines the set of global defaults and the path to the local configuration directory, if any.
  • Name of the class being instantiated. This is obtained using get_called_class() so we can distinguish between Configuration, EtlConfiguration, etc.
  • Any additional options passed to the Configuration or child classes. Options such as substitution variables will affect the generated object.

The result is a significant reduction in the amount of times that Configuration related methods are called.

Current Profile (8.5.0)

Usage Explorer get_data request: /controllers/user_interface.php?&operation=get_data

curl 'http://localhost:8080/controllers/user_interface.php' -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:65.0) Gecko/20100101 Firefox/65.0' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8' -H 'Accept-Language: en-US,en;q=0.5' --compressed -H 'Referer: https://aristotle-hub.ccr.xdmod.org/' -H 'Content-Type: application/x-www-form-urlencoded' -H 'DNT: 1' -H 'Connection: keep-alive' -H 'Upgrade-Insecure-Requests: 1' --data 'public_user=true&scale=1&aggregation_unit=Auto&dataset_type=timeseries&thumbnail=n&query_group=tg_usage&limit=10&offset=0&log_scale=n&show_guide_lines=y&show_trend_line=n&show_error_bars=n&show_aggregate_labels=n&show_error_labels=n&hide_tooltip=false&show_title=y&legend_type=bottom_center&font_size=3&drilldowns=%5Bobject+Object%5D&none=-9999&format=png&inline=n&operation=get_data&display_type=area&combine_type=stack&realm=Cloud&width=916&height=484&group_by=resource&statistic=cloud_core_time&timeframe_label=Quarter+to+date&start_date=2016-12-01&end_date=2017-01-31&XDEBUG_PROFILE=yes' -o image.png

Longest Calls:

Percent Time Calls Function
57.54 46.23 694 Configuration->processKeyTransformers()
13.03 12.87 646 Configuration->recursivelySubstituteVariables()
5.40 5.33 1,969 StripMergePrefixTransformer->keyMatches()
5.21 5.15 1,969 CommentTransformer->keyMatches()

Function list: xdmod850-vanilla get_data_function_list

Simulate a click on a Usage Tab tree node to view summary charts: /controllers/user_interface.php?&operation=get_data?&operation=get_charts

curl 'http://localhost:8080/controllers/user_interface.php' -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:66.0) Gecko/20100101 Firefox/66.0' -H 'Accept: */*' -H 'Accept-Language: en-US,en;q=0.5' --compressed -H 'Referer: http://localhost:8080/' -H 'X-Requested-With: XMLHttpRequest' -H 'Content-Type: application/x-www-form-urlencoded; charset=UTF-8' -H 'DNT: 1' -H 'Connection: keep-alive' -H 'Cookie: PHPSESSID=6qopa2rn6p8ho7hgaqfpek5763; xdmod_token=public-1557842538.9428-5cdaca6ae6350' --data 'public_user=true&realm=Jobs&group_by=none&start_date=2016-12-01&end_date=2017-01-31&timeframe_label=Previous%20month&aggregation_unit=Auto&width=411.0197368421052&height=246.61184210526312&scale=1&dataset_type=timeseries&thumbnail=y&query_group=tg_usage&display_type=line&combine_type=side&limit=3&offset=0&log_scale=n&show_guide_lines=y&show_trend_line=n&show_error_bars=n&show_aggregate_labels=n&show_error_labels=n&hide_tooltip=false&format=session_variable&legend_type=off&font_size=0&show_title=n&operation=get_charts&controller_module=user_interface&XDEBUG_PROFILE=yes'

Longest Calls (no cycle detection):

Percent Time Called Function
58.98 47.46 10,865 Configuration->processKeyTransformers()
11.90 11.77 9,813 Configuration->recursivelySubstituteVariables()
5.42 5.35 31,872 CommentTransformer->keyMatches()
5.42 5.34 31,872 StripMergePrefixTransformer->keyMatches()
3.66 2.00 2,239 Statistic->__construct()
1.14 1.12 2,105 Query->addStatField()

Function list: xdmod850-vanilla usage_tab_tree_node_function_list

Profile with Config Object Cache

Note significant reduction in calls to Configuration related code. For example, calls to Configuration->processKeyTransformers() are reduced from 10,865 to 300 for Usage Tab summary charts.

Usage Explorer get_data request: /controllers/user_interface.php?&operation=get_data

Longest Calls:

Percent Time Calls Function
41.67 33.64 300 Configuration->processKeyTransformers()
10.36 10.26 273 Configuration->recursivelySubstituteVariables()
5.40 3.96 880 StripMergePrefixTransformer->keyMatches()
5.21 3.57 880 CommentTransformer->keyMatches()

Function list: xdmod850-configcache get_data_function_list

Simulate a click on a Usage Tab tree node to view summary charts: /controllers/user_interface.php?&operation=get_data?&operation=get_charts

Longest Calls (no cycle detection):

Percent Time Called Function
15.09 8.12 3,081 Query\Statistic->__construct()
5.43 5.36 8,210 Common\Identity->__toString()
4.69 4.61 2,895 Query\Query->getStatField()
5.88 3.82 3,633 Query\Query::get_statistic_name_to_class_name()
3.69 3.63 6,550 Common\Identify->__construct()
...
4.17 3.34 300 Configuration->processKeyTransformers()
0.81 0.80 273 Configuration->recursivelySubstituteVariables()

Function list: xdmod850-configcache usage_tab_tree_node_function_list png

Motivation and Context

Performance improvements.

Tests performed

Ran tests locally and via shippable.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project as found in the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@smgallo smgallo added enhancement Enhancement of the functionality of an existing feature Category:General General labels May 29, 2019
@smgallo smgallo added this to the 8.5.0 milestone May 29, 2019
@jpwhite4
Copy link
Member

I was curious why the keyMatches() function of the Comment and StripMergePrefix transformers showed in the profile but the keyMatches() calls from the other transformers did not. The two functions both use two different ways of checking the first character of the string for a specific character:

return ( 0 === strpos($key, self::COMMENT_PREFIX) );

and

return substr($key, 0, 1) === self::MERGE_PREFIX;

Certainly the strpos() call is not ideal since it will have to scan the whole string if it does not contain a comment character (i.e most of the time). In my tests the following was about twice as fast:

return $key[0] === self::COMMENT_PREFIX;

@jpwhite4
Copy link
Member

I also wonder if there is any benefit of changing the recursivelySubstututeVariables() function to only call variablestore->substitute() if the string could contain a variable. Something like:

 if ( is_string($value) ) {
       if (strpos($value, '${') !== null) {
                $value = $this->variableStore->substitute($value);
       }
 } ...

@smgallo
Copy link
Contributor Author

smgallo commented May 29, 2019

The original idea for comments was any key starting with a comment as the first non-whitespace character, but it turns out that we don't use it that way so it makes sense to change it to only compare the first character instead of using strpos().

@smgallo
Copy link
Contributor Author

smgallo commented May 29, 2019

I think that we can gain some performance improvement if we add the strpos($value, '${') !== false check right at the start of VariableStore::substitute(). That will keep all of the substitution machinery in one place.

// object. We use serialize() instead of json_encode() because the latter only takes
// into account public member variables and we have more complex objects that can be
// passed as options such as VariableStore.
$cacheKey = md5($filename . '|' . get_called_class() . '|' . serialize($options));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we ever have a hash collision then it will be a pain in the neck to try to debug the problem!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. Although I can generate the key leaving the filename and class as plain text and calculating the md5 only on the serialized object.

@smgallo
Copy link
Contributor Author

smgallo commented May 30, 2019

@jpwhite4 For whatever reason Xdebug or cachegrind appears to be grouping some of the calls to keyMatches() together. The sum seems reasonable but there are 4 classes utilized not 2 as reported.

@smgallo
Copy link
Contributor Author

smgallo commented May 30, 2019

@jpwhite4 Significant relative performance improvements gained using return $key[0] === self::COMMENT_PREFIX; and returning from $this->variableStore->substitute($value); if there are no variables found. Other potential performance gains as well.

jpwhite4
jpwhite4 previously approved these changes May 30, 2019
@smgallo
Copy link
Contributor Author

smgallo commented May 30, 2019

@jpwhite4 Can you re-approve this PR? I found and fixed a bug and made another performance improvement.

@smgallo smgallo merged commit 4e4fe89 into ubccr:xdmod8.5 May 31, 2019
@smgallo smgallo deleted the config/enable-caching branch May 31, 2019 13:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Category:General General enhancement Enhancement of the functionality of an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants