-
Notifications
You must be signed in to change notification settings - Fork 905
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate performance of config loading for big projects #3893
Comments
I'd like to see us add a CLI command which users can run to produce a flamegraph. It would massively reduce the guesswork here.
|
@datajoely flamegraph for the entire pipeline run (how much time each node takes) or just the config resolution / pipeline initialization? |
In my mind, it would run the whole command as normal, but also generate the profiling data. Perhaps if we were to take this seriously, a full on memray integration would incredible. |
Continuing the discussion on creating custom commands here #3908 |
Originally posted by @idanov in #3732 (comment) The solution works, but From the discussion in the PR:
There were a few thumbs up to the idea, and it was brought up again in #3973 (@datajoely please do confirm that this is what you had in mind 😄) @merelcht pointed out that there's a pending research item on how users use parameters and for what #2240 @ElenaKhaustova agreed that this is relevant in the context of the ongoing Ideally, if there's a way we can tackle this issue without blocking it on #2240, the time to look at it would be now. But I have very little visibility on what are the implications, or whether we would actually solve the performance problem at all. So, leaving the decision to the team. |
Would you really call this coupling? The way I read it is that is uses omegaconf to parse the parameters config. We already have a dependency on omegaconf anyway, and I actually quite like that we can leverage it in more places than just the |
Sorry to keep moving the conversation but I'd rather not discuss the specifics of a particular solution outside the corresponding PR, addressed your question in context at #3732 (comment) |
Now that we're working on this, some goals for this ticket:
The outcomes should be
Hopefully by the end of the analysis we should either have
|
@astrojuanlu My worry is that we will likely find nothing actionable unless there is a project that is actually slow with OmegaConfigLoader. With the benchmark result, it seems that it is reasonable fast. We can still do the exponential scaling (not necessary a combination of all) to better understand the performance of configloader (this should probably move into the benchmark once done):
The result of this should be a table (one axis being the thing to be test, the other axis is the number of entry) + profiling |
I had started working on something for fun a few months ago to solve this (potential) problem. So if you can find cases where |
Did someone hear a 🦀 walking? |
Hi Team, As suggested by @astrojuanlu and @noklam , I tried creating stress test scripts and analyze how OmegaConfigLoader scales. You can find the test scripts here under Machine used: 1. Single Catalog file with increasing variable interpolations - Line Profiler - kernprof -lvr --unit 1 ocl_plot_variables.py
Total time: 265.218 s
File: /KedroOrg/kedro/kedro/config/omegaconf_config.py
Function: load_and_merge_dir_config at line 272
Line # Hits Time Per Hit % Time Line Contents
326 16 103.0 6.4 38.8 config = OmegaConf.load(tmp_fo)
353 32 134.1 4.2 50.6 for k, v in OmegaConf.to_container(
354 16 28.0 1.8 10.6 OmegaConf.merge(*aggregate_config), resolve=True 2. Single Catalog file without variable interpolations - Line profiler - kernprof -lvr --unit 1 ocl_plot_datasets.py
Total time: 50.2196 s
File: /KedroOrg/kedro/kedro/config/omegaconf_config.py
Function: load_and_merge_dir_config at line 272
Line # Hits Time Per Hit % Time Line Contents
326 16 37.6 2.3 74.9 config = OmegaConf.load(tmp_fo)
353 32 2.4 0.1 4.8 for k, v in OmegaConf.to_container(
354 16 10.1 0.6 20.1 OmegaConf.merge(*aggregate_config), resolve=True 3. Multiple catalog files following Line Profiler - kernprof -lvr --unit 1 ocl_plot_multifile.py
Total time: 106.144 s
File: /KedroOrg/kedro/kedro/config/omegaconf_config.py
Function: load_and_merge_dir_config at line 272
Line # Hits Time Per Hit % Time Line Contents
322 3615 16.8 0.0 15.8 with self._fs.open(str(config_filepath.as_posix())) as open_config:
326 3615 58.3 0.0 54.9 config = OmegaConf.load(tmp_fo)
354 10 19.7 2.0 18.5 OmegaConf.merge(*aggregate_config), resolve=True Summary: Below are the methods which take most of the time when resolving catalog. All of them are part of load_and_merge_dir_config function. All of these are from OmegaConf module which we use under the hood. So based on the above analysis, we could try alternatives to OmegaConf to have better performance (I am not sure if there are any better alternatives. I found hydra which again uses OmegaConf under the hood.) Thank you ! |
I'm not sure if it's worth the engineering overhead, but its cool |
Some more on the same theme: https://github.com/leptonyu/cfg-rs |
Thanks a lot for the analysis @ravi-kumar-pilla , the analysis looks good. Looks like everything scales linearly. We're still waiting for feedback from a user that was struggling with high latency. |
Thanks @ravi-kumar-pilla , this aligns with the result here https://kedro-org.github.io/kedro-benchmark-results/#benchmark_ocl.TimeOmegaConfigLoader.time_loading_catalog This looks reasonably fast enough (1000 datasets in 1 second). Let's wait for the feedback from the user. |
Great analysis, thanks @ravi-kumar-pilla ! Is there by any chance any parts of the Kedro code that are also introducing latency or is it only the And for number 3 |
I'm also curious to hear how you think we can add this kind of profiling to the QA/benchmarking tests @ravi-kumar-pilla ? And did you find the newly added benchmarking setup useful for this kind of testing at all? |
I did use pyinstrument and speedscope to analyze the overall behavior. Most of the time was spent on
Number 3 was without variable interpolations (I think it would take more time with variable interpolations based on the individual test).
Yes 100 catalogs is crazy simulation which might not happen in real. I think the overall behavior of OmegaConf was reasonable. Thank you |
The benchmark setup does show similar performance trajectory and is useful for testing these cases. We can definitely iterate and add more use cases of generating the catalog. |
Hi Team, Based on the materials we received, it is evident that the bottleneck is OmegaConf usage ( These observations are in-line with the above benchmark test plots. |
Closing this as the investigation is completed. Opened a follow up issue to improve time taken by OmegaConf in resolving global interpolations here |
Description
Earlier this week a user reached out to me in private saying that it was taking 3 minutes for Kedro to load their configuration (
KedroContext._get_catalog
).Today another user mentioned that "Looking at the logs, it gets stuck at the kedro.config.module for more than 50% of the pipeline run duration, but we do have a lot of inputs and outputs"
I still don't have specific reproducers, but I'm noticing enough qualitative evidence to open an issue about it.
The text was updated successfully, but these errors were encountered: