Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[spike] Improve Kedro CLI startup time #1476

Closed
noklam opened this issue Apr 25, 2022 · 37 comments · Fixed by #3883
Closed

[spike] Improve Kedro CLI startup time #1476

noklam opened this issue Apr 25, 2022 · 37 comments · Fixed by #3883
Assignees
Labels
Component: CLI Issue/PR that addresses the CLI for Kedro Issue: Feature Request New feature or improvement to existing feature

Comments

@noklam
Copy link
Contributor

noklam commented Apr 25, 2022

Description

Currently, Kedro CLI startup time can be slow even if very little work is done, for example, doing kedro to show the available command or help can take a few seconds. These commands should feel instant since it is just printing out some help text without any heavy work.

Useful tool to profile import time

  1. The kedro command itself takes 0.7s, which isn't fast but still an acceptable response time. Initial investigation suggests that it can take few seconds mainly due to a slow import during plugin registration.

Kedro startup time with plugin installed

(0.7s, removed screenshot)

Kedro startup time with no plugin installed (telemetry + viz)

image

Causes:

  1. Slow import from plugin
  2. Slow import from kedro itself
  3. Slow import from pkg_resource (something that is run with console_script)

Context

Why is this change important to you? How would you use it? How can it benefit other users?

The Kedro CLI can be slow sometimes and lead to a bad user experience. It should be the most common way that users interact with kedro.

Possible Implementation

(Optional) Suggest an idea for implementing the addition or change.
The first improvement that can be done is to do lazy loading in plugin registration to address (1). (2) & (3) will requires more time to study.

Possible Alternatives

(Optional) Describe any alternative solutions or features you've considered.
Compile some of the kedro CLI, so it prints out help message without actually loading any python code.

e.g.
kedro -> Load some pre-compiled text files and print, it may be tricky with plugins though.

@noklam noklam added Issue: Feature Request New feature or improvement to existing feature Component: CLI Issue/PR that addresses the CLI for Kedro labels Apr 25, 2022
@merelcht
Copy link
Member

merelcht commented May 4, 2022

Some older issues about this topic: https://github.com/quantumblacklabs/private-kedro/issues/930 & https://github.com/quantumblacklabs/private-kedro/pull/537

@znfgnu
Copy link

znfgnu commented Dec 8, 2022

I created PR that should solve this problem in kedro-org/kedro-viz#1196 - any help with making tests passed would be appreciated

@astrojuanlu
Copy link
Member

(From #3033)

Still an issue. For example, kedro info on a spaceflights project takes a bit more than 6 seconds on my machine (M1 MacBook Pro from 2022):

$ time kedro info
 _            _
| | _____  __| |_ __ ___
| |/ / _ \/ _` | '__/ _ \
|   <  __/ (_| | | | (_) |
|_|\_\___|\__,_|_|  \___/
v0.18.13

Kedro is a Python framework for
creating reproducible, maintainable
and modular data science code.

Installed plugins:
kedro_telemetry: 0.2.5 (entry points:cli_hooks,hooks)
kedro_viz: 6.5.0 (entry points:global,hooks,line_magic)

________________________________________________________
Executed in    6.27 secs    fish           external
   usr time    1.15 secs    0.26 millis    1.15 secs
   sys time    0.67 secs    2.65 millis    0.67 secs

After warmup (i.e. executing the command again immediately after) this can go down to ~1 second.

Ahead of this I did some profiling using https://github.com/joerick/pyinstrument and https://www.speedscope.app/, and this is what I found:

image

For kedro info with instrumentation on and no warmup, 7.5 seconds were spent importing the various kedro.framework.cli.* subpackages, and 1.7 seconds actually executing the function.

@astrojuanlu
Copy link
Member

Some users are puzzled that kedro gets stuck and sometimes it's difficult to tell whether it's just slow https://linen-slack.kedro.org/t/16630828/i-m-encountering-issues-when-running-the-load-ext-kedro-ipyt#3758284a-34fb-435c-b054-dd3713cd78c9

Today with @rashidakanchwala kedro viz run took like 30 seconds on my M1 Mac.

@astrojuanlu
Copy link
Member

I don't know what happened lately but Kedro has got unacceptably slow. I'm raising the priority of this.

@astrojuanlu
Copy link
Member

❯ time kedro registry list
- __default__
- data_processing
- data_science
- reporting


________________________________________________________
Executed in   24.37 secs    fish           external
   usr time    5.08 secs  245.00 micros    5.08 secs
   sys time    4.95 secs  593.00 micros    4.95 secs

❯ time kedro registry list
- __default__
- data_processing
- data_science
- reporting


________________________________________________________
Executed in    3.72 secs    fish           external
   usr time    2.82 secs    0.24 millis    2.82 secs
   sys time    4.56 secs    1.33 millis    4.56 secs

On my computer, the first time it takes 24 seconds to run kedro registry list, second time it's 3 seconds.

As a first step, we may need a better understanding of how slow Kedro CLI startup time is, and why. Some questions I have:

  • Are there any commands that are slower? (Looks like kedro info is faster than kedro registry list)
  • What's the effect of having plugins? Specifically, what's the effect of having/not having kedro-viz, kedro-datasets?
  • All other things equal, are there significant performance differences in different operating systems?
    • The "all other things equal" might be difficult to test due to differences in hardware
    • Still, it's important to understand because different systems will do different smart things for caching (see the 24 vs 3 second difference)

Then as a next step, we'd need an assessment on what can we do about this. Ideally it shouldn't take this long to run seemingly simple commands, even with a cold cache. Can we aim for a combination of slashing the time in ~half + showing something during those initial seconds so the user knows that it didn't get stuck?

@merelcht merelcht changed the title Improve Kedro CLI startup time [spike] Improve Kedro CLI startup time May 13, 2024
@ankatiyar ankatiyar moved this from To Do to In Progress in Kedro Framework May 15, 2024
@astrojuanlu
Copy link
Member

I don't know if the architecture of pluggy or kedro allows this, but if we don't expect plugins to be able to add pipelines to the registry, maybe we should try to delay the plugin discovery process as much as possible. If one runs kedro registry list or kedro info in principle that should already be enough information to not load the CLI extensions, right?

@ankatiyar
Copy link
Contributor

ankatiyar commented May 30, 2024

Update re: Lazy Loading of Plugins

TLDR: It's possible to lazy load plugins and their commands on Kedro side

Prototype PR: #3901

Profile of regular kedro registry list

image

Updated proposal

Note

this time I have kedro-mlflow + kedro-airflow + kedro-telemetry+kedro-viz installed in the system for the below profiles. Also have consented to collection of kedro-telemetry data.

Lazy loading of plugins and their commands

In #3901 -

  • We don't load the entry points of all the plugins eagerly, just get the entry points and pass them to CommandCollection. We already use a custom version of click's CommandCollection -
    class CommandCollection(click.CommandCollection):
  • If a command is not in existing list of commands CommandCollection.list_commands(), add plugins till it is.
  • Execute the command

Profile of kedro registry list with lazy loading of plugins

image

Send telemetry info in after_command_run [Out of scope]

See: kedro-org/kedro-plugins#709
As mentioned in #1476 (comment), this would not reduce the TOTAL time taken by a command but will reduce the apparent time taken by the command.

Profile of kedro registry list with lazy loading of plugins + telemetry after_command_run

image

Profile data

kedro registry list regular with lazy plugin loading lazy + telemetry after
initialisation of CLI module 858ms 576ms 553
KedroCLI init 4.11s 270ms 287ms
sending telemetry 1s 0.8s 1s
time before the main code runs ~6s ~1.7s ~0.8s
total 6.96s 3.76s 3.8s
kedro info regular with lazy plugin loading lazy + telemetry after
initialisation of CLI module 860ms 737ms 846ms
KedroCLI init 3.97s 292ms 304ms
sending telemetry 1.05s 880ms 858ms
time before the main code runs ~5.7s ~1.9s ~0.8s
total 5.99s 2.03s 2.12s

Things to note

  • It might take same or more time to run plugin commands - there might be a better way to load only the plugin for which the command we're running than just iterating through the entrypoints till we load the plugin which has the command
  • There'll also be some delay when the command is wrong or doesn't exist. For eg kedro infroo will take a while before the error message is displayed because we'll spend some time loading plugins. I don't think this is too bad since we eagerly load these commands already right now.
  • We could still consider combining this with the LazyGroup strategy in Refactor CLI with lazy subcommands and deferring imports kedro-viz#1920 and Lazy load click subcommands #3883 or just look into improving the CLI setup on kedro-viz in general. Kedro viz could be sped up a little bit by delaying imports at least 🤔

@ankatiyar ankatiyar moved this from In Progress to In Review in Kedro Framework May 30, 2024
@datajoely
Copy link
Contributor

I'm amazed kedro-telemetry is so chunky, we should absolutely defer it till after the important stuff, but I wonder if async is sensible here too.

@astrojuanlu
Copy link
Member

but I wonder if async is sensible here too.

Yeah in this case it would make sense to make it async, but... pytest-dev/pluggy#320

@ankatiyar
Copy link
Contributor

ankatiyar commented May 30, 2024

I'm amazed kedro-telemetry is so chunky, we should absolutely defer it till after the important stuff, but I wonder if async is sensible here too.

I was just testing the switch to after_command_run hook on kedro-telemetry, seems like the hook doesn't get executed when there is an error in the pipeline. (cc @noklam and @DimedS also because they asked before)
I think async would be the way to go

I'll create a separate issue for kedro-telemetry stuff and we can focus on the plugin loading stuff here.

@noklam
Copy link
Contributor Author

noklam commented May 30, 2024

after_command_run
https://github.com/kedro-org/kedro-plugins/blob/064c5d9bc491a3f96f4341a2c2d1f034b7b354cb/kedro-telemetry/kedro_telemetry/plugin.py#L185

We still have the project hook that send telemetry immediately after. We should combine the two hooks to avoid checking consent twice as well.

@astrojuanlu
Copy link
Member

For now, shall we then proceed with

? How does kedro-org/kedro-viz#1920 interact or conflict with #3901 ?

@ankatiyar
Copy link
Contributor

For now, shall we then proceed with

? How does kedro-org/kedro-viz#1920 interact or conflict with #3901 ?

These solutions can be used together of course,

I think it's definitely worth refactoring CLI code on kedro-viz side as there is a lot of heavy imports in their cli.py

There's another thing to note -
Right now, if you do kedro --help, you also see all plugin commands:
Screenshot 2024-06-03 at 13 53 37

After lazy plugin loading (i.e after #3901), we lose the plugin commands from the top-level help text:
Screenshot 2024-06-03 at 13 53 29

But you can always do kedro airflow --help or kedro mlflow --help

@astrojuanlu
Copy link
Member

Good point about losing visibility of the full list in kedro --help. Wondering if it's any way to retain both.

@znfgnu
Copy link

znfgnu commented Jun 6, 2024

Are there any contraindications to:

  • start kedro process in the background,
  • make cli communicate with it,
  • handle cli commands by forking the process?

This way, once loaded, the process with libraries would be ready to fork, unless stopped using cli command.

@noklam
Copy link
Contributor Author

noklam commented Jun 6, 2024

@znfgnu IPython essentially?

@datajoely
Copy link
Contributor

I think - and correct me if I'm wrong - the suggestion is about creating a long living kedro process and then a client process expose the CLI and can interact with it without the start-up penalty

@noklam
Copy link
Contributor Author

noklam commented Jun 7, 2024

Would the user need to do anything differently? If we can make this almost invisible to the users then it's great.

Is there any examples of Python base CLI doing this?

@astrojuanlu
Copy link
Member

The current notion of Kedro is that it's a library and a framework. I don't think we should underestimate the consequences of making it a service, in terms of software architecture, reusability, perception etc. Feel free to open a new discussion https://github.com/kedro-org/kedro/discussions/

@ankatiyar
Copy link
Contributor

Discussed in Tech Design on 19th June:

Two approaches discussed:

  1. Make subcommands of Kedro and Kedro-viz lazy, described in this command and prototypes in Refactor CLI with lazy subcommands and deferring imports kedro-viz#1920 and Lazy load click subcommands #3883
  2. Make loading of plugins lazy, described in this comment and prototype in Lazy load commands from plugins #3901

The drawbacks of second approach:

Decision: To go with approach 1 for now - lazy subcommands on Kedro and Kedro Viz with delayed imports, and make sure that the help texts still show up.

@astrojuanlu
Copy link
Member

Dropping this here as a interesting historical reference.

Consider the example of a Python command line program (CLI) with a number of subcommands. Each subcommand may perform different tasks, requiring the import of different dependencies. But a given invocation of the program will only execute a single subcommand, or possibly none (i.e. if just --help usage info is requested). Top-level eager imports in such a program will result in the import of many modules that will never be used at all; the time spent (possibly compiling and) executing these modules is pure waste.

To improve startup time, some large Python CLIs make imports lazy by manually placing imports inline into functions to delay imports of expensive subsystems. This manual approach is labor-intensive and fragile; one misplaced import or refactor can easily undo painstaking optimization work.

https://peps.python.org/pep-0690/#motivation (rejected)

This is just a hard problem in Python in general.

@astrojuanlu
Copy link
Member

Any follow-up tasks here @ankatiyar? Should we then encourage plugins to make their CLI groups lazy, do we need any documentation updates, are we going with the after_command_run for kedro-telemetry?

@noklam
Copy link
Contributor Author

noklam commented Jul 11, 2024

kedro-org/kedro-plugins#707 Same question as above, are we still planning to move telemetry to after_xxx_run. If not what's the reason?

@astrojuanlu
Copy link
Member

Thanks, I had lost track of that PR. The other questions stand

@ankatiyar
Copy link
Contributor

@noklam @astrojuanlu
We have the following follow up actions -

Both are in current sprints of framework and viz respectively.

Not sure about docs, maybe we can add a recommended way to define CLI commands i.e. with LazyGroup here - https://docs.kedro.org/en/stable/extend_kedro/plugins.html

@astrojuanlu
Copy link
Member

maybe we can add a recommended way to define CLI commands i.e. with LazyGroup here - https://docs.kedro.org/en/stable/extend_kedro/plugins.html

yes!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: CLI Issue/PR that addresses the CLI for Kedro Issue: Feature Request New feature or improvement to existing feature
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

8 participants