Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimisation of CLI startup time #272

Closed
wants to merge 9 commits into from
Closed

Conversation

ajjackson
Copy link

@ajjackson ajjackson commented Aug 13, 2024

Fixes #267

Had to do some nasty things here to get rid of the import overheads, but this seems to get the CLI to just-about-tolerable responsiveness. Lets see what can be done to keep this performance without hurting maintainability.

A fair bit of the remaining overhead could be typer itself 🤷

If you would like to join in, my workhorse command for profiling data was

PYTHONPROFILEIMPORTTIME=1 janus --help 2> import_perf.log

@ajjackson
Copy link
Author

ajjackson commented Aug 19, 2024

Having read up a bit more on python annotation progress, it would be better to:

  • gate "unnecessary" import for type annotation with if typing.TYPE_CHECKING instead of #type: comments.
  • from __future__ import annotations will handle the forward- (or otherwise unresolvable) references without needing to wrap them in quote marks

See
https://docs.python.org/3/library/__future__.html#module-contents
https://peps.python.org/pep-0649/

After this the only remaining doc problem (on local sphinx build...)
was caused by an incorrect import, so this is also fixed.
Harmless-looking whitespace changes from

  ruff check --fix --unsafe-fixes
@ajjackson
Copy link
Author

Because this touches a lot of files ruff wants to change all the type hints in those files from Union/Optional syntax to use | instead. I don't mind this, but it horribly clutters the diff so should be a separate PR.

@ajjackson ajjackson marked this pull request as ready for review August 21, 2024 09:50
@ajjackson ajjackson changed the title WIP: optimisation of CLI startup time Optimisation of CLI startup time Aug 21, 2024
@ajjackson
Copy link
Author

It's not ideal that so many import statements are moved inside functions, but this does seem to help. I think this is a good point to do some code review!

Copy link
Member

@ElliottKasoar ElliottKasoar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this, it's looking good, and there does seem to be a noticeable speed up.

It's a shame lazy importing e.g. https://peps.python.org/pep-0690/ hasn't gone anywhere.

Did you see discussions e.g. fastapi/typer#231 and (inspired by that issue) initialcommit-com/git-sim#77?

It definitely looks like there are limitations from typer itself, but there also seem to be suggestions that go a bit further to ensure typer only gets the minimum it needs, which may be worth looking into.

I'm less sure about the TYPE_CHECKING changes, particularly when the same import is needed elsewhere, as it'll become difficult to know where to expect imports, whereas at least the the CLI modules the changes are fairly consistent and predictable.

@@ -6,7 +6,6 @@
from typing import Any, Optional, get_args

from ase import Atoms
from ase.io import read
Copy link
Member

@ElliottKasoar ElliottKasoar Aug 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may matter not too much if we merge #283, but is there a reason for to move this import specifically, but nothing else in the calculation modules?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only that it was quite slow! It may well be that a few other things could be moved for consistency.

@oerc0122
Copy link
Collaborator

@ElliottKasoar
Copy link
Member

How would lazy imports fare? https://docs.python.org/3/library/importlib.html#implementing-lazy-imports

The LazyLoader is mentioned in the PEP 690 justification, which suggests it's not ideal, but may be workable:

The Python standard library already includes built-in support for lazy imports, via importlib.util.LazyLoader. There are also third-party packages such as demandimport. These provide a “lazy module object” which delays its own import until first attribute access. This is not sufficient to make all imports lazy: imports such as from foo import a, b will still eagerly import the module foo since they immediately access an attribute from it. It also imposes noticeable runtime overhead on every module attribute access, since it requires a Python-level getattr or getattribute implementation.

@ajjackson
Copy link
Author

Yeah, it seems like in most cases we do access a module attribute so it wouldn't help with much.

@ajjackson
Copy link
Author

What is the status of this, then? Do we try to make this approach work? The imports do seem to be the main problem, so one way or another they have to be deferred in order to have a good time using Janus with completion.

An alternative approach would be to add a whole extra abstraction layer, so the Typer application interfaces are in their own module file(s) and only at runtime do they import the actual application functions from another module. This module could then have more "normal"-looking imports at the top of its file.

It would make for a messier project directory but cleaner code files...

@ElliottKasoar
Copy link
Member

Sorry, I've been away most of the month, and if possible want to look into potential solutions myself too.

I definitely agree it is unreasonably slow at the moment, and I agree some form of delaying imports seems to be necessary.

I'd probably lean towards tidier code but messier directories, within reason. At least the form of the directories would be static, as opposed to imports in arbitrary places within the code.

@ajjackson
Copy link
Author

Ok! Maybe you can have a go at implementing that version in a fresh branch, and then we can do a speed-test to see if both branches deliver similar performance?

@ElliottKasoar
Copy link
Member

Could you take a look at #330 and see what you think, @ajjackson?

As I mention in the PR, having worked through the options I think I've come to pretty similar conclusions as you did for this PR.

@ElliottKasoar
Copy link
Member

Resolved by #330.

Thank you so much for your help, @ajjackson!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tab completion slow and raises warnings
3 participants