You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Annif takes several seconds to start even when it's doing nothing but printing the version number or help text:
$ time annif --version
0.54.0.dev0
real 0m4,398s
user 0m4,322s
sys 0m0,470s
I investigated this a little bit using the -X importtime feature in Python 3.7+ and the tuna tool for visualizing profiling information. It seems that the time is mostly spent importing large libraries such as tensorflow, scikit-learn, optuna, connexion and nltk:
These libraries are all unnecessary in simple operations such as annif --help and --version so it would be better to avoid importing them altogether. There are some tutorials on lazy importing (e.g. this one) and the importlib library contains (since Python 3.5) a LazyLoader utility class that could be used here.
I experimented a bit with this lazy_import function but couldn't get it to work for nltk submodules:
# Adapted from: https://stackoverflow.com/questions/42703908/deflazy_import(fullname):
"""lazily import a module the first time it is used"""try:
returnsys.modules[fullname]
exceptKeyError:
spec=importlib.util.find_spec(fullname)
module=importlib.util.module_from_spec(spec)
loader=importlib.util.LazyLoader(spec.loader)
# Make module with proper locking and get it inserted into sys.modules.loader.exec_module(module)
returnmodule
This needs more experimentation but for now I'm just opening the issue...
The text was updated successfully, but these errors were encountered:
Implementing lazy import of backends could partially solve the problem of using AVX instructions within VirtualBox, which quite frequently causes problems for participants of Annif tutorial (see Troubleshooting in the VirtualBox install instructions). With lazy import, TensorFlow (which requires AVX) would only be imported if the NN ensemble backend is used - so the AVX problem would not affect other backends.
Annif takes several seconds to start even when it's doing nothing but printing the version number or help text:
I investigated this a little bit using the
-X importtime
feature in Python 3.7+ and the tuna tool for visualizing profiling information. It seems that the time is mostly spent importing large libraries such as tensorflow, scikit-learn, optuna, connexion and nltk:These libraries are all unnecessary in simple operations such as
annif --help
and--version
so it would be better to avoid importing them altogether. There are some tutorials on lazy importing (e.g. this one) and theimportlib
library contains (since Python 3.5) a LazyLoader utility class that could be used here.I experimented a bit with this
lazy_import
function but couldn't get it to work for nltk submodules:This needs more experimentation but for now I'm just opening the issue...
The text was updated successfully, but these errors were encountered: