Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kedro run does not work after single machine deployment #4043

Open
jan-kaufmann opened this issue Jul 30, 2024 · 1 comment
Open

kedro run does not work after single machine deployment #4043

jan-kaufmann opened this issue Jul 30, 2024 · 1 comment

Comments

@jan-kaufmann
Copy link

Description

After coding in Pycharm I tried to deploy my code to another machine following the instructions of the git clone workflow: docs

git clone my_repo
cd my_repo
pip install kedro
pip install -r requirements.txt
kedro run

This failed with an exception because kedro was unable to locate my custom Dataset class. Running the same code from within Pycharm works just fine.

I managed to get the code working by using

python -m kedro run

instead. This is not what the documentation suggests.
I experienced the same behavior on the development machine and the deployment target. Development is running Windows 10, Deployment is on Fedora 39, kedro 0.19.6

I also tried to activate the python environment I used for development and then running kedro run - same issue.

Documentation page (if applicable)

Context

Error log

kedro run
[07/30/24 12:08:14] INFO     Using `conf/logging.yml` as logging configuration. You can change this by setting the KEDRO_LOGGING_CONFIG environment variable accordingly.     __init__.py:249
                    WARNING  /home/jan.kaufmann/.local/lib/python3.12/site-packages/kedro_viz/__init__.py:13: KedroVizPythonVersionWarning: Please be advised that Kedro Viz  warnings.py:110
                             is not yet fully
                                     compatible with the Python version you are currently using.
                               warnings.warn(

[07/30/24 12:08:16] INFO     Kedro project log-analytics                                                                                                                       session.py:324
                    DEBUG    Registered Ctrl-C handler                                                                                                                            hooks.py:22
                    WARNING  /home/jan.kaufmann/.local/lib/python3.12/site-packages/kedro/framework/project/__init__.py:432: UserWarning: An error occurred while importing   warnings.py:110
                             the 'log_analytics.pipelines.load_data' module. Nothing defined therein will be returned by 'find_pipelines'.

                             Traceback (most recent call last):
                               File "/home/jan.kaufmann/.local/lib/python3.12/site-packages/kedro/io/core.py", line 152, in from_config
                                 class_obj, config = parse_dataset_definition(
                                                     ^^^^^^^^^^^^^^^^^^^^^^^^^
                               File "/home/jan.kaufmann/.local/lib/python3.12/site-packages/kedro/io/core.py", line 405, in parse_dataset_definition
                                 raise DatasetError(f"Class '{dataset_type}' not found, is this a typo?")
                             kedro.io.core.DatasetError: Class 'src.log_analytics.datatypes.IncrementalLogsDataset' not found, is this a typo?

                             The above exception was the direct cause of the following exception:

                             Traceback (most recent call last):
                               File "/home/jan.kaufmann/.local/lib/python3.12/site-packages/kedro/framework/project/__init__.py", line 424, in find_pipelines
                                 pipeline_module = importlib.import_module(pipeline_module_name)
                                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                               File "/usr/lib64/python3.12/importlib/__init__.py", line 90, in import_module
                                 return _bootstrap._gcd_import(name[level:], package, level)
                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                               File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
                               File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
                               File "<frozen importlib._bootstrap>", line 1331, in _find_and_load_unlocked
                               File "<frozen importlib._bootstrap>", line 935, in _load_unlocked
                               File "<frozen importlib._bootstrap_external>", line 995, in exec_module
                               File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
                               File "/home/jan.kaufmann/grafana_log_analytics/log-analytics/src/log_analytics/pipelines/load_data/__init__.py", line 1, in <module>
                                 from .pipeline import create_pipeline  # NOQA
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                               File "/home/jan.kaufmann/grafana_log_analytics/log-analytics/src/log_analytics/pipelines/load_data/pipeline.py", line 4, in <module>
                                 from .nodes import update_dataset
                               File "/home/jan.kaufmann/grafana_log_analytics/log-analytics/src/log_analytics/pipelines/load_data/nodes.py", line 12, in <module>
                                 catalog = DataCatalog.from_config(conf_loader['catalog'])
                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                               File "/home/jan.kaufmann/.local/lib/python3.12/site-packages/kedro/io/data_catalog.py", line 299, in from_config
                                 datasets[ds_name] = AbstractDataset.from_config(
                                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                               File "/home/jan.kaufmann/.local/lib/python3.12/site-packages/kedro/io/core.py", line 156, in from_config
                                 raise DatasetError(
                             kedro.io.core.DatasetError: An exception occurred when parsing config for dataset 'raw_log_data':
                             Class 'src.log_analytics.datatypes.IncrementalLogsDataset' not found, is this a typo?

                               warnings.warn(

                    WARNING  /home/jan.kaufmann/.local/lib/python3.12/site-packages/kedro/framework/project/__init__.py:432: UserWarning: An error occurred while importing   warnings.py:110
                             the 'log_analytics.pipelines.analyse_data' module. Nothing defined therein will be returned by 'find_pipelines'.

                             Traceback (most recent call last):
                               File "/home/jan.kaufmann/.local/lib/python3.12/site-packages/kedro/io/core.py", line 152, in from_config
                                 class_obj, config = parse_dataset_definition(
                                                     ^^^^^^^^^^^^^^^^^^^^^^^^^
                               File "/home/jan.kaufmann/.local/lib/python3.12/site-packages/kedro/io/core.py", line 405, in parse_dataset_definition
                                 raise DatasetError(f"Class '{dataset_type}' not found, is this a typo?")
                             kedro.io.core.DatasetError: Class 'src.log_analytics.datatypes.IncrementalLogsDataset' not found, is this a typo?

                             The above exception was the direct cause of the following exception:

                             Traceback (most recent call last):
                               File "/home/jan.kaufmann/.local/lib/python3.12/site-packages/kedro/framework/project/__init__.py", line 424, in find_pipelines
                                 pipeline_module = importlib.import_module(pipeline_module_name)
                                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                               File "/usr/lib64/python3.12/importlib/__init__.py", line 90, in import_module
                                 return _bootstrap._gcd_import(name[level:], package, level)
                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                               File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
                               File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
                               File "<frozen importlib._bootstrap>", line 1331, in _find_and_load_unlocked
                               File "<frozen importlib._bootstrap>", line 935, in _load_unlocked
                               File "<frozen importlib._bootstrap_external>", line 995, in exec_module
                               File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
                               File "/home/jan.kaufmann/grafana_log_analytics/log-analytics/src/log_analytics/pipelines/analyse_data/__init__.py", line 1, in <module>
                                 from .pipeline import create_pipeline  # NOQA
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                               File "/home/jan.kaufmann/grafana_log_analytics/log-analytics/src/log_analytics/pipelines/analyse_data/pipeline.py", line 4, in <module>
                                 from .nodes import portscan_analysis, report_upload
                               File "/home/jan.kaufmann/grafana_log_analytics/log-analytics/src/log_analytics/pipelines/analyse_data/nodes.py", line 12, in <module>
                                 catalog = DataCatalog.from_config(conf_loader['catalog'])
                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                               File "/home/jan.kaufmann/.local/lib/python3.12/site-packages/kedro/io/data_catalog.py", line 299, in from_config
                                 datasets[ds_name] = AbstractDataset.from_config(
                                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                               File "/home/jan.kaufmann/.local/lib/python3.12/site-packages/kedro/io/core.py", line 156, in from_config
                                 raise DatasetError(
                             kedro.io.core.DatasetError: An exception occurred when parsing config for dataset 'raw_log_data':
                             Class 'src.log_analytics.datatypes.IncrementalLogsDataset' not found, is this a typo?

                               warnings.warn(

Traceback (most recent call last):
  File "/home/jan.kaufmann/.local/lib/python3.12/site-packages/kedro/framework/session/session.py", line 341, in run
    pipeline = pipelines[name]
               ~~~~~~~~~^^^^^^
  File "/home/jan.kaufmann/.local/lib/python3.12/site-packages/kedro/framework/project/__init__.py", line 142, in inner
    self._load_data()
  File "/home/jan.kaufmann/.local/lib/python3.12/site-packages/kedro/framework/project/__init__.py", line 187, in _load_data
    project_pipelines = register_pipelines()
                        ^^^^^^^^^^^^^^^^^^^^
  File "/home/jan.kaufmann/grafana_log_analytics/log-analytics/src/log_analytics/pipeline_registry.py", line 16, in register_pipelines
    pipelines["load"] = pipelines['load_data']
                        ~~~~~~~~~^^^^^^^^^^^^^
KeyError: 'load_data'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/jan.kaufmann/.local/bin/kedro", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/jan.kaufmann/.local/lib/python3.12/site-packages/kedro/framework/cli/cli.py", line 233, in main
    cli_collection()
  File "/home/jan.kaufmann/.local/lib/python3.12/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jan.kaufmann/.local/lib/python3.12/site-packages/kedro/framework/cli/cli.py", line 130, in main
    super().main(
  File "/home/jan.kaufmann/.local/lib/python3.12/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/home/jan.kaufmann/.local/lib/python3.12/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jan.kaufmann/.local/lib/python3.12/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jan.kaufmann/.local/lib/python3.12/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jan.kaufmann/.local/lib/python3.12/site-packages/kedro/framework/cli/project.py", line 225, in run
    session.run(
  File "/home/jan.kaufmann/.local/lib/python3.12/site-packages/kedro/framework/session/session.py", line 343, in run
    raise ValueError(
ValueError: Failed to find the pipeline named '__default__'. It needs to be generated and returned by the 'register_pipelines' function.
@ankatiyar
Copy link
Contributor

Hey @jan-kaufmann, thanks for reporting this. Could you check if the folder which contains your custom dataset has an __init__.py file. It might be that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants