Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Make the Dockerfile a tiny bit cache friendlier #149

Merged
merged 1 commit into from
Aug 23, 2024

Conversation

maringuu
Copy link
Collaborator

Still not great as the whole installation happens after copying but at least copying is not the first thing we do anymore.

Related #124

@jstucke
Copy link
Collaborator

jstucke commented Aug 16, 2024

The code looks fine, the container builds without errors but when trying to use it in FACT it crashes directly:

Traceback (most recent call last):
  File "/venv/lib/python3.11/site-packages/gunicorn/arbiter.py", line 609, in spawn_worker
    worker.init_process()
  File "/venv/lib/python3.11/site-packages/gunicorn/workers/base.py", line 134, in init_process
    self.load_wsgi()
  File "/venv/lib/python3.11/site-packages/gunicorn/workers/base.py", line 146, in load_wsgi
    self.wsgi = self.app.wsgi()
                ^^^^^^^^^^^^^^^
  File "/venv/lib/python3.11/site-packages/gunicorn/app/base.py", line 67, in wsgi
    self.callable = self.load()
                    ^^^^^^^^^^^
  File "/venv/lib/python3.11/site-packages/gunicorn/app/wsgiapp.py", line 58, in load
    return self.load_wsgiapp()
           ^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.11/site-packages/gunicorn/app/wsgiapp.py", line 48, in load_wsgiapp
    return util.import_app(self.app_uri)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.11/site-packages/gunicorn/util.py", line 371, in import_app
    mod = importlib.import_module(module)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap>", line 1206, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1178, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1142, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'server'

Maybe because server.py is not found in the cwd because the WORKDIR changed

@maringuu
Copy link
Collaborator Author

maringuu commented Aug 22, 2024

Mhh, well good catch. The underlying problem here is that the fact_extractor does not have a proper python module structure.
Lets conside the tree -L 2 of this repository.

.
├── Dockerfile
├── extract.py
├── fact_extractor # This is a namespace pacakge
│   ├── config
│   ├── docker_extraction.py
│   ├── fact_extract.py # Does not use relative imports, but resides in the same directory as the modules it imports
│   ├── helperFunctions # This is a package
│   ├── install
│   ├── install.py
│   ├── plugins
│   ├── __pycache__
│   ├── server.py # Should use relative imports
│   ├── test
│   ├── unpacker
│   └── version.py

I would expect fact_extractor to be a regular package and the files to use imports like import fact_extractor.helperFunctions.
Interestingly, this is not the case but the imports look like import helperFunctions.
Actually, the package fact_extractor is not used at all.
The reason this works is, that python adds the current working directory to the module search path.

Before this patch the dockerfile changed the workdir to the path the modules reside in.
To acoomodate this, I set the PYTHONPATH to /app.
The easy fix here is to simply change the PYTHONPATH to /app/fact_extractor (I'll do this) but I think in the future a proper module strucutre should be used.

Still not great as the whole installation happens after copying
but at least copying is not the first thing we do anymore.

Related #124
Copy link
Collaborator

@jstucke jstucke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@jstucke jstucke merged commit 967e44f into master Aug 23, 2024
2 checks passed
@jstucke jstucke deleted the dockerfile-cache branch August 23, 2024 09:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants