-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error in KubernetesPodOperator while fetching logs from kube API #21727
Comments
Thanks for opening your first issue here! Be sure to follow the issue template! |
@uranusjr I noticed that @raphaelauv does not have this problem when using apache-airflow-providers-cncf-kubernetes version greater than 3.0.0 but my version is obviously 3.0.2 I don't know where the problem is |
download = KubernetesPodOperator( |
Please downgrade your |
Kubernetes and Celery are both providers and part of the core. The dependencies for both are added via "extras" which makes them "soft" limits and in case of serious dependency bumps this might end up with a mess (as we experienced with bumping min K8S library version from 11.0.0 to 22.* (resulting in yanking 4 versions of `cncf.kubernetes` provider. After this learning, we approach K8S and Celery dependencies a bit differently than any other dependencies. * for Celery and K8S (and Dask but this is rather an afterhought) we do not strip-off the dependencies from the extra (so for example [cncf.kubernetes] extra will have dependencies on both 'apache-airflow-providers-cncf-kubernetes' as well as directly on kubernetes library * We add upper-bound limits for both Celery and Kubernetes to prevent from accidental upgrades. Both Celery and Kubernetes Python library follow SemVer, and they are crucial components of Airlfow so they both squarely fit our "do not upper-bound" exceptions. * We also add a rule that whenever dependency upper-bound limit is raised, we should also make sure that additional testing is done and appropriate `apache-airflow` lower-bound limit is added for the `apache-airflow-providers-cncf-kubernetes` and `apache-airflow-providers-celery` providers. That should protect our users in all scenarios where they might unknowingly attempt to upgrade Kubernetes or Celery to incompatible version. Related to: apache#22560, apache#21727
Kubernetes and Celery are both providers and part of the core. The dependencies for both are added via "extras" which makes them "soft" limits and in case of serious dependency bumps this might end up with a mess (as we experienced with bumping min K8S library version from 11.0.0 to 22.* (resulting in yanking 4 versions of `cncf.kubernetes` provider. After this learning, we approach K8S and Celery dependencies a bit differently than any other dependencies. * for Celery and K8S (and Dask but this is rather an afterhought) we do not strip-off the dependencies from the extra (so for example [cncf.kubernetes] extra will have dependencies on both 'apache-airflow-providers-cncf-kubernetes' as well as directly on kubernetes library * We add upper-bound limits for both Celery and Kubernetes to prevent from accidental upgrades. Both Celery and Kubernetes Python library follow SemVer, and they are crucial components of Airlfow so they both squarely fit our "do not upper-bound" exceptions. * We also add a rule that whenever dependency upper-bound limit is raised, we should also make sure that additional testing is done and appropriate `apache-airflow` lower-bound limit is added for the `apache-airflow-providers-cncf-kubernetes` and `apache-airflow-providers-celery` providers. As part of this change we also had to fix two issues: * the image was needlesly rebuilt during constraint generation as we already have the image and we even warn that it should be built before we run constraint generation * after this change, the currently released, unyanked cncf.kubernetes provider cannot be installed with airflow, because it has conflicting requirements for kubernetes library (provider has <11 and airflow has > 22.7). Therefore during constraint generation with PyPI providers we install providers from PyPI, we explicitly install the yanked 3.1.2 version. This should be removed after we release the next K8S provider version. That should protect our users in all scenarios where they might unknowingly attempt to upgrade Kubernetes or Celery to incompatible version. Related to: apache#22560, apache#21727
hi @chengzi0103 it appears that your task likely failed for reasons unrelated to the traceback shown. sometimes logs read is interrupted due to connection issue. in that case we catch the error and resume logging. and that's what that traceback is about. but note that it is only a warning, and that the logs later resume, and the task doesn't fail for another 8 minutes. your issue report inspired us to move that traceback to the DEBUG level in #22595, so as not to cause false alarm or confusion. |
Yes my setting is
|
Kubernetes and Celery are both providers and part of the core. The dependencies for both are added via "extras" which makes them "soft" limits and in case of serious dependency bumps this might end up with a mess (as we experienced with bumping min K8S library version from 11.0.0 to 22.* (resulting in yanking 4 versions of `cncf.kubernetes` provider. After this learning, we approach K8S and Celery dependencies a bit differently than any other dependencies. * for Celery and K8S (and Dask but this is rather an afterhought) we do not strip-off the dependencies from the extra (so for example [cncf.kubernetes] extra will have dependencies on both 'apache-airflow-providers-cncf-kubernetes' as well as directly on kubernetes library * We add upper-bound limits for both Celery and Kubernetes to prevent from accidental upgrades. Both Celery and Kubernetes Python library follow SemVer, and they are crucial components of Airlfow so they both squarely fit our "do not upper-bound" exceptions. * We also add a rule that whenever dependency upper-bound limit is raised, we should also make sure that additional testing is done and appropriate `apache-airflow` lower-bound limit is added for the `apache-airflow-providers-cncf-kubernetes` and `apache-airflow-providers-celery` providers. As part of this change we also had to fix two issues: * the image was needlesly rebuilt during constraint generation as we already have the image and we even warn that it should be built before we run constraint generation * after this change, the currently released, unyanked cncf.kubernetes provider cannot be installed with airflow, because it has conflicting requirements for kubernetes library (provider has <11 and airflow has > 22.7). Therefore during constraint generation with PyPI providers we install providers from PyPI, we explicitly install the yanked 3.1.2 version. This should be removed after we release the next K8S provider version. That should protect our users in all scenarios where they might unknowingly attempt to upgrade Kubernetes or Celery to incompatible version. Related to: #22560, #21727
Thank you for your answers I will try to use larger clusters and resources later this problem occurs less often I will use your suggestions Thanks again |
Apache Airflow version
2.2.3 (latest released)
What happened
airflow log error when running multiple k8s_pods
What you expected to happen
how to fix it ?
How to reproduce
I have three machines
Machine A: run airflow webserver and scheduler
B and C machines: run celery worker
All operators run on the k8s cluster through k8s_config
Once I run multiple tasks, the program will automatically report the error log problem I don't know how to solve it
Operating System
Debian GNU/Linux 11
Versions of Apache Airflow Providers
alembic 1.7.5
amqp 5.0.9
anyio 3.5.0
apache-airflow 2.2.3
apache-airflow-providers-celery 2.1.0
apache-airflow-providers-cncf-kubernetes 3.0.2
apache-airflow-providers-docker 2.4.1
apache-airflow-providers-ftp 2.0.1
apache-airflow-providers-http 2.0.3
apache-airflow-providers-imap 2.2.0
apache-airflow-providers-sqlite 2.1.0
apispec 3.3.2
argcomplete 1.12.3
attrs 20.3.0
Babel 2.9.1
billiard 3.6.4.0
bleach 4.1.0
blinker 1.4
cachetools 5.0.0
cattrs 1.6.0
celery 5.2.2
certifi 2021.10.8
cffi 1.15.0
charset-normalizer 2.0.12
click 8.0.4
click-didyoumean 0.3.0
click-plugins 1.1.1
click-repl 0.2.0
clickclick 20.10.2
colorama 0.4.4
colorlog 5.0.1
commonmark 0.9.1
coverage 6.3.2
croniter 1.0.15
cryptography 36.0.1
datacompy 0.7.3
defusedxml 0.7.1
dill 0.3.4
dnspython 2.2.0
docker 5.0.3
docutils 0.16
email-validator 1.1.3
Flask 1.1.2
Flask-AppBuilder 3.4.4
Flask-Babel 2.0.0
Flask-Caching 1.10.1
Flask-JWT-Extended 3.25.1
Flask-Login 0.4.1
Flask-OpenID 1.3.0
Flask-SQLAlchemy 2.5.1
Flask-WTF 0.14.3
flower 1.0.0
gevent 21.12.0
google-auth 2.6.0
graphviz 0.19.1
greenlet 1.1.2
gunicorn 20.1.0
h11 0.12.0
httpcore 0.13.7
httpx 0.19.0
humanize 4.0.0
idna 3.3
importlib-metadata 4.11.1
importlib-resources 5.4.0
inflection 0.5.1
iniconfig 1.1.1
iso8601 1.0.2
itsdangerous 1.1.0
jeepney 0.7.1
Jinja2 3.0.3
jsonschema 3.2.0
keyring 23.5.0
kombu 5.2.3
kubernetes 22.6.0
lazy-object-proxy 1.7.1
lockfile 0.12.2
Mako 1.1.6
Markdown 3.3.6
MarkupSafe 2.1.0
marshmallow 3.14.1
marshmallow-enum 1.5.1
marshmallow-oneofschema 3.0.1
marshmallow-sqlalchemy 0.26.1
numexpr 2.8.1
numpy 1.22.2
oauthlib 3.2.0
openapi-schema-validator 0.2.3
openapi-spec-validator 0.4.0
ordered-set 4.1.0
packaging 21.3
pandas 1.3.5
pendulum 2.1.2
pip 22.0.3
pkginfo 1.8.2
pluggy 1.0.0
prettytable 3.1.1
prison 0.2.1
prometheus-client 0.13.1
prompt-toolkit 3.0.28
psutil 5.9.0
psycopg2-binary 2.9.3
py 1.11.0
pyasn1 0.4.8
pyasn1-modules 0.2.8
pycparser 2.21
Pygments 2.11.2
PyJWT 1.7.1
pyparsing 3.0.7
pyrsistent 0.18.1
pytest 7.0.1
pytest-cov 3.0.0
python-daemon 2.3.0
python-dateutil 2.8.2
python-nvd3 0.15.0
python-slugify 4.0.1
python3-openid 3.2.0
pytz 2021.3
pytzdata 2020.1
PyYAML 6.0
readme-renderer 32.0
redis 3.5.3
requests 2.27.1
requests-oauthlib 1.3.1
requests-toolbelt 0.9.1
rfc3986 1.5.0
rich 11.2.0
rsa 4.8
SecretStorage 3.3.1
semantic-version 2.9.0
setproctitle 1.2.2
setuptools 59.0.1
setuptools-rust 1.1.2
six 1.16.0
sniffio 1.2.0
SQLAlchemy 1.4.31
SQLAlchemy-JSONField 1.0.0
SQLAlchemy-Utils 0.38.2
swagger-ui-bundle 0.0.9
tables 3.6.1
tabulate 0.8.9
tenacity 8.0.1
termcolor 1.1.0
text-unidecode 1.3
tomli 2.0.1
tornado 6.1
tqdm 4.62.3
twine 3.8.0
typing_extensions 4.1.1
unicodecsv 0.14.1
urllib3 1.26.8
vine 5.0.0
wcwidth 0.2.5
webencodings 0.5.1
websocket 0.2.1
websocket-client 1.2.3
Werkzeug 1.0.1
wheel 0.37.0
WTForms 2.3.3
zipp 3.7.0
zope.event 4.5.0
zope.interface 5.4.0
Deployment
Docker-Compose
Deployment details
I have three machines
Machine A: run airflow webserver and scheduler
B and C machines: run celery worker
All operators run on the k8s cluster through k8s_config
Anything else
No response
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: