Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3.x Task Runs went missing from 3.0.0rc18 to 3.0.0rc19 #15153

Closed
tothandor opened this issue Aug 30, 2024 · 50 comments
Closed

3.x Task Runs went missing from 3.0.0rc18 to 3.0.0rc19 #15153

tothandor opened this issue Aug 30, 2024 · 50 comments
Labels
bug Something isn't working

Comments

@tothandor
Copy link

Bug summary

Up until 3.0.0rc18 task run completions were registered correctly with a local Prefect server, but after that with 3.0.0rc19 and 3.0.0rc20 the list of task runs remain empty even after all tasks were completed successfully.
So, nothing will be listed on a dashboard URL like that:
http://localhost:4200/runs/flow-run/a161f610-630f-43a5-8d22-93845c4659a2?tab=Task+Runs

I am still investigating the issue, but downgrading helped to solved it for me.

The flow that I use loads an sqlalchemy-connector block in the first task, and pass the engine to downstream tasks to be used for issuing queries. Maybe this is not a viable method, so I am planning to check this with a simplified flow with dummy tasks.

Version info (prefect version output)

Version:             3.0.0rc18
API version:         0.8.4
Python version:      3.9.19
Git commit:          99428dea
Built:               Thu, Aug 15, 2024 1:17 PM
OS/Arch:             linux/x86_64
Profile:             ephemeral
Server type:         server
Pydantic version:    2.8.2
Integrations:
  prefect-sqlalchemy: 0.5.0rc2

Additional context

I get a lot errors like that from the worker:

12:45:07.687 | ERROR   | GlobalEventLoopThread | prefect._internal.concurrency - Service 'EventsWorker' failed with 1 pending items.
@tothandor tothandor added the bug Something isn't working label Aug 30, 2024
@cicdw
Copy link
Member

cicdw commented Aug 30, 2024

Hi @tothandor, thanks for the bug report! Could you confirm two things for me:

  • are you running a server with prefect server start?
  • if so, what version of Prefect is that server running?

Thank you!

@tothandor
Copy link
Author

Hi @cicdw ,
Yes, I do start the server with prefect server start, except that I add --hostname $HOSTNAME option, so the server is accessible in the local network.
The server version is 3.0.0rc20, according to http://{prefecthost}:4200/settings.

I was trying to reproduce the missing task runs with a simplified flow, but all I could achieve was an empty "run graph".
prefect3 0 0rc20_empty_run_graph

This is the flow code so far.

# dbflow/dbflow_class.py
import logging                                                                                                                                                                                     [5/1835]

import pandas as pd
import sqlalchemy as sqla

from prefect import flow, task
from prefect_sqlalchemy import SqlAlchemyConnector

log = logging.getLogger()

class DBFlow:
    @task
    def dbconnect(self):
        log.info("Connecting to database")
        self.db = SqlAlchemyConnector.load("dbconn").get_engine()

    @staticmethod
    @task
    def q1(db):
        log.info(f"QUERY 1")
        df = pd.read_sql('select * from cities limit 10', db)
        return df

    @task
    def q2(self):
        log.info(f"QUERY 2")
        df = pd.read_sql('select * from forecast limit 10', self.db)
        return df

    @task
    def l2(self, df: pd.DataFrame):
        log.info(f"LOADING TO DATABASE")
        df.to_sql('forecast_test', self.db, schema='test', if_exists='replace')


    @flow(name='DBFlow')
    def run(self=None):
        self.dbconnect()

        self.cities_ = self.q1(self.db)
        self.forecast_ = self.q2()
        self.l2(self.forecast_)

dbflow = DBFlow()
dbflow_flow = dbflow.run
# prefect.yaml
name: prefect3
prefect-version: 3.0.0rc20

# build section allows you to manage and build docker images
build:

# push section allows you to manage if and how this project is uploaded to remote locations
push:

# pull section allows you to provide instructions for cloning this project in remote locations
pull:
- prefect.deployments.steps.set_working_directory:
    directory: /home/andor.toth/work/prefect3

# the deployments section allows you to provide configuration for deploying flows
deployments:
- name: dbflow
  version:
  tags: []
  concurrency_limit: 1
  description:
  entrypoint: dbflow.dbflow_class:dbflow_flow
  parameters: {}
  work_pool:
    name: default
    work_queue_name:
    job_variables: {}
  enforce_parameter_schema: true
  schedules:
  - cron: '*/5 * * * *'
    timezone: Europe/Budapest
    day_or: true
    active: true
    max_active_runs:
    catchup: false

@tothandor
Copy link
Author

tothandor commented Sep 2, 2024

I have retried the upgrade from 3.0.0rc18 to 3.0.0rc20 with particular attention to every steps, including prefect server database upgrade and changes in pip packages, but the result is the same, task runs are not registered somehow, altough the flow exits cleanly.

image

image

Maybe it's not important, but I get a lot of these errors. Both in rc18 and rc20.
ERROR | GlobalEventLoopThread | prefect._internal.concurrency - Service 'EventsWorker' failed with 1 pending items.

And in most cases when a flow fails, I get this stack trace.

...
  File "/srv/prefect/venv/lib64/python3.9/site-packages/prefect/utilities/engine.py", line 736, in emit_task_run_state_change_event
    return emit_event(
  File "/srv/prefect/venv/lib64/python3.9/site-packages/prefect/events/utilities.py", line 84, in emit_event
    worker_instance.send(event_obj)
  File "/srv/prefect/venv/lib64/python3.9/site-packages/prefect/_internal/concurrency/services.py", line 108, in send
    raise RuntimeError("Cannot put items in a stopped service instance.")

If I revert back to rc18, then everything is fine again.
I have also tried resetting the database, but makes no difference.

@tothandor
Copy link
Author

I have tried some intricate dummy flows, but all I could achieve is an incomplete task run graph. But the task runs appeared correctly in the list.

# dbflow/forkmerge.py
import logging                                                                                                                                                                                    

from prefect import flow, task
# from prefect_sqlalchemy import SqlAlchemyConnector

log = logging.getLogger()

class ForkMergeFlow:
    @task
    def dbconnect(self):
        log.info("Connecting to database")
        self.db = lambda: 'dummyresource'
        # self.db = SqlAlchemyConnector.load("dbconn").get_engine()

    @staticmethod
    @task
    def q1(db):
        log.info(f"QUERY 1")
        return [1,2,3,4]

    @task
    def q2(self):
        log.info(f"QUERY 2")
        return list('abcd')

    @classmethod
    @task
    def q3(cls, db):
        log.info(f"QUERY 3")
        return dict(c=3, d=4, e=5, f=6)

    @task
    def p11(self, data): # result #1 - processor #1
        log.info('PROCESS 1 RESULT 1')
        return sorted(data)

    @task
    def p21(self, data): # result #2 - processor #1
        log.info('PROCESS 1 RESULT 2')
        return list(enumerate(data))

    @task
    def p22(self, data): # result #2 - processor #2
        log.info('PROCESS 2 RESULT 2')
        return list(reversed(data))

    @task                                                                                                                                                                                         [15/1893]
    def m12(self, r1, r2): # merge process result #1 & #2
        log.info('MERGE RESULT 1 & 2')
        return list(zip(r1, r2))

    @task
    def l1(self, data):
        log.info('LOAD 1')
        print(data)

    @task
    def l2(self, data):
        log.info(f"LOAD 2")
        print(data)

    @task
    def l22(self, data):
        log.info(f"LOAD 2 PROCESSED RESULT 2")
        print(data)

    @task
    def lm12(self, data):
        log.info(f"LOADING MERGED DATA")
        print(data)


    @flow(name='ForkMerge')
    def run(self=None):
        self.dbconnect()

        # raw results
        self.r1_ = self.q1(self.db)
        self.l1(self.r1_)

        self.r2_ = self.q2()
        self.l2(self.r2_)

        self.r2_ = self.q2() # once more
        self.rp21_ = self.p21(self.r2_)
        self.rp22_ = self.p22(self.r2_)
        self.l22(self.rp22_)

        self.r3_ = self.q3(self.db)

        # processed results
        self.rp11_ = self.p11(self.r1_)
        self.l1(self.rp11_)

        # merged results
        self.rm12_ = self.m12(self.rp11_, self.rp21_)

        # load results
        self.lm12(self.rm12_)

fmflow = ForkMergeFlow()
fmflow_run = fmflow.run

The entrypoint for the deployment is dbflow.forkmerge:fmflow_run

@jakekaplan
Copy link
Contributor

jakekaplan commented Sep 2, 2024

@tothandor thanks for this! A few more questions to help diagnose:

  • Can you run your flow with PREFECT_DEBUG_MODE=True and share the full logs? You'll see a lot of extra output, but we should be able to see the actual error that EventsWorker is failing with.
  • What is your PREFECT_API_URL set to on the client?
  • Do you see the EventsWorker error in your logs on previous prefect versions?

3.0.0rc19 changed how task run data is sent to the server. It's now sent via websocket with the rest Event data. Is there anything in your setup that would block a ws connection from being created?

@tothandor
Copy link
Author

Hi @jakekaplan ,

I have added PREFECT_DEBUG_MODE=True to the configuration (prefect config set PREFECT_DEBUG_MODE=True) and restarted the server and the worker. Both became very verbose.

The 'EventWorker' failed error is gone, and I get this stack trace instead.

10:49:14.506 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - Running call get(timeout=1.9999814350157976) in thread 'APILogWorkerThread'
10:49:14.506 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='get' RUNNING, runtime=0.00> entered
10:49:16.507 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='get' COMPLETED, runtime=2.00> exited
10:49:16.507 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - Encountered exception in call get(<dropped>)
Traceback (most recent call last):
  File "/home/andor.toth/work/prefect3/venv/lib64/python3.12/site-packages/prefect/_internal/concurrency/calls.py", line 346, in _run_sync
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.12/queue.py", line 179, in get
    raise Empty
_queue.Empty

The server show some sqlite3 exceptions:

sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) near ",": syntax error
[SQL: WITH intervals AS
(
                -- recursive CTE to mimic the behavior of `generate_series`,
                -- which is only available as a compiled extension
                WITH RECURSIVE intervals(interval_start, interval_end, counter) AS (
                    VALUES(
                        strftime('%Y-%m-%d %H:%M:%f000', :start_time),
                        strftime('%Y-%m-%d %H:%M:%f000', :start_time, :interval),
                        1
                        )

                    UNION ALL

                    SELECT interval_end, strftime('%Y-%m-%d %H:%M:%f000', interval_end, :interval), counter + 1
                    FROM intervals
                    -- subtract interval because recursive where clauses are effectively evaluated on a t-1 lag
                    WHERE
                        interval_start < strftime('%Y-%m-%d %H:%M:%f000', :end_time, :negative_interval)
                        -- don't compute more than 500 intervals
                        AND counter < 500
                )
 SELECT counts.interval_start, counts.interval_end, coalesce(json_group_array(json(counts.state_agg)) FILTER (WHERE counts.state_agg IS NOT NULL), '[]') AS states
FROM (SELECT intervals.interval_start AS interval_start, intervals.interval_end AS interval_end, CASE WHEN (count(runs.id) = :count_1) THEN NULL ELSE json_object(:json_object_1, runs.state_type, :json_object_2, runs.state_name, :json_object_3, count(runs.id), :json_objec
t_4, sum(max(:max_1, CAST(STRFTIME('%s', runs.estimated_run_time) AS INTEGER))), :json_object_5, sum(max(:max_2, CAST(STRFTIME('%s', runs.estimated_start_time_delta) AS INTEGER)))) END AS state_agg
FROM intervals LEFT OUTER JOIN (SELECT task_run.id AS id, task_run.expected_start_time AS expected_start_time, (SELECT CASE WHEN (task_run.state_type = :state_type_1) THEN strftime(:strftime_1, (julianday(task_run.total_run_time) + julianday(strftime(:strftime_2, (:julia
nday_1 + julianday(strftime('%Y-%m-%d %H:%M:%f000', 'now'))) - julianday(task_run.state_timestamp)))) - :param_1) ELSE task_run.total_run_time END AS anon_1) AS estimated_run_time, CASE WHEN (task_run.start_time > task_run.expected_start_time) THEN strftime(:strftime_3,
(:julianday_2 + julianday(task_run.start_time)) - julianday(task_run.expected_start_time)) WHEN (task_run.start_time IS NULL AND (task_run.state_type NOT IN (:state_type_2_1, :state_type_2_2, :state_type_2_3, :state_type_2_4)) AND task_run.expected_start_time < strftime(
'%Y-%m-%d %H:%M:%f000', 'now')) THEN strftime(:strftime_4, (:julianday_3 + julianday(strftime('%Y-%m-%d %H:%M:%f000', 'now'))) - julianday(task_run.expected_start_time)) ELSE :param_2 END AS estimated_start_time_delta, task_run.state_type AS state_type, task_run.state_na
me AS state_name
FROM task_run
WHERE task_run.start_time <= :start_time_1 AND task_run.start_time >= :start_time_2 AND (EXISTS (SELECT flow_run.flow_id, flow_run.deployment_id, flow_run.work_queue_name, flow_run.flow_version, flow_run.deployment_version, flow_run.parameters, flow_run.idempotency_key,
flow_run.context, flow_run.empirical_policy, flow_run.tags, flow_run.created_by, flow_run.infrastructure_pid, flow_run.job_variables, flow_run.infrastructure_document_id, flow_run.parent_task_run_id, flow_run.auto_scheduled, flow_run.state_id, flow_run.work_queue_id, flo
w_run.name, flow_run.state_type, flow_run.state_name, flow_run.state_timestamp, flow_run.run_count, flow_run.expected_start_time, flow_run.next_scheduled_start_time, flow_run.start_time, flow_run.end_time, flow_run.total_run_time, flow_run.id, flow_run.created, flow_run.
updated
FROM flow_run
WHERE flow_run.id = task_run.flow_run_id))) AS runs ON runs.expected_start_time >= intervals.interval_start AND runs.expected_start_time < intervals.interval_end GROUP BY intervals.interval_start, intervals.interval_end, runs.state_type, runs.state_name) AS counts GROUP
BY counts.interval_start, counts.interval_end ORDER BY counts.interval_start
 LIMIT :param_3 OFFSET :param_4]
[parameters: {'count_1': 0, 'json_object_1': 'state_type', 'json_object_2': 'state_name', 'json_object_3': 'count_runs', 'json_object_4': 'sum_estimated_run_time', 'max_1': 0, 'json_object_5': 'sum_estimated_lateness', 'max_2': 0, 'start_time': '2024-09-02 08:48:35+00:00
', 'interval': '+4320.0 seconds', 'end_time': '2024-09-03 08:48:35+00:00', 'negative_interval': '-4320.0 seconds', 'state_type_1': 'RUNNING', 'strftime_1': '%Y-%m-%d %H:%M:%f000', 'strftime_2': '%Y-%m-%d %H:%M:%f000', 'julianday_1': 2440587.5, 'param_1': 2440587.5, 'strf
time_3': '%Y-%m-%d %H:%M:%f000', 'julianday_2': 2440587.5, 'strftime_4': '%Y-%m-%d %H:%M:%f000', 'julianday_3': 2440587.5, 'param_2': '1970-01-01 00:00:00.000000', 'start_time_1': '2024-09-03 08:48:35.000000', 'start_time_2': '2024-09-02 08:48:35.000000', 'param_3': 500,
 'param_4': 0, 'state_type_2_1': 'COMPLETED', 'state_type_2_2': 'CRASHED', 'state_type_2_3': 'CANCELLED', 'state_type_2_4': 'FAILED'}]
(Background on this error at: https://sqlalche.me/e/20/e3q8)

And there's an other regarding database locking.

...
  File "/home/andor.toth/work/prefect3/venv/lib64/python3.12/site-packages/prefect/server/api/concurrency_limits_v2.py", line 256, in bulk_decrement_active_slots                                                                                                    [831/1962]
    await models.concurrency_limits_v2.bulk_decrement_active_slots(
  File "/home/andor.toth/work/prefect3/venv/lib64/python3.12/site-packages/prefect/server/database/dependencies.py", line 168, in async_wrapper
    return await func(db, *args, **kwargs)  # type: ignore
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/andor.toth/work/prefect3/venv/lib64/python3.12/site-packages/prefect/server/models/concurrency_limits_v2.py", line 296, in bulk_decrement_active_slots
    result = await session.execute(query)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/andor.toth/work/prefect3/venv/lib64/python3.12/site-packages/sqlalchemy/ext/asyncio/session.py", line 461, in execute
    result = await greenlet_spawn(
             ^^^^^^^^^^^^^^^^^^^^^
  File "/home/andor.toth/work/prefect3/venv/lib64/python3.12/site-packages/sqlalchemy/util/_concurrency_py3k.py", line 201, in greenlet_spawn
    result = context.throw(*sys.exc_info())
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/andor.toth/work/prefect3/venv/lib64/python3.12/site-packages/sqlalchemy/orm/session.py", line 2362, in execute
    return self._execute_internal(
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/andor.toth/work/prefect3/venv/lib64/python3.12/site-packages/sqlalchemy/orm/session.py", line 2247, in _execute_internal
    result: Result[Any] = compile_state_cls.orm_execute_statement(
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/andor.toth/work/prefect3/venv/lib64/python3.12/site-packages/sqlalchemy/orm/bulk_persistence.py", line 1649, in orm_execute_statement
    return super().orm_execute_statement(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/andor.toth/work/prefect3/venv/lib64/python3.12/site-packages/sqlalchemy/orm/context.py", line 293, in orm_execute_statement
    result = conn.execute(
             ^^^^^^^^^^^^^
  File "/home/andor.toth/work/prefect3/venv/lib64/python3.12/site-packages/sqlalchemy/engine/base.py", line 1418, in execute
    return meth(
           ^^^^^
  File "/home/andor.toth/work/prefect3/venv/lib64/python3.12/site-packages/sqlalchemy/sql/elements.py", line 515, in _execute_on_connection
    return connection._execute_clauseelement(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/andor.toth/work/prefect3/venv/lib64/python3.12/site-packages/sqlalchemy/engine/base.py", line 1640, in _execute_clauseelement
    ret = self._execute_context(
          ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/andor.toth/work/prefect3/venv/lib64/python3.12/site-packages/sqlalchemy/engine/base.py", line 1846, in _execute_context
    return self._exec_single_context(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/andor.toth/work/prefect3/venv/lib64/python3.12/site-packages/sqlalchemy/engine/base.py", line 1986, in _exec_single_context
    self._handle_dbapi_exception(
  File "/home/andor.toth/work/prefect3/venv/lib64/python3.12/site-packages/sqlalchemy/engine/base.py", line 2355, in _handle_dbapi_exception
    raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
  File "/home/andor.toth/work/prefect3/venv/lib64/python3.12/site-packages/sqlalchemy/engine/base.py", line 1967, in _exec_single_context
    self.dialect.do_execute(
  File "/home/andor.toth/work/prefect3/venv/lib64/python3.12/site-packages/sqlalchemy/engine/default.py", line 941, in do_execute
    cursor.execute(statement, parameters)
  File "/home/andor.toth/work/prefect3/venv/lib64/python3.12/site-packages/sqlalchemy/dialects/sqlite/aiosqlite.py", line 147, in execute
    self._adapt_connection._handle_exception(error)
  File "/home/andor.toth/work/prefect3/venv/lib64/python3.12/site-packages/sqlalchemy/dialects/sqlite/aiosqlite.py", line 298, in _handle_exception
    raise error
  File "/home/andor.toth/work/prefect3/venv/lib64/python3.12/site-packages/sqlalchemy/dialects/sqlite/aiosqlite.py", line 129, in execute
    self.await_(_cursor.execute(operation, parameters))
  File "/home/andor.toth/work/prefect3/venv/lib64/python3.12/site-packages/sqlalchemy/util/_concurrency_py3k.py", line 132, in await_only
    return current.parent.switch(awaitable)  # type: ignore[no-any-return,attr-defined] # noqa: E501
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/andor.toth/work/prefect3/venv/lib64/python3.12/site-packages/sqlalchemy/util/_concurrency_py3k.py", line 196, in greenlet_spawn
    value = await result
            ^^^^^^^^^^^^
  File "/home/andor.toth/work/prefect3/venv/lib64/python3.12/site-packages/aiosqlite/cursor.py", line 48, in execute
    await self._execute(self._cursor.execute, sql, parameters)
  File "/home/andor.toth/work/prefect3/venv/lib64/python3.12/site-packages/aiosqlite/cursor.py", line 40, in _execute
    return await self._conn._execute(fn, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/andor.toth/work/prefect3/venv/lib64/python3.12/site-packages/aiosqlite/core.py", line 132, in _execute
    return await future
           ^^^^^^^^^^^^
  File "/home/andor.toth/work/prefect3/venv/lib64/python3.12/site-packages/aiosqlite/core.py", line 115, in run
    result = function()
             ^^^^^^^^^^
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) database is locked
[SQL: UPDATE concurrency_limit_v2 SET active_slots=CASE WHEN (CASE WHEN (concurrency_limit_v2.active_slots - floor(concurrency_limit_v2.slot_decay_per_second * (julianday(:julianday_1) - julianday(concurrency_limit_v2.updated)) * :param_1) < :param_2) THEN :param_3 ELSE
concurrency_limit_v2.active_slots - floor(concurrency_limit_v2.slot_decay_per_second * (julianday(:julianday_1) - julianday(concurrency_limit_v2.updated)) * :param_1) END - :param_4 < :param_5) THEN :param_6 ELSE CASE WHEN (concurrency_limit_v2.active_slots - floor(concu
rrency_limit_v2.slot_decay_per_second * (julianday(:julianday_2) - julianday(concurrency_limit_v2.updated)) * :param_7) < :param_8) THEN :param_9 ELSE concurrency_limit_v2.active_slots - floor(concurrency_limit_v2.slot_decay_per_second * (julianday(:julianday_2) - julian
day(concurrency_limit_v2.updated)) * :param_7) END - :param_10 END, denied_slots=CASE WHEN (concurrency_limit_v2.denied_slots - floor(CASE WHEN (concurrency_limit_v2.slot_decay_per_second > :slot_decay_per_second_1) THEN concurrency_limit_v2.slot_decay_per_second ELSE :p
aram_11 / (CAST(concurrency_limit_v2.avg_slot_occupancy_seconds AS FLOAT) + 0.0) END * (julianday(:julianday_3) - julianday(concurrency_limit_v2.updated)) * :param_12) < :param_13) THEN :param_14 ELSE concurrency_limit_v2.denied_slots - floor(CASE WHEN (concurrency_limit
_v2.slot_decay_per_second > :slot_decay_per_second_1) THEN concurrency_limit_v2.slot_decay_per_second ELSE :param_11 / (CAST(concurrency_limit_v2.avg_slot_occupancy_seconds AS FLOAT) + 0.0) END * (julianday(:julianday_3) - julianday(concurrency_limit_v2.updated)) * :para
m_12) END, avg_slot_occupancy_seconds=((concurrency_limit_v2.avg_slot_occupancy_seconds + :param_15 / ((concurrency_limit_v2."limit" * :limit_1) + 0.0)) - concurrency_limit_v2.avg_slot_occupancy_seconds / ((concurrency_limit_v2."limit" * :limit_2) + 0.0)), updated=strfti
me('%Y-%m-%d %H:%M:%f000', 'now') WHERE concurrency_limit_v2.id IN (SELECT 1 FROM (SELECT 1) WHERE 1!=1) AND concurrency_limit_v2.active = 1]
[parameters: {'julianday_1': 'now', 'param_1': 86400.0, 'param_2': 0, 'param_3': 0, 'param_4': 1, 'param_5': 0, 'param_6': 0, 'julianday_2': 'now', 'param_7': 86400.0, 'param_8': 0, 'param_9': 0, 'param_10': 1, 'slot_decay_per_second_1': 0.0, 'param_11': 1.0, 'julianday_
3': 'now', 'param_12': 86400.0, 'param_13': 0, 'param_14': 0, 'param_15': 4.316197, 'limit_1': 2, 'limit_2': 2}]
(Background on this error at: https://sqlalche.me/e/20/e3q8)

EventsWorker errors were present in earlier versions too.

The API URL is set in the profile config to the following value: PREFECT_API_URL='http://t84:4200/api'
t84 is a the hostname, and it resolves correctly, TCP port 4200 is accessible (nc -vz t84 4200).

I hope this helps!

@jakekaplan
Copy link
Contributor

jakekaplan commented Sep 3, 2024

Thank you @tothandor that is helpful! The EventsWorker errors being present in prior versions indicates to me that you haven't been able to send prefect Events to your server for awhile. You're not seeing task runs starting in 3.0.0rc19 because task runs started to utilize the events connection to send data. The underlying issues seems to not being able to establish this websocket connection.

Heres a simplified test you can run:

import asyncio

from prefect.events.clients import PrefectEventsClient

async def main():
    async with PrefectEventsClient() as client:
        print(f"Connected to: {client._events_socket_url}")
        pong = await client._websocket.ping()
        pong_time = await pong
        print(f"Response received in: {pong_time}")

if __name__ == '__main__':
    asyncio.run(main())

when connecting successfully gives me:

Connected to: ws://127.0.0.1:4200/api/events/in
Response received in: 0.0001704159949440509

Is your server traffic whitelisted or proxied in some way? I've seen instances in the past where http:// urls are setup to forward/be accepted and ws:// is not.

@lucasdepetrisd
Copy link

Hi!

I’m encountering the same issue after upgrading from Prefect 2.x to the new 3.0.0. My setup includes PREFECT_API_URL='http://192.168.1.13:442/api' and PREFECT_HOME='C:\Users\tareas\.prefect'.

Even though the flows execute successfully, task runs are not always registered in the dashboard. When running a simple flow with PREFECT_DEBUG_MODE=True, I observed that while tasks complete successfully, an error occurs when Prefect tries to access the stored result:

From the UI:
image

From the terminal:
image

The error ValueError: Path C:\Users\tareas\.prefect\storage\2845aef2c337538faed6bb7ab7c99317 does not exist. is the same task ID as shown in the UI image, so I suspect there might be a connection. However, the strange part is that this error still occurs even in runs where the tasks are visible in the dashboard.

It’s strange because this issue doesn’t occur consistently; in some runs, tasks are visible in the dashboard even though the ValueError appears anyway. Any insights on resolving this would be greatly appreciated.

Additionally, the function provided results in the following:
Connected to: ws://192.168.1.13:442/api/events/in
Response received in: 0.000411600005463697

Thanks!

@tothandor
Copy link
Author

tothandor commented Sep 4, 2024

Hi @jakekaplan ,
I have this virtual machine (VM) on a local network without any obstacles or proxies. No firewalls or anything in between.
The client connection test runs without a problem.

Connected to: ws://t84:4200/api/events/in
Response received in: 0.00022578425705432892

And since I have upgraded to 3.0.0, but it doesn't make a difference.

@cicdw
Copy link
Member

cicdw commented Sep 4, 2024

Hey @lucasdepetrisd -- the ValueError is unrelated; we will suppress this in a release soon. Tasks always look up their configured result persistence location, and if a result is found the task is cached, but when there is no result the task executes normally. This error is being raised by the internal machinery of Prefect, but is captured so shouldn't bubble up to users like this. Apologies for the confusion on that point!

@jakekaplan
Copy link
Contributor

jakekaplan commented Sep 4, 2024

hey @tothandor thanks trying out the test script. Unfortunately I'm not able to reproduce what you're seeing here so I may need some more help on your end investigating! Let me give a brief overview of what's happening and hopefully we can try and piece things together.

3.0.0rc19 introduced client side task run orchestration. Previously every time a task run changed state (going to RUNNING etc.) it would make a POST request to the server "asking permission" if it could go to that state.

After 3.0.0rc19 any determination about a task run moving to the next state, is done totally on the client. The client will still record those state changes after the fact, so the underlying information about your task run still gets recorded on the server. It does that by doing the following (you'll need to have PREFECT_DEBUG_MODE=1 to see most of these logs):

Client Side :

  • When you run a task, prefect will emit events to the EventsWorker. You'll see a log like this, sending a task run event to the EventsWorker.
16:47:09.303 | DEBUG   | MainThread   | prefect._internal.concurrency - Service <prefect.events.worker.EventsWorker object at 0x106671300> enqueuing item Event(occurred=DateTime(2024, 9, 4, 20, 47, 9, 303661, tzinfo=Timezone('UTC')), event='prefect.task-run.Running', resource=Resource(root={'prefect.resource.id': 'prefect.task-run.49d25245-7972-4b4b-83d8-3868ce7af378', ..., ))
  • The EventsWorker has client that maintains a websocket connection to the server. When it receives an event you should see a matching log to the one above like this:
16:47:09.330 | DEBUG   | GlobalEventLoopThread | prefect._internal.concurrency - Service <prefect.events.worker.EventsWorker object at 0x106671300> handling item Event(occurred=DateTime(2024, 9, 4, 20, 47, 9, 303661, tzinfo=Timezone('UTC')), event='prefect.task-run.Running', resource=Resource(root={'prefect.resource.id': 'prefect.task-run.49d25245-7972-4b4b-83d8-3868ce7af378', ..., )
  • If the EventsWorker cannot send an event for any reason you'll see a log like below, followed by a traceback of what went wrong:
16:58:51.007 | ERROR   | GlobalEventLoopThread | prefect._internal.concurrency - Service 'EventsWorker' failed with 1 pending items.
Traceback (most recent call last):
...

Server Side:

  • When an event comes in over the websocket connection to ws://.../api/events/in, it gets published to an in memory queue on the server. It distributes the same event to a bunch of services (you can think of this like pub/sub.)
  • One service will record the event to the events table. If you inspect the database for our event above you should see:
-- SELECT * FROM events where resource_id = 'prefect.task-run.49d25245-7972-4b4b-83d8-3868ce7af378' and event = 'prefect.task-run.Running'

  {
    "occurred": "2024-09-04 21:04:10.780012",
    "event": "prefect.task-run.Running",
    "resource_id": "prefect.task-run.49d25245-7972-4b4b-83d8-3868ce7af378",
     ...,
  }
  • Another service will record the event to the task_run and task_run_state tables:
-- SELECT * FROM task_run_state where task_run_id = '49d25245-7972-4b4b-83d8-3868ce7af378' and type = 'RUNNING'
  {
    "id": "9679e882-890a-43fe-a537-7e1be6409beb",
    "created": "2024-09-04 20:47:09.351794",
    "updated": "2024-09-04 20:47:09.352106",
    "type": "RUNNING",
    ...,
  }

I know this is a lot of information. But I'm hoping you may be able to run a no-op task and track the events from the client -> server and following the above see if you notice anything out of the ordinary. Happy to answer any other questions as well, thanks in advance for your assistance in working through this!

@lucasdepetrisd
Copy link

Thanks for the clarification @cicdw! It makes sense that the ValueError is unrelated.

@lucasdepetrisd
Copy link

lucasdepetrisd commented Sep 5, 2024

Hey @jakekaplan, thanks for the explanation! I ran a basic flow with a no-op task and checked the database and logs as you suggested.

Here’s what I found in the logs about the task:

01:32:13.692 | DEBUG   | MainThread   | prefect._internal.concurrency - Service <prefect.events.worker.EventsWorker object at 0x000001AE78587F10> enqueuing item Event(occurred=DateTime(2024, 9, 5, 4, 32, 13, 692362, tzinfo=Timezone('UTC')), event='prefect.task-run.Running', resource=Resource(root={'prefect.resource.id': 'prefect.task-run.c0734825-cabd-4a79-a77e-4c4e462f29f7', 'prefect.resource.name': 'my_task-4a4', 'prefect.state-message': '', 'prefect.state-name': 'Running', 'prefect.state-timestamp': '2024-09-05T04:32:13.692362+00:00', 'prefect.state-type': 'RUNNING', 'prefect.orchestration': 'client'}), related=[], payload={'intended': {'from': 'PENDING', 'to': 'RUNNING'}, 'initial_state': {'type': 'PENDING', 'name': 'Pending', 'message': '', 'state_details': {}}, 'validated_state': {'type': 'RUNNING', 'name': 'Running', 'message': '', 'state_details': {}, 'data': None}, 'task_run': {'name': 'my_task-4a4', 'task_key': 'my_task-2d616e3a', 'dynamic_key': '4a42ee4d-1709-47c3-a9c5-4d32cc73cc77', 'empirical_policy': {'max_retries': 0, 'retry_delay_seconds': 0.0, 'retries': 0, 'retry_delay': 0}, 'tags': [], 'task_inputs': {}, 'run_count': 1, 'flow_run_run_count': 1, 'expected_start_time': '2024-09-05T04:32:13.641478Z', 'start_time': '2024-09-05T04:32:13.692362Z', 'total_run_time': 0.0}}, id=UUID('a3e96151-50da-463e-bd8e-e2b64511c8cd'), follows=UUID('359b6948-718e-4f79-9f10-6ba8393ca689'))

01:32:13.748 | DEBUG   | EventsWorkerThread | prefect._internal.concurrency - Running call get() in thread 'EventsWorkerThread'

01:32:13.839 | INFO    | Task 'my_task' - Finished in state Completed(message=None, type=COMPLETED, result=PersistedResult(type='reference', serializer_type='pickle', storage_key='C:\\Users\\tareas\\.prefect\\storage\\7dfc6a8d5f5969a6f2cbc5c770e5d4e9', storage_block_id=None, expiration=None, serialize_to_none=True))

It seems it's all as expected on the client side and no signs of an error except for that ValueError @cicdw explained before.

However, on the server side, I queried the database and found:

For SELECT * FROM events WHERE resource_id = 'prefect.task-run.c0734825-cabd-4a79-a77e-4c4e462f29f7';, returns data for the three states of the flow and for the resource column of the pending state it contains:

{
  "prefect.resource.id": "prefect.task-run.c0734825-cabd-4a79-a77e-4c4e462f29f7",
  "prefect.resource.name": "my_task-4a4",
  "prefect.state-message": "",
  "prefect.state-name": "Pending",
  "prefect.state-timestamp": "2024-09-05T04:32:13.641478+00:00",
  "prefect.state-type": "PENDING",
  "prefect.orchestration": "client"
}

image

But for SELECT * FROM task_run_state WHERE task_run_id = 'c0734825-cabd-4a79-a77e-4c4e462f29f7';, there were no results and no new tasks recently with or without that ID.

Additionally, I observed that when restarting the local server, tasks are briefly stored and then stop registering after a few minutes.

Any ideas on why this might be happening?

@jakekaplan
Copy link
Contributor

@lucasdepetrisd thanks for doing that. Can you confirm for me that both your client and server versions are >=3.0.0 (not an rc)?

@lucasdepetrisd
Copy link

Yes, I ran the prefect version command and got:

prefect version
13:39:07.366 | DEBUG   | prefect.profiles - Using profile 'dev'
Version:             3.0.0
API version:         0.8.4
Python version:      3.10.6
Git commit:          c40d069d
Built:               Tue, Sep 3, 2024 11:13 AM
OS/Arch:             win32/AMD64
Profile:             dev
Server type:         server
Pydantic version:    2.8.2

I also checked the server version using the endpoint http://192.168.1.13:442/api/admin/version and confirmed its reporting 3.0.0.

However, when querying http://192.168.1.13:442/api/version (described in the docs as the server version), I got API version 0.8.4.

@tothandor
Copy link
Author

Hi @jakekaplan ,

Thanks a lot for the detailed explanation!

I have an other development server (dev15) that runs Prefect 3 with a bit more difficult workflow called CountryWeatherFlow, which collects and processes weather data. After failing to upgrade it to rc19, I have modelled the dummy ForkMergeFlow to the test server (t84) from that, and did my experimentation and debugging on that simplified flow.

Now, I have made a new try with the upgrade on the dev server (from rc18 to GA), resetting the database, recreating blocks, and redeploying the workflows. Unlike the test server, the dev server uses an Apache 2.4 reverse proxy, therefore I have aligned its configuration so it could handle both http and websockets (with wstunnel module).
It seems that the Task Runs are now registered correctly, only the depencies are not recognized, and some of the tasks do not appear on the run graph.

I am sorry, but I may have mixed up some of the conclusions from the different servers, and beleived that task runs could be reliably checked on the run graph. After all the blocked websockets caused the problem as you have guessed it early.

Thanks for all your time and efforts to solve this issue!

@lucasdepetrisd
Copy link

lucasdepetrisd commented Sep 9, 2024

Hi @jakekaplan! I upgraded to version 3.0.1, restarted the server, and let it run over the weekend, but the issue persists. Would you prefer I open a new issue, or should we keep this one closed and continue the conversation here or at the new #15274 issue?

@jakekaplan
Copy link
Contributor

jakekaplan commented Sep 9, 2024

@lucasdepetrisd I'm going to re-open this issue. The other issue while showing similar symptom seems to be from the client not being able to open a websocket connection to the server, which is how you would get the Service 'EventsWorker' failed with 'n' pending error log. From what I can see above, you are able to send events successfully to the server, but they are not being inserted as task runs.

We've added some additional server side debug logging. If I could ask you to enable the following settings server side before restarting your server again for them to kick in:

PREFECT_LOGGING_SERVER_LEVEL=DEBUG
PREFECT_DEBUG_MODE=1

(also double check you're on 3.0.1 for both the client and server, although it seems like you are).
I'd expect you to now see the following:

  1. Indication of the TaskRunRecorder service starting
11:27:36.517 | INFO    | prefect.server - TaskRunRecorder service scheduled to start in-app
...
11:27:36.517 | DEBUG   | prefect.server.services.task_run_recorder - TaskRunRecorder started
  1. Record of your task run event being received by the server:
11:28:26.267 | DEBUG   | prefect.server.events.messaging - Publishing event: prefect.task-run.Pending with id: 5900a1a5-d3ad-479c-931b-ef9c9c73d151 for resource: prefect.task-run.ae14085a-88f1-43f4-b3d5-c5afb799beb4
  1. Record of your task run event being recieved by the internal service:
11:28:26.267 | DEBUG   | prefect.server.services.task_run_recorder - Received event: prefect.task-run.Pending with id: 5900a1a5-d3ad-479c-931b-ef9c9c73d151 for resource: prefect.task-run.ae14085a-88f1-43f4-b3d5-c5afb799beb4
  1. Record of your task run being inserted to the DB:
11:28:26.301 | DEBUG   | prefect.server.services.task_run_recorder - Recorded task run state change

The server debug logs can be pretty noisy as there is a lot going on, if you're having difficultly parsing the logs you can opt to temporarily disable some of the other services (PREFECT_API_SERVICES_SCHEDULER_ENABLED=False for example) while debugging. Thanks for your assistance trying to work through this!

@jakekaplan jakekaplan reopened this Sep 9, 2024
@jakekaplan
Copy link
Contributor

jakekaplan commented Sep 20, 2024

@lucasdepetrisd were you able to investigate any further?

@jakekaplan
Copy link
Contributor

jakekaplan commented Sep 20, 2024

If anyone is coming to this issue unable to view task runs from showing up please following these debug steps (consolidating a couple comments from above).

When you execute a task run, it will emit events via websocket to the server. These events are converted and inserted into the task_run and task_run_state table via the TaskRunRecorder service on the server.

Following the below, you should be able to track a task run from execution time, through client side event emission, server side receipt and finally db insertion. We're looking for any error logs or indication what point in the process something is breaking down.

Debug Steps:

1. Confirm that your client and server are both on 3.0.2 (or whatever the latest version is)

2. Setup debug env vars

  • Set PREFECT_DEBUG_MODE=1 on your client (where your flow runs are actually executing, not your worker) and server
  • Set PREFECT_LOGGING_SERVER_LEVEL=DEBUG on your server (and restart so it takes effect)

3. Setup test task

Run the following task (either directly or in a flow whatever method was leading to you not see task runs before):

from prefect import task

@task(log_prints=True)
def hello_task():
    print("Hello, world!")

4. Track the task run through the client:

  • When you run a task, prefect will emit events to the EventsWorker. You'll see a log like this, sending a task run event to the EventsWorker.
16:47:09.303 | DEBUG   | MainThread   | prefect._internal.concurrency - Service <prefect.events.worker.EventsWorker object at 0x106671300> enqueuing item Event(occurred=DateTime(2024, 9, 4, 20, 47, 9, 303661, tzinfo=Timezone('UTC')), event='prefect.task-run.Running', resource=Resource(root={'prefect.resource.id': 'prefect.task-run.49d25245-7972-4b4b-83d8-3868ce7af378', ..., ))
  • The EventsWorker has client that maintains a websocket connection to the server. When it receives an event you should see a matching log to the one above like this:
16:47:09.330 | DEBUG   | GlobalEventLoopThread | prefect._internal.concurrency - Service <prefect.events.worker.EventsWorker object at 0x106671300> handling item Event(occurred=DateTime(2024, 9, 4, 20, 47, 9, 303661, tzinfo=Timezone('UTC')), event='prefect.task-run.Running', resource=Resource(root={'prefect.resource.id': 'prefect.task-run.49d25245-7972-4b4b-83d8-3868ce7af378', ..., )
  • If the EventsWorker cannot send an event for any reason you'll see a log like below, followed by a traceback of what went wrong:
16:58:51.007 | ERROR   | GlobalEventLoopThread | prefect._internal.concurrency - Service 'EventsWorker' failed with 1 pending items.
Traceback (most recent call last):
...

You should see 3 task run state events make their way to the EventsWorker and emitted., one for PENDING, RUNNING and COMPLETED

5. Track the task run through the server:

The server will accept the task run events via the weboscket and you should see the following debug logs:

  1. Record of your task run event being received by the server:
11:28:26.267 | DEBUG   | prefect.server.events.messaging - Publishing event: prefect.task-run.Pending with id: 5900a1a5-d3ad-479c-931b-ef9c9c73d151 for resource: prefect.task-run.ae14085a-88f1-43f4-b3d5-c5afb799beb4
  1. Record of your task run event being recieved by the internal service that will convert it into a full task run:
11:28:26.267 | DEBUG   | prefect.server.services.task_run_recorder - Received event: prefect.task-run.Pending with id: 5900a1a5-d3ad-479c-931b-ef9c9c73d151 for resource: prefect.task-run.ae14085a-88f1-43f4-b3d5-c5afb799beb4
  1. Record of your task run actually being inserted to the DB:
11:28:26.301 | DEBUG   | prefect.server.services.task_run_recorder - Recorded task run state change
  1. After which you should be able to observe the task run and task run states in the DB directly:
-- SELECT * FROM events where resource_id = 'prefect.task-run.49d25245-7972-4b4b-83d8-3868ce7af378' and event = 'prefect.task-run.Running'

  {
    "occurred": "2024-09-04 21:04:10.780012",
    "event": "prefect.task-run.Running",
    "resource_id": "prefect.task-run.49d25245-7972-4b4b-83d8-3868ce7af378",
     ...,
  }
  • Another service will record the event to the task_run and task_run_state tables:
-- SELECT * FROM task_run_state where task_run_id = '49d25245-7972-4b4b-83d8-3868ce7af378' and type = 'RUNNING'
  {
    "id": "9679e882-890a-43fe-a537-7e1be6409beb",
    "created": "2024-09-04 20:47:09.351794",
    "updated": "2024-09-04 20:47:09.352106",
    "type": "RUNNING",
    ...,
  }

@noemtz
Copy link

noemtz commented Nov 22, 2024

Hi @jakekaplan, you were exactly right, Jobs were not able to communicate with prefect server API due to incorrect certificate store configuration (missing CA certificates). I was confused about the failed GlobalEventLoopThread logs seen from the worker pod specially because I was able to communicate to prefect server through websocket connection when testing from the worker pod but I didn't realize the failed logs were coming from the jobs, I could confirm looking at the jobs logs and finding the same GlobalEventLoopThread failures, also my ad-hoc websocket test was failing when using jobs container image.

Problem got solved fixed fater adding the right certificates into the jobs/flows container image.

@kavvkon
Copy link

kavvkon commented Nov 28, 2024

Hi All,

We seem to have run into the same issue using the current prefect server version, running open source.
[...]
Now for the odd part, found this thread and changed the two debug options recommended above, recreated both server & agent and the issue seemed to have gone away.

We will do further debugging this week, seemingly some odd behaviour here.

So far the logs do not indicate any errors except the below, however assuming this is expected

Same for me, when I enabled both debugging options the problem disappeared!

@GalLadislav
Copy link
Contributor

Hi @kavvkon, @jbw-vtl , @lucasdepetrisd . I believe your issue is probably related more to #15607 and follow up #16100. If your clients are on Windows, please look at these issues.

@jbw-vtl
Copy link
Contributor

jbw-vtl commented Nov 28, 2024

Afraid running within docker compose on debian.
So far has not surfaced again, however will revert back in case it pops up

@GalLadislav
Copy link
Contributor

GalLadislav commented Nov 28, 2024

@jbw-vtl

Hi All,

We seem to have run into the same issue using the current prefect server version, running open source.

prefect version output

Version:             3.1.1
API version:         0.8.4
Python version:      3.11.10
Git commit:          6b50a2b9
Built:               Fri, Nov 8, 2024 12:38 PM
OS/Arch:             win32/AMD64
Profile:             default
Server type:         server
Pydantic version:    2.9.2

We are using a docker agent, both the agent & server running on the same VM within docker compose. Everything is executing as expected, however task runs are not generated.

...

In your first comment you provided prefect version output with OS/Arch: win32/AMD64, that's why i assumed that.
If you executed flows on Windows client, that might caused Prefect Server to enter infinite loop on TaskRunState table which affected all other flows running elsewhere. The infinite loop was fixed in 3.1.4.

@flakeman2
Copy link

flakeman2 commented Dec 6, 2024

I had the same problem upgrading from Prefect 2.20.6 to Prefect 3.1.5 in my k8s instance:

13:22:08.805 | WARNING | prefect.events.clients - Unable to connect to 'wss://workflows.arte.adobe.net/api/events/in'. Please check your network settings to ensure websocket connections to the API are allowed. Otherwise event data (including task run data) may be lost. Reason: server rejected WebSocket connection: HTTP 403. Set PREFECT_DEBUG_MODE=1 to see the full error.
13:22:08.805 | ERROR   | GlobalEventLoopThread | prefect._internal.concurrency - Service 'EventsWorker' failed with 11 pending items.

Thanks to @zzstoatzz for confirming that outbound websockets was the issue. Updating my IngressRoute fixed it: enableWebsockets: true

apiVersion: contour.heptio.com/v1beta1
kind: IngressRoute
metadata:
  name: prefect-ingress-tls
  labels:
    environment: production
    app: prefect
spec:
  virtualhost:
    fqdn: workflows.arte.adobe.net
    tls:
      secretName: arte-star-tls
  routes:
    - enableWebsockets: true
      match: /
      timeout: 900s
      permitInsecure: false
      services:
        - name: prefect-server
          port: 4200

@jbw-vtl
Copy link
Contributor

jbw-vtl commented Dec 19, 2024

Also ran into the websocket related issue following an upgrade to 3.1.8 for both OSS server and agent using python 3.12.
Potentially this was the same issue we had a while ago, hard to say as we didn't have verbose logs.

We use a fairly basic docker compose setup, with both the agent and server being in the same docker network.

Agent directly communicates to server using http://server:4200/api (which is the service name for the server)

This is running on a Debian linux.

The flows are showing subflow runs within the task_run table and UI, however not the actual task runs.

Any help is greatly appreciated, despite recreating all the server and agent containers can't seem to resolve this one.

Have not yet tried resetting the whole database.

prefect-agent  | 13:02:48.076 | WARNING | prefect.events.clients - Unable to connect to 'ws://server:4200/api/events/in'. Please check your network settings to ensure websocket connections to the API are allowed. Otherwise event data (including task run data) may be lost. Reason: did not receive a valid HTTP response. Set PREFECT_DEBUG_MODE=1 to see the full error.
prefect-agent  | Traceback (most recent call last):
prefect-agent  |   File "/src/.venv/lib/python3.11/site-packages/websockets/legacy/client.py", line 134, in read_http_response
prefect-agent  |     status_code, reason, headers = await read_response(self.reader)
prefect-agent  |                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
prefect-agent  |   File "/src/.venv/lib/python3.11/site-packages/websockets/legacy/http.py", line 130, in read_response
prefect-agent  |     raise ValueError(f"unsupported HTTP version: {d(version)}")
prefect-agent  | ValueError: unsupported HTTP version: HTTP/1.0
prefect-agent  |
prefect-agent  | The above exception was the direct cause of the following exception:
prefect-agent  |
prefect-agent  | Traceback (most recent call last):
prefect-agent  |   File "/src/.venv/lib/python3.11/site-packages/prefect/events/clients.py", line 348, in _reconnect
prefect-agent  |     self._websocket = await self._connect.__aenter__()
prefect-agent  |                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
prefect-agent  |   File "/src/.venv/lib/python3.11/site-packages/websockets/legacy/client.py", line 635, in __aenter__
prefect-agent  |     return await self
prefect-agent  |            ^^^^^^^^^^
prefect-agent  |   File "/src/.venv/lib/python3.11/site-packages/prefect/events/clients.py", line 123, in _proxy_connect
prefect-agent  |     proto = await self.__await_impl__()
prefect-agent  |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
prefect-agent  |   File "/src/.venv/lib/python3.11/site-packages/websockets/legacy/client.py", line 656, in __await_impl__
prefect-agent  |     await protocol.handshake(
prefect-agent  |   File "/src/.venv/lib/python3.11/site-packages/websockets/legacy/client.py", line 316, in handshake
prefect-agent  |     status_code, response_headers = await self.read_http_response()
prefect-agent  |                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
prefect-agent  |   File "/src/.venv/lib/python3.11/site-packages/websockets/legacy/client.py", line 136, in read_http_response
prefect-agent  |     raise InvalidMessage("did not receive a valid HTTP response") from exc
prefect-agent  | websockets.legacy.exceptions.InvalidMessage: did not receive a valid HTTP response
prefect-agent  | 13:02:48.079 | ERROR   | GlobalEventLoopThread | prefect._internal.concurrency - Service 'EventsWorker' failed with 1 pending items.
prefect-agent  | Traceback (most recent call last):
prefect-agent  |   File "/src/.venv/lib/python3.11/site-packages/websockets/legacy/client.py", line 134, in read_http_response
prefect-agent  |     status_code, reason, headers = await read_response(self.reader)
prefect-agent  |                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
prefect-agent  |   File "/src/.venv/lib/python3.11/site-packages/websockets/legacy/http.py", line 130, in read_response
prefect-agent  |     raise ValueError(f"unsupported HTTP version: {d(version)}")
prefect-agent  | ValueError: unsupported HTTP version: HTTP/1.0
prefect-agent  |
prefect-agent  | The above exception was the direct cause of the following exception:
prefect-agent  |
prefect-agent  | Traceback (most recent call last):
prefect-agent  |   File "/src/.venv/lib/python3.11/site-packages/prefect/_internal/concurrency/services.py", line 125, in _run
prefect-agent  |     async with self._lifespan():
prefect-agent  |   File "/usr/local/lib/python3.11/contextlib.py", line 204, in __aenter__
prefect-agent  |     return await anext(self.gen)
prefect-agent  |            ^^^^^^^^^^^^^^^^^^^^^
prefect-agent  |   File "/src/.venv/lib/python3.11/site-packages/prefect/events/worker.py", line 70, in _lifespan
prefect-agent  |     async with self._client:
prefect-agent  |   File "/src/.venv/lib/python3.11/site-packages/prefect/events/clients.py", line 329, in __aenter__
prefect-agent  |     await self._reconnect()
prefect-agent  |   File "/src/.venv/lib/python3.11/site-packages/prefect/events/clients.py", line 348, in _reconnect
prefect-agent  |     self._websocket = await self._connect.__aenter__()
prefect-agent  |                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
prefect-agent  |   File "/src/.venv/lib/python3.11/site-packages/websockets/legacy/client.py", line 635, in __aenter__
prefect-agent  |     return await self
prefect-agent  |            ^^^^^^^^^^
prefect-agent  |   File "/src/.venv/lib/python3.11/site-packages/prefect/events/clients.py", line 123, in _proxy_connect
prefect-agent  |     proto = await self.__await_impl__()
prefect-agent  |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
prefect-agent  |   File "/src/.venv/lib/python3.11/site-packages/websockets/legacy/client.py", line 656, in __await_impl__
prefect-agent  |     await protocol.handshake(
prefect-agent  |   File "/src/.venv/lib/python3.11/site-packages/websockets/legacy/client.py", line 316, in handshake
prefect-agent  |     status_code, response_headers = await self.read_http_response()
prefect-agent  |                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
prefect-agent  |   File "/src/.venv/lib/python3.11/site-packages/websockets/legacy/client.py", line 136, in read_http_response
prefect-agent  |     raise InvalidMessage("did not receive a valid HTTP response") from exc
prefect-agent  | websockets.legacy.exceptions.InvalidMessage: did not receive a valid HTTP response

I have ruled out the basic connection not working, can successfully connect from within the agent container to the websocket on the server using something like below

import asyncio
import websockets

def test_url(url, data="{}"):
    async def inner():
        print(f"Connecting to {url}")
        async with websockets.connect(url) as websocket:
            print("Connected")
            await websocket.send(data)
            print("Successfully send data")
    return asyncio.run(inner())

test_url("ws://server:4200/api/events/in")

The schema of course is wrong (sending an empty json), however can connect successfully with the server failing upon validating the request with pydantic

prefect-server  | 13:30:53.822 | DEBUG   | uvicorn.error - = connection is CONNECTING
prefect-server  | 13:30:53.822 | DEBUG   | uvicorn.error - < GET /api/events/in HTTP/1.1
prefect-server  | 13:30:53.824 | DEBUG   | uvicorn.error - < host: server:4200
prefect-server  | 13:30:53.824 | DEBUG   | uvicorn.error - < upgrade: websocket
prefect-server  | 13:30:53.825 | DEBUG   | uvicorn.error - < connection: Upgrade
prefect-server  | 13:30:53.825 | DEBUG   | uvicorn.error - < sec-websocket-key: vy0jD//Qu2DgDAqJFIul2A==
prefect-server  | 13:30:53.825 | DEBUG   | uvicorn.error - < sec-websocket-version: 13
prefect-server  | 13:30:53.825 | DEBUG   | uvicorn.error - < sec-websocket-extensions: permessage-deflate; client_max_window_bits
prefect-server  | 13:30:53.826 | DEBUG   | uvicorn.error - < user-agent: Python/3.12 websockets/13.1
prefect-server  | 13:30:53.826 | INFO    | uvicorn.error - ('192.168.144.4', 47446) - "WebSocket /api/events/in" [accepted]
prefect-server  | 13:30:53.829 | DEBUG   | uvicorn.error - > HTTP/1.1 101 Switching Protocols
prefect-server  | 13:30:53.829 | DEBUG   | uvicorn.error - > Upgrade: websocket
prefect-server  | 13:30:53.829 | DEBUG   | uvicorn.error - > Connection: Upgrade
prefect-server  | 13:30:53.829 | DEBUG   | uvicorn.error - > Sec-WebSocket-Accept: cZajcYowAa8XeBKtGzVgoH6Ji1E=
prefect-server  | 13:30:53.830 | DEBUG   | uvicorn.error - > Sec-WebSocket-Extensions: permessage-deflate
prefect-server  | 13:30:53.830 | DEBUG   | uvicorn.error - > date: Thu, 19 Dec 2024 13:30:53 GMT
prefect-server  | 13:30:53.830 | DEBUG   | uvicorn.error - > server: uvicorn
prefect-server  | 13:30:53.831 | INFO    | uvicorn.error - connection open
prefect-server  | 13:30:53.831 | DEBUG   | uvicorn.error - = connection is OPEN
prefect-server  | 13:30:53.838 | DEBUG   | uvicorn.error - < TEXT '{}' [2 bytes]
prefect-server  | 13:30:53.838 | DEBUG   | uvicorn.error - < CLOSE 1000 (OK) [2 bytes]
prefect-server  | 13:30:53.838 | DEBUG   | uvicorn.error - = connection is CLOSING
prefect-server  | 13:30:53.839 | DEBUG   | uvicorn.error - > CLOSE 1000 (OK) [2 bytes]
prefect-server  | 13:30:53.839 | DEBUG   | uvicorn.error - x half-closing TCP connection
prefect-server  | 13:30:53.840 | ERROR   | uvicorn.error - Exception in ASGI application
...
prefect-server  | occurred
prefect-server  |   Field required [type=missing, input_value={}, input_type=dict]
prefect-server  |     For further information visit https://errors.pydantic.dev/2.10/v/missing
prefect-server  | event
prefect-server  |   Field required [type=missing, input_value={}, input_type=dict]
prefect-server  |     For further information visit https://errors.pydantic.dev/2.10/v/missing
prefect-server  | resource
prefect-server  |   Field required [type=missing, input_value={}, input_type=dict]
prefect-server  |     For further information visit https://errors.pydantic.dev/2.10/v/missing
prefect-server  | id
prefect-server  |   Field required [type=missing, input_value={}, input_type=dict]
prefect-server  |     For further information visit https://errors.pydantic.dev/2.10/v/missing

@jbw-vtl
Copy link
Contributor

jbw-vtl commented Dec 20, 2024

Just providing a bit more info, we tried fully resetting the database and recreating, same issue for 3.1.8.
Struggling to downgrade to previous versions as installing the docker extensions via EXTRA_PIP_PACKAGES=prefect-docker in the agent container seems to also upgrade prefect.

@jakekaplan
Copy link
Contributor

jakekaplan commented Dec 20, 2024

Hi @jbw-vtl:

You'll see the error in your logs is:

prefect-agent  | 13:02:48.076 | WARNING | prefect.events.clients - Unable to connect to 'ws://server:4200/api/events/in'. Please check your network settings to ensure websocket connections to the API are allowed. Otherwise event data (including task run data) may be lost. Reason: did not receive a valid HTTP response. Set PREFECT_DEBUG_MODE=1 to see the full error.
prefect-agent  | Traceback (most recent call last):
prefect-agent  |   File "/src/.venv/lib/python3.11/site-packages/websockets/legacy/client.py", line 134, in read_http_response
prefect-agent  |     status_code, reason, headers = await read_response(self.reader)
prefect-agent  |                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
prefect-agent  |   File "/src/.venv/lib/python3.11/site-packages/websockets/legacy/http.py", line 130, in read_response
prefect-agent  |     raise ValueError(f"unsupported HTTP version: {d(version)}")
prefect-agent  | ValueError: unsupported HTTP version: HTTP/1.0

Websockets in general are only supported on > HTTP/1.1 and the underlying http/websockets libraries prefect uses should only communicate with http/1.1 or http2.

In the test you ran the you can see the server respond with the correct http version.

prefect-server  | 13:30:53.822 | DEBUG   | uvicorn.error - < GET /api/events/in HTTP/1.1

Some questions that maybe could help debug:

  • What version were you on previously?
  • Is it possible that something else changed here? Are you able to share your docker compose?
  • Do you have any intermediate networking that could be downgrading the http version (like a proxy or load balancer?)
  • Do you the server logs from when you see the error client side?

@jbw-vtl
Copy link
Contributor

jbw-vtl commented Dec 20, 2024

Previously we were running 3.1.5 previously.

We do use an nginx as reverse proxy for SSL exposing the prefect server , however can reproduce the issues with just a server and agent in docker compose, without any further components.
See below as simple configuration as I could accomplish to reproduce the issue

full docker compose below (company related content in <...> excluded, otherwise everything as is

---
services:
  server:
    image: nexus.<company>.com/docker/pyc/<company>-prefect:3.1.8-python3.12
    container_name: pycom-prefect-server
    restart: always
    # Temporary workaround to not block prefect deployments and local use while we explore EntraID integration
    ports:
      - "4200:4200"
    entrypoint: ["/opt/prefect/entrypoint.sh", "prefect", "server", "start"]
    # We do not expose the prefect service directly, instead using Nginx to enforce SSL and later authentication / authorization
    environment:
      - PREFECT_UI_URL=http://<ec2-instance>:4200
      - PREFECT_API_URL=http://<ec2-instance>:4200/api
      - PREFECT_SERVER_API_HOST=0.0.0.0
      - PREFECT_API_DATABASE_CONNECTION_URL=postgresql+asyncpg://<...>
      - PREFECT_DEBUG_MODE=1
      - PREFECT_LOGGING_SERVER_LEVEL=DEBUG

  agent:
    image: nexus.<company>.com/docker/pyc/<company>-prefect:3.1.8-python3.12
    container_name: pycom-prefect-agent
    restart: always
    entrypoint: ["/opt/prefect/entrypoint.sh", "prefect", "worker", "start", "--pool", "docker-pool", "--work-queue", "default", "--type", "docker"]
    environment:
      - PREFECT_API_URL=http://server:4200/api
      - PREFECT_LOGGING_LEVEL=DEBUG
      - PREFECT_DEBUG_MODE=1
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - /usr/bin/docker:/usr/bin/docker
    depends_on:
      - server
    links:
      - server

networks:
  default:
    name: pycom-prefect-network

We run a slightly modified version of the prefect base image as the vm itself is airgapped, so installing prefect[docker] via PIP extra packages was not an option.

# Slightly modified version of the prefect baseimage to include docker dependencies for airgapped environment
FROM prefecthq/prefect:3.1.8-python3.12

RUN pip install prefect[docker]==3.1.8 --no-cache-dir

Full logs of both server and agent in debug below, following manually triggering a deployment through the UI

 pycom-prefect-server  | 20:05:04.702 | INFO    | uvicorn.access - 172.24.0.3:42606 - "GET /api/deployments/4e779212-3425-4019-854f-358417860d46 HTTP/1.1" 200
pycom-prefect-server  | 20:05:04.719 | INFO    | uvicorn.access - 172.24.0.3:42606 - "GET /api/flows/efc28063-d9d2-4631-b98b-cac3ded99963 HTTP/1.1" 200
pycom-prefect-agent   | 20:05:04.733 | DEBUG   | MainThread   | prefect._internal.concurrency - Service <prefect.events.worker.EventsWorker object at 0x7f0d293b2510> enqueuing item Event(occurred=DateTime(2024, 12, 20, 20, 5, 4, 733002, tzinfo=Timezone('UTC')), event='prefect.worker.submitted-flow-run', resource=Resource(root={'prefect.resource.id': 'prefect.worker.docker.dockerworker-d142205c-9bb2-484a-a26a-d562f76b2b10', 'prefect.resource.name': 'DockerWorker d142205c-9bb2-484a-a26a-d562f76b2b10', 'prefect.version': '3.1.8', 'prefect.worker-type': 'docker'}), related=[RelatedResource(root={'prefect.resource.id': 'prefect.deployment.4e779212-3425-4019-854f-358417860d46', 'prefect.resource.role': 'deployment', 'prefect.resource.name': 'update_nexus_repo_packages'}), RelatedResource(root={'prefect.resource.id': 'prefect.flow.efc28063-d9d2-4631-b98b-cac3ded99963', 'prefect.resource.role': 'flow', 'prefect.resource.name': 'scan-nexus-packages'}), RelatedResource(root={'prefect.resource.id': 'prefect.flow-run.46f10173-5139-4276-8ee2-aa4ecda242b7', 'prefect.resource.role': 'flow-run', 'prefect.resource.name': 'realistic-caracal'}), RelatedResource(root={'prefect.resource.id': 'prefect.work-pool.446e8d8e-394c-4722-a65d-0e1b749db8f1', 'prefect.resource.role': 'work-pool', 'prefect.resource.name': 'docker-pool'})], payload={}, id=UUID('bd936d3b-f114-42d0-a343-91d51f838bef'), follows=None)
pycom-prefect-agent   | 20:05:04.734 | DEBUG   | EventsWorkerThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='get' COMPLETED, runtime=11.57> exited
pycom-prefect-agent   | 20:05:04.734 | DEBUG   | EventsWorkerThread | prefect._internal.concurrency - Finished call get(<dropped>)
pycom-prefect-agent   | 20:05:04.734 | DEBUG   | GlobalEventLoopThread | prefect._internal.concurrency - Service <prefect.events.worker.EventsWorker object at 0x7f0d293b2510> handling item Event(occurred=DateTime(2024, 12, 20, 20, 5, 4, 733002, tzinfo=Timezone('UTC')), event='prefect.worker.submitted-flow-run', resource=Resource(root={'prefect.resource.id': 'prefect.worker.docker.dockerworker-d142205c-9bb2-484a-a26a-d562f76b2b10', 'prefect.resource.name': 'DockerWorker d142205c-9bb2-484a-a26a-d562f76b2b10', 'prefect.version': '3.1.8', 'prefect.worker-type': 'docker'}), related=[RelatedResource(root={'prefect.resource.id': 'prefect.deployment.4e779212-3425-4019-854f-358417860d46', 'prefect.resource.role': 'deployment', 'prefect.resource.name': 'update_nexus_repo_packages'}), RelatedResource(root={'prefect.resource.id': 'prefect.flow.efc28063-d9d2-4631-b98b-cac3ded99963', 'prefect.resource.role': 'flow', 'prefect.resource.name': 'scan-nexus-packages'}), RelatedResource(root={'prefect.resource.id': 'prefect.flow-run.46f10173-5139-4276-8ee2-aa4ecda242b7', 'prefect.resource.role': 'flow-run', 'prefect.resource.name': 'realistic-caracal'}), RelatedResource(root={'prefect.resource.id': 'prefect.work-pool.446e8d8e-394c-4722-a65d-0e1b749db8f1', 'prefect.resource.role': 'work-pool', 'prefect.resource.name': 'docker-pool'})], payload={}, id=UUID('bd936d3b-f114-42d0-a343-91d51f838bef'), follows=None)
pycom-prefect-agent   | 20:05:04.735 | DEBUG   | EventsWorkerThread | prefect._internal.concurrency - Running call get() in thread 'EventsWorkerThread'
pycom-prefect-agent   | 20:05:04.735 | DEBUG   | EventsWorkerThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='get' RUNNING, runtime=0.00> entered
pycom-prefect-server  | 20:05:04.735 | DEBUG   | uvicorn.error - < TEXT '{"occurred":"2024-12-20T20:05:04.733002Z","even...838bef","follows":null}' [1051 bytes]
pycom-prefect-server  | 20:05:04.736 | DEBUG   | prefect.server.events.messaging - Publishing event: prefect.worker.submitted-flow-run with id: bd936d3b-f114-42d0-a343-91d51f838bef for resource: prefect.worker.docker.dockerworker-d142205c-9bb2-484a-a26a-d562f76b2b10
pycom-prefect-server  | 20:05:04.737 | DEBUG   | prefect.server.events.services.event_persister - Received event: prefect.worker.submitted-flow-run with id: bd936d3b-f114-42d0-a343-91d51f838bef for resource: prefect.worker.docker.dockerworker-d142205c-9bb2-484a-a26a-d562f76b2b10
pycom-prefect-agent   | 20:05:04.747 | INFO    | prefect.workers.docker.dockerworker d142205c-9bb2-484a-a26a-d562f76b2b10 - Creating Docker container 'realistic-caracal'...
pycom-prefect-agent   | 20:05:04.796 | INFO    | prefect.workers.docker.dockerworker d142205c-9bb2-484a-a26a-d562f76b2b10 - Docker container 'realistic-caracal' has status 'created'
pycom-prefect-agent   | 20:05:04.797 | DEBUG   | AnyIO worker thread | prefect._internal.concurrency - Service <prefect.events.worker.EventsWorker object at 0x7f0d293b2510> enqueuing item Event(occurred=DateTime(2024, 12, 20, 20, 5, 4, 797202, tzinfo=Timezone('UTC')), event='prefect.docker.container.created', resource=Resource(root={'prefect.resource.id': 'prefect.docker.container.352c5ca8ff5d96dfde1db502fc313c251bd0bd187f388b1dda4ae24a95191b5e', 'prefect.resource.name': 'realistic-caracal'}), related=[RelatedResource(root={'prefect.resource.id': 'prefect.deployment.4e779212-3425-4019-854f-358417860d46', 'prefect.resource.role': 'deployment', 'prefect.resource.name': 'update_nexus_repo_packages'}), RelatedResource(root={'prefect.resource.id': 'prefect.flow.efc28063-d9d2-4631-b98b-cac3ded99963', 'prefect.resource.role': 'flow', 'prefect.resource.name': 'scan-nexus-packages'}), RelatedResource(root={'prefect.resource.id': 'prefect.flow-run.46f10173-5139-4276-8ee2-aa4ecda242b7', 'prefect.resource.role': 'flow-run', 'prefect.resource.name': 'realistic-caracal'}), RelatedResource(root={'prefect.resource.id': 'prefect.work-pool.446e8d8e-394c-4722-a65d-0e1b749db8f1', 'prefect.resource.role': 'work-pool', 'prefect.resource.name': 'docker-pool'}), RelatedResource(root={'prefect.resource.id': 'prefect.worker.docker.dockerworker-d142205c-9bb2-484a-a26a-d562f76b2b10', 'prefect.resource.name': 'DockerWorker d142205c-9bb2-484a-a26a-d562f76b2b10', 'prefect.version': '3.1.8', 'prefect.worker-type': 'docker', 'prefect.resource.role': 'worker'})], payload={}, id=UUID('e2c9ff93-bc48-4420-b049-dc48f9e4a7b3'), follows=None)
pycom-prefect-agent   | 20:05:04.798 | DEBUG   | EventsWorkerThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='get' COMPLETED, runtime=0.06> exited
pycom-prefect-agent   | 20:05:04.798 | DEBUG   | EventsWorkerThread | prefect._internal.concurrency - Finished call get(<dropped>)
pycom-prefect-agent   | 20:05:04.799 | DEBUG   | GlobalEventLoopThread | prefect._internal.concurrency - Service <prefect.events.worker.EventsWorker object at 0x7f0d293b2510> handling item Event(occurred=DateTime(2024, 12, 20, 20, 5, 4, 797202, tzinfo=Timezone('UTC')), event='prefect.docker.container.created', resource=Resource(root={'prefect.resource.id': 'prefect.docker.container.352c5ca8ff5d96dfde1db502fc313c251bd0bd187f388b1dda4ae24a95191b5e', 'prefect.resource.name': 'realistic-caracal'}), related=[RelatedResource(root={'prefect.resource.id': 'prefect.deployment.4e779212-3425-4019-854f-358417860d46', 'prefect.resource.role': 'deployment', 'prefect.resource.name': 'update_nexus_repo_packages'}), RelatedResource(root={'prefect.resource.id': 'prefect.flow.efc28063-d9d2-4631-b98b-cac3ded99963', 'prefect.resource.role': 'flow', 'prefect.resource.name': 'scan-nexus-packages'}), RelatedResource(root={'prefect.resource.id': 'prefect.flow-run.46f10173-5139-4276-8ee2-aa4ecda242b7', 'prefect.resource.role': 'flow-run', 'prefect.resource.name': 'realistic-caracal'}), RelatedResource(root={'prefect.resource.id': 'prefect.work-pool.446e8d8e-394c-4722-a65d-0e1b749db8f1', 'prefect.resource.role': 'work-pool', 'prefect.resource.name': 'docker-pool'}), RelatedResource(root={'prefect.resource.id': 'prefect.worker.docker.dockerworker-d142205c-9bb2-484a-a26a-d562f76b2b10', 'prefect.resource.name': 'DockerWorker d142205c-9bb2-484a-a26a-d562f76b2b10', 'prefect.version': '3.1.8', 'prefect.worker-type': 'docker', 'prefect.resource.role': 'worker'})], payload={}, id=UUID('e2c9ff93-bc48-4420-b049-dc48f9e4a7b3'), follows=None)
pycom-prefect-agent   | 20:05:04.800 | DEBUG   | EventsWorkerThread | prefect._internal.concurrency - Running call get() in thread 'EventsWorkerThread'
pycom-prefect-agent   | 20:05:04.800 | DEBUG   | EventsWorkerThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='get' RUNNING, runtime=0.00> entered
pycom-prefect-server  | 20:05:04.799 | DEBUG   | uvicorn.error - < TEXT '{"occurred":"2024-12-20T20:05:04.797202Z","even...e4a7b3","follows":null}' [1243 bytes]
pycom-prefect-server  | 20:05:04.800 | DEBUG   | prefect.server.events.messaging - Publishing event: prefect.docker.container.created with id: e2c9ff93-bc48-4420-b049-dc48f9e4a7b3 for resource: prefect.docker.container.352c5ca8ff5d96dfde1db502fc313c251bd0bd187f388b1dda4ae24a95191b5e
pycom-prefect-server  | 20:05:04.803 | DEBUG   | prefect.server.events.services.event_persister - Received event: prefect.docker.container.created with id: e2c9ff93-bc48-4420-b049-dc48f9e4a7b3 for resource: prefect.docker.container.352c5ca8ff5d96dfde1db502fc313c251bd0bd187f388b1dda4ae24a95191b5e
pycom-prefect-agent   | 20:05:05.148 | INFO    | prefect.workers.docker.dockerworker d142205c-9bb2-484a-a26a-d562f76b2b10 - Docker container 'realistic-caracal' has status 'running'
pycom-prefect-agent   | 20:05:05.150 | DEBUG   | AnyIO worker thread | prefect._internal.concurrency - Service <prefect.events.worker.EventsWorker object at 0x7f0d293b2510> enqueuing item Event(occurred=DateTime(2024, 12, 20, 20, 5, 5, 149608, tzinfo=Timezone('UTC')), event='prefect.docker.container.running', resource=Resource(root={'prefect.resource.id': 'prefect.docker.container.352c5ca8ff5d96dfde1db502fc313c251bd0bd187f388b1dda4ae24a95191b5e', 'prefect.resource.name': 'realistic-caracal'}), related=[RelatedResource(root={'prefect.resource.id': 'prefect.deployment.4e779212-3425-4019-854f-358417860d46', 'prefect.resource.role': 'deployment', 'prefect.resource.name': 'update_nexus_repo_packages'}), RelatedResource(root={'prefect.resource.id': 'prefect.flow.efc28063-d9d2-4631-b98b-cac3ded99963', 'prefect.resource.role': 'flow', 'prefect.resource.name': 'scan-nexus-packages'}), RelatedResource(root={'prefect.resource.id': 'prefect.flow-run.46f10173-5139-4276-8ee2-aa4ecda242b7', 'prefect.resource.role': 'flow-run', 'prefect.resource.name': 'realistic-caracal'}), RelatedResource(root={'prefect.resource.id': 'prefect.work-pool.446e8d8e-394c-4722-a65d-0e1b749db8f1', 'prefect.resource.role': 'work-pool', 'prefect.resource.name': 'docker-pool'}), RelatedResource(root={'prefect.resource.id': 'prefect.worker.docker.dockerworker-d142205c-9bb2-484a-a26a-d562f76b2b10', 'prefect.resource.name': 'DockerWorker d142205c-9bb2-484a-a26a-d562f76b2b10', 'prefect.version': '3.1.8', 'prefect.worker-type': 'docker', 'prefect.resource.role': 'worker'})], payload={}, id=UUID('02c99b08-fd78-4532-9bb0-c7975973699c'), follows=UUID('e2c9ff93-bc48-4420-b049-dc48f9e4a7b3'))
pycom-prefect-agent   | 20:05:05.151 | DEBUG   | EventsWorkerThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='get' COMPLETED, runtime=0.35> exited
pycom-prefect-agent   | 20:05:05.151 | DEBUG   | EventsWorkerThread | prefect._internal.concurrency - Finished call get(<dropped>)
pycom-prefect-agent   | 20:05:05.154 | DEBUG   | GlobalEventLoopThread | prefect._internal.concurrency - Service <prefect.events.worker.EventsWorker object at 0x7f0d293b2510> handling item Event(occurred=DateTime(2024, 12, 20, 20, 5, 5, 149608, tzinfo=Timezone('UTC')), event='prefect.docker.container.running', resource=Resource(root={'prefect.resource.id': 'prefect.docker.container.352c5ca8ff5d96dfde1db502fc313c251bd0bd187f388b1dda4ae24a95191b5e', 'prefect.resource.name': 'realistic-caracal'}), related=[RelatedResource(root={'prefect.resource.id': 'prefect.deployment.4e779212-3425-4019-854f-358417860d46', 'prefect.resource.role': 'deployment', 'prefect.resource.name': 'update_nexus_repo_packages'}), RelatedResource(root={'prefect.resource.id': 'prefect.flow.efc28063-d9d2-4631-b98b-cac3ded99963', 'prefect.resource.role': 'flow', 'prefect.resource.name': 'scan-nexus-packages'}), RelatedResource(root={'prefect.resource.id': 'prefect.flow-run.46f10173-5139-4276-8ee2-aa4ecda242b7', 'prefect.resource.role': 'flow-run', 'prefect.resource.name': 'realistic-caracal'}), RelatedResource(root={'prefect.resource.id': 'prefect.work-pool.446e8d8e-394c-4722-a65d-0e1b749db8f1', 'prefect.resource.role': 'work-pool', 'prefect.resource.name': 'docker-pool'}), RelatedResource(root={'prefect.resource.id': 'prefect.worker.docker.dockerworker-d142205c-9bb2-484a-a26a-d562f76b2b10', 'prefect.resource.name': 'DockerWorker d142205c-9bb2-484a-a26a-d562f76b2b10', 'prefect.version': '3.1.8', 'prefect.worker-type': 'docker', 'prefect.resource.role': 'worker'})], payload={}, id=UUID('02c99b08-fd78-4532-9bb0-c7975973699c'), follows=UUID('e2c9ff93-bc48-4420-b049-dc48f9e4a7b3'))
pycom-prefect-agent   | 20:05:05.155 | DEBUG   | EventsWorkerThread | prefect._internal.concurrency - Running call get() in thread 'EventsWorkerThread'
pycom-prefect-agent   | 20:05:05.155 | DEBUG   | EventsWorkerThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='get' RUNNING, runtime=0.00> entered
pycom-prefect-server  | 20:05:05.155 | DEBUG   | uvicorn.error - < TEXT '{"occurred":"2024-12-20T20:05:05.149608Z","even...420-b049-dc48f9e4a7b3"}' [1277 bytes]
pycom-prefect-server  | 20:05:05.156 | DEBUG   | prefect.server.events.messaging - Publishing event: prefect.docker.container.running with id: 02c99b08-fd78-4532-9bb0-c7975973699c for resource: prefect.docker.container.352c5ca8ff5d96dfde1db502fc313c251bd0bd187f388b1dda4ae24a95191b5e
pycom-prefect-server  | 20:05:05.157 | DEBUG   | prefect.server.events.services.event_persister - Received event: prefect.docker.container.running with id: 02c99b08-fd78-4532-9bb0-c7975973699c for resource: prefect.docker.container.352c5ca8ff5d96dfde1db502fc313c251bd0bd187f388b1dda4ae24a95191b5e
pycom-prefect-server  | 20:05:05.165 | INFO    | uvicorn.access - 172.24.0.3:42606 - "PATCH /api/flow_runs/46f10173-5139-4276-8ee2-aa4ecda242b7 HTTP/1.1" 204
pycom-prefect-agent   | 20:05:05.168 | DEBUG   | MainThread   | prefect._internal.concurrency - Service <prefect.logging.handlers.APILogWorker object at 0x7f0d2ddf7560> enqueuing item {'name': 'prefect.flow_runs.worker', 'level': 20, 'message': "Completed submission of flow run '46f10173-5139-4276-8ee2-aa4ecda242b7'", 'timestamp': '2024-12-20T20:05:05.167893Z', 'flow_run_id': '46f10173-5139-4276-8ee2-aa4ecda242b7', 'task_run_id': None, '__payload_size__': 255}
pycom-prefect-agent   | 20:05:05.167 | INFO    | prefect.flow_runs.worker - Completed submission of flow run '46f10173-5139-4276-8ee2-aa4ecda242b7'
pycom-prefect-agent   | 20:05:05.169 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='get' COMPLETED, runtime=0.64> exited
pycom-prefect-agent   | 20:05:05.169 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - Finished call get(<dropped>)
pycom-prefect-agent   | 20:05:05.170 | DEBUG   | GlobalEventLoopThread | prefect._internal.concurrency - Service <prefect.logging.handlers.APILogWorker object at 0x7f0d2ddf7560> added item {'name': 'prefect.flow_runs.worker', 'level': 20, 'message': "Completed submission of flow run '46f10173-5139-4276-8ee2-aa4ecda242b7'", 'timestamp': '2024-12-20T20:05:05.167893Z', 'flow_run_id': '46f10173-5139-4276-8ee2-aa4ecda242b7', 'task_run_id': None} to batch (size 556/3000000)
pycom-prefect-agent   | 20:05:05.170 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - Running call get(timeout=1.359314669854939) in thread 'APILogWorkerThread'
pycom-prefect-agent   | 20:05:05.170 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='get' RUNNING, runtime=0.00> entered
pycom-prefect-server  | 20:05:05.294 | INFO    | uvicorn.access - 10.122.16.154:63760 - "GET /assets/FlowRun-Bldb8VNX.js HTTP/1.1" 200
pycom-prefect-server  | 20:05:05.327 | INFO    | uvicorn.access - 10.122.16.154:63763 - "GET /assets/FlowRun-CrGsvBik.css HTTP/1.1" 200
pycom-prefect-server  | 20:05:05.389 | INFO    | uvicorn.access - 10.122.16.154:63763 - "GET /favicon-dark.ico HTTP/1.1" 304
pycom-prefect-server  | 20:05:05.658 | INFO    | uvicorn.access - 10.122.16.154:63756 - "OPTIONS /api/flow_runs/46f10173-5139-4276-8ee2-aa4ecda242b7 HTTP/1.1" 200
pycom-prefect-server  | 20:05:05.664 | INFO    | uvicorn.access - 10.122.16.154:63763 - "GET /ico/undefined.svg HTTP/1.1" 200
pycom-prefect-server  | 20:05:05.725 | INFO    | uvicorn.access - 10.122.16.154:63756 - "GET /api/flow_runs/46f10173-5139-4276-8ee2-aa4ecda242b7 HTTP/1.1" 200
pycom-prefect-server  | 20:05:05.830 | INFO    | uvicorn.access - 10.122.16.154:63756 - "OPTIONS /api/work_pools/docker-pool/queues/default HTTP/1.1" 200
pycom-prefect-server  | 20:05:05.832 | INFO    | uvicorn.access - 10.122.16.154:63752 - "OPTIONS /api/events/filter HTTP/1.1" 200
pycom-prefect-server  | 20:05:05.849 | INFO    | uvicorn.access - 10.122.16.154:63751 - "POST /api/task_runs/count HTTP/1.1" 200
pycom-prefect-server  | 20:05:05.853 | INFO    | uvicorn.access - 10.122.16.154:63763 - "GET /ico/pending.svg HTTP/1.1" 200
pycom-prefect-server  | 20:05:05.861 | INFO    | uvicorn.access - 10.122.16.154:63756 - "OPTIONS /api/flows/efc28063-d9d2-4631-b98b-cac3ded99963 HTTP/1.1" 200
pycom-prefect-server  | 20:05:05.863 | INFO    | uvicorn.access - 10.122.16.154:63752 - "OPTIONS /api/deployments/4e779212-3425-4019-854f-358417860d46 HTTP/1.1" 200
pycom-prefect-server  | 20:05:05.871 | INFO    | uvicorn.access - 10.122.16.154:63766 - "OPTIONS /api/logs/filter HTTP/1.1" 200
pycom-prefect-server  | 20:05:05.889 | INFO    | uvicorn.access - 10.122.16.154:63765 - "GET /api/work_pools/docker-pool HTTP/1.1" 200
pycom-prefect-server  | 20:05:05.909 | DEBUG   | prefect.server.events.storage.database - Running PostgreSQL query: SELECT events.occurred, events.event, events.resource_id, events.resource, events.related_resource_ids, events.related, events.payload, events.received, events.recorded, events.follows, events.id, events.created, events.updated
pycom-prefect-server  | FROM events
pycom-prefect-server  | WHERE events.occurred >= :occurred_1 AND events.occurred <= :occurred_2 AND events.event IN (__[POSTCOMPILE_event_1]) AND ((events.resource_id LIKE :resource_id_1 || '%') OR (events.resource_id LIKE :resource_id_2 || '%')) AND events.id IN (SELECT event_resources.event_id
pycom-prefect-server  | FROM event_resources
pycom-prefect-server  | WHERE event_resources.occurred >= :occurred_3 AND event_resources.occurred <= :occurred_4 AND event_resources.resource_id IN (__[POSTCOMPILE_resource_id_3]) AND event_resources.resource_role != :resource_role_1) ORDER BY events.occurred DESC
pycom-prefect-server  |  LIMIT :param_1 OFFSET :param_2
pycom-prefect-server  | 20:05:05.924 | INFO    | uvicorn.access - 10.122.16.154:63767 - "GET /api/work_pools/docker-pool/queues/default HTTP/1.1" 200
pycom-prefect-server  | 20:05:05.927 | INFO    | uvicorn.access - 10.122.16.154:63751 - "POST /api/events/filter HTTP/1.1" 200
pycom-prefect-server  | 20:05:05.932 | INFO    | uvicorn.access - 10.122.16.154:63756 - "GET /api/flows/efc28063-d9d2-4631-b98b-cac3ded99963 HTTP/1.1" 200
pycom-prefect-server  | 20:05:05.938 | INFO    | uvicorn.access - 10.122.16.154:63766 - "POST /api/logs/filter HTTP/1.1" 200
pycom-prefect-server  | 20:05:05.960 | INFO    | uvicorn.access - 10.122.16.154:63752 - "GET /api/deployments/4e779212-3425-4019-854f-358417860d46 HTTP/1.1" 200
pycom-prefect-agent   | 20:05:06.530 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='get' COMPLETED, runtime=1.36> exited
pycom-prefect-agent   | 20:05:06.530 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - Encountered exception in call get(<dropped>)
pycom-prefect-agent   | Traceback (most recent call last):
pycom-prefect-agent   |   File "/usr/local/lib/python3.12/site-packages/prefect/_internal/concurrency/calls.py", line 364, in _run_sync
pycom-prefect-agent   |     result = self.fn(*self.args, **self.kwargs)
pycom-prefect-agent   |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pycom-prefect-agent   |   File "/usr/local/lib/python3.12/queue.py", line 179, in get
pycom-prefect-agent   |     raise Empty
pycom-prefect-agent   | _queue.Empty
pycom-prefect-agent   | 20:05:06.532 | DEBUG   | GlobalEventLoopThread | prefect._internal.concurrency - Service <prefect.logging.handlers.APILogWorker object at 0x7f0d2ddf7560> processing batch of size 556
pycom-prefect-server  | 20:05:06.537 | INFO    | uvicorn.access - 172.24.0.3:42612 - "GET /api/csrf-token?client=37b3c344-2402-4159-94e7-15465757b872 HTTP/1.1" 422
pycom-prefect-server  | 20:05:06.559 | INFO    | uvicorn.access - 172.24.0.3:42612 - "POST /api/logs/ HTTP/1.1" 201
pycom-prefect-agent   | 20:05:06.562 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - Running call get(timeout=1.9999768850393593) in thread 'APILogWorkerThread'
pycom-prefect-agent   | 20:05:06.562 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='get' RUNNING, runtime=0.00> entered
pycom-prefect-server  | 20:05:07.015 | DEBUG   | prefect.server.services.flowrunnotifications - About to run FlowRunNotifications...
pycom-prefect-server  | 20:05:07.029 | DEBUG   | prefect.server.services.flowrunnotifications - Got 0 notifications from queue.
pycom-prefect-server  | 20:05:07.031 | DEBUG   | prefect.server.services.flowrunnotifications - Finished running FlowRunNotifications. Next run at 2024-12-20 20:05:11.015359+00:00
pycom-prefect-server  | 20:05:08.025 | DEBUG   | prefect.server.services.recentdeploymentsscheduler - About to run RecentDeploymentsScheduler...
pycom-prefect-server  | 20:05:08.028 | DEBUG   | prefect.server.services.marklateruns - About to run MarkLateRuns...
pycom-prefect-server  | 20:05:08.030 | DEBUG   | prefect.server.services.failexpiredpauses - About to run FailExpiredPauses...
pycom-prefect-server  | 20:05:08.052 | INFO    | prefect.server.services.marklateruns - Finished monitoring for late runs.
pycom-prefect-server  | 20:05:08.052 | DEBUG   | prefect.server.services.marklateruns - Finished running MarkLateRuns. Next run at 2024-12-20 20:05:13.028971+00:00
pycom-prefect-server  | 20:05:08.053 | INFO    | prefect.server.services.failexpiredpauses - Finished monitoring for late runs.
pycom-prefect-server  | 20:05:08.053 | DEBUG   | prefect.server.services.failexpiredpauses - Finished running FailExpiredPauses. Next run at 2024-12-20 20:05:13.030366+00:00
pycom-prefect-server  | 20:05:08.073 | DEBUG   | prefect.server.services.proactivetriggers - About to run ProactiveTriggers...
pycom-prefect-server  | 20:05:08.073 | DEBUG   | prefect.server.services.proactivetriggers - Finished running ProactiveTriggers. Next run at 2024-12-20 20:05:13.073200+00:00
pycom-prefect-server  | 20:05:08.134 | DEBUG   | prefect.server.events.services.event_persister - Persisting 5 events...
pycom-prefect-server  | 20:05:08.201 | DEBUG   | prefect.server.events.services.event_persister - Finished persisting events.
pycom-prefect-server  | 20:05:08.376 | INFO    | prefect.server.services.recentdeploymentsscheduler - Scheduled 0 runs.
pycom-prefect-server  | 20:05:08.377 | DEBUG   | prefect.server.services.recentdeploymentsscheduler - Finished running RecentDeploymentsScheduler. Next run at 2024-12-20 20:05:13.025382+00:00
pycom-prefect-agent   | 20:05:08.563 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='get' COMPLETED, runtime=2.00> exited
pycom-prefect-agent   | 20:05:08.563 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - Encountered exception in call get(<dropped>)
pycom-prefect-agent   | Traceback (most recent call last):
pycom-prefect-agent   |   File "/usr/local/lib/python3.12/site-packages/prefect/_internal/concurrency/calls.py", line 364, in _run_sync
pycom-prefect-agent   |     result = self.fn(*self.args, **self.kwargs)
pycom-prefect-agent   |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pycom-prefect-agent   |   File "/usr/local/lib/python3.12/queue.py", line 179, in get
pycom-prefect-agent   |     raise Empty
pycom-prefect-agent   | _queue.Empty
pycom-prefect-agent   | 20:05:08.564 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - Running call get(timeout=1.9999809791333973) in thread 'APILogWorkerThread'
pycom-prefect-agent   | 20:05:08.564 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='get' RUNNING, runtime=0.00> entered
pycom-prefect-server  | 20:05:08.608 | DEBUG   | prefect.server.events.triggers - Running periodic evaluation as of 2024-12-20 20:05:08.599711+00:00 (offset -0.008213043212890625s)
pycom-prefect-server  | 20:05:09.646 | INFO    | uvicorn.access - 172.24.0.4:43384 - "GET /api/flow_runs/46f10173-5139-4276-8ee2-aa4ecda242b7 HTTP/1.1" 200
pycom-prefect-server  | 20:05:09.690 | INFO    | uvicorn.access - 172.24.0.4:43384 - "GET /api/csrf-token?client=b6bb48c7-e305-4fbe-b572-fed3086ec8ee HTTP/1.1" 422
pycom-prefect-server  | 20:05:09.709 | INFO    | uvicorn.access - 172.24.0.4:43384 - "POST /api/flow_runs/filter HTTP/1.1" 200
pycom-prefect-server  | 20:05:09.732 | INFO    | uvicorn.access - 172.24.0.4:43384 - "POST /api/flow_runs/filter HTTP/1.1" 200
pycom-prefect-agent   | 20:05:10.565 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='get' COMPLETED, runtime=2.00> exited
pycom-prefect-agent   | 20:05:10.565 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - Encountered exception in call get(<dropped>)
pycom-prefect-agent   | Traceback (most recent call last):
pycom-prefect-agent   |   File "/usr/local/lib/python3.12/site-packages/prefect/_internal/concurrency/calls.py", line 364, in _run_sync
pycom-prefect-agent   |     result = self.fn(*self.args, **self.kwargs)
pycom-prefect-agent   |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pycom-prefect-agent   |   File "/usr/local/lib/python3.12/queue.py", line 179, in get
pycom-prefect-agent   |     raise Empty
pycom-prefect-agent   | _queue.Empty
pycom-prefect-agent   | 20:05:10.566 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - Running call get(timeout=1.999982148874551) in thread 'APILogWorkerThread'
pycom-prefect-agent   | 20:05:10.566 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='get' RUNNING, runtime=0.00> entered
pycom-prefect-server  | 20:05:10.670 | INFO    | uvicorn.access - 10.122.16.154:63752 - "GET /api/flow_runs/46f10173-5139-4276-8ee2-aa4ecda242b7 HTTP/1.1" 200
pycom-prefect-server  | 20:05:10.867 | INFO    | uvicorn.access - 10.122.16.154:63752 - "POST /api/logs/filter HTTP/1.1" 200
pycom-prefect-server  | 20:05:11.016 | DEBUG   | prefect.server.services.flowrunnotifications - About to run FlowRunNotifications...
pycom-prefect-server  | 20:05:11.030 | DEBUG   | prefect.server.services.flowrunnotifications - Got 0 notifications from queue.
pycom-prefect-server  | 20:05:11.032 | DEBUG   | prefect.server.services.flowrunnotifications - Finished running FlowRunNotifications. Next run at 2024-12-20 20:05:15.016360+00:00
pycom-prefect-server  | 20:05:11.688 | INFO    | uvicorn.access - 172.24.0.4:43390 - "GET /api/csrf-token?client=bd961b74-7529-4b65-a2d1-7e5dc2ab059f HTTP/1.1" 422
pycom-prefect-server  | 20:05:11.708 | INFO    | uvicorn.access - 172.24.0.4:43390 - "POST /api/logs/ HTTP/1.1" 201
pycom-prefect-server  | 20:05:11.857 | INFO    | uvicorn.access - 172.24.0.4:43394 - "GET /api/flow_runs/46f10173-5139-4276-8ee2-aa4ecda242b7 HTTP/1.1" 200
pycom-prefect-server  | 20:05:11.912 | INFO    | uvicorn.access - 172.24.0.4:43402 - "GET /api/deployments/4e779212-3425-4019-854f-358417860d46 HTTP/1.1" 200
pycom-prefect-agent   | 20:05:09.108 | DEBUG   | prefect.profiles - Using profile 'ephemeral'
pycom-prefect-agent   | 20:05:09.581 | DEBUG   | prefect.runner - Starting runner...
pycom-prefect-agent   | 20:05:09.616 | DEBUG   | prefect.client - Connecting to API at http://server:4200/api/
pycom-prefect-agent   | 20:05:09.616 | DEBUG   | prefect.runner - Limit slot acquired for flow run '46f10173-5139-4276-8ee2-aa4ecda242b7'
pycom-prefect-agent   | 20:05:09.660 | DEBUG   | GlobalEventLoopThread | prefect._internal.concurrency - Running call start() in thread 'GlobalEventLoopThread'
pycom-prefect-agent   | 20:05:09.660 | DEBUG   | GlobalEventLoopThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='start' RUNNING, runtime=0.00> entered
pycom-prefect-agent   | 20:05:09.660 | DEBUG   | GlobalEventLoopThread | prefect._internal.concurrency - Starting service <prefect.logging.handlers.APILogWorker object at 0x7f11de2f6a50>
pycom-prefect-agent   | 20:05:09.660 | DEBUG   | GlobalEventLoopThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='start' COMPLETED, runtime=0.00> exited
pycom-prefect-agent   | 20:05:09.660 | DEBUG   | GlobalEventLoopThread | prefect._internal.concurrency - Finished call start(<dropped>)
pycom-prefect-agent   | 20:05:09.661 | DEBUG   | MainThread   | prefect._internal.concurrency - Service <prefect.logging.handlers.APILogWorker object at 0x7f11de2f6a50> enqueuing item {'name': 'prefect.flow_runs.runner', 'level': 20, 'message': 'Opening process...', 'timestamp': '2024-12-20T20:05:09.656752Z', 'flow_run_id': '46f10173-5139-4276-8ee2-aa4ecda242b7', 'task_run_id': None, '__payload_size__': 202}
pycom-prefect-agent   | 20:05:09.656 | INFO    | prefect.flow_runs.runner - Opening process...
pycom-prefect-agent   | 20:05:09.672 | DEBUG   | prefect.utilities.services.critical_service_loop - Starting run of functools.partial(<bound method Runner._check_for_cancelled_flow_runs of Runner(name='runner-dadb361c-0fa0-40ca-aefb-fbe6c23d6033')>, should_stop=<function Runner.execute_flow_run.<locals>.<lambda> at 0x7f11dd5cf1a0>, on_stop=<bound method CancelScope.cancel of <anyio._backends._asyncio.CancelScope object at 0x7f11dd621550>>)
pycom-prefect-agent   | 20:05:09.673 | DEBUG   | prefect.runner - Checking for cancelled flow runs...
pycom-prefect-agent   | 20:05:09.680 | DEBUG   | prefect.client - Connecting to API at http://server:4200/api/
pycom-prefect-agent   | 20:05:09.686 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - Running call get(timeout=1.9999772529117763) in thread 'APILogWorkerThread'
pycom-prefect-agent   | 20:05:09.688 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='get' RUNNING, runtime=0.00> entered
pycom-prefect-agent   | 20:05:09.688 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='get' COMPLETED, runtime=0.00> exited
pycom-prefect-agent   | 20:05:09.688 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - Finished call get(<dropped>)
pycom-prefect-agent   | 20:05:09.689 | DEBUG   | GlobalEventLoopThread | prefect._internal.concurrency - Service <prefect.logging.handlers.APILogWorker object at 0x7f11de2f6a50> added item {'name': 'prefect.flow_runs.runner', 'level': 20, 'message': 'Opening process...', 'timestamp': '2024-12-20T20:05:09.656752Z', 'flow_run_id': '46f10173-5139-4276-8ee2-aa4ecda242b7', 'task_run_id': None} to batch (size 202/3000000)
pycom-prefect-agent   | 20:05:09.690 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - Running call get(timeout=1.990762717090547) in thread 'APILogWorkerThread'
pycom-prefect-agent   | 20:05:09.690 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='get' RUNNING, runtime=0.00> entered
pycom-prefect-agent   | 20:05:11.681 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='get' COMPLETED, runtime=1.99> exited
pycom-prefect-agent   | 20:05:11.681 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - Encountered exception in call get(<dropped>)
pycom-prefect-agent   | Traceback (most recent call last):
pycom-prefect-agent   |   File "/src/.venv/lib/python3.11/site-packages/prefect/_internal/concurrency/calls.py", line 364, in _run_sync
pycom-prefect-agent   |     result = self.fn(*self.args, **self.kwargs)
pycom-prefect-agent   |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pycom-prefect-agent   |   File "/usr/local/lib/python3.11/queue.py", line 179, in get
pycom-prefect-agent   |     raise Empty
pycom-prefect-agent   | _queue.Empty
pycom-prefect-agent   | 20:05:11.682 | DEBUG   | GlobalEventLoopThread | prefect._internal.concurrency - Service <prefect.logging.handlers.APILogWorker object at 0x7f11de2f6a50> processing batch of size 202
pycom-prefect-agent   | 20:05:11.711 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - Running call get(timeout=1.999973313882947) in thread 'APILogWorkerThread'
pycom-prefect-agent   | 20:05:11.711 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='get' RUNNING, runtime=0.00> entered
pycom-prefect-agent   | 20:05:11.817 | DEBUG   | prefect.profiles - Using profile 'ephemeral'
pycom-prefect-agent   | 20:05:11.868 | DEBUG   | RunSyncEventLoopThread | prefect._internal.concurrency - Running call coroutine_wrapper() in thread 'RunSyncEventLoopThread'
pycom-prefect-agent   | 20:05:11.868 | DEBUG   | RunSyncEventLoopThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='coroutine_wrapper' RUNNING, runtime=0.00> entered
pycom-prefect-agent   | 20:05:11.868 | DEBUG   | RunSyncEventLoopThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='coroutine_wrapper' COMPLETED, runtime=0.00> exited
pycom-prefect-agent   | 20:05:11.868 | DEBUG   | RunSyncEventLoopThread | prefect._internal.concurrency - Scheduling coroutine for call coroutine_wrapper(<dropped>) in running loop <_UnixSelectorEventLoop running=True closed=False debug=False>
pycom-prefect-agent   | 20:05:11.868 | DEBUG   | RunSyncEventLoopThread | prefect._internal.concurrency - <AsyncCancelScope, name='coroutine_wrapper' RUNNING, runtime=0.00> entered
pycom-prefect-agent   | 20:05:11.977 | DEBUG   | GlobalEventLoopThread | prefect._internal.concurrency - Running call start() in thread 'GlobalEventLoopThread'
pycom-prefect-agent   | 20:05:11.977 | DEBUG   | GlobalEventLoopThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='start' RUNNING, runtime=0.00> entered
pycom-prefect-agent   | 20:05:11.977 | DEBUG   | GlobalEventLoopThread | prefect._internal.concurrency - Starting service <prefect.logging.handlers.APILogWorker object at 0x7f125c10b710>
pycom-prefect-agent   | 20:05:11.978 | DEBUG   | GlobalEventLoopThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='start' COMPLETED, runtime=0.00> exited
pycom-prefect-agent   | 20:05:11.978 | DEBUG   | GlobalEventLoopThread | prefect._internal.concurrency - Finished call start(<dropped>)
pycom-prefect-agent   | 20:05:11.978 | DEBUG   | RunSyncEventLoopThread | prefect._internal.concurrency - Service <prefect.logging.handlers.APILogWorker object at 0x7f125c10b710> enqueuing item {'name': 'prefect.flow_runs', 'level': 10, 'message': 'Running 1 deployment pull step(s)', 'timestamp': '2024-12-20T20:05:11.974499Z', 'flow_run_id': '46f10173-5139-4276-8ee2-aa4ecda242b7', 'task_run_id': None, '__payload_size__': 210}
pycom-prefect-agent   | 20:05:11.974 | DEBUG   | Flow run 'realistic-caracal' - Running 1 deployment pull step(s)
pycom-prefect-agent   | 20:05:11.990 | DEBUG   | prefect.client - Connecting to API at http://server:4200/api/
pycom-prefect-agent   | 20:05:11.993 | DEBUG   | prefect.client - Connecting to API at http://server:4200/api/
pycom-prefect-agent   | 20:05:11.993 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - Running call get(timeout=1.9999765758402646) in thread 'APILogWorkerThread'
pycom-prefect-agent   | 20:05:11.995 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='get' RUNNING, runtime=0.00> entered
pycom-prefect-agent   | 20:05:11.995 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='get' COMPLETED, runtime=0.00> exited
pycom-prefect-agent   | 20:05:11.995 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - Finished call get(<dropped>)
pycom-prefect-agent   | 20:05:11.996 | DEBUG   | GlobalEventLoopThread | prefect._internal.concurrency - Service <prefect.logging.handlers.APILogWorker object at 0x7f125c10b710> added item {'name': 'prefect.flow_runs', 'level': 10, 'message': 'Running 1 deployment pull step(s)', 'timestamp': '2024-12-20T20:05:11.974499Z', 'flow_run_id': '46f10173-5139-4276-8ee2-aa4ecda242b7', 'task_run_id': None} to batch (size 210/3000000)
pycom-prefect-agent   | 20:05:11.996 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - Running call get(timeout=1.996932223904878) in thread 'APILogWorkerThread'
pycom-prefect-agent   | 20:05:11.996 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='get' RUNNING, runtime=0.00> entered
pycom-prefect-agent   | 20:05:12.001 | DEBUG   | prefect.client - Connecting to API at http://server:4200/api/
pycom-prefect-agent   | 20:05:12.566 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='get' COMPLETED, runtime=2.00> exited
pycom-prefect-agent   | 20:05:12.567 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - Encountered exception in call get(<dropped>)
pycom-prefect-agent   | Traceback (most recent call last):
pycom-prefect-agent   |   File "/usr/local/lib/python3.12/site-packages/prefect/_internal/concurrency/calls.py", line 364, in _run_sync
pycom-prefect-agent   |     result = self.fn(*self.args, **self.kwargs)
pycom-prefect-agent   |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pycom-prefect-agent   |   File "/usr/local/lib/python3.12/queue.py", line 179, in get
pycom-prefect-agent   |     raise Empty
pycom-prefect-agent   | _queue.Empty
pycom-prefect-agent   | 20:05:12.567 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - Running call get(timeout=1.9999809912405908) in thread 'APILogWorkerThread'
pycom-prefect-agent   | 20:05:12.568 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='get' RUNNING, runtime=0.00> entered
pycom-prefect-server  | 20:05:13.018 | DEBUG   | prefect.server.services.foreman - About to run Foreman...
pycom-prefect-server  | 20:05:13.025 | DEBUG   | prefect.server.services.recentdeploymentsscheduler - About to run RecentDeploymentsScheduler...
pycom-prefect-server  | 20:05:13.029 | DEBUG   | prefect.server.services.marklateruns - About to run MarkLateRuns...
pycom-prefect-server  | 20:05:13.030 | DEBUG   | prefect.server.services.failexpiredpauses - About to run FailExpiredPauses...
pycom-prefect-server  | 20:05:13.039 | INFO    | prefect.server.services.recentdeploymentsscheduler - Scheduled 0 runs.
pycom-prefect-server  | 20:05:13.040 | DEBUG   | prefect.server.services.recentdeploymentsscheduler - Finished running RecentDeploymentsScheduler. Next run at 2024-12-20 20:05:18.025727+00:00
pycom-prefect-server  | 20:05:13.041 | INFO    | prefect.server.services.marklateruns - Finished monitoring for late runs.
pycom-prefect-server  | 20:05:13.042 | DEBUG   | prefect.server.services.marklateruns - Finished running MarkLateRuns. Next run at 2024-12-20 20:05:18.029455+00:00
pycom-prefect-server  | 20:05:13.042 | INFO    | prefect.server.services.failexpiredpauses - Finished monitoring for late runs.
pycom-prefect-server  | 20:05:13.043 | DEBUG   | prefect.server.services.failexpiredpauses - Finished running FailExpiredPauses. Next run at 2024-12-20 20:05:18.030942+00:00
pycom-prefect-server  | 20:05:13.073 | DEBUG   | prefect.server.services.proactivetriggers - About to run ProactiveTriggers...
pycom-prefect-server  | 20:05:13.074 | DEBUG   | prefect.server.services.proactivetriggers - Finished running ProactiveTriggers. Next run at 2024-12-20 20:05:18.073890+00:00
pycom-prefect-server  | 20:05:13.082 | DEBUG   | prefect.server.services.foreman - Finished running Foreman. Next run at 2024-12-20 20:05:28.017996+00:00
pycom-prefect-server  | 20:05:13.165 | DEBUG   | uvicorn.error - < PING 'TrkJ' [text, 4 bytes]
pycom-prefect-server  | 20:05:13.166 | DEBUG   | uvicorn.error - > PONG 'TrkJ' [text, 4 bytes]
pycom-prefect-server  | 20:05:13.167 | DEBUG   | uvicorn.error - % sending keepalive ping
pycom-prefect-server  | 20:05:13.168 | DEBUG   | uvicorn.error - > PING 42 1c 20 d9 [binary, 4 bytes]
pycom-prefect-server  | 20:05:13.168 | DEBUG   | uvicorn.error - < PONG 42 1c 20 d9 [binary, 4 bytes]
pycom-prefect-server  | 20:05:13.169 | DEBUG   | uvicorn.error - % received keepalive pong
pycom-prefect-server  | 20:05:13.426 | INFO    | uvicorn.access - 172.24.0.4:43408 - "GET /api/block_types/slug/secret/block_documents/name/postgres-uri?include_secrets=true HTTP/1.1" 200
pycom-prefect-agent   | 20:05:12.004 | DEBUG   | WorkerThread-0 | prefect._internal.concurrency - Running call set_working_directory(directory='/src') in thread 'WorkerThread-0'
pycom-prefect-agent   | 20:05:12.004 | DEBUG   | WorkerThread-0 | prefect._internal.concurrency - <WatcherThreadCancelScope, name='set_working_directory' RUNNING, runtime=0.00> entered
pycom-prefect-agent   | 20:05:12.004 | DEBUG   | WorkerThread-0 | prefect._internal.concurrency - <WatcherThreadCancelScope, name='set_working_directory' COMPLETED, runtime=0.00> exited
pycom-prefect-agent   | 20:05:12.004 | DEBUG   | WorkerThread-0 | prefect._internal.concurrency - Finished call set_working_directory(<dropped>)
pycom-prefect-agent   | 20:05:12.004 | INFO    | WorkerThread-0 | prefect._internal.concurrency - Exiting worker thread 'WorkerThread-0'
pycom-prefect-agent   | 20:05:12.006 | DEBUG   | RunSyncEventLoopThread | prefect._internal.concurrency - Service <prefect.logging.handlers.APILogWorker object at 0x7f125c10b710> enqueuing item {'name': 'prefect.flow_runs', 'level': 10, 'message': "Changing working directory to '/src'", 'timestamp': '2024-12-20T20:05:12.005465Z', 'flow_run_id': '46f10173-5139-4276-8ee2-aa4ecda242b7', 'task_run_id': None, '__payload_size__': 213}
pycom-prefect-agent   | 20:05:12.005 | DEBUG   | Flow run 'realistic-caracal' - Changing working directory to '/src'
pycom-prefect-agent   | 20:05:12.007 | DEBUG   | RunSyncEventLoopThread | prefect._internal.concurrency - Service <prefect.logging.handlers.APILogWorker object at 0x7f125c10b710> enqueuing item {'name': 'prefect.flow_runs', 'level': 10, 'message': "Importing flow code from 'src/xxx/viper_prefect/flows/scan_nexus_packages.py:scan_nexus_packages'", 'timestamp': '2024-12-20T20:05:12.007670Z', 'flow_run_id': '46f10173-5139-4276-8ee2-aa4ecda242b7', 'task_run_id': None, '__payload_size__': 276}
pycom-prefect-agent   | 20:05:12.007 | DEBUG   | Flow run 'realistic-caracal' - Importing flow code from 'src/xxx/viper_prefect/flows/scan_nexus_packages.py:scan_nexus_packages'
pycom-prefect-agent   | 20:05:12.008 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='get' COMPLETED, runtime=0.01> exited
pycom-prefect-agent   | 20:05:12.009 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - Finished call get(<dropped>)
pycom-prefect-agent   | 20:05:12.010 | DEBUG   | GlobalEventLoopThread | prefect._internal.concurrency - Service <prefect.logging.handlers.APILogWorker object at 0x7f125c10b710> added item {'name': 'prefect.flow_runs', 'level': 10, 'message': "Changing working directory to '/src'", 'timestamp': '2024-12-20T20:05:12.005465Z', 'flow_run_id': '46f10173-5139-4276-8ee2-aa4ecda242b7', 'task_run_id': None} to batch (size 423/3000000)
pycom-prefect-agent   | 20:05:12.012 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - Running call get(timeout=1.9818994118832052) in thread 'APILogWorkerThread'
pycom-prefect-agent   | 20:05:12.012 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='get' RUNNING, runtime=0.00> entered
pycom-prefect-agent   | 20:05:12.012 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='get' COMPLETED, runtime=0.00> exited
pycom-prefect-agent   | 20:05:12.012 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - Finished call get(<dropped>)
pycom-prefect-agent   | 20:05:12.015 | DEBUG   | GlobalEventLoopThread | prefect._internal.concurrency - Service <prefect.logging.handlers.APILogWorker object at 0x7f125c10b710> added item {'name': 'prefect.flow_runs', 'level': 10, 'message': "Importing flow code from 'src/xxx/viper_prefect/flows/scan_nexus_packages.py:scan_nexus_packages'", 'timestamp': '2024-12-20T20:05:12.007670Z', 'flow_run_id': '46f10173-5139-4276-8ee2-aa4ecda242b7', 'task_run_id': None} to batch (size 699/3000000)
pycom-prefect-agent   | 20:05:12.016 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - Running call get(timeout=1.9777771718800068) in thread 'APILogWorkerThread'
pycom-prefect-agent   | 20:05:12.016 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='get' RUNNING, runtime=0.00> entered
pycom-prefect-agent   | 20:05:13.432 | DEBUG   | GlobalEventLoopThread | prefect._internal.concurrency - Running call start() in thread 'GlobalEventLoopThread'
pycom-prefect-agent   | 20:05:13.432 | DEBUG   | GlobalEventLoopThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='start' RUNNING, runtime=0.00> entered
pycom-prefect-agent   | 20:05:13.432 | DEBUG   | GlobalEventLoopThread | prefect._internal.concurrency - Starting service <prefect.events.worker.EventsWorker object at 0x7f124ee77610>
pycom-prefect-agent   | 20:05:13.433 | DEBUG   | GlobalEventLoopThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='start' COMPLETED, runtime=0.00> exited
pycom-prefect-agent   | 20:05:13.433 | DEBUG   | GlobalEventLoopThread | prefect._internal.concurrency - Finished call start(<dropped>)
pycom-prefect-agent   | 20:05:13.434 | DEBUG   | AnyIO worker thread | prefect._internal.concurrency - Service <prefect.events.worker.EventsWorker object at 0x7f124ee77610> enqueuing item Event(occurred=DateTime(2024, 12, 20, 20, 5, 13, 433552, tzinfo=Timezone('UTC')), event='prefect.block.secret.loaded', resource=Resource(root={'prefect.resource.id': 'prefect.block-document.46e1a40f-3f3b-4228-8cb4-8bfdba830be0', 'prefect.resource.name': 'postgres-uri'}), related=[RelatedResource(root={'prefect.resource.id': 'prefect.block-type.secret', 'prefect.resource.role': 'block-type'})], payload={}, id=UUID('fcf755a7-ae2d-45f4-b0a2-fa05a3bd6408'), follows=None)
pycom-prefect-agent   | 20:05:13.484 | WARNING | prefect.events.clients - Unable to connect to 'ws://server:4200/api/events/in'. Please check your network settings to ensure websocket connections to the API are allowed. Otherwise event data (including task run data) may be lost. Reason: did not receive a valid HTTP response. Set PREFECT_DEBUG_MODE=1 to see the full error.
pycom-prefect-agent   | Traceback (most recent call last):
pycom-prefect-agent   |   File "/src/.venv/lib/python3.11/site-packages/websockets/legacy/client.py", line 134, in read_http_response
pycom-prefect-agent   |     status_code, reason, headers = await read_response(self.reader)
pycom-prefect-agent   |                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pycom-prefect-agent   |   File "/src/.venv/lib/python3.11/site-packages/websockets/legacy/http.py", line 130, in read_response
pycom-prefect-agent   |     raise ValueError(f"unsupported HTTP version: {d(version)}")
pycom-prefect-agent   | ValueError: unsupported HTTP version: HTTP/1.0
pycom-prefect-agent   |
pycom-prefect-agent   | The above exception was the direct cause of the following exception:
pycom-prefect-agent   |
pycom-prefect-agent   | Traceback (most recent call last):
pycom-prefect-agent   |   File "/src/.venv/lib/python3.11/site-packages/prefect/events/clients.py", line 348, in _reconnect
pycom-prefect-agent   |     self._websocket = await self._connect.__aenter__()
pycom-prefect-agent   |                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pycom-prefect-agent   |   File "/src/.venv/lib/python3.11/site-packages/websockets/legacy/client.py", line 635, in __aenter__
pycom-prefect-agent   |     return await self
pycom-prefect-agent   |            ^^^^^^^^^^
pycom-prefect-agent   |   File "/src/.venv/lib/python3.11/site-packages/prefect/events/clients.py", line 123, in _proxy_connect
pycom-prefect-agent   |     proto = await self.__await_impl__()
pycom-prefect-agent   |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
pycom-prefect-agent   |   File "/src/.venv/lib/python3.11/site-packages/websockets/legacy/client.py", line 656, in __await_impl__
pycom-prefect-agent   |     await protocol.handshake(
pycom-prefect-agent   |   File "/src/.venv/lib/python3.11/site-packages/websockets/legacy/client.py", line 316, in handshake
pycom-prefect-agent   |     status_code, response_headers = await self.read_http_response()
pycom-prefect-agent   |                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pycom-prefect-agent   |   File "/src/.venv/lib/python3.11/site-packages/websockets/legacy/client.py", line 136, in read_http_response
pycom-prefect-agent   |     raise InvalidMessage("did not receive a valid HTTP response") from exc
pycom-prefect-agent   | websockets.legacy.exceptions.InvalidMessage: did not receive a valid HTTP response
pycom-prefect-agent   | 20:05:13.486 | ERROR   | GlobalEventLoopThread | prefect._internal.concurrency - Service 'EventsWorker' failed with 1 pending items.
pycom-prefect-agent   | Traceback (most recent call last):
pycom-prefect-agent   |   File "/src/.venv/lib/python3.11/site-packages/websockets/legacy/client.py", line 134, in read_http_response
pycom-prefect-agent   |     status_code, reason, headers = await read_response(self.reader)
pycom-prefect-agent   |                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pycom-prefect-agent   |   File "/src/.venv/lib/python3.11/site-packages/websockets/legacy/http.py", line 130, in read_response
pycom-prefect-agent   |     raise ValueError(f"unsupported HTTP version: {d(version)}")
pycom-prefect-agent   | ValueError: unsupported HTTP version: HTTP/1.0
pycom-prefect-agent   |
pycom-prefect-agent   | The above exception was the direct cause of the following exception:
pycom-prefect-agent   |
pycom-prefect-agent   | Traceback (most recent call last):
pycom-prefect-agent   |   File "/src/.venv/lib/python3.11/site-packages/prefect/_internal/concurrency/services.py", line 125, in _run
pycom-prefect-agent   |     async with self._lifespan():
pycom-prefect-agent   |   File "/usr/local/lib/python3.11/contextlib.py", line 204, in __aenter__
pycom-prefect-agent   |     return await anext(self.gen)
pycom-prefect-agent   |            ^^^^^^^^^^^^^^^^^^^^^
pycom-prefect-server  | 20:05:13.632 | DEBUG   | prefect.server.events.triggers - Running periodic evaluation as of 2024-12-20 20:05:13.624208+00:00 (offset -0.008213043212890625s)
pycom-prefect-server  | 20:05:13.698 | INFO    | uvicorn.access - 172.24.0.4:43420 - "GET /api/block_types/slug/secret/block_documents/name/postgres-dev-uri?include_secrets=true HTTP/1.1" 200
pycom-prefect-agent   |   File "/src/.venv/lib/python3.11/site-packages/prefect/events/worker.py", line 70, in _lifespan
pycom-prefect-agent   |     async with self._client:
pycom-prefect-agent   |   File "/src/.venv/lib/python3.11/site-packages/prefect/events/clients.py", line 329, in __aenter__
pycom-prefect-agent   |     await self._reconnect()
pycom-prefect-agent   |   File "/src/.venv/lib/python3.11/site-packages/prefect/events/clients.py", line 348, in _reconnect
pycom-prefect-agent   |     self._websocket = await self._connect.__aenter__()
pycom-prefect-agent   |                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pycom-prefect-agent   |   File "/src/.venv/lib/python3.11/site-packages/websockets/legacy/client.py", line 635, in __aenter__
pycom-prefect-agent   |     return await self
pycom-prefect-agent   |            ^^^^^^^^^^
pycom-prefect-agent   |   File "/src/.venv/lib/python3.11/site-packages/prefect/events/clients.py", line 123, in _proxy_connect
pycom-prefect-agent   |     proto = await self.__await_impl__()
pycom-prefect-agent   |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
pycom-prefect-agent   |   File "/src/.venv/lib/python3.11/site-packages/websockets/legacy/client.py", line 656, in __await_impl__
pycom-prefect-agent   |     await protocol.handshake(
pycom-prefect-agent   |   File "/src/.venv/lib/python3.11/site-packages/websockets/legacy/client.py", line 316, in handshake
pycom-prefect-agent   |     status_code, response_headers = await self.read_http_response()
pycom-prefect-agent   |                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pycom-prefect-agent   |   File "/src/.venv/lib/python3.11/site-packages/websockets/legacy/client.py", line 136, in read_http_response
pycom-prefect-agent   |     raise InvalidMessage("did not receive a valid HTTP response") from exc
pycom-prefect-agent   | websockets.legacy.exceptions.InvalidMessage: did not receive a valid HTTP response
pycom-prefect-agent   | 20:05:13.489 | INFO    | EventsWorkerThread | prefect._internal.concurrency - Exiting worker thread 'EventsWorkerThread'
pycom-prefect-agent   | 20:05:13.700 | DEBUG   | GlobalEventLoopThread | prefect._internal.concurrency - Running call start() in thread 'GlobalEventLoopThread'
pycom-prefect-agent   | 20:05:13.700 | DEBUG   | GlobalEventLoopThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='start' RUNNING, runtime=0.00> entered
pycom-prefect-agent   | 20:05:13.700 | DEBUG   | GlobalEventLoopThread | prefect._internal.concurrency - Starting service <prefect.events.worker.EventsWorker object at 0x7f124c0c1a50>
pycom-prefect-agent   | 20:05:13.701 | DEBUG   | GlobalEventLoopThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='start' COMPLETED, runtime=0.00> exited
pycom-prefect-agent   | 20:05:13.701 | DEBUG   | GlobalEventLoopThread | prefect._internal.concurrency - Finished call start(<dropped>)
pycom-prefect-agent   | 20:05:13.701 | DEBUG   | AnyIO worker thread | prefect._internal.concurrency - Service <prefect.events.worker.EventsWorker object at 0x7f124c0c1a50> enqueuing item Event(occurred=DateTime(2024, 12, 20, 20, 5, 13, 701539, tzinfo=Timezone('UTC')), event='prefect.block.secret.loaded', resource=Resource(root={'prefect.resource.id': 'prefect.block-document.d7e50557-85e3-4961-95b5-a14c3da32edb', 'prefect.resource.name': 'postgres-dev-uri'}), related=[RelatedResource(root={'prefect.resource.id': 'prefect.block-type.secret', 'prefect.resource.role': 'block-type'})], payload={}, id=UUID('0c63029a-3d86-45c0-8e83-6bfcc42f4bb8'), follows=None)
pycom-prefect-agent   | 20:05:13.711 | DEBUG   | APILogWorkerThread | prefect._internal.concurrency - <WatcherThreadCancelScope, name='get' COMPLETED, runtime=2.00> exited

@jbw-vtl
Copy link
Contributor

jbw-vtl commented Dec 20, 2024

Hmm, upon further inspection downgrading to 3.1.5, including downgrading the database revisions is running into the same websockets errors.

However at this point not sure what could have changed here.

Will keep investigating the coming days to see if I find something sticking out.

Job execution etc all working as expected

@jakekaplan
Copy link
Contributor

jakekaplan commented Dec 21, 2024

Thank you all that info is really helpful. Will try and see if I can reproduce something similar on my end with this.

We did add some proxy support for websockets as part of 3.1.7 in this pr that I thought somehow could be related (although I don't immediately see how as it doesn't look like you're configuring HTTP_PROXY/HTTPS_PROXY)? But if it's still happening on 3.1.5, it seems unrelated.

@kavvkon
Copy link

kavvkon commented Dec 23, 2024

Hi All,
We seem to have run into the same issue using the current prefect server version, running open source.
[...]
Now for the odd part, found this thread and changed the two debug options recommended above, recreated both server & agent and the issue seemed to have gone away.
We will do further debugging this week, seemingly some odd behaviour here.
So far the logs do not indicate any errors except the below, however assuming this is expected

Same for me, when I enabled both debugging options the problem disappeared!

After updating to 3.1.9 problem still persists. Task runs are created but not plotted. And this time changing the logging level to debug does not make it work..

@jbw-vtl
Copy link
Contributor

jbw-vtl commented Dec 23, 2024

Okay, after a lot more debugging, did identify the issue for us.
In the end it ended up being some bad proxy configuration for the container spawned by the agent on our side...
Got myself quite a bit confused here, but now the issue is very clear, apologies for the rope around.

  • The server and agent itself are fully air-gapped without any configured proxy / outgoing network connectivity (this led me down the wrong path)
  • The agent spawns docker containers which themselves need network connectivity, configured by providing them with a HTTP_PROXY and HTTPS_PROXY
  • To avoid this impacting connections not supposed to be going through the proxy, we use NO_PROXY
  • This is configured via the environment variables in the work pool
{
"NO_PROXY":"<company>.com,server,127.0.0.1,0.0.0.0,localhost",
"HTTP_PROXY":"<proxy>",
"HTTPS_PROXY":"<proxy>",
}

Unsure whether there's helper functions around somewhere to validate whether a host matches a given NO_PROXY string as there are some edge cases to consider.
Happy to take a stab at this with a PR, keen to get prefect setup for development.
Will create a separate issue as this is potentially unrelated to the initial question.

Short term we will just remove the HTTP_PROXY, as the connections we care about use https and seems like the prefect agents use ws:// (which will check for HTTP_PROX to be present), for now working as expected.

@kavvkon
Copy link

kavvkon commented Dec 23, 2024

Some more logs:

Run log_repo_info() example with version 4.1.9

image

Client logs

3:10:56.857 | INFO    | prefect.engine - Created flow run 'elite-bee' for flow 'log-repo-info'
13:10:56.860 | INFO    | prefect.engine - View at http://192.168.1.100:4200/runs/flow-run/4da64fef-ca24-4f79-b19a-db91f97288ed13:10:58.593 | INFO    | Task run 'get_repo_info-508' - Finished in state Completed()
13:10:58.596 | INFO    | Flow run 'elite-bee' - Stars 🌠 : 17850
13:11:00.235 | INFO    | Task run 'get_contributors-95a' - Finished in state Completed()
13:11:00.235 | INFO    | Flow run 'elite-bee' - Number of contributors 👷: 30
13:11:00.285 | INFO    | Flow run 'elite-bee' - Finished in state Completed()

Server logs

13:10:57.551 | INFO    | uvicorn.access - 192.168.1.10:55528 - "GET /api/admin/version HTTP/1.1" 200
13:10:57.560 | INFO    | uvicorn.access - 192.168.1.10:55528 - "GET /api/csrf-token?client=8586729b-aeae-4b2b-90ad-9fc90c165973 HTTP/1.1" 422
13:10:57.574 | INFO    | uvicorn.access - 192.168.1.10:55528 - "POST /api/flows/ HTTP/1.1" 200
13:10:57.606 | DEBUG   | prefect.server.events.messaging - Publishing event: prefect.flow-run.Pending with id: 5177174e-0b2f-4d5c-a50f-cf4e0ff41d2d for resource: prefect.flow-run.4da64fef-ca24-4f79-b19a-db91f97288ed
13:10:57.609 | DEBUG   | prefect.server.events.services.event_persister - Received event: prefect.flow-run.Pending with id: 5177174e-0b2f-4d5c-a50f-cf4e0ff41d2d for resource: prefect.flow-run.4da64fef-ca24-4f79-b19a-db91f97288ed
13:10:57.613 | INFO    | uvicorn.access - 192.168.1.10:55528 - "POST /api/flow_runs/ HTTP/1.1" 201
13:10:57.643 | DEBUG   | prefect.server.events.messaging - Publishing event: prefect.flow-run.Running with id: 04cb9c1d-6138-4175-8adf-c74793a40a9d for resource: prefect.flow-run.4da64fef-ca24-4f79-b19a-db91f97288ed
13:10:57.646 | DEBUG   | prefect.server.events.services.event_persister - Received event: prefect.flow-run.Running with id: 04cb9c1d-6138-4175-8adf-c74793a40a9d for resource: prefect.flow-run.4da64fef-ca24-4f79-b19a-db91f97288ed
13:10:57.650 | INFO    | uvicorn.access - 192.168.1.10:55528 - "POST /api/flow_runs/4da64fef-ca24-4f79-b19a-db91f97288ed/set_state HTTP/1.1" 201
13:10:57.664 | INFO    | uvicorn.access - 192.168.1.10:55528 - "GET /api/flow_runs/4da64fef-ca24-4f79-b19a-db91f97288ed HTTP/1.1" 200
13:10:57.852 | DEBUG   | prefect.server.services.flowrunnotifications - About to run FlowRunNotifications...
13:10:57.857 | DEBUG   | prefect.server.services.flowrunnotifications - Got 0 notifications from queue.
13:10:57.858 | DEBUG   | prefect.server.services.flowrunnotifications - Finished running FlowRunNotifications. Next run at 2024-12-23 11:11:01.852279+00:00
13:10:58.847 | DEBUG   | uvicorn.error - = connection is CONNECTING
13:10:58.847 | DEBUG   | uvicorn.error - < GET /api/events/in HTTP/1.1
13:10:58.848 | DEBUG   | uvicorn.error - < host: 192.168.1.100:4200
13:10:58.848 | DEBUG   | uvicorn.error - < upgrade: websocket
13:10:58.848 | DEBUG   | uvicorn.error - < connection: Upgrade
13:10:58.848 | DEBUG   | uvicorn.error - < sec-websocket-key: mknU8lv3Sz2BFg7jy5pitA==
13:10:58.848 | DEBUG   | uvicorn.error - < sec-websocket-version: 13
13:10:58.848 | DEBUG   | uvicorn.error - < sec-websocket-extensions: permessage-deflate; client_max_window_bits
13:10:58.849 | DEBUG   | uvicorn.error - < user-agent: Python/3.12 websockets/13.1
13:10:58.849 | INFO    | uvicorn.error - ('192.168.1.10', 55533) - "WebSocket /api/events/in" [accepted]
13:10:58.850 | DEBUG   | uvicorn.error - > HTTP/1.1 101 Switching Protocols
13:10:58.850 | DEBUG   | uvicorn.error - > Upgrade: websocket
13:10:58.850 | DEBUG   | uvicorn.error - > Connection: Upgrade
13:10:58.850 | DEBUG   | uvicorn.error - > Sec-WebSocket-Accept: KS4QZH68Kj8YiUn2SXKS8n4N4DM=
13:10:58.850 | DEBUG   | uvicorn.error - > Sec-WebSocket-Extensions: permessage-deflate
13:10:58.850 | DEBUG   | uvicorn.error - > date: Mon, 23 Dec 2024 11:10:57 GMT
13:10:58.851 | DEBUG   | uvicorn.error - > server: uvicorn
13:10:58.851 | INFO    | uvicorn.error - connection open
13:10:58.851 | DEBUG   | uvicorn.error - = connection is OPEN
13:10:58.852 | DEBUG   | uvicorn.error - < PING 40 c7 42 8d [binary, 4 bytes]
13:10:58.853 | DEBUG   | uvicorn.error - > PONG 40 c7 42 8d [binary, 4 bytes]
13:10:58.864 | INFO    | uvicorn.access - 192.168.1.10:55534 - "GET /api/flows/3ada983d-00a8-41bc-9365-00dc8dac869c HTTP/1.1" 200
13:10:58.868 | DEBUG   | uvicorn.error - < TEXT '{"occurred":"2024-12-23T11:10:57.467392Z","even...859fba","follows":null}' [1334 bytes]
13:10:58.868 | DEBUG   | prefect.server.events.messaging - Publishing event: prefect.task-run.Pending with id: bbf732b0-b012-408f-916e-f4ae2f859fba for resource: prefect.task-run.bf1a3571-b87f-4fea-8048-32fb72b5795f
13:10:58.869 | DEBUG   | prefect.server.services.task_run_recorder - Received event: prefect.task-run.Pending with id: bbf732b0-b012-408f-916e-f4ae2f859fba for resource: prefect.task-run.bf1a3571-b87f-4fea-8048-32fb72b5795f
13:10:58.870 | DEBUG   | prefect.server.events.services.event_persister - Received event: prefect.task-run.Pending with id: bbf732b0-b012-408f-916e-f4ae2f859fba for resource: prefect.task-run.bf1a3571-b87f-4fea-8048-32fb72b5795f
13:10:58.871 | DEBUG   | uvicorn.error - < TEXT '{"occurred":"2024-12-23T11:10:57.471691Z","even...08f-916e-f4ae2f859fba"}' [1479 bytes]
13:10:58.872 | DEBUG   | prefect.server.events.messaging - Publishing event: prefect.task-run.Running with id: e2d5b99a-3301-4416-b7aa-667027a73eb4 for resource: prefect.task-run.bf1a3571-b87f-4fea-8048-32fb72b5795f
13:10:58.872 | DEBUG   | prefect.server.events.services.event_persister - Received event: prefect.task-run.Running with id: e2d5b99a-3301-4416-b7aa-667027a73eb4 for resource: prefect.task-run.bf1a3571-b87f-4fea-8048-32fb72b5795f
13:10:58.892 | DEBUG   | prefect.server.services.task_run_recorder - Recorded task run state change
13:10:58.892 | DEBUG   | prefect.server.services.task_run_recorder - Received event: prefect.task-run.Running with id: e2d5b99a-3301-4416-b7aa-667027a73eb4 for resource: prefect.task-run.bf1a3571-b87f-4fea-8048-32fb72b5795f
13:10:58.908 | INFO    | uvicorn.access - 192.168.1.10:55525 - "POST /api/task_runs/count HTTP/1.1" 200
13:10:58.912 | DEBUG   | prefect.server.services.task_run_recorder - Recorded task run state change
13:10:58.918 | INFO    | uvicorn.access - 192.168.1.10:55535 - "GET /api/flow_runs/dc851e66-6406-49fa-b211-f0bea2c71254 HTTP/1.1" 200
13:10:59.973 | DEBUG   | uvicorn.error - < TEXT '{"occurred":"2024-12-23T11:10:58.593322Z","even...416-b7aa-667027a73eb4"}' [1562 bytes]
13:10:59.974 | DEBUG   | prefect.server.events.messaging - Publishing event: prefect.task-run.Completed with id: fac4f4c8-4781-4963-b7ca-4fbffc5e42bd for resource: prefect.task-run.bf1a3571-b87f-4fea-8048-32fb72b5795f
13:10:59.974 | DEBUG   | prefect.server.services.task_run_recorder - Received event: prefect.task-run.Completed with id: fac4f4c8-4781-4963-b7ca-4fbffc5e42bd for resource: prefect.task-run.bf1a3571-b87f-4fea-8048-32fb72b5795f
13:10:59.976 | DEBUG   | prefect.server.events.services.event_persister - Received event: prefect.task-run.Completed with id: fac4f4c8-4781-4963-b7ca-4fbffc5e42bd for resource: prefect.task-run.bf1a3571-b87f-4fea-8048-32fb72b5795f
13:10:59.985 | DEBUG   | uvicorn.error - < TEXT '{"occurred":"2024-12-23T11:10:59.214188Z","even...e30c1f","follows":null}' [1396 bytes]
13:10:59.986 | DEBUG   | prefect.server.events.messaging - Publishing event: prefect.task-run.Pending with id: b8f6841a-a997-4f32-9eb4-5f0339e30c1f for resource: prefect.task-run.b500110c-c035-4c70-9d7d-79f8df5352b3
13:10:59.986 | DEBUG   | prefect.server.events.services.event_persister - Received event: prefect.task-run.Pending with id: b8f6841a-a997-4f32-9eb4-5f0339e30c1f for resource: prefect.task-run.b500110c-c035-4c70-9d7d-79f8df5352b3
13:10:59.987 | DEBUG   | uvicorn.error - < TEXT '{"occurred":"2024-12-23T11:10:59.219211Z","even...f32-9eb4-5f0339e30c1f"}' [1541 bytes]
13:10:59.987 | DEBUG   | prefect.server.events.messaging - Publishing event: prefect.task-run.Running with id: 09c3c603-3b81-4aa8-ad6d-63de8d011f0a for resource: prefect.task-run.b500110c-c035-4c70-9d7d-79f8df5352b3
13:10:59.988 | DEBUG   | prefect.server.events.services.event_persister - Received event: prefect.task-run.Running with id: 09c3c603-3b81-4aa8-ad6d-63de8d011f0a for resource: prefect.task-run.b500110c-c035-4c70-9d7d-79f8df5352b3
13:10:59.999 | DEBUG   | prefect.server.services.task_run_recorder - Recorded task run state change
13:10:59.999 | DEBUG   | prefect.server.services.task_run_recorder - Received event: prefect.task-run.Pending with id: b8f6841a-a997-4f32-9eb4-5f0339e30c1f for resource: prefect.task-run.b500110c-c035-4c70-9d7d-79f8df5352b3
13:11:00.013 | DEBUG   | prefect.server.services.task_run_recorder - Recorded task run state change
13:11:00.014 | DEBUG   | prefect.server.services.task_run_recorder - Received event: prefect.task-run.Running with id: 09c3c603-3b81-4aa8-ad6d-63de8d011f0a for resource: prefect.task-run.b500110c-c035-4c70-9d7d-79f8df5352b3
13:11:00.026 | DEBUG   | prefect.server.services.task_run_recorder - Recorded task run state change
13:11:00.852 | DEBUG   | prefect.server.services.recentdeploymentsscheduler - About to run RecentDeploymentsScheduler...
13:11:00.854 | DEBUG   | prefect.server.services.marklateruns - About to run MarkLateRuns...
13:11:00.856 | DEBUG   | prefect.server.services.failexpiredpauses - About to run FailExpiredPauses...
13:11:00.858 | DEBUG   | prefect.server.services.proactivetriggers - About to run ProactiveTriggers...
13:11:00.858 | DEBUG   | prefect.server.services.proactivetriggers - Finished running ProactiveTriggers. Next run at 2024-12-23 11:11:05.858061+00:00
13:11:00.869 | INFO    | prefect.server.services.recentdeploymentsscheduler - Scheduled 0 runs.
13:11:00.870 | DEBUG   | prefect.server.services.recentdeploymentsscheduler - Finished running RecentDeploymentsScheduler. Next run at 2024-12-23 11:11:05.852823+00:00
13:11:00.871 | INFO    | prefect.server.services.marklateruns - Finished monitoring for late runs.
13:11:00.871 | DEBUG   | prefect.server.services.marklateruns - Finished running MarkLateRuns. Next run at 2024-12-23 11:11:05.854947+00:00
13:11:00.872 | INFO    | prefect.server.services.failexpiredpauses - Finished monitoring for late runs.
13:11:00.872 | DEBUG   | prefect.server.services.failexpiredpauses - Finished running FailExpiredPauses. Next run at 2024-12-23 11:11:05.856602+00:00
13:11:01.007 | DEBUG   | uvicorn.error - < TEXT '{"occurred":"2024-12-23T11:11:00.235255Z","even...aa8-ad6d-63de8d011f0a"}' [1599 bytes]
13:11:01.009 | DEBUG   | prefect.server.events.messaging - Publishing event: prefect.task-run.Completed with id: 8717251e-9be4-4462-9091-242e857b665d for resource: prefect.task-run.b500110c-c035-4c70-9d7d-79f8df5352b3
13:11:01.009 | DEBUG   | prefect.server.services.task_run_recorder - Received event: prefect.task-run.Completed with id: 8717251e-9be4-4462-9091-242e857b665d for resource: prefect.task-run.b500110c-c035-4c70-9d7d-79f8df5352b3
13:11:01.011 | DEBUG   | prefect.server.events.services.event_persister - Received event: prefect.task-run.Completed with id: 8717251e-9be4-4462-9091-242e857b665d for resource: prefect.task-run.b500110c-c035-4c70-9d7d-79f8df5352b3
13:11:01.033 | DEBUG   | prefect.server.services.task_run_recorder - Recorded task run state change
13:11:01.043 | DEBUG   | prefect.server.events.messaging - Publishing event: prefect.flow-run.Completed with id: 212f5724-683a-44b8-a60c-0b3c46d54cc4 for resource: prefect.flow-run.4da64fef-ca24-4f79-b19a-db91f97288ed
13:11:01.045 | DEBUG   | prefect.server.events.services.event_persister - Received event: prefect.flow-run.Completed with id: 212f5724-683a-44b8-a60c-0b3c46d54cc4 for resource: prefect.flow-run.4da64fef-ca24-4f79-b19a-db91f97288ed
13:11:01.048 | INFO    | uvicorn.access - 192.168.1.10:55528 - "POST /api/flow_runs/4da64fef-ca24-4f79-b19a-db91f97288ed/set_state HTTP/1.1" 201
13:11:01.059 | INFO    | uvicorn.access - 192.168.1.10:55538 - "GET /api/csrf-token?client=809630e9-7402-487d-a4cf-cb56f6d6193f HTTP/1.1" 422
13:11:01.075 | INFO    | uvicorn.access - 192.168.1.10:55538 - "POST /api/logs/ HTTP/1.1" 201
13:11:01.081 | DEBUG   | uvicorn.error - < CLOSE 1000 (OK) [2 bytes]
13:11:01.081 | DEBUG   | uvicorn.error - = connection is CLOSING
13:11:01.082 | DEBUG   | uvicorn.error - > CLOSE 1000 (OK) [2 bytes]
13:11:01.082 | DEBUG   | uvicorn.error - x half-closing TCP connection
13:11:01.084 | DEBUG   | uvicorn.error - = connection is CLOSED
13:11:01.084 | INFO    | uvicorn.error - connection closed
13:11:01.588 | DEBUG   | prefect.server.events.services.event_persister - Persisting 9 events...
13:11:01.590 | DEBUG   | prefect.server.events.triggers - Running periodic evaluation as of 2024-12-23 11:11:01.558435+00:00 (offset -0.03208208084106445s)

@jakekaplan
Copy link
Contributor

@jbw-vtl no worries at all, thank you for the breakdown. That's very helpful. If you could file an issue for supporting NO_PROXY in the wrapper, happy to get a PR up for it.

@jakekaplan
Copy link
Contributor

@kavvkon from your logs I can see task runs inserted into the database. Does task run b500110c-c035-4c70-9d7d-79f8df5352b3 belong to flow run id 4da64fef-ca24-4f79-b19a-db91f97288ed? Do you see it in the task run/task run state table? If so, when looking at the graph what does your network tab show for /graph-v2?

@kavvkon
Copy link

kavvkon commented Dec 23, 2024

Yes I can see task run b500110c-c035-4c70-9d7d-79f8df5352b3 (get-contributors-95a) in flow 4da64fef-ca24-4f79-b19a-db91f97288ed (elite-bee)

image

Indeed there is an error in graph-v2 request
message: 'Request failed with status code 500', name: 'AxiosError', code: 'ERR_BAD_RESPONSE'

"AxiosError: Request failed with status code 500\n at tee (http://192.168.1.100:4200/assets/index-D4VbDMHP.js:115:1042)\n at XMLHttpRequest.W (http://192.168.1.100:4200/assets/index-D4VbDMHP.js:115:6094)\n at kv.request (http://192.168.1.100:4200/assets/index-D4VbDMHP.js:117:1971)\n at async Cnt.getFlowRunsGraph (http://192.168.1.100:4200/assets/index-D4VbDMHP.js:118:123221)\n at async Object.a [as start] (http://192.168.1.100:4200/assets/index-D4VbDMHP.js:1318:21173)"

@jakekaplan
Copy link
Contributor

jakekaplan commented Dec 23, 2024

@kavvkon can you look at the server logs when you hit /graph-v2. You should see the actual error occurring there. It seems like your task runs are written fine, but theres some issue with reading them out?

@kavvkon
Copy link

kavvkon commented Dec 23, 2024

@kavvkon can you look at the server logs when you hit /graph-v2. You should see the actual error occurring there. It seems like your task runs are written fine, but theres some issue with reading them out?

hmm not what I expected

Traceback (most recent call last):
  File "/home/user1/miniconda3/envs/prefect/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1815, in _execute_context
    context = constructor(
              ^^^^^^^^^^^^
  File "/home/user1/miniconda3/envs/prefect/lib/python3.12/site-packages/sqlalchemy/engine/default.py", line 1493, in _init_compiled
    flattened_processors[key](compiled_params[key])
  File "/home/user1/miniconda3/envs/prefect/lib/python3.12/site-packages/sqlalchemy/sql/type_api.py", line 2037, in process
    fixed_process_param(value, dialect)
  File "/home/user1/miniconda3/envs/prefect/lib/python3.12/site-packages/prefect/server/utilities/database.py", line 127, in process_bind_param
    if value.tzinfo is None:
       ^^^^^^^^^^^^
AttributeError: 'str' object has no attribute 'tzinfo'

@jakekaplan
Copy link
Contributor

jakekaplan commented Dec 23, 2024

That's not what I would have expected either... Maybe one of the task run timestamps doesn't have a tz? I'm not positive how that's possible.

@kavvkon Could you open a separate issue for this?

An MRE with a @task that gets this to happen every time in a reliable way in a fresh setup would really help in trying to solve it.

@jbw-vtl
Copy link
Contributor

jbw-vtl commented Dec 29, 2024

@jbw-vtl no worries at all, thank you for the breakdown. That's very helpful. If you could file an issue for supporting NO_PROXY in the wrapper, happy to get a PR up for it.

Took a stab at implementing the behaviour within #16537

@jakekaplan
Copy link
Contributor

Going to close this issue as the likely primary causes have been addressed and it has become a bit of a catch all.

If anyone continues to see missing task runs, please follow these debugging steps and open a new issue with a MRE. Thank you!

@github-project-automation github-project-automation bot moved this from In progress to Done in OSS Backlog Jan 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Done
Development

No branches or pull requests

10 participants