Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(dataflow): catch all exceptions when creating a KafkaStreams object #5405

Merged
merged 1 commit into from
Mar 7, 2024

Conversation

lc525
Copy link
Member

@lc525 lc525 commented Mar 7, 2024

Previously, we only caught a StreamsException. However, the creation might fail for many reasons (for example, incorrect configuration).

We want to catch any exception so that we mark the pipeline creation as failed and we don't stop the connection to the scheduler.

Previously, on configuration errors, the exception would be bubbled to the PipelineSubscriber event loop, and the connection to the scheduler would be broken. We would try to reconnect, but on reconnect the scheduler would try to re-init the problematic pipeline (with the same id). This then led to an error about existing uncleaned KafkaStreams state in /tmp. This latter error was being handled cleanly (i.e not breaking the connection to the scheduler anymore), but would mask the real reason for the failure when looking at the pipeline status (via k8s or seldon cli).

Previously, we only caught a StreamsException. However, the creation might
fail for many reasons (for example, incorrect configuration).

We want to catch any exception so that we mark the pipeline creation as
failed and we don't stop the connection to the scheduler.

Previously, on configuration errors, the exception would be bubbled to
the PipelineSubscriber event loop, and the connection to the scheduler
would be broken. We would try to reconnect, but on reconnect the
scheduler would try to re-init the problematic pipeline (with the same id).
This then led to an error about existing uncleaned KafkaStreams state in /tmp.
This latter error was being handled cleanly (i.e not breaking the connection
to the scheduler anymore), but would mask the real reason for the failure when
looking at the pipeline status (via k8s or seldon cli).
@lc525 lc525 requested a review from sakoush as a code owner March 7, 2024 16:30
@lc525 lc525 changed the title catch all exceptions when creating a KafkaStreams object fix(dataflow): catch all exceptions when creating a KafkaStreams object Mar 7, 2024
@sakoush sakoush added the v2 label Mar 7, 2024
Copy link
Member

@sakoush sakoush left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@lc525 lc525 merged commit 6620fbc into SeldonIO:v2 Mar 7, 2024
3 of 4 checks passed
@lc525 lc525 deleted the fix.dataflow.pipeline-stuck branch March 8, 2024 09:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants