Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query activity state when GetExecBatchResults call fails due to GSB error #587

Closed
azawlocki opened this issue Aug 9, 2021 · 1 comment · Fixed by #588
Closed

Query activity state when GetExecBatchResults call fails due to GSB error #587

azawlocki opened this issue Aug 9, 2021 · 1 comment · Fixed by #588
Assignees
Labels
beta.3 enhancement New feature or request

Comments

@azawlocki
Copy link
Contributor

azawlocki commented Aug 9, 2021

An idea from @mfranciszkiewicz (from discord):

(...) Failing calls to the GetExecBatchResults endpoint are not unusual, but the cause is usually limited to "endpoint not registered". In this kind of situation, the requestor could try to query activity state, which would provide us with the following information:

  • if it's not possible to query activity state, the provider is gone or having networking issues,
  • if we can retrieve the state (most probably "Terminated"), we can log a message on premature activity termination on provider's side. Last known activity state is stored by the provider in their database for such cases.

The latter could be extended with state metadata / cause of the issue; this task already resides in core team's backlog.

My refinement:

I'm trying to figure out where call retrying fits in this picture, does the following look reasonable to you:

if call failed with "endpoint address not found":
    if try_query_activity_state() == "Terminated":
        log("Premature activity termination by provider")
    else:  # a different state or query failed
        repeat the call
else:  # call failed for other reason
    repeat the call

@mfranciszkiewicz

yes, that's basically it

edit: mfranciszkiewicz: change error message to endpoint address not found

@azawlocki azawlocki added bug Something isn't working enhancement New feature or request beta.3 and removed bug Something isn't working labels Aug 9, 2021
@azawlocki
Copy link
Contributor Author

azawlocki commented Aug 9, 2021

Supersedes #584

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
beta.3 enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants