Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retrieval deals fail if running many in parallel #2743

Closed
mishmosh opened this issue Jul 31, 2020 · 6 comments
Closed

Retrieval deals fail if running many in parallel #2743

mishmosh opened this issue Jul 31, 2020 · 6 comments

Comments

@mishmosh
Copy link
Contributor

From @stefangeorgerobert:

I’ve tried running parallel retrievals for approx 10 hours on our dev environment and most of the retrievals failed with timeout.

2020/07/31:43:23:984  [ INFO ]   RetrieveDeals: TOTAL : 61
2020/07/31:43:23:984  [ INFO ]   RetrieveDeals: SUCCESSFUL : 3
  • In order to run a retrieval bot that doesn't crash its lotus node, we are heavily constraining the retrieval volume. MAX_RETRIEVE_DEALS_RUNS is currently set to 2, that means the bot can request only 2 and wait for the result.
  • Serial retrievals seems more successful as on calibration.spacerace dashboard the Average Retrieval Deal Success Rate is 26% at the moment

Based on @jimpick's manual testing:

Based on manual testing with my custom testground instance, if I do a single retrieval, it completes in 1 minute. If I do 2 in parallel, each takes 2 minutes to complete. If I do 6 in parallel, each takes 7 minutes to complete.

@dirkmc has a PR that may address this, but not confirmed.
#2640

@whyrusleeping
Copy link
Member

Thanks for the report! I think this should be resolved by #2640 , but am not 100% sure.

@dirkmc what do you think?

@mishmosh
Copy link
Contributor Author

mishmosh commented Aug 4, 2020

Short-term, this is causing major delays to the retrieval deal-making rate from the Space Race bot. Proceeding with the workaround, which is to run more parallel retrieval dealbot nodes.

Long-term, the ability to do multiple retrievals in parallel is essential for production usage.

@jimpick
Copy link
Contributor

jimpick commented Aug 6, 2020

#2640 has landed and when I test parallel deals using my testground setup, they run as fast as single deals.

@TippyFlitsUK
Copy link
Contributor

Hi @mishmosh

This looks like a reminder. Should we close it or keep it open?

@Reiers Reiers added the need/author-input Hint: Needs Author Input label Apr 18, 2022
@github-actions
Copy link

Oops, seems like we needed more information for this issue, please comment with more details or this issue will be closed in 24 hours.

@github-actions
Copy link

This issue was closed because it is missing author input.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants