Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-4188] [Core] Perform network-level retry of shuffle file fetches #3101

Closed
wants to merge 6 commits into from

Commits on Nov 5, 2014

  1. [SPARK-4238] [Core] Perform network-level retry of shuffle file fetches

    This adds a RetryingBlockFetcher to the NettyBlockTransferService which is wrapped around our typical OneForOneBlockFetcher, adding retry logic in the event of an IOException.
    
    This sort of retry allows us to avoid marking an entire executor as failed due to garbage collection or high network load.
    
    TODO:
    - [ ] unit tests
    - [ ] put in ExternalShuffleClient too
    aarondav committed Nov 5, 2014
    Configuration menu
    Copy the full SHA
    66e5a24 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    05ff43c View commit details
    Browse the repository at this point in the history
  3. Fix unit test

    aarondav committed Nov 5, 2014
    Configuration menu
    Copy the full SHA
    6f594cd View commit details
    Browse the repository at this point in the history

Commits on Nov 6, 2014

  1. Address initial comments

    aarondav committed Nov 6, 2014
    Configuration menu
    Copy the full SHA
    e80e4c2 View commit details
    Browse the repository at this point in the history
  2. Fix unit tests

    aarondav committed Nov 6, 2014
    Configuration menu
    Copy the full SHA
    c7fd107 View commit details
    Browse the repository at this point in the history

Commits on Nov 7, 2014

  1. Configuration menu
    Copy the full SHA
    72a2a32 View commit details
    Browse the repository at this point in the history