Configuration Properties

SparkRDMA has several runtime properties that can be set along side other Spark properties:

General Properties

Property Name	Default/Min/Max	Description
spark.shuffle.rdma.driverPort	random/1025/65535	Port the RDMA driver instance will listen on.
spark.shuffle.rdma.executorPort	random/1025/65535	Port the RDMA executor instances will listen on.
spark.shuffle.rdma.portMaxRetries	16/1/1000	Maximum number of attempts to bind to an RDMA port before failing. Each retry will increment the previously attempted port number by 1. This value applies to both the RDMA driverPort and RDMA executorPort.
spark.shuffle.rdma.cpuList	All CPUs/--/--	The list of CPUs that should be used by the RDMA services for thread creation and event processing. It is recommended to only use the CPU cores associated with the NUMA node that the Mellanox NIC is attached to. The parameter should be specified as a comma separated list, but can also take a hyphenated range. Invalid syntax will result in reverting to the default value. Examples: 1,3,5 or 1-5, or 1-4,10-12
spark.shuffle.rdma.useOdp	false	On-Demand-Paging (ODP) is a technique to ease the memory registration. Applications do not need to pin down the underlying physical pages of the address space, and track the validity of the mappings. Rather, the HCA (Host Channel Adapter) requests the latest translations from the OS when pages are not present, and the OS invalidates translations which are no longer valid due to either non-present pages or mapping changes. See more...
spark.shuffle.rdma.collectOdpStats	true	Collect and report ODP statistics
spark.shuffle.rdma.device.num	0	Device number to get ODP stats from sysfs (`/sys/class/infiniband_verbs/uverbs$DEVICE_NUMBER`/) (only if `spark.shuffle.rdma.useOdp=true` and `spark.shuffle.rdma.collectOdpStats=true`)
spark.shuffle.rdma.preAllocateBuffers		Comma separated list of buffer size : buffer count pairs. E.g. 4k:1000,16k:500

RDMA Queue Pair (QP) Properties

Property Name	Default/Min/Max	Description
spark.shuffle.rdma.recvQueueDepth	1024/256/65535	The maximum number of outstanding receive work requests that can be posted to the QP.
spark.shuffle.rdma.sendQueueDepth	4096/256/65535	The maximum number of outstanding send work requests that can be posted to the QP.
spark.shuffle.rdma.recvWrSize	4k/2k/1m	The size (in bytes) of the buffers used to receive data from a SEND operation.

RDMA Connection Management Properties

Property Name	Default/Min/Max	Description
spark.shuffle.rdma.rdmaCmEventTimeout	20000/-1/60000	The amount of time to wait (in milliseconds) for RDMA CM events before failing. A value of -1 means to wait forever.
spark.shuffle.rdma.teardownListenTimeout	50/-1/60000	The amount of time to wait (in milliseconds) for RDMA disconnect events before failing. A value of -1 means to wait forever.
spark.shuffle.rdma.resolvePathTimeout	2000/-1/60000	The amount of time to wait (in milliseconds) for RDMA resolve address and resolve route events before failing. A value of -1 means to wait forever.
spark.shuffle.rdma.maxConnectionAttempts	5/1/100	Maximum attempts to set up remote connections before failing a task

Shuffle Writer (Mapper) Properties

Property Name	Default/Min/Max	Description
spark.shuffle.rdma.shuffleWriteBlockSize	8m/4k/512m	The storage block size used for the shuffle writer. When using "ChunkedPartitionAgg" writer method, it's the size of each memory buffer used to store ShuffleWrite data. In “Wrapper” mode, it's the size of each file mapping – e.g. a 120MB file is broken down into 8MB sized file mappings.

Shuffle Reader (Reducer) Properties

Property Name	Default/Min/Max	Description
spark.shuffle.rdma.shuffleReadBlockSize	256k/0/512m	The transfer size to be used for block fetches on shuffle read operations. The SparkRDMA layer will aggregate the blocks into a single buffer until it reaches this size. When set to "0", no aggregation will be performed on the reader side.
spark.shuffle.rdma.maxBytesInFlight	64m/128k/100g	Maximum bytes that shuffle read operations will attempt to fetch at any given moment. If this threshold is reached, then fetches will resume only once outstanding requests complete.
spark.shuffle.rdma. partitionLocationFetchTimeout	30000/1000/MAX_INT	The amount of time to wait (in milliseconds) for fetching Shuffle metadata

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configuration Properties

SparkRDMA has several runtime properties that can be set along side other Spark properties:

General Properties

RDMA Queue Pair (QP) Properties

RDMA Connection Management Properties

Shuffle Writer (Mapper) Properties

Shuffle Reader (Reducer) Properties

Clone this wiki locally