RDMA SRQ implementation for NVMe-oF target #3

EugeneKochetov · 2018-10-04T15:24:41Z

nvmf/rdma: Add shared receive queue support

This is a new feature for NVMEoF RDMA target, that is intended to save resource allocation
(by sharing them) and utilize the locality (completions and memory) to get the best
performance with Shared Receive Queues (SRQs). We'll create a SRQ per core (poll group),
per device and associate each created QP/CQ with an appropriate SRQ.

Our testing environment has 2 hosts.
Host 1:
CPU: Intel(R) Xeon(R) CPU E5-2609 0 @ 2.40GHz dual socket (8 cores total)
Network: ConnectX-5, ConnectX-5 VPI , 100GbE, single-port QSFP28, PCIe3.0 x16
Disk: Intel Optane SSD 900P Series
OS: Fedora 27 x86_64
Host 2:
CPU: Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz dual-socket (24 cores total)
Network: ConnectX-4 VPI , 100GbE, dual-port QSFP28
Disk: Intel Optane SSD 900P Series
OS : CentOS 7.5.1804 x86_64
Hosts are connected via Spectrum switch.
Host 1 is running SPDK NVMeoF target. Host 2 is used as initiator running fio with SPDK plugin.

Configuration:

SPDK NVMeoF target: cpu mask 0x0F (4 cores), max queue depth 128, max SRQ depth 1024, max QPs per controller 1024
Single NVMf subsystem with single namespace backed by physical SSD disk
fio with SPDK plugin: randread pattern, 1-256 jobs, block size 4k, IO depth 16, cpu_mask 0xFFF0, IO rate 10k, rate process “poisson”

Here is a full fio command line:
fio --name=Job --stats=1 --group_reporting=1 --idle-prof=percpu --loops=1 --numjobs=1 --thread=1 --time_based=1 --runtime=30s --ramp_time=5s --bs=4k --size=4G --iodepth=16 --readwrite=randread --rwmixread=75 --randrepeat=1 --ioengine=spdk --direct=1 --gtod_reduce=0 --cpumask=0xFFF0 --rate_iops=10k --rate_process=poisson --filename=trtype=RDMA adrfam=IPv4 traddr=1.1.79.1 trsvcid=4420 ns=1

SPDK allocates the following entities for every work request in receive queue (shared or not): reqs (1024 bytes), recvs (96 bytes), cmds (64 bytes), cpls (16 bytes), in_capsule_buffer. All except the last one are fixed size. In capsule data size is configured to 4096.
Memory consumption calculation (target):

Multiple SRQ: core_num * ib_devs_num * SRQ_depth * (1200 + in_capsule_data_size)
Multiple RQ: queue_num * RQ_depth * (1200 + in_capsule_data_size)
We ignore admin queues in calculations for simplicity.

Cases:

Multiple SRQ with 1024 entries:
- Mem = 4 * 1 * 1024 * (1200 + 4096) = 20.7 MiB (Constant number – does not depend on initiators number)
RQ with 128 entries for 64 initiators:
- Mem = 64 * 128 * (1200 + 4096) = 41.4 MiB

Results:
FIO_JOBS kIOPS Bandwidth, MiB/s AvgLatency, us MaxResidentSize, kiB
RQ SRQ RQ SRQ RQ SRQ RQ SRQ
1 8.623 8.623 33.7 33.7 13.89 14.03 144376 155624
2 17.3 17.3 67.4 67.4 14.03 14.1 145776 155700
4 34.5 34.5 135 135 14.15 14.23 146540 156184
8 69.1 69.1 270 270 14.64 14.49 148116 156960
16 138 138 540 540 14.84 15.38 151216 158668
32 276 276 1079 1079 16.5 16.61 157560 161936
64 513 502 2005 1960 1673.31 1612.38 170408 168440
128 535 526 2092 2054 3329.79 3344.03 195796 181524
256 571 571 2232 2233 6854.57 6873.37 246484 207856

We can see the benefit in memory consumption.

The drawback of using SRQ is a risk of RNR errors when multiple clients initiate large number of IOs simultaneously.
In "RQ per QP" this is handled by Submission Queue flow control and RNR is not possible.
This patch does not contain any changes to solve RNR issue but we see at least two options here:

try to increase the RNR retry count to more than 0 which is now hardcoded and make it a configurable parameter.
implement some mechanism for dynamic SRQ extension. It may scale with number of IO queues or when it reaches some threshold.

Change-Id: I28f21649feae8022cd0f0afd5b01ae3ee2b3592e Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com> Reviewed-on: https://review.gerrithub.io/424618 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>

When emulating write_zeroes commands on device that don't natively support it, we submit a write with a zeroed buffer. We used to just reuse the original bdev_io, but that was recently changed due to other splitting code added for iovs. But when making those changes, we forgot to free the bdev_io for the write that was sent down to the device. Fixes: 183f37e (bdev: do not reuse bdev_io when...) Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: If08782c65f6305c0a9f9d15d74fd8823e1158e9b Reviewed-on: https://review.gerrithub.io/424733 Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>

The previous patch fixed the underlying issue. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Ic5ccaf82aba4d2ed8644b34bd4d0294784fd1de2 Reviewed-on: https://review.gerrithub.io/424734 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>

It used to be enough for ioatdma to be unloaded - but now we require the channels to be bound to uio_pci_generic or vfio-pci. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I790fd909a4aa92adead5d711cce093e18e9f9595 Reviewed-on: https://review.gerrithub.io/424725 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>

spdk_bdev_close() must be called on the same thread as spdk_bdev_open(). Further, the remove callback on the descriptor will also be run on the same thread as spdk_bdev_open(). Change-Id: I949d6dd67de1e63d39f06944d473e4aa7134111b Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/424738 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: GangCao <gang.cao@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>

When new bdev was created, the struct spdk_bdev_module::examine_disk() may open and close bdev. On the other hand, if something goes wrong, the creation procedure may unregister new created bdev, so race condition appeared between _remove_notify() and spdk_bdev_close(). Add the new field "closed" and "remove_notified" in struct spdk_bdev_desc, so _remove_notify() and spdk_bdev_close() knows how to deal with this situation. Change-Id: Ibfe915a4d76096796b039a13a4f49f26669eba2c Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com> Reviewed-on: https://review.gerrithub.io/423369 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>

spdk_bdev_io_put_buf() is responsible for reclaiming bdev-allocated buffers from a bdev_io. If there are bdev_ios waiting for one of these buffers, it calls spdk_bdev_io_set_buf() on the next bdev_io in the queue. This will set the iov_base and iov_len on the bdev_io to point to the bdev-allocated buffer. But spdk_bdev_io_put_buf() was calling spdk_bdev_io_set_buf() on the just completed bdev_io, not the next bdev_io in the queue. So fix that. Fixes: 844aedf ("bdev: Simplify get/set/put buf functions") Reported-by: Alan Tu Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Ibbcad6e35a3db6991bd7deb3516229572f021638 Reviewed-on: https://review.gerrithub.io/424880 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>

Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I69101b142b99403ecdd1b651fba85b8aa10df20e Reviewed-on: https://review.gerrithub.io/424721 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Lance Hartmann <lance.hartmann@oracle.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>

When --with-shared is specified, we need to make sure the shared libraries have been built before we start linking applications. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I98510b29cca67e09f0c33ac9c8e823e61ac5dc8d Reviewed-on: https://review.gerrithub.io/424722 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Lance Hartmann <lance.hartmann@oracle.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>

Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I3fa790389f56715629c0301f101a5244dc7ae516 Reviewed-on: https://review.gerrithub.io/424723 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Lance Hartmann <lance.hartmann@oracle.com> Reviewed-by: Seth Howell <seth.howell5141@gmail.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>

Change-Id: Idd365df7fb61eafb502f415adf70638bb91ded0e Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.gerrithub.io/424773 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>

Change-Id: I2fda87d6176e18f2face59c9d916db2b4631c05b Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.gerrithub.io/424774 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>

Latest scan-build finds a legitimate issue here - although it's not clear why it was only found when testing against a seemingly unrelated patch... Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Ic981f6f403c638b0b2d454c0be9f7fbe145171d9 Reviewed-on: https://review.gerrithub.io/424887 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>

We cannot split an iov if a buffer hasn't been allocated yet. So always call spdk_bdev_io_get_buf on reads before trying to split. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I2c26efb9dc6cb2c7c3e3b7ae5bab2c37844b9113 Reviewed-on: https://review.gerrithub.io/424879 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>

This is an artifact from before SPDK had a configure script or a DPDK submodule. Make configure the only supported way for specifying the location of the DPDK installation to use with SPDK. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I5c197c46220928bb18b97c8807755967d76ea42c Reviewed-on: https://review.gerrithub.io/424893 Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>

Change-Id: I6ae1f380aebbcf090a0ff31ff96fc4592fc29591 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/421173 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>

There is a race condition with the following sequence: spdk_bdev_open() spdk_bdev_unregister() <-- starts deferred message spdk_bdev_close() deferred message runs, crashes Change-Id: I81fbced0849949cfb2dae5a7cc6f60c9685a8885 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/424739 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>

Change-Id: I51b312a086f18a5b5f63de27dd69e43a8cc7225d Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.gerrithub.io/424914 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>

Currently in the function nvme_ctrlr_start() the initialization process is executed as a whole, in the case there are many controllers in one system, which means other controllers must call the function one by one. While here, we add several states here, which can help refactoring the initialization process. Change-Id: I209cf964bbf6e151823a7ecdc6a3f6e6e69df297 Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.gerrithub.io/424157 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>

Change-Id: Ide0c81b1cc29d67cec0c10ab877360db3699141e Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.gerrithub.io/424775 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>

Change-Id: I1e5be0e282b9e29f7bf7ca7d2720b9fd00539be0 Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.gerrithub.io/424776 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>

Make clear that building with an alternate DPDK than the SPDK's dpdk submodule is not limited to pointing only to a full DPDK directory of sources, but also supports the ability to build against pre-built libraries and includes provided by an installation of the DPDK packages available from several distro's. Change-Id: I40cd8132e45dbd366a4c93c891a95e8952b6620d Signed-off-by: Lance Hartmann <lance.hartmann@oracle.com> Reviewed-on: https://review.gerrithub.io/425003 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>

block_size should be extended_sector_size in case of metadata or PI is enabled. Change-Id: I2cba61975b0541ef64839a8cd117eb42f19742b3 Signed-off-by: Xiaodong Liu <xiaodong.liu@intel.com> Reviewed-on: https://review.gerrithub.io/425061 Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>

Now that it is required to be on the same thread, the message isn't necessary. Change-Id: I714b77b46467dbcfa51186c8404c5976eaeea08a Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/424593 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>

This initiates an error recovery instead of a disconnect. The error recovery may result in a disconnect if the qpair is not recoverable. This also resolves an issue where the disconnect may immediately release the resources associated with the rqpair, but upcoming wc entries may still reference it. Change-Id: I9d9e212a83129412e049c91c02725699ce2cac11 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/425010 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>

The underlying DPDK function we use reads an array at the provided index without checking for any out of bounds access. The array is RTE_MAX_LCORE elements long, so always manually check against that to keep our APIs safe. This fixes potential crashes with lcore == SPDK_ENV_LCORE_ID_ANY Change-Id: I3081b888275fbecba8ab95feb20d2074341e2fc7 Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/425042 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>

Change-Id: I859a390e830d43e921244a8a482a63e5c8afe56c Signed-off-by: Takeshi Yoshimura <tyos@jp.ibm.com> Reviewed-on: https://review.gerrithub.io/424016 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>

Changed "/bin/sh" to "/usr/bin/env bash" in scripts/pkgdep.sh Previously script failed on my Ubuntu server, with error: trap: ERR: bad trap Change-Id: I054c0388c462e2d0340b6e3d4e581a8e6cdc7097 Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com> Reviewed-on: https://review.gerrithub.io/424998 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>

This was not being tested on every patch previously. Change-Id: I0a1756f3709da6608f66a2160962ed9b0fb38c7b Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/424890 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>

This prevents us from using the bdev embedded in the lun after it has been freed. Change-Id: I780cf3eccca05a58d3461366fec024be42b8ff74 Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/425174 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>

This allows a userto run the script without specifying a configuration file. Removing this dependency opens us up to things like calling vm_setup directly from a vagrantfile without having to first copy the spdk source over. Change-Id: I72074a445f8befc714c03cab57a2da539350c092 Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/426944 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>

For some use cases (e.g. creating a test pool vm), we won't want to rsync spdk from the host, and instead will want to clone a new one from scratch. Change-Id: I4c27f8ffc6c04aa0901dfe5b536b7e6ba94f7693 Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/426945 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>

Allow the option to run vm_setup.sh during provisioning on a fedora system. This is one step closer to fully automating the setup of a build pool test machine. Change-Id: Ia3965b31e0a9217d176ffe3c165b8eb6b3ccad13 Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/426946 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>

Keeping consistent with the distributions we support in the build pool. Change-Id: Iec13686507f890e9e18ecd5a1f4f238b2370f1d7 Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/427161 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>

Update the branch to point at our code rebased on the most recent mainline release of nvme-cli. This branch includes fixes for compatibility with gcc 8. Change-Id: Ie1bdb046d1e24e832bda585dd9841619451ec5f5 Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/426985 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com>

There were some innacuracies in vagrant documentation, such as typos, incorrect versions and incorrect names. This commit fixes them. Change-Id: Ibe01f24c43bc105a19b27b81cea771c6711af7c5 Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com> Reviewed-on: https://review.gerrithub.io/427023 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>

As a side effect, if SPDK_LOG_DISABLED is used in spdk_log() as log level then no message will be printed. Change-Id: I2d57b60a5a310a9ef2a1187a81088d0acf828742 Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com> Reviewed-on: https://review.gerrithub.io/425105 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>

This patch is to introduce the specific QoS related structure and the enumeration for types of QoS rate limits. Later new types of QoS rate limits can be supported easily. Change-Id: Idb8d2e7627fd145bf2b0ddb296c968b6b068f48c Signed-off-by: GangCao <gang.cao@intel.com> Reviewed-on: https://review.gerrithub.io/424459 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>

Print shorter output about unsupported option instead of printing full "usage" text. Change-Id: I92a98b0bdf0b2ed9ac56da644f24777f76e7df29 Signed-off-by: Karol Latecki <karol.latecki@intel.com> Reviewed-on: https://review.gerrithub.io/426471 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com> Reviewed-by: Seth Howell <seth.howell5141@gmail.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>

Add create/delete/get methods from nvmf subsystem to spdkcli. Change-Id: Id1f4e539d14746c4d6108bb58df921c301d47e96 Signed-off-by: Pawel Kaminski <pawelx.kaminski@intel.com> Reviewed-on: https://review.gerrithub.io/423425 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Paweł Niedźwiecki <pawelx.niedzwiecki@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>

Change-Id: Idf6dd895033393858e49a494f8aac67b8034c67d Signed-off-by: Pawel Kaminski <pawelx.kaminski@intel.com> Reviewed-on: https://review.gerrithub.io/423650 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Paweł Niedźwiecki <pawelx.niedzwiecki@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>

Change-Id: I555abe8500547be81ca5a6566539dae7c8d7e690 Signed-off-by: Pawel Niedzwiecki <pawelx.niedzwiecki@intel.com> Reviewed-on: https://review.gerrithub.io/427244 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>

Change system manager timeout for stopping services. Lagging services cause timeouts during shutting down VM. Change-Id: I7428724cf8795ffe3c215bacdb56cb3f58fbdff0 Signed-off-by: Pawel Niedzwiecki <pawelx.niedzwiecki@intel.com> Reviewed-on: https://review.gerrithub.io/427715 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>

In newer versions of the kernel, there is a race condition that causes the kernel block layer to mess up if an nvme-oF subsystem is disconnected before it is fully removed. This can cause the shutdown test to hang during a sync call aftr nvmf tgt shutdown. see: https://ci.spdk.io/spdk/builds/review/d1ecb02290a63c7e17cbe9becd7d48e3d0f245d6.1537982067/fedora-06/build.log for details. This problem has been observed and fixed in the RPC tests also. https://review.gerrithub.io/c/spdk/spdk/+/426772 Change-Id: I9ec8517e067448be323e69979adf5d3915195c80 Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/427152 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>

This patch fixes the following scan build failure. nbd.c:697:6: warning: Use of memory after it is freed if (io->state == NBD_IO_XMIT_RESP) { ^~~~~~~~~ Change-Id: Icba8b509604b064bff36d6ef63ecf02617ad7666 Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-on: https://review.gerrithub.io/427366 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>

Consolidating multiptle reinsertions into a place will make the logic a little cleaner. Change-Id: Iab7e9f8e7dcdebbec9d51e151b1d838567c1dcc4 Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-on: https://review.gerrithub.io/427441 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>

according to commit: bdev: add spdk_bdev_queue_io_wait This patch will make io_wait to support nbd Change-Id: I9c8a6f5d20afbada45cedffe9d49846a992d2581 Signed-off-by: Ni Xun <nixun@baidu.com> Signed-off-by: Li Lin <lilin24@baidu.com> Signed-off-by: Zhang Yu <zhangyu31@baidu.com> Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-on: https://review.gerrithub.io/425594 Reviewed-by: Jim Harris <james.r.harris@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>

We need to include it in devel package. Change-Id: I823200632e8bcb9fdb86c8cb5fbf3a651a710b78 Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com> Reviewed-on: https://review.gerrithub.io/426839 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>

Each file that need to check SPDK_CONFIG_* options need to include spdk/config.h explicitly. Change-Id: If9f2a91ac4c2b1a300dcf88ec3e2a12714ad344a Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com> Reviewed-on: https://review.gerrithub.io/427221 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>

This is a new feature for NVMEoF RDMA target, that is intended to save resource allocation (by sharing them) and utilize the locality (completions and memory) to get the best performance with Shared Receive Queues (SRQs). We'll create a SRQ per core (poll group), per device and associate each created QP/CQ with an appropriate SRQ. Our testing environment has 2 hosts. Host 1: CPU: Intel(R) Xeon(R) CPU E5-2609 0 @ 2.40GHz dual socket (8 cores total) Network: ConnectX-5, ConnectX-5 VPI , 100GbE, single-port QSFP28, PCIe3.0 x16 Disk: Intel Optane SSD 900P Series OS: Fedora 27 x86_64 Host 2: CPU: Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz dual-socket (24 cores total) Network: ConnectX-4 VPI , 100GbE, dual-port QSFP28 Disk: Intel Optane SSD 900P Series OS : CentOS 7.5.1804 x86_64 Hosts are connected via Spectrum switch. Host 1 is running SPDK NVMeoF target. Host 2 is used as initiator running fio with SPDK plugin. Configuration: - SPDK NVMeoF target: cpu mask 0x0F (4 cores), max queue depth 128, max SRQ depth 1024, max QPs per controller 1024 - Single NVMf subsystem with single namespace backed by physical SSD disk - fio with SPDK plugin: randread pattern, 1-256 jobs, block size 4k, IO depth 16, cpu_mask 0xFFF0, IO rate 10k, rate process “poisson” Here is a full fio command line: fio --name=Job --stats=1 --group_reporting=1 --idle-prof=percpu --loops=1 --numjobs=1 --thread=1 --time_based=1 --runtime=30s --ramp_time=5s --bs=4k --size=4G --iodepth=16 --readwrite=randread --rwmixread=75 --randrepeat=1 --ioengine=spdk --direct=1 --gtod_reduce=0 --cpumask=0xFFF0 --rate_iops=10k --rate_process=poisson --filename=trtype=RDMA adrfam=IPv4 traddr=1.1.79.1 trsvcid=4420 ns=1 SPDK allocates the following entities for every work request in receive queue (shared or not): reqs (1024 bytes), recvs (96 bytes), cmds (64 bytes), cpls (16 bytes), in_capsule_buffer. All except the last one are fixed size. In capsule data size is configured to 4096. Memory consumption calculation (target): - Multiple SRQ: core_num * ib_devs_num * SRQ_depth * (1200 + in_capsule_data_size) - Multiple RQ: queue_num * RQ_depth * (1200 + in_capsule_data_size) We ignore admin queues in calculations for simplicity. Cases: 1. Multiple SRQ with 1024 entries: - Mem = 4 * 1 * 1024 * (1200 + 4096) = 20.7 MiB (Constant number – does not depend on initiators number) 2. RQ with 128 entries for 64 initiators: - Mem = 64 * 128 * (1200 + 4096) = 41.4 MiB Results: FIO_JOBS kIOPS Bandwidth, MiB/s AvgLatency, us MaxResidentSize, kiB RQ SRQ RQ SRQ RQ SRQ RQ SRQ 1 8.623 8.623 33.7 33.7 13.89 14.03 144376 155624 2 17.3 17.3 67.4 67.4 14.03 14.1 145776 155700 4 34.5 34.5 135 135 14.15 14.23 146540 156184 8 69.1 69.1 270 270 14.64 14.49 148116 156960 16 138 138 540 540 14.84 15.38 151216 158668 32 276 276 1079 1079 16.5 16.61 157560 161936 64 513 502 2005 1960 1673.31 1612.38 170408 168440 128 535 526 2092 2054 3329.79 3344.03 195796 181524 256 571 571 2232 2233 6854.57 6873.37 246484 207856 We can see the benefit in memory consumption. The drawback of using SRQ is a risk of RNR errors when multiple clients initiate large number of IOs simultaneously. In "RQ per QP" this is handled by Submission Queue flow control and RNR is not possible. This patch does not contain any changes to solve RNR issue but we see at least two options here: - try to increase the RNR retry count to more than 0 which is now hardcoded and make it a configurable parameter. - implement some mechanism for dynamic SRQ extension. It may scale with number of IO queues or when it reaches some threshold. Change-Id: I40c70f6ccbad7754918bcc6cb397e955b09d1033 Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com>

OCF queue list needs to be managed synchronously. This patch uses our own mutex to achieve that because we cannot rely on ocf_mngt_cache_lock() as it may produce deadlocks when using cleaner. Alternative way would be to use trylock, but we need to register a poller for it and do locking concurently which doesn't seem to be possible in callbacks of io channel. We agreed with OCF team that we are going to change this part when OCF will deliver safe ocf_mngt_cache_lock() function. This patch fixes a very rare failure on our CI that looked like this: ``` 04:33:02 vbdev_ocf.c: 134:stop_vbdev: *NOTICE*: Not stopping cache instance 'Malloc0' because it is referenced by other OCF bdev 04:33:03 MalCache1: Core core2 successfully removed 04:33:03 MalCache1: Stopping cache 04:33:03 MalCache1: Done saving cache state! 04:33:03 src/ocf/mngt/ocf_mngt_cache.c:1576:2: runtime error: pointer index expression with base 0x000000000000 overflowed to 0xffffffffffffffa8 04:33:03 #0 0x7f3c52d54c26 in _ocf_mngt_cache_stop src/ocf/mngt/ocf_mngt_cache.c:1576 04:33:03 #1 0x7f3c52d5579f in ocf_mngt_cache_stop src/ocf/mngt/ocf_mngt_cache.c:1657 04:33:03 #2 0x7f3c52cbe9f4 in stop_vbdev /var/jenkins/workspace/NVMe_tests/nvme_phy_autotest/spdk/lib/bdev/ocf/vbdev_ocf.c:147 04:33:03 #3 0x7f3c52cbf4a0 in vbdev_ocf_destruct /var/jenkins/workspace/NVMe_tests/nvme_phy_autotest/spdk/lib/bdev/ocf/vbdev_ocf.c:216 ``` Change-Id: Id6fafb444958f3becdc480e44762074c6c081e1f Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/450682 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>

For VMD driver we'll need to introduce some way of iterating over all spdk pci device objects and we would like to achieve that with simple spdk_pci_get_first_dev()/get_next_dev() APIs. To make it thread safe though, we would have to expose some public pci mutex to be locked around the iteration and we don't want to do that, so we'll make PCI APIs usable from only a single thread - this will prevent any pci devices from being removed inbetween subsequent get_first/get_next calls. We currently have the following players accessing pci device state: 1) public APIs, obviously (on any thread right now) 2) VFIO hotremove callback (dpdk interrupt thread) 3) rte_eal_alarm for detaching rte_pci_devices (dpdk interrupt thread) 4) DPDK hotplug IPC (dpdk interrupt thread) There is g_pci_mutex providing the thread safety, but even today it doesn't protect #3 and #4, making the entire pci layer prone to data corruption. To make #3 and #4 safe, we would have to lock inside device init/fini callbacks (spdk_pci_device_init/fini), but those are called directly inside the public device attach/detach functions which already lock. So now, with the decision to drop thread safety from public pci APIs, we narrow down the locks inside public functions and introduce locks inside those lower-level init/fini callbacks. Change-Id: I5dcbc9cdcbab65ee76cd3c42890f596069ec9a8a Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/458930 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>

Not all JSON methods require 'params' field to be supplied. Verification of the JSON is done on server side in parse_single_request(). We should not attempt to process garbage values on correct JSON config file during app start. Segfault can be observed if following valid JSON config is supplied: { "method": "framework_wait_init" } Resulting in: json_config.c:388:13: runtime error: applying non-zero offset 18446744073709551600 to null pointer AddressSanitizer:DEADLYSIGNAL ================================================================= ==3386067==ERROR: AddressSanitizer: SEGV on unknown address (pc 0x0000007260ff bp 0x7ffe6ea06890 sp 0x7ffe6ea067e0 T0) ==3386067==The signal is caused by a READ memory access. ==3386067==Hint: this fault was caused by a dereference of a high value address (see register values below). Dissassemble the provided pc to learn which register was used. #0 0x7260ff in app_json_config_load_subsystem_config_entry /home/tzawadzk/spdk/lib/event/json_config.c:391 #1 0x7cbb13 in msg_queue_run_batch /home/tzawadzk/spdk/lib/thread/thread.c:505 #2 0x7cd00a in thread_poll /home/tzawadzk/spdk/lib/thread/thread.c:581 #3 0x7cfe18 in spdk_thread_poll /home/tzawadzk/spdk/lib/thread/thread.c:689 #4 0x71d6ef in _reactor_run /home/tzawadzk/spdk/lib/event/reactor.c:326 #5 0x71eb00 in reactor_run /home/tzawadzk/spdk/lib/event/reactor.c:382 #6 0x71f911 in spdk_reactors_start /home/tzawadzk/spdk/lib/event/reactor.c:477 #7 0x718237 in spdk_app_start /home/tzawadzk/spdk/lib/event/app.c:691 #8 0x407e94 in main /home/tzawadzk/spdk/app/spdk_tgt/spdk_tgt.c:120 #9 0x7f0f2eef2041 in __libc_start_main ../csu/libc-start.c:308 #10 0x4079ad in _start (/home/tzawadzk/spdk/build/bin/spdk_tgt+0x4079ad) Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I7ef1a764467817ad788fdf5dbe17eaeb99dcc22e Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3256 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>

The controller data structure may be freed before subsystem resume done callback, we can take endpoint as the input parameter to avoid this issue. AddressSanitizer: heap-use-after-free on address 0x625000046100 at pc 0x00000082818f bp 0x7fff7b09bd10 sp 0x7fff7b09bd00 READ of size 8 at 0x625000046100 thread T0 (reactor_0) #0 0x82818e in vfio_user_dev_quiesce_resume_done /spdk/lib/nvmf/vfio_user.c:2147 #1 0x782cc0 in subsystem_state_change_done /spdk/lib/nvmf/subsystem.c:634 #2 0xad047b in _call_completion /spdk/lib/thread/thread.c:2344 #3 0xabc48d in msg_queue_run_batch /spdk/lib/thread/thread.c:710 #4 0xac0670 in thread_poll /spdk/lib/thread/thread.c:926 #5 0xac0ead in spdk_thread_poll /spdk/lib/thread/thread.c:986 #6 0x9a5b4f in _reactor_run /spdk/lib/event/reactor.c:920 #7 0x9a6442 in reactor_run /spdk/lib/event/reactor.c:958 #8 0x9a717c in spdk_reactors_start /spdk/lib/event/reactor.c:1060 #9 0x99884a in spdk_app_start /spdk/lib/event/app.c:643 #10 0x407e82 in main /spdk/app/nvmf_tgt/nvmf_main.c:75 #11 0x7f822095ff42 in __libc_start_main (/lib64/libc.so.6+0x23f42) #12 0x407abd in _start (/spdk/build/bin/nvmf_tgt+0x407abd) 0x625000046100 is located 0 bytes inside of 8320-byte region [0x625000046100,0x625000048180) freed by thread T0 (reactor_0) here: #0 0x7f82219ff91f in __interceptor_free (/lib64/libasan.so.5+0x10d91f) #1 0x837059 in _free_ctrlr /spdk/lib/nvmf/vfio_user.c:2976 #2 0x837327 in free_ctrlr /spdk/lib/nvmf/vfio_user.c:2996 #3 0x843541 in nvmf_vfio_user_close_qpair /spdk/lib/nvmf/vfio_user.c:3742 #4 0x7d1d91 in nvmf_transport_qpair_fini /spdk/lib/nvmf/transport.c:604 #5 0x7ad922 in _nvmf_qpair_destroy /spdk/lib/nvmf/nvmf.c:1055 #6 0x761362 in nvmf_qpair_request_cleanup /spdk/lib/nvmf/ctrlr.c:4026 #7 0x761906 in spdk_nvmf_request_free /spdk/lib/nvmf/ctrlr.c:4041 #8 0x75a931 in nvmf_qpair_free_aer /spdk/lib/nvmf/ctrlr.c:3576 #9 0x7ae626 in spdk_nvmf_qpair_disconnect /spdk/lib/nvmf/nvmf.c:1127 #10 0x83db36 in _vfio_user_qpair_disconnect /spdk/lib/nvmf/vfio_user.c:3433 #11 0xabc48d in msg_queue_run_batch /spdk/lib/thread/thread.c:710 #12 0xac0670 in thread_poll /spdk/lib/thread/thread.c:926 #13 0xac0ead in spdk_thread_poll /spdk/lib/thread/thread.c:986 #14 0x9a5b4f in _reactor_run /spdk/lib/event/reactor.c:920 #15 0x9a6442 in reactor_run /spdk/lib/event/reactor.c:958 #16 0x9a717c in spdk_reactors_start /spdk/lib/event/reactor.c:1060 #17 0x99884a in spdk_app_start /spdk/lib/event/app.c:643 #18 0x407e82 in main /spdk/app/nvmf_tgt/nvmf_main.c:75 #19 0x7f822095ff42 in __libc_start_main (/lib64/libc.so.6+0x23f42) previously allocated by thread T0 (reactor_0) here: #0 0x7f82219fff16 in __interceptor_calloc (/lib64/libasan.so.5+0x10df16) #1 0x837413 in nvmf_vfio_user_create_ctrlr /spdk/lib/nvmf/vfio_user.c:3010 #2 0x83bc68 in nvmf_vfio_user_accept /spdk/lib/nvmf/vfio_user.c:3313 #3 0xabfbd8 in thread_execute_timed_poller /spdk/lib/thread/thread.c:872 #4 0xac0c75 in thread_poll /spdk/lib/thread/thread.c:960 #5 0xac0ead in spdk_thread_poll /spdk/lib/thread/thread.c:986 #6 0x9a5b4f in _reactor_run /spdk/lib/event/reactor.c:920 #7 0x9a6442 in reactor_run /spdk/lib/event/reactor.c:958 #8 0x9a717c in spdk_reactors_start /spdk/lib/event/reactor.c:1060 #9 0x99884a in spdk_app_start /spdk/lib/event/app.c:643 #10 0x407e82 in main /spdk/app/nvmf_tgt/nvmf_main.c:75 #11 0x7f822095ff42 in __libc_start_main (/lib64/libc.so.6+0x23f42) SUMMARY: AddressSanitizer: heap-use-after-free /spdk/lib/nvmf/vfio_user.c:2147 in vfio_user_dev_quiesce_resume_done Change-Id: Icf5e5b360b9107a3c5eb960ae59b7fe10ace1c66 Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11420 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Dong Yi <dongx.yi@intel.com> Reviewed-by: John Levon <levon@movementarian.org> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>

Ubsan with clang complains when using spdk_ioviter with more iters than declared in the array: iov.c:69:9: runtime error: index 3 out of bounds for type 'struct spdk_single_ioviter[2]' #0 0x5df709 in spdk_ioviter_firstv /home/vagrant/spdk_repo/spdk/lib/util/iov.c:69:9 #1 0x53780b in raid5f_xor_stripe /home/vagrant/spdk_repo/spdk/module/bdev/raid/raid5f.c:270:24 #2 0x531bd8 in raid5f_submit_write_request /home/vagrant/spdk_repo/spdk/module/bdev/raid/raid5f.c:520:2 #3 0x52a03a in raid5f_submit_rw_request /home/vagrant/spdk_repo/spdk/module/bdev/raid/raid5f.c:596:9 #4 0x548c17 in test_raid5f_write_request /home/vagrant/spdk_repo/spdk/test/unit/lib/bdev/raid/raid5f.c/raid5f_ut.c:550:2 #5 0x544e18 in test_raid5f_submit_rw_request /home/vagrant/spdk_repo/spdk/test/unit/lib/bdev/raid/raid5f.c/raid5f_ut.c:714:3 #6 0x553e61 in __test_raid5f_submit_full_stripe_write_request /home/vagrant/spdk_repo/spdk/test/unit/lib/bdev/raid/raid5f.c/raid5f_ut.c:878:3 #7 0x543f84 in run_for_each_raid5f_config /home/vagrant/spdk_repo/spdk/test/unit/lib/bdev/raid/raid5f.c/raid5f_ut.c:748:3 #8 0x527ac1 in test_raid5f_submit_full_stripe_write_request /home/vagrant/spdk_repo/spdk/test/unit/lib/bdev/raid/raid5f.c/raid5f_ut.c:885:2 #9 0x7f4a71a0960a (/usr/lib64/libcunit.so.1+0x460a) (BuildId: 9c82dd336cbccd99721651ac0a04435e746e0fc0) #10 0x7f4a71a09937 (/usr/lib64/libcunit.so.1+0x4937) (BuildId: 9c82dd336cbccd99721651ac0a04435e746e0fc0) #11 0x7f4a71a0a897 in CU_run_all_tests (/usr/lib64/libcunit.so.1+0x5897) (BuildId: 9c82dd336cbccd99721651ac0a04435e746e0fc0) #12 0x524fe8 in main /home/vagrant/spdk_repo/spdk/test/unit/lib/bdev/raid/raid5f.c/raid5f_ut.c:1006:2 #13 0x7f4a711d750f in __libc_start_call_main (/usr/lib64/libc.so.6+0x2750f) (BuildId: 81daba31ee66dbd63efdc4252a872949d874d136) #14 0x7f4a711d75c8 in __libc_start_main@GLIBC_2.2.5 (/usr/lib64/libc.so.6+0x275c8) (BuildId: 81daba31ee66dbd63efdc4252a872949d874d136) #15 0x4235b4 in _start (/home/vagrant/spdk_repo/spdk/test/unit/lib/bdev/raid/raid5f.c/raid5f_ut+0x4235b4) (BuildId: 028d075edd1a7cd17881fd678ef076adfdbac13d) Fix this by making iters a zero-length array and put it in a union with a two-element array to keep the default size for compatibility. Change-Id: I8573b015755e9986cdadbfa1705d269d51a7c2b7 Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/18402 Reviewed-by: Jim Harris <james.r.harris@intel.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>

String allocated in avahi_string_list_to_string() was not freed after it was no longer needed. Fixes spdk#3122 Direct leak of 102 byte(s) in 2 object(s) allocated from: #0 0x7fc255cd92ef in malloc (/usr/lib64/libasan.so.8+0xd92ef) (BuildId: 6f17f87dc4c1aa9f9dde7c4856604c3a25ba4872) #1 0x7fc255bf5fb4 in avahi_malloc (/usr/lib64/libavahi-common.so.3+0x3fb4) (BuildId: e99b431894489a77a31921493f198df4850090e0) #2 0x7fc255bf6e74 in avahi_string_list_to_string (/usr/lib64/libavahi-common.so.3+0x4e74) (BuildId: e99b431894489a77a31921493f198df4850090e0) #3 0x4a3c93 in mdns_resolve_callback /home/tzawadzk/spdk/module/bdev/nvme/bdev_mdns_client.c:261 .... Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: Ib0eae278d59152c97572531bef59efef36600f13 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/20849 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>

first_fused_req is touched on the hot completion path but fused commands are not considered hot path in general. To workaround that first_fused flag is introduced in the first cache line of the nvmf request. Change-Id: Ibd5e9278101cbad6883e937b2830dc8e84b3cb3c Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/21442 Reviewed-by: Jim Harris <jim.harris@samsung.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>

As per typedef in nvme.h the spdk_nvme_cpl argument should be a pointer to a const struct. This fixes runtimer error under clang >= 17.x which now makes the -fsanitize=function available for C and which on our end is being enabled via -fsanitize=undefined under UBSAN. Error in question: Test: test_spdk_nvme_detach ...passed Test: test_nvme_completion_poll_cb ...passed Test: test_nvme_user_copy_cmd_complete .../root/spdk/lib/nvme/nvme.c:417:2: runtime error: call to function dummy_cb through pointer to incorrect function type 'void (*)(void *, const struct spdk_nvme_cpl *)' /root/spdk/test/unit/lib/nvme/nvme.c/nvme_ut.c:584: note: dummy_cb defined here #0 0x5098e0 in nvme_user_copy_cmd_complete /root/spdk/lib/nvme/nvme.c:417:2 #1 0x532161 in test_nvme_user_copy_cmd_complete /root/spdk/test/unit/lib/nvme/nvme.c/nvme_ut.c:604:2 #2 0x7f08c952266a (/usr/lib64/libcunit.so.1+0x466a) (BuildId: d99e3b60795f2ce01ada820b4b7e3cd84d8150fe) #3 0x7f08c95229c7 (/usr/lib64/libcunit.so.1+0x49c7) (BuildId: d99e3b60795f2ce01ada820b4b7e3cd84d8150fe) #4 0x7f08c9523a9f in CU_run_all_tests (/usr/lib64/libcunit.so.1+0x5a9f) (BuildId: d99e3b60795f2ce01ada820b4b7e3cd84d8150fe) #5 0x55555e in run_tests /root/spdk/lib/ut/ut.c:169:3 #6 0x552aec in spdk_ut_run_tests /root/spdk/lib/ut/ut.c:225:8 #7 0x522d52 in main /root/spdk/test/unit/lib/nvme/nvme.c/nvme_ut.c:1664:17 #8 0x7f08c8c28149 in __libc_start_call_main (/usr/lib64/libc.so.6+0x28149) (BuildId: 7ea8d85df0e89b90c63ac7ed2b3578b2e7728756) #9 0x7f08c8c2820a in __libc_start_main@GLIBC_2.2.5 (/usr/lib64/libc.so.6+0x2820a) (BuildId: 7ea8d85df0e89b90c63ac7ed2b3578b2e7728756) #10 0x42b6a4 in _start (/root/spdk/test/unit/lib/nvme/nvme.c/nvme_ut+0x42b6a4) (BuildId: 6fc2caaf777030becad2d0f660ec68443f3380b4) SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /root/spdk/lib/nvme/nvme.c:417:2 in ./test/unit/unittest.sh: line 85: 75349 Aborted (core dumped) $valgrind $testdir/lib/nvme/nvme.c/nvme_ut Change-Id: Iddbd5fc0dee0ef6a6cc1f032e079f6119e76aed9 Signed-off-by: Michal Berger <michal.berger@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/22025 Reviewed-by: Jim Harris <jim.harris@samsung.com> Community-CI: Mellanox Build Bot Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>

wuzhouhui and others added 30 commits September 6, 2018 16:53

Seth5141 and others added 20 commits October 1, 2018 16:15

EugeneKochetov closed this Nov 23, 2018

EugeneKochetov deleted the nvmf_rdma_srq_upstream branch November 23, 2018 10:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RDMA SRQ implementation for NVMe-oF target #3

RDMA SRQ implementation for NVMe-oF target #3

EugeneKochetov commented Oct 4, 2018

RDMA SRQ implementation for NVMe-oF target #3

RDMA SRQ implementation for NVMe-oF target #3

Conversation

EugeneKochetov commented Oct 4, 2018