Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RDMA SRQ implementation for NVMe-oF target #3

Closed
wants to merge 189 commits into from
Closed

RDMA SRQ implementation for NVMe-oF target #3

wants to merge 189 commits into from

Conversation

EugeneKochetov
Copy link

nvmf/rdma: Add shared receive queue support

This is a new feature for NVMEoF RDMA target, that is intended to save resource allocation
(by sharing them) and utilize the locality (completions and memory) to get the best
performance with Shared Receive Queues (SRQs). We'll create a SRQ per core (poll group),
per device and associate each created QP/CQ with an appropriate SRQ.

Our testing environment has 2 hosts.
Host 1:
CPU: Intel(R) Xeon(R) CPU E5-2609 0 @ 2.40GHz dual socket (8 cores total)
Network: ConnectX-5, ConnectX-5 VPI , 100GbE, single-port QSFP28, PCIe3.0 x16
Disk: Intel Optane SSD 900P Series
OS: Fedora 27 x86_64
Host 2:
CPU: Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz dual-socket (24 cores total)
Network: ConnectX-4 VPI , 100GbE, dual-port QSFP28
Disk: Intel Optane SSD 900P Series
OS : CentOS 7.5.1804 x86_64
Hosts are connected via Spectrum switch.
Host 1 is running SPDK NVMeoF target. Host 2 is used as initiator running fio with SPDK plugin.

Configuration:

  • SPDK NVMeoF target: cpu mask 0x0F (4 cores), max queue depth 128, max SRQ depth 1024, max QPs per controller 1024
  • Single NVMf subsystem with single namespace backed by physical SSD disk
  • fio with SPDK plugin: randread pattern, 1-256 jobs, block size 4k, IO depth 16, cpu_mask 0xFFF0, IO rate 10k, rate process “poisson”

Here is a full fio command line:
fio --name=Job --stats=1 --group_reporting=1 --idle-prof=percpu --loops=1 --numjobs=1 --thread=1 --time_based=1 --runtime=30s --ramp_time=5s --bs=4k --size=4G --iodepth=16 --readwrite=randread --rwmixread=75 --randrepeat=1 --ioengine=spdk --direct=1 --gtod_reduce=0 --cpumask=0xFFF0 --rate_iops=10k --rate_process=poisson --filename=trtype=RDMA adrfam=IPv4 traddr=1.1.79.1 trsvcid=4420 ns=1

SPDK allocates the following entities for every work request in receive queue (shared or not): reqs (1024 bytes), recvs (96 bytes), cmds (64 bytes), cpls (16 bytes), in_capsule_buffer. All except the last one are fixed size. In capsule data size is configured to 4096.
Memory consumption calculation (target):

  • Multiple SRQ: core_num * ib_devs_num * SRQ_depth * (1200 + in_capsule_data_size)
  • Multiple RQ: queue_num * RQ_depth * (1200 + in_capsule_data_size)
    We ignore admin queues in calculations for simplicity.

Cases:

  1. Multiple SRQ with 1024 entries:
    • Mem = 4 * 1 * 1024 * (1200 + 4096) = 20.7 MiB (Constant number – does not depend on initiators number)
  2. RQ with 128 entries for 64 initiators:
    • Mem = 64 * 128 * (1200 + 4096) = 41.4 MiB

Results:
FIO_JOBS kIOPS Bandwidth, MiB/s AvgLatency, us MaxResidentSize, kiB
RQ SRQ RQ SRQ RQ SRQ RQ SRQ
1 8.623 8.623 33.7 33.7 13.89 14.03 144376 155624
2 17.3 17.3 67.4 67.4 14.03 14.1 145776 155700
4 34.5 34.5 135 135 14.15 14.23 146540 156184
8 69.1 69.1 270 270 14.64 14.49 148116 156960
16 138 138 540 540 14.84 15.38 151216 158668
32 276 276 1079 1079 16.5 16.61 157560 161936
64 513 502 2005 1960 1673.31 1612.38 170408 168440
128 535 526 2092 2054 3329.79 3344.03 195796 181524
256 571 571 2232 2233 6854.57 6873.37 246484 207856

We can see the benefit in memory consumption.

The drawback of using SRQ is a risk of RNR errors when multiple clients initiate large number of IOs simultaneously.
In "RQ per QP" this is handled by Submission Queue flow control and RNR is not possible.
This patch does not contain any changes to solve RNR issue but we see at least two options here:

  • try to increase the RNR retry count to more than 0 which is now hardcoded and make it a configurable parameter.
  • implement some mechanism for dynamic SRQ extension. It may scale with number of IO queues or when it reaches some threshold.

wuzhouhui and others added 30 commits September 6, 2018 16:53
Change-Id: I28f21649feae8022cd0f0afd5b01ae3ee2b3592e
Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-on: https://review.gerrithub.io/424618
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
When emulating write_zeroes commands on device that
don't natively support it, we submit a write with
a zeroed buffer.  We used to just reuse the original
bdev_io, but that was recently changed due to other
splitting code added for iovs.  But when making those
changes, we forgot to free the bdev_io for the
write that was sent down to the device.

Fixes: 183f37e (bdev: do not reuse bdev_io when...)

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: If08782c65f6305c0a9f9d15d74fd8823e1158e9b

Reviewed-on: https://review.gerrithub.io/424733
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
The previous patch fixed the underlying issue.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ic5ccaf82aba4d2ed8644b34bd4d0294784fd1de2

Reviewed-on: https://review.gerrithub.io/424734
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
It used to be enough for ioatdma to be unloaded - but now
we require the channels to be bound to uio_pci_generic or
vfio-pci.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I790fd909a4aa92adead5d711cce093e18e9f9595

Reviewed-on: https://review.gerrithub.io/424725
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
spdk_bdev_close() must be called on the same thread as
spdk_bdev_open(). Further, the remove callback on the
descriptor will also be run on the same thread as
spdk_bdev_open().

Change-Id: I949d6dd67de1e63d39f06944d473e4aa7134111b
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/424738
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
When new bdev was created, the struct spdk_bdev_module::examine_disk()
may open and close bdev. On the other hand, if something goes wrong,
the creation procedure may unregister new created bdev, so race
condition appeared between _remove_notify() and spdk_bdev_close().

Add the new field "closed" and "remove_notified" in struct spdk_bdev_desc,
so _remove_notify() and spdk_bdev_close() knows how to deal with this
situation.

Change-Id: Ibfe915a4d76096796b039a13a4f49f26669eba2c
Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-on: https://review.gerrithub.io/423369
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
spdk_bdev_io_put_buf() is responsible for reclaiming
bdev-allocated buffers from a bdev_io.  If there are
bdev_ios waiting for one of these buffers, it calls
spdk_bdev_io_set_buf() on the next bdev_io in the queue.
This will set the iov_base and iov_len on the bdev_io
to point to the bdev-allocated buffer.

But spdk_bdev_io_put_buf() was calling spdk_bdev_io_set_buf()
on the just completed bdev_io, not the next bdev_io in the
queue.  So fix that.

Fixes: 844aedf ("bdev: Simplify get/set/put buf functions")
Reported-by: Alan Tu
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ibbcad6e35a3db6991bd7deb3516229572f021638

Reviewed-on: https://review.gerrithub.io/424880
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I69101b142b99403ecdd1b651fba85b8aa10df20e

Reviewed-on: https://review.gerrithub.io/424721
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Lance Hartmann <lance.hartmann@oracle.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
When --with-shared is specified, we need to make sure the
shared libraries have been built before we start linking
applications.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I98510b29cca67e09f0c33ac9c8e823e61ac5dc8d

Reviewed-on: https://review.gerrithub.io/424722
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Lance Hartmann <lance.hartmann@oracle.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3fa790389f56715629c0301f101a5244dc7ae516

Reviewed-on: https://review.gerrithub.io/424723
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Lance Hartmann <lance.hartmann@oracle.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Idd365df7fb61eafb502f415adf70638bb91ded0e
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/424773
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Change-Id: I2fda87d6176e18f2face59c9d916db2b4631c05b
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/424774
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Latest scan-build finds a legitimate issue here - although
it's not clear why it was only found when testing against
a seemingly unrelated patch...

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ic981f6f403c638b0b2d454c0be9f7fbe145171d9

Reviewed-on: https://review.gerrithub.io/424887
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
We cannot split an iov if a buffer hasn't been
allocated yet.  So always call spdk_bdev_io_get_buf
on reads before trying to split.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I2c26efb9dc6cb2c7c3e3b7ae5bab2c37844b9113

Reviewed-on: https://review.gerrithub.io/424879
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
This is an artifact from before SPDK had a configure
script or a DPDK submodule.  Make configure the
only supported way for specifying the location of the
DPDK installation to use with SPDK.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I5c197c46220928bb18b97c8807755967d76ea42c

Reviewed-on: https://review.gerrithub.io/424893
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Change-Id: I6ae1f380aebbcf090a0ff31ff96fc4592fc29591
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/421173
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
There is a race condition with the following sequence:

spdk_bdev_open()
spdk_bdev_unregister() <-- starts deferred message
spdk_bdev_close()
deferred message runs, crashes

Change-Id: I81fbced0849949cfb2dae5a7cc6f60c9685a8885
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/424739
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I51b312a086f18a5b5f63de27dd69e43a8cc7225d
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/424914
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Currently in the function nvme_ctrlr_start() the initialization
process is executed as a whole, in the case there are many controllers
in one system, which means other controllers must call the function
one by one.  While here, we add several states here, which can
help refactoring the initialization process.

Change-Id: I209cf964bbf6e151823a7ecdc6a3f6e6e69df297
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/424157
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ide0c81b1cc29d67cec0c10ab877360db3699141e
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/424775
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I1e5be0e282b9e29f7bf7ca7d2720b9fd00539be0
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/424776
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Make clear that building with an alternate DPDK than the SPDK's
dpdk submodule is not limited to pointing only to a full DPDK
directory of sources, but also supports the ability to build
against pre-built libraries and includes provided by an
installation of the DPDK packages available from several
distro's.

Change-Id: I40cd8132e45dbd366a4c93c891a95e8952b6620d
Signed-off-by: Lance Hartmann <lance.hartmann@oracle.com>
Reviewed-on: https://review.gerrithub.io/425003
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
block_size should be extended_sector_size in case of
metadata or PI is enabled.

Change-Id: I2cba61975b0541ef64839a8cd117eb42f19742b3
Signed-off-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/425061
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Now that it is required to be on the same thread, the
message isn't necessary.

Change-Id: I714b77b46467dbcfa51186c8404c5976eaeea08a
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/424593
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This initiates an error recovery instead of a disconnect. The
error recovery may result in a disconnect if the qpair is not
recoverable. This also resolves an issue where the disconnect
may immediately release the resources associated with the rqpair,
but upcoming wc entries may still reference it.

Change-Id: I9d9e212a83129412e049c91c02725699ce2cac11
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/425010
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The underlying DPDK function we use reads an array
at the provided index without checking for any out
of bounds access. The array is RTE_MAX_LCORE elements
long, so always manually check against that to keep
our APIs safe.

This fixes potential crashes with lcore == SPDK_ENV_LCORE_ID_ANY

Change-Id: I3081b888275fbecba8ab95feb20d2074341e2fc7
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/425042
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I859a390e830d43e921244a8a482a63e5c8afe56c
Signed-off-by: Takeshi Yoshimura <tyos@jp.ibm.com>
Reviewed-on: https://review.gerrithub.io/424016
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Changed "/bin/sh" to "/usr/bin/env bash" in scripts/pkgdep.sh
Previously script failed on my Ubuntu server, with error:
	trap: ERR: bad trap

Change-Id: I054c0388c462e2d0340b6e3d4e581a8e6cdc7097
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Reviewed-on: https://review.gerrithub.io/424998
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
This was not being tested on every patch previously.

Change-Id: I0a1756f3709da6608f66a2160962ed9b0fb38c7b
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/424890
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This prevents us from using the bdev embedded in the lun after it has
been freed.

Change-Id: I780cf3eccca05a58d3461366fec024be42b8ff74
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/425174
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Seth5141 and others added 20 commits October 1, 2018 16:15
This allows a userto run the script without specifying a configuration
file. Removing this dependency opens us up to things like calling
vm_setup directly from a vagrantfile without having to first copy the
spdk source over.

Change-Id: I72074a445f8befc714c03cab57a2da539350c092
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/426944
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
For some use cases (e.g. creating a test pool vm), we won't want to
rsync spdk from the host, and instead will want to clone a new one from
scratch.

Change-Id: I4c27f8ffc6c04aa0901dfe5b536b7e6ba94f7693
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/426945
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Allow the option to run vm_setup.sh during provisioning on a fedora
system. This is one step closer to fully automating the setup of a build
pool test machine.

Change-Id: Ia3965b31e0a9217d176ffe3c165b8eb6b3ccad13
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/426946
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Keeping consistent with the distributions we support in the build pool.

Change-Id: Iec13686507f890e9e18ecd5a1f4f238b2370f1d7
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/427161
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Update the branch to point at our code rebased on the most recent
mainline release of nvme-cli. This branch includes fixes for
compatibility with gcc 8.

Change-Id: Ie1bdb046d1e24e832bda585dd9841619451ec5f5
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/426985
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
There were some innacuracies in vagrant documentation,
 such as typos, incorrect versions and incorrect names.
This commit fixes them.

Change-Id: Ibe01f24c43bc105a19b27b81cea771c6711af7c5
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Reviewed-on: https://review.gerrithub.io/427023
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
As a side effect, if SPDK_LOG_DISABLED is used in spdk_log() as log
level then no message will be printed.

Change-Id: I2d57b60a5a310a9ef2a1187a81088d0acf828742
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-on: https://review.gerrithub.io/425105
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This patch is to introduce the specific QoS related structure
and the enumeration for types of QoS rate limits. Later new
types of QoS rate limits can be supported easily.

Change-Id: Idb8d2e7627fd145bf2b0ddb296c968b6b068f48c
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/424459
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Print shorter output about unsupported option instead of
printing full "usage" text.

Change-Id: I92a98b0bdf0b2ed9ac56da644f24777f76e7df29
Signed-off-by: Karol Latecki <karol.latecki@intel.com>
Reviewed-on: https://review.gerrithub.io/426471
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Add create/delete/get methods from nvmf subsystem to spdkcli.

Change-Id: Id1f4e539d14746c4d6108bb58df921c301d47e96
Signed-off-by: Pawel Kaminski <pawelx.kaminski@intel.com>
Reviewed-on: https://review.gerrithub.io/423425
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paweł Niedźwiecki <pawelx.niedzwiecki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Idf6dd895033393858e49a494f8aac67b8034c67d
Signed-off-by: Pawel Kaminski <pawelx.kaminski@intel.com>
Reviewed-on: https://review.gerrithub.io/423650
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paweł Niedźwiecki <pawelx.niedzwiecki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I555abe8500547be81ca5a6566539dae7c8d7e690
Signed-off-by: Pawel Niedzwiecki <pawelx.niedzwiecki@intel.com>
Reviewed-on: https://review.gerrithub.io/427244
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Change system manager timeout for stopping services.
Lagging services cause timeouts during shutting down VM.

Change-Id: I7428724cf8795ffe3c215bacdb56cb3f58fbdff0
Signed-off-by: Pawel Niedzwiecki <pawelx.niedzwiecki@intel.com>
Reviewed-on: https://review.gerrithub.io/427715
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
In newer versions of the kernel, there is a race condition that causes
the kernel block layer to mess up if an nvme-oF subsystem is
disconnected before it is fully removed. This can cause the shutdown
test to hang during a sync call aftr nvmf tgt shutdown. see:
https://ci.spdk.io/spdk/builds/review/d1ecb02290a63c7e17cbe9becd7d48e3d0f245d6.1537982067/fedora-06/build.log
for details.

This problem has been observed and fixed in the RPC tests also.
https://review.gerrithub.io/c/spdk/spdk/+/426772

Change-Id: I9ec8517e067448be323e69979adf5d3915195c80
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/427152
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This patch fixes the following scan build failure.

nbd.c:697:6: warning: Use of memory after it is freed
        if (io->state == NBD_IO_XMIT_RESP) {
            ^~~~~~~~~

Change-Id: Icba8b509604b064bff36d6ef63ecf02617ad7666
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/427366
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Consolidating multiptle reinsertions into a place will make the logic
a little cleaner.

Change-Id: Iab7e9f8e7dcdebbec9d51e151b1d838567c1dcc4
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/427441
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
according to commit:
     bdev: add spdk_bdev_queue_io_wait
This patch will make io_wait to support nbd

Change-Id: I9c8a6f5d20afbada45cedffe9d49846a992d2581
Signed-off-by: Ni Xun <nixun@baidu.com>
Signed-off-by: Li Lin <lilin24@baidu.com>
Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/425594
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
We need to include it in devel package.


Change-Id: I823200632e8bcb9fdb86c8cb5fbf3a651a710b78
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-on: https://review.gerrithub.io/426839
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Each file that need to check SPDK_CONFIG_* options need to include
spdk/config.h explicitly.

Change-Id: If9f2a91ac4c2b1a300dcf88ec3e2a12714ad344a
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-on: https://review.gerrithub.io/427221
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
This is a new feature for NVMEoF RDMA target, that is intended to save resource allocation
(by sharing them) and utilize the locality (completions and memory) to get the best
performance with Shared Receive Queues (SRQs). We'll create a SRQ per core (poll group),
per device and associate each created QP/CQ with an appropriate SRQ.

Our testing environment has 2 hosts.
Host 1:
  CPU: Intel(R) Xeon(R) CPU E5-2609 0 @ 2.40GHz dual socket (8 cores total)
  Network: ConnectX-5, ConnectX-5 VPI , 100GbE, single-port QSFP28, PCIe3.0 x16
  Disk: Intel Optane SSD 900P Series
  OS: Fedora 27 x86_64
Host 2:
  CPU: Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz dual-socket (24 cores total)
  Network: ConnectX-4 VPI , 100GbE, dual-port QSFP28
  Disk: Intel Optane SSD 900P Series
  OS : CentOS 7.5.1804 x86_64
Hosts are connected via Spectrum switch.
Host 1 is running SPDK NVMeoF target. Host 2 is used as initiator running fio with SPDK plugin.

Configuration:
- SPDK NVMeoF target: cpu mask 0x0F (4 cores), max queue depth 128, max SRQ depth 1024, max QPs per controller 1024
- Single NVMf subsystem with single namespace backed by physical SSD disk
- fio with SPDK plugin: randread pattern, 1-256 jobs, block size 4k, IO depth 16, cpu_mask 0xFFF0, IO rate 10k, rate process “poisson”

Here is a full fio command line:
fio  --name=Job --stats=1 --group_reporting=1 --idle-prof=percpu --loops=1 --numjobs=1 --thread=1 --time_based=1 --runtime=30s --ramp_time=5s --bs=4k --size=4G --iodepth=16 --readwrite=randread --rwmixread=75 --randrepeat=1 --ioengine=spdk --direct=1 --gtod_reduce=0 --cpumask=0xFFF0 --rate_iops=10k --rate_process=poisson --filename=trtype=RDMA adrfam=IPv4 traddr=1.1.79.1 trsvcid=4420 ns=1

SPDK allocates the following entities for every work request in receive queue (shared or not): reqs (1024 bytes), recvs (96 bytes), cmds (64 bytes), cpls (16 bytes), in_capsule_buffer. All except the last one are fixed size. In capsule data size is configured to 4096.
Memory consumption calculation (target):
- Multiple SRQ: core_num * ib_devs_num * SRQ_depth * (1200 + in_capsule_data_size)
- Multiple RQ: queue_num * RQ_depth * (1200 + in_capsule_data_size)
We ignore admin queues in calculations for simplicity.

Cases:
1. Multiple SRQ with 1024 entries:
    - Mem = 4 * 1 * 1024 * (1200 + 4096) = 20.7 MiB (Constant number – does not depend on initiators number)
2. RQ with 128 entries for 64 initiators:
    - Mem = 64 * 128 * (1200 + 4096) = 41.4 MiB

Results:
FIO_JOBS       kIOPS       Bandwidth, MiB/s     AvgLatency, us     MaxResidentSize, kiB
           RQ       SRQ      RQ       SRQ       RQ         SRQ        RQ        SRQ
1          8.623    8.623    33.7     33.7      13.89      14.03      144376    155624
2          17.3     17.3     67.4     67.4      14.03      14.1       145776    155700
4          34.5     34.5     135      135       14.15      14.23      146540    156184
8          69.1     69.1     270      270       14.64      14.49      148116    156960
16         138      138      540      540       14.84      15.38      151216    158668
32         276      276      1079     1079      16.5       16.61      157560    161936
64         513      502      2005     1960      1673.31    1612.38    170408    168440
128        535      526      2092     2054      3329.79    3344.03    195796    181524
256        571      571      2232     2233      6854.57    6873.37    246484    207856

We can see the benefit in memory consumption.

The drawback of using SRQ is a risk of RNR errors when multiple clients initiate large number of IOs simultaneously.
In "RQ per QP" this is handled by Submission Queue flow control and RNR is not possible.
This patch does not contain any changes to solve RNR issue but we see at least two options here:
- try to increase the RNR retry count to more than 0 which is now hardcoded and make it a configurable parameter.
- implement some mechanism for dynamic SRQ extension. It may scale with number of IO queues or when it reaches some threshold.

Change-Id: I40c70f6ccbad7754918bcc6cb397e955b09d1033
Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com>
@EugeneKochetov EugeneKochetov deleted the nvmf_rdma_srq_upstream branch November 23, 2018 10:11
aviadye pushed a commit that referenced this pull request Jun 17, 2019
OCF queue list needs to be managed synchronously.
This patch uses our own mutex to achieve that because we cannot rely on
  ocf_mngt_cache_lock() as it may produce deadlocks when using
  cleaner.

Alternative way would be to use trylock, but
  we need to register a poller for it and do locking concurently
  which doesn't seem to be possible in callbacks of io channel.

We agreed with OCF team that we are going to change this part
  when OCF will deliver safe ocf_mngt_cache_lock() function.

This patch fixes a very rare failure on our CI that looked like this:
```
04:33:02 vbdev_ocf.c: 134:stop_vbdev: *NOTICE*: Not stopping cache instance 'Malloc0' because it is referenced by other OCF bdev
04:33:03 MalCache1: Core core2 successfully removed
04:33:03 MalCache1: Stopping cache
04:33:03 MalCache1: Done saving cache state!
04:33:03 src/ocf/mngt/ocf_mngt_cache.c:1576:2: runtime error: pointer index expression with base 0x000000000000 overflowed to 0xffffffffffffffa8
04:33:03     #0 0x7f3c52d54c26 in _ocf_mngt_cache_stop src/ocf/mngt/ocf_mngt_cache.c:1576
04:33:03     #1 0x7f3c52d5579f in ocf_mngt_cache_stop src/ocf/mngt/ocf_mngt_cache.c:1657
04:33:03     #2 0x7f3c52cbe9f4 in stop_vbdev /var/jenkins/workspace/NVMe_tests/nvme_phy_autotest/spdk/lib/bdev/ocf/vbdev_ocf.c:147
04:33:03     #3 0x7f3c52cbf4a0 in vbdev_ocf_destruct /var/jenkins/workspace/NVMe_tests/nvme_phy_autotest/spdk/lib/bdev/ocf/vbdev_ocf.c:216
```

Change-Id: Id6fafb444958f3becdc480e44762074c6c081e1f
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/450682
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
AlekseyMarchuk pushed a commit that referenced this pull request Jul 2, 2019
For VMD driver we'll need to introduce some way of
iterating over all spdk pci device objects and we would
like to achieve that with simple spdk_pci_get_first_dev()/get_next_dev()
APIs. To make it thread safe though, we would have to
expose some public pci mutex to be locked around the
iteration and we don't want to do that, so we'll make
PCI APIs usable from only a single thread - this will
prevent any pci devices from being removed inbetween
subsequent get_first/get_next calls.

We currently have the following players accessing pci
device state:
 1) public APIs, obviously (on any thread right now)
 2) VFIO hotremove callback (dpdk interrupt thread)
 3) rte_eal_alarm for detaching rte_pci_devices (dpdk
    interrupt thread)
 4) DPDK hotplug IPC (dpdk interrupt thread)

There is g_pci_mutex providing the thread safety, but
even today it doesn't protect #3 and #4, making the
entire pci layer prone to data corruption.

To make #3 and #4 safe, we would have to lock inside
device init/fini callbacks (spdk_pci_device_init/fini),
but those are called directly inside the public device
attach/detach functions which already lock.

So now, with the decision to drop thread safety from
public pci APIs, we narrow down the locks inside public
functions and introduce locks inside those lower-level
init/fini callbacks.

Change-Id: I5dcbc9cdcbab65ee76cd3c42890f596069ec9a8a
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/458930
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
allen-mlnx pushed a commit that referenced this pull request Aug 18, 2020
Not all JSON methods require 'params' field to be supplied.
Verification of the JSON is done on server side in
parse_single_request().

We should not attempt to process garbage values on correct
JSON config file during app start.

Segfault can be observed if following valid JSON config is supplied:
{
	"method": "framework_wait_init"
}
Resulting in:
json_config.c:388:13: runtime error: applying non-zero offset 18446744073709551600 to null pointer
AddressSanitizer:DEADLYSIGNAL
=================================================================
==3386067==ERROR: AddressSanitizer: SEGV on unknown address (pc 0x0000007260ff bp 0x7ffe6ea06890 sp 0x7ffe6ea067e0 T0)
==3386067==The signal is caused by a READ memory access.
==3386067==Hint: this fault was caused by a dereference of a high value address (see register values below).  Dissassemble the provided pc to learn which register was used.
    #0 0x7260ff in app_json_config_load_subsystem_config_entry /home/tzawadzk/spdk/lib/event/json_config.c:391
    #1 0x7cbb13 in msg_queue_run_batch /home/tzawadzk/spdk/lib/thread/thread.c:505
    #2 0x7cd00a in thread_poll /home/tzawadzk/spdk/lib/thread/thread.c:581
    #3 0x7cfe18 in spdk_thread_poll /home/tzawadzk/spdk/lib/thread/thread.c:689
    #4 0x71d6ef in _reactor_run /home/tzawadzk/spdk/lib/event/reactor.c:326
    #5 0x71eb00 in reactor_run /home/tzawadzk/spdk/lib/event/reactor.c:382
    #6 0x71f911 in spdk_reactors_start /home/tzawadzk/spdk/lib/event/reactor.c:477
    #7 0x718237 in spdk_app_start /home/tzawadzk/spdk/lib/event/app.c:691
    #8 0x407e94 in main /home/tzawadzk/spdk/app/spdk_tgt/spdk_tgt.c:120
    #9 0x7f0f2eef2041 in __libc_start_main ../csu/libc-start.c:308
    #10 0x4079ad in _start (/home/tzawadzk/spdk/build/bin/spdk_tgt+0x4079ad)

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I7ef1a764467817ad788fdf5dbe17eaeb99dcc22e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3256
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
EugeneKochetov pushed a commit that referenced this pull request May 29, 2022
The controller data structure may be freed before subsystem resume done
callback, we can take endpoint as the input parameter to avoid this issue.

AddressSanitizer: heap-use-after-free on address 0x625000046100 at pc 0x00000082818f bp 0x7fff7b09bd10 sp 0x7fff7b09bd00
READ of size 8 at 0x625000046100 thread T0 (reactor_0)
    #0 0x82818e in vfio_user_dev_quiesce_resume_done /spdk/lib/nvmf/vfio_user.c:2147
    #1 0x782cc0 in subsystem_state_change_done /spdk/lib/nvmf/subsystem.c:634
    #2 0xad047b in _call_completion /spdk/lib/thread/thread.c:2344
    #3 0xabc48d in msg_queue_run_batch /spdk/lib/thread/thread.c:710
    #4 0xac0670 in thread_poll /spdk/lib/thread/thread.c:926
    #5 0xac0ead in spdk_thread_poll /spdk/lib/thread/thread.c:986
    #6 0x9a5b4f in _reactor_run /spdk/lib/event/reactor.c:920
    #7 0x9a6442 in reactor_run /spdk/lib/event/reactor.c:958
    #8 0x9a717c in spdk_reactors_start /spdk/lib/event/reactor.c:1060
    #9 0x99884a in spdk_app_start /spdk/lib/event/app.c:643
    #10 0x407e82 in main /spdk/app/nvmf_tgt/nvmf_main.c:75
    #11 0x7f822095ff42 in __libc_start_main (/lib64/libc.so.6+0x23f42)
    #12 0x407abd in _start (/spdk/build/bin/nvmf_tgt+0x407abd)

0x625000046100 is located 0 bytes inside of 8320-byte region [0x625000046100,0x625000048180)
freed by thread T0 (reactor_0) here:
    #0 0x7f82219ff91f in __interceptor_free (/lib64/libasan.so.5+0x10d91f)
    #1 0x837059 in _free_ctrlr /spdk/lib/nvmf/vfio_user.c:2976
    #2 0x837327 in free_ctrlr /spdk/lib/nvmf/vfio_user.c:2996
    #3 0x843541 in nvmf_vfio_user_close_qpair /spdk/lib/nvmf/vfio_user.c:3742
    #4 0x7d1d91 in nvmf_transport_qpair_fini /spdk/lib/nvmf/transport.c:604
    #5 0x7ad922 in _nvmf_qpair_destroy /spdk/lib/nvmf/nvmf.c:1055
    #6 0x761362 in nvmf_qpair_request_cleanup /spdk/lib/nvmf/ctrlr.c:4026
    #7 0x761906 in spdk_nvmf_request_free /spdk/lib/nvmf/ctrlr.c:4041
    #8 0x75a931 in nvmf_qpair_free_aer /spdk/lib/nvmf/ctrlr.c:3576
    #9 0x7ae626 in spdk_nvmf_qpair_disconnect /spdk/lib/nvmf/nvmf.c:1127
    #10 0x83db36 in _vfio_user_qpair_disconnect /spdk/lib/nvmf/vfio_user.c:3433
    #11 0xabc48d in msg_queue_run_batch /spdk/lib/thread/thread.c:710
    #12 0xac0670 in thread_poll /spdk/lib/thread/thread.c:926
    #13 0xac0ead in spdk_thread_poll /spdk/lib/thread/thread.c:986
    #14 0x9a5b4f in _reactor_run /spdk/lib/event/reactor.c:920
    #15 0x9a6442 in reactor_run /spdk/lib/event/reactor.c:958
    #16 0x9a717c in spdk_reactors_start /spdk/lib/event/reactor.c:1060
    #17 0x99884a in spdk_app_start /spdk/lib/event/app.c:643
    #18 0x407e82 in main /spdk/app/nvmf_tgt/nvmf_main.c:75
    #19 0x7f822095ff42 in __libc_start_main (/lib64/libc.so.6+0x23f42)

previously allocated by thread T0 (reactor_0) here:
    #0 0x7f82219fff16 in __interceptor_calloc (/lib64/libasan.so.5+0x10df16)
    #1 0x837413 in nvmf_vfio_user_create_ctrlr /spdk/lib/nvmf/vfio_user.c:3010
    #2 0x83bc68 in nvmf_vfio_user_accept /spdk/lib/nvmf/vfio_user.c:3313
    #3 0xabfbd8 in thread_execute_timed_poller /spdk/lib/thread/thread.c:872
    #4 0xac0c75 in thread_poll /spdk/lib/thread/thread.c:960
    #5 0xac0ead in spdk_thread_poll /spdk/lib/thread/thread.c:986
    #6 0x9a5b4f in _reactor_run /spdk/lib/event/reactor.c:920
    #7 0x9a6442 in reactor_run /spdk/lib/event/reactor.c:958
    #8 0x9a717c in spdk_reactors_start /spdk/lib/event/reactor.c:1060
    #9 0x99884a in spdk_app_start /spdk/lib/event/app.c:643
    #10 0x407e82 in main /spdk/app/nvmf_tgt/nvmf_main.c:75
    #11 0x7f822095ff42 in __libc_start_main (/lib64/libc.so.6+0x23f42)

SUMMARY: AddressSanitizer: heap-use-after-free /spdk/lib/nvmf/vfio_user.c:2147 in vfio_user_dev_quiesce_resume_done

Change-Id: Icf5e5b360b9107a3c5eb960ae59b7fe10ace1c66
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11420
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: John Levon <levon@movementarian.org>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
yshestakov pushed a commit that referenced this pull request Sep 19, 2023
Ubsan with clang complains when using spdk_ioviter with more iters than
declared in the array:

  iov.c:69:9: runtime error: index 3 out of bounds for type 'struct spdk_single_ioviter[2]'
  #0 0x5df709 in spdk_ioviter_firstv /home/vagrant/spdk_repo/spdk/lib/util/iov.c:69:9
  #1 0x53780b in raid5f_xor_stripe /home/vagrant/spdk_repo/spdk/module/bdev/raid/raid5f.c:270:24
  #2 0x531bd8 in raid5f_submit_write_request /home/vagrant/spdk_repo/spdk/module/bdev/raid/raid5f.c:520:2
  #3 0x52a03a in raid5f_submit_rw_request /home/vagrant/spdk_repo/spdk/module/bdev/raid/raid5f.c:596:9
  #4 0x548c17 in test_raid5f_write_request /home/vagrant/spdk_repo/spdk/test/unit/lib/bdev/raid/raid5f.c/raid5f_ut.c:550:2
  #5 0x544e18 in test_raid5f_submit_rw_request /home/vagrant/spdk_repo/spdk/test/unit/lib/bdev/raid/raid5f.c/raid5f_ut.c:714:3
  #6 0x553e61 in __test_raid5f_submit_full_stripe_write_request /home/vagrant/spdk_repo/spdk/test/unit/lib/bdev/raid/raid5f.c/raid5f_ut.c:878:3
  #7 0x543f84 in run_for_each_raid5f_config /home/vagrant/spdk_repo/spdk/test/unit/lib/bdev/raid/raid5f.c/raid5f_ut.c:748:3
  #8 0x527ac1 in test_raid5f_submit_full_stripe_write_request /home/vagrant/spdk_repo/spdk/test/unit/lib/bdev/raid/raid5f.c/raid5f_ut.c:885:2
  #9 0x7f4a71a0960a  (/usr/lib64/libcunit.so.1+0x460a) (BuildId: 9c82dd336cbccd99721651ac0a04435e746e0fc0)
  #10 0x7f4a71a09937  (/usr/lib64/libcunit.so.1+0x4937) (BuildId: 9c82dd336cbccd99721651ac0a04435e746e0fc0)
  #11 0x7f4a71a0a897 in CU_run_all_tests (/usr/lib64/libcunit.so.1+0x5897) (BuildId: 9c82dd336cbccd99721651ac0a04435e746e0fc0)
  #12 0x524fe8 in main /home/vagrant/spdk_repo/spdk/test/unit/lib/bdev/raid/raid5f.c/raid5f_ut.c:1006:2
  #13 0x7f4a711d750f in __libc_start_call_main (/usr/lib64/libc.so.6+0x2750f) (BuildId: 81daba31ee66dbd63efdc4252a872949d874d136)
  #14 0x7f4a711d75c8 in __libc_start_main@GLIBC_2.2.5 (/usr/lib64/libc.so.6+0x275c8) (BuildId: 81daba31ee66dbd63efdc4252a872949d874d136)
  #15 0x4235b4 in _start (/home/vagrant/spdk_repo/spdk/test/unit/lib/bdev/raid/raid5f.c/raid5f_ut+0x4235b4) (BuildId: 028d075edd1a7cd17881fd678ef076adfdbac13d)

Fix this by making iters a zero-length array and put it in a union with a
two-element array to keep the default size for compatibility.

Change-Id: I8573b015755e9986cdadbfa1705d269d51a7c2b7
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/18402
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
EugeneKochetov pushed a commit that referenced this pull request Jan 18, 2024
String allocated in avahi_string_list_to_string()
was not freed after it was no longer needed.

Fixes spdk#3122

Direct leak of 102 byte(s) in 2 object(s) allocated from:
    #0 0x7fc255cd92ef in malloc (/usr/lib64/libasan.so.8+0xd92ef) (BuildId: 6f17f87dc4c1aa9f9dde7c4856604c3a25ba4872)
    #1 0x7fc255bf5fb4 in avahi_malloc (/usr/lib64/libavahi-common.so.3+0x3fb4) (BuildId: e99b431894489a77a31921493f198df4850090e0)
    #2 0x7fc255bf6e74 in avahi_string_list_to_string (/usr/lib64/libavahi-common.so.3+0x4e74) (BuildId: e99b431894489a77a31921493f198df4850090e0)
    #3 0x4a3c93 in mdns_resolve_callback /home/tzawadzk/spdk/module/bdev/nvme/bdev_mdns_client.c:261
	....

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ib0eae278d59152c97572531bef59efef36600f13
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/20849
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
EugeneKochetov pushed a commit that referenced this pull request Jan 18, 2024
first_fused_req is touched on the hot completion path but fused commands
are not considered hot path in general. To workaround that first_fused
flag is introduced in the first cache line of the nvmf request.

Change-Id: Ibd5e9278101cbad6883e937b2830dc8e84b3cb3c
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/21442
Reviewed-by: Jim Harris <jim.harris@samsung.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
EugeneKochetov pushed a commit that referenced this pull request Apr 9, 2024
As per typedef in nvme.h the spdk_nvme_cpl argument should be a
pointer to a const struct.

This fixes runtimer error under clang >= 17.x which now makes the
-fsanitize=function available for C and which on our end is being
enabled via -fsanitize=undefined under UBSAN.

Error in question:

 Test: test_spdk_nvme_detach ...passed
  Test: test_nvme_completion_poll_cb ...passed
  Test: test_nvme_user_copy_cmd_complete
.../root/spdk/lib/nvme/nvme.c:417:2: runtime error: call to function
dummy_cb through pointer to incorrect function type 'void (*)(void *,
const struct spdk_nvme_cpl *)'
/root/spdk/test/unit/lib/nvme/nvme.c/nvme_ut.c:584: note: dummy_cb
defined here
    #0 0x5098e0 in nvme_user_copy_cmd_complete
       /root/spdk/lib/nvme/nvme.c:417:2
    #1 0x532161 in test_nvme_user_copy_cmd_complete
       /root/spdk/test/unit/lib/nvme/nvme.c/nvme_ut.c:604:2
    #2 0x7f08c952266a  (/usr/lib64/libcunit.so.1+0x466a) (BuildId:
       d99e3b60795f2ce01ada820b4b7e3cd84d8150fe)
    #3 0x7f08c95229c7  (/usr/lib64/libcunit.so.1+0x49c7) (BuildId:
       d99e3b60795f2ce01ada820b4b7e3cd84d8150fe)
    #4 0x7f08c9523a9f in CU_run_all_tests
       (/usr/lib64/libcunit.so.1+0x5a9f) (BuildId:
d99e3b60795f2ce01ada820b4b7e3cd84d8150fe)
    #5 0x55555e in run_tests /root/spdk/lib/ut/ut.c:169:3
    #6 0x552aec in spdk_ut_run_tests /root/spdk/lib/ut/ut.c:225:8
    #7 0x522d52 in main
       /root/spdk/test/unit/lib/nvme/nvme.c/nvme_ut.c:1664:17
    #8 0x7f08c8c28149 in __libc_start_call_main
       (/usr/lib64/libc.so.6+0x28149) (BuildId:
7ea8d85df0e89b90c63ac7ed2b3578b2e7728756)
    #9 0x7f08c8c2820a in __libc_start_main@GLIBC_2.2.5
       (/usr/lib64/libc.so.6+0x2820a) (BuildId:
7ea8d85df0e89b90c63ac7ed2b3578b2e7728756)
    #10 0x42b6a4 in _start
	(/root/spdk/test/unit/lib/nvme/nvme.c/nvme_ut+0x42b6a4)
(BuildId: 6fc2caaf777030becad2d0f660ec68443f3380b4)

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior
/root/spdk/lib/nvme/nvme.c:417:2 in
./test/unit/unittest.sh: line 85: 75349 Aborted                 (core
dumped) $valgrind $testdir/lib/nvme/nvme.c/nvme_ut

Change-Id: Iddbd5fc0dee0ef6a6cc1f032e079f6119e76aed9
Signed-off-by: Michal Berger <michal.berger@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/22025
Reviewed-by: Jim Harris <jim.harris@samsung.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet