Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QAT compression always fails to take effect #7081

Closed
woquflux opened this issue Jan 26, 2018 · 11 comments
Closed

QAT compression always fails to take effect #7081

woquflux opened this issue Jan 26, 2018 · 11 comments

Comments

@woquflux
Copy link

System information

Type Version/Name
Linux Kernel 3.10.0-327.el7.x86_64 (Red Hat Enterprise Linux 7.2 )
Architecture x86_64
ZFS Version 0.7.5-1
SPL Version 0.7.5-1
QAT Hardware DH895XCC
QAT Driver qatmux.l.2.6.0-60

Describe the problem you're observing

I follow the link(#5846) to test my QAT to compress zfs volume,But it always does not work

To enable Intel QAT hardware acceleration in ZOL you need have QAT hardware and driver installed:

Enable QAT in autoconf, e.g.:

  • ./configure --with-qat=qat-install-dir/QAT1.6
  • make

Set GZIP compression in ZFS dataset: "compression = gzip"
The data written to ZFS pool will be compressed by QAT hardware automatically with GZIP format.

The following is the QAT statistics when I use dd or cp to write data to the zfs volume

cat /proc/spl/kstat/zfs/qat

18 1 0x01 7 336 2365618746922 2805108820657
name                            type data
comp_reqests                    4    0
comp_total_in_bytes             4    0
comp_total_out_bytes            4    0
decomp_reqests                  4    0
decomp_total_in_bytes           4    0
decomp_total_out_bytes          4    0
dc_fails                        4    0

So I looked qat_compress.c code, then I find code always return from this location:

// qat_compress.c
339 qat_compress(qat_compress_dir_t dir, char *src, int src_len,
340     char *dst, int dst_len, size_t *c_len)
341 {
// ... ...
373         if (!is_vmalloc_addr(src) || !is_vmalloc_addr(src + src_len - 1) ||
374             !is_vmalloc_addr(dst) || !is_vmalloc_addr(dst + dst_len - 1))
375                 return (-1);

The same result when I try to use the zfs 0.7.0 version,So why is this problem, how can I continue my test ?


Other Message

This is the status of QAT

[root@qbackup zfs-0.7.5]# service qat_service status
There is 2 acceleration device(s) in the system:
 icp_dev0 - type=dh895xcc, inst_id=0, node_id=0,  bdf=07:00:0, #accel=6, #engines=12, state=up
 icp_dev1 - type=dh895xcc, inst_id=1, node_id=0,  bdf=84:00:0, #accel=6, #engines=12, state=up

This is the status of zfs

[root@qbackup zfs-0.7.5]# zfs get all qbackup/hdz1
NAME          PROPERTY              VALUE                   SOURCE
qbackup/hdz1  type                  volume                  -
qbackup/hdz1  creation              4 Jan 25 14:13 2018     -
qbackup/hdz1  used                  56K                     -
qbackup/hdz1  available             220G                    -
qbackup/hdz1  referenced            56K                     -
qbackup/hdz1  compressratio         1.00x                   -
qbackup/hdz1  reservation           none                    default
qbackup/hdz1  volsize               50G                     local
qbackup/hdz1  volblocksize          8K                      default
qbackup/hdz1  checksum              on                      default
qbackup/hdz1  compression           gzip                    local
qbackup/hdz1  readonly              off                     default
qbackup/hdz1  createtxg             336807                  -
qbackup/hdz1  copies                1                       default
qbackup/hdz1  refreservation        none                    local
qbackup/hdz1  guid                  12709042754560527079    -
qbackup/hdz1  primarycache          all                     local
qbackup/hdz1  secondarycache        all                     default
qbackup/hdz1  usedbysnapshots       0B                      -
qbackup/hdz1  usedbydataset         56K                     -
qbackup/hdz1  usedbychildren        0B                      -
qbackup/hdz1  usedbyrefreservation  0B                      -
qbackup/hdz1  logbias               latency                 default
qbackup/hdz1  dedup                 off                     default
qbackup/hdz1  mlslabel              none                    default
qbackup/hdz1  sync                  standard                default
qbackup/hdz1  refcompressratio      1.00x                   -
qbackup/hdz1  written               56K                     -
qbackup/hdz1  logicalused           26K                     -
qbackup/hdz1  logicalreferenced     26K                     -
qbackup/hdz1  volmode               default                 default
qbackup/hdz1  snapshot_limit        none                    default
qbackup/hdz1  snapshot_count        none                    default
qbackup/hdz1  snapdev               hidden                  default
qbackup/hdz1  context               none                    default
qbackup/hdz1  fscontext             none                    default
qbackup/hdz1  defcontext            none                    default
qbackup/hdz1  rootcontext           none                    default
qbackup/hdz1  redundant_metadata    all                     default
@gmelikov
Copy link
Member

@wli5 @daweiq can you please look at this issue?

@wli5
Copy link
Contributor

wli5 commented Jan 26, 2018

@woquflux can you please let us know the steps how to reproduce this issue? E.g., OS info, file type, size etc. you are using for testing? The code runs well on our system, so need more info to reproduce it.

@woquflux
Copy link
Author

woquflux commented Jan 26, 2018

OK! Thank you for answering my question!

@woquflux
Copy link
Author

woquflux commented Jan 26, 2018

System Message

Type Version/Name
Machine DELL PowerEdge R730xd
CPU Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz
OS 3.10.0-327.el7.x86_64 (Red Hat Enterprise Linux 7.2 )
QAT Hardware Intel QuickAssist Technology DH895xCC
QAT Driver version L.2.6.0-60(https://01.org/sites/default/files/page/qatmux.l.2.6.0-60.tgz)

Here is my detailed steps

  • Download and uncompress drive, then select 3 to install:
tar xzf qatmux.l.2.6.0-60.tgz
./installer.sh
    <****Acceleration Information****>


Number of DH895xCC devices on the system: 2

BDF=07:00.0
DH895x Stepping A0 detected
device 0 is SKU2

BDF=84:00.0
DH895x Stepping A0 detected
device 1 is SKU2

     Please Select Option :
     ----------------------
     1   Build
     2   Clean Build
     3   Install
     4   Uninstall
     5   Show Accel Info
     6   Change Configuration
     7   Dependency List
     0   Exit


     QAT Devices found: 2 QAT1.6 devices
                        0 QAT1.5 devices
     Configuration:     Build Acceleration and Sample Code
                        for QAT1.6
                        And SR-IOV Disabled
     Exit and re-enter to set defaults

  • Download zfs-0.7.5.tar.gz and spl-0.7.5.tar.gz to install ZFS :
cd spl-0.7.5/
./configure
make;
make install;
cd zfs-0.7.5/
./configure --with-qat=/opt/qat/QAT1.6/ --with-spl=/root/hdzhang/spl-0.7.5/
make;
make install;
modprobe zfs

  • Then I found/proc/spl/kstat/zfs/qat does not exist, so I reinstalled the QAT driver

[root@qbackup qat]# ./installer.sh

    =========================================================
    Welcome to Intel(R) QuickAssist Interactive Installer -v2
    =========================================================


     Please Select Option :
     ----------------------
     1   Build
     2   Clean Build
     3   Install
     4   Uninstall
     5   Show Accel Info
     6   Change Configuration
     7   Dependency List
     0   Exit


     QAT Devices found: 2 QAT1.6 devices
                        0 QAT1.5 devices
     Configuration:     Build Acceleration and Sample Code
                        for QAT1.6
                        And SR-IOV Disabled
     Exit and re-enter to set defaults


  • Test ZFS compress:
echo 0 > /sys/module/zfs/parameters/zfs_qat_disable

zpool create qbackup /dev/sdc;
zfs create -V 100G qbackup/hdz1;
zfs set compression=gzip qbackup/hdz1;
[root@qbackup ~]# zpool status
  pool: qbackup
 state: ONLINE
  scan: none requested
config:

    NAME        STATE     READ WRITE CKSUM
    qbackup     ONLINE       0     0     0
      sdc       ONLINE       0     0     0

errors: No known data errors

[root@qbackup ~]# zfs list
NAME           USED  AVAIL  REFER  MOUNTPOINT
qbackup        103G  1.47T    96K  /qbackup
qbackup/hdz1   103G  1.57T  2.08G  -

[root@qbackup ~]# zfs get all qbackup/hdz1
NAME          PROPERTY              VALUE                   SOURCE
qbackup/hdz1  type                  volume                  -
qbackup/hdz1  creation              五 1月 26 18:31 2018  -
qbackup/hdz1  used                  103G                    -
qbackup/hdz1  available             1.57T                   -
qbackup/hdz1  referenced            2.89G                   -
qbackup/hdz1  compressratio         2.01x                   -
qbackup/hdz1  reservation           none                    default
qbackup/hdz1  volsize               100G                    local
qbackup/hdz1  volblocksize          8K                      default
qbackup/hdz1  checksum              on                      default
qbackup/hdz1  compression           gzip                    local
qbackup/hdz1  readonly              off                     default
qbackup/hdz1  createtxg             81                      -
qbackup/hdz1  copies                1                       default
qbackup/hdz1  refreservation        103G                    local
qbackup/hdz1  guid                  16717141794397827493    -
qbackup/hdz1  primarycache          all                     default
qbackup/hdz1  secondarycache        all                     default
qbackup/hdz1  usedbysnapshots       0B                      -
qbackup/hdz1  usedbydataset         2.89G                   -
qbackup/hdz1  usedbychildren        0B                      -
qbackup/hdz1  usedbyrefreservation  100G                    -
qbackup/hdz1  logbias               latency                 default
qbackup/hdz1  dedup                 off                     default
qbackup/hdz1  mlslabel              none                    default
qbackup/hdz1  sync                  standard                default
qbackup/hdz1  refcompressratio      2.01x                   -
qbackup/hdz1  written               2.89G                   -
qbackup/hdz1  logicalused           5.75G                   -
qbackup/hdz1  logicalreferenced     5.75G                   -
qbackup/hdz1  volmode               default                 default
qbackup/hdz1  snapshot_limit        none                    default
qbackup/hdz1  snapshot_count        none                    default
qbackup/hdz1  snapdev               hidden                  default
qbackup/hdz1  context               none                    default
qbackup/hdz1  fscontext             none                    default
qbackup/hdz1  defcontext            none                    default
qbackup/hdz1  rootcontext           none                    default
qbackup/hdz1  redundant_metadata    all                     default
[root@qbackup ~]#
dd if=/dev/sda of=/dev/qbackup/hdz1
Every 2.0s: cat /proc/spl/kstat/zfs/qat                                                                                         Fri Jan 26 18:34:09 2018

18 1 0x01 7 336 2334347743541 2603832924844
name                            type data
comp_reqests                    4    0
comp_total_in_bytes             4    0
comp_total_out_bytes            4    0
decomp_reqests                  4    0
decomp_total_in_bytes           4    0
decomp_total_out_bytes          4    0
dc_fails                        4    0

@wli5
Copy link
Contributor

wli5 commented Jan 29, 2018

@woquflux thanks for the steps, we can reproduce this problem now. First looking at this issue it seems you are using "dd" to copy block device to another disk which source buffer is probably not allocated by file-system, we need investigate more details to work out a fix. In the meanwhile, you can change your test method e.g., copy files from a file-system directory to the Zpool, or use FIO for the tests. If you want to use dd test from block layer, please try to remove the check in line 373 in file qat_compress.c, and re-build ZFS module. This condition check should be removed, but we need more tests to verify it before we submit a new PR.

@woquflux
Copy link
Author

@wli5 Thank you for this information,but I already try copy files to zfs volume to test this, the result is same as dd. At the same time, I also try to remove the check in line 373 in file qat_compress.c, looks normal at first, but after some time, I found zfs volume and the system volume has a problem, I will re-test later and give you a detailed description of this phenomenon.

@wli5
Copy link
Contributor

wli5 commented Jan 31, 2018

@woquflux by removing this condition check we've verified the code works well, please provide more info if you still find a problem, thanks!

@woquflux
Copy link
Author

@wli5 Sorry, I just found my test method have a little problem, so I will retest by removing condition check, I believe it should be no problem. Thank you very much!

@woquflux
Copy link
Author

woquflux commented Feb 1, 2018

@wli5 I found a new problem, When I install qat driver, if I choose Sr-iov (Host), I found the qat is fails to take effect too, Then I check the code, I find this exit in qat_compress.c at line 209, This means system have not avaliable DC instance, so what is the reason?

194 int
195 qat_init(void)
196 {
... ...
207         status = cpaDcGetNumInstances(&num_inst);
208         if (status != CPA_STATUS_SUCCESS || num_inst == 0)
209                 printk("init qat failed! %d %d\n", status, num_inst);
210                 return (-1);

Are you Chinese? Can you give me your phone number or other ways of contact?

The following are the details


$./installer.sh


    =========================================================
    Welcome to Intel(R) QuickAssist Interactive Installer -v2
    =========================================================


     Please Select Option :
     ----------------------
     1   Build
     2   Clean Build
     3   Install
     4   Uninstall
     5   Show Accel Info
     6   Change Configuration
     7   Dependency List
     0   Exit


     QAT Devices found: 33 QAT1.6 devices
                        0 QAT1.5 devices
     Configuration:     Build Acceleration and Sample Code
                        for QAT1.6
                        And SR-IOV Disabled
     Exit and re-enter to set defaults



6
     Options to change configuration:
     --------------------------------
     a1  Set Build Target as "sample_code only"
     a2  Set Build Target as "DC_ONLY acceleration and sample_code"
     a3  Set Build Target as "acceleration driver only"
     b1  Set Build Location
     c1  QAT1.5 Only
     c2  QAT1.6 Only
     c3  QAT1.6 with Mux
     c4  QAT1.5 and QAT1.6 with Mux
     c5  Enable/Disable Mux based on devices detected now
     d1  Set SRIOV Mode to "Host"
     d2  Set SRIOV Mode to "Guest"
     e1  Toggle GigE Watchdog for Acceleration Install (Disabled)
     z1  Go Back to Main Menu

d1
     Please Select Option :
     ----------------------
     1   Build
     2   Clean Build
     3   Install
     4   Uninstall
     5   Show Accel Info
     6   Change Configuration
     7   Dependency List
     0   Exit


     QAT Devices found: 33 QAT1.6 devices
                        0 QAT1.5 devices
     Configuration:     Build Acceleration and Sample Code
                        for QAT1.6
                        and with SR-IOV(Host)
     Exit and re-enter to set defaults
$service qat_service status
There is 1 acceleration device(s) in the system:
 icp_dev0 - type=dh895xcc, inst_id=0, node_id=0,  bdf=84:00:0, #accel=6, #engines=12, state=up
$lspci |grep QAT
84:00.0 Co-processor: Intel Corporation DH895XCC Series QAT
84:01.0 Co-processor: Intel Corporation DH895XCC Series QAT Virtual Function
84:01.1 Co-processor: Intel Corporation DH895XCC Series QAT Virtual Function
84:01.2 Co-processor: Intel Corporation DH895XCC Series QAT Virtual Function
84:01.3 Co-processor: Intel Corporation DH895XCC Series QAT Virtual Function
84:01.4 Co-processor: Intel Corporation DH895XCC Series QAT Virtual Function
84:01.5 Co-processor: Intel Corporation DH895XCC Series QAT Virtual Function
84:01.6 Co-processor: Intel Corporation DH895XCC Series QAT Virtual Function
84:01.7 Co-processor: Intel Corporation DH895XCC Series QAT Virtual Function
84:02.0 Co-processor: Intel Corporation DH895XCC Series QAT Virtual Function
84:02.1 Co-processor: Intel Corporation DH895XCC Series QAT Virtual Function
84:02.2 Co-processor: Intel Corporation DH895XCC Series QAT Virtual Function
84:02.3 Co-processor: Intel Corporation DH895XCC Series QAT Virtual Function
84:02.4 Co-processor: Intel Corporation DH895XCC Series QAT Virtual Function
84:02.5 Co-processor: Intel Corporation DH895XCC Series QAT Virtual Function
84:02.6 Co-processor: Intel Corporation DH895XCC Series QAT Virtual Function
84:02.7 Co-processor: Intel Corporation DH895XCC Series QAT Virtual Function
84:03.0 Co-processor: Intel Corporation DH895XCC Series QAT Virtual Function
84:03.1 Co-processor: Intel Corporation DH895XCC Series QAT Virtual Function
84:03.2 Co-processor: Intel Corporation DH895XCC Series QAT Virtual Function
84:03.3 Co-processor: Intel Corporation DH895XCC Series QAT Virtual Function
84:03.4 Co-processor: Intel Corporation DH895XCC Series QAT Virtual Function
84:03.5 Co-processor: Intel Corporation DH895XCC Series QAT Virtual Function
84:03.6 Co-processor: Intel Corporation DH895XCC Series QAT Virtual Function
84:03.7 Co-processor: Intel Corporation DH895XCC Series QAT Virtual Function
84:04.0 Co-processor: Intel Corporation DH895XCC Series QAT Virtual Function
84:04.1 Co-processor: Intel Corporation DH895XCC Series QAT Virtual Function
84:04.2 Co-processor: Intel Corporation DH895XCC Series QAT Virtual Function
84:04.3 Co-processor: Intel Corporation DH895XCC Series QAT Virtual Function
84:04.4 Co-processor: Intel Corporation DH895XCC Series QAT Virtual Function
84:04.5 Co-processor: Intel Corporation DH895XCC Series QAT Virtual Function
84:04.6 Co-processor: Intel Corporation DH895XCC Series QAT Virtual Function
84:04.7 Co-processor: Intel Corporation DH895XCC Series QAT Virtual Function

@wli5
Copy link
Contributor

wli5 commented Feb 2, 2018

@woquflux are you using zfs+qat in your VM or host? have you tested the kernel space sample code, can you see instance in the sample code? Yes, I'm in China, we can talk.

@woquflux
Copy link
Author

woquflux commented Feb 2, 2018

@wli5 I did not use a virtual machine, I opened the IOMMU, so I found that you need to select Sr-iov (host) when installing the driver, so Can you give me your WeChat, My WeChat is he_dong_zhang, I elaborate on my test method.

wli5 added a commit to wli5/zfs that referenced this issue Feb 5, 2018
This fix is for issue openzfs#7081:
Remove the unused vmalloc address check, and function mem_to_page will
handle the non-vmalloc address when map it to a physical address.
@woquflux woquflux closed this as completed Feb 5, 2018
wli5 added a commit to wli5/zfs that referenced this issue Feb 5, 2018
This fix is for issue openzfs#7081:
Remove the unused vmalloc address check, and function mem_to_page
will handle the non-vmalloc address when map it to a physical
address.

Signed-off-by: Weigang Li <weigang.li@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants