Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zil_itx_needcopy_bytes kstat counter is corrupted #6988

Closed
dechamps opened this issue Dec 20, 2017 · 14 comments
Closed

zil_itx_needcopy_bytes kstat counter is corrupted #6988

dechamps opened this issue Dec 20, 2017 · 14 comments
Milestone

Comments

@dechamps
Copy link
Contributor

dechamps commented Dec 20, 2017

System information

Type Version/Name
Distribution Name Debian
Distribution Version Unstable
Linux Kernel 4.13.0-1-amd64
Architecture amd64
ZFS Version 0.7.3-3
SPL Version 0.7.3-1

Describe the problem you're observing

$ cat /proc/spl/kstat/zfs/zil
15 1 0x01 13 624 31503034653 382758011634377
name                            type data
zil_commit_count                4    197902
zil_commit_writer_count         4    197884
zil_itx_count                   4    611431070
zil_itx_indirect_count          4    0
zil_itx_indirect_bytes          4    0
zil_itx_copied_count            4    0
zil_itx_copied_bytes            4    0
zil_itx_needcopy_count          4    611266365
zil_itx_needcopy_bytes          4    18446744072731425348
zil_itx_metaslab_normal_count   4    0
zil_itx_metaslab_normal_bytes   4    0
zil_itx_metaslab_slog_count     4    1169526
zil_itx_metaslab_slog_bytes     4    140983216376

The zil_itx_needcopy_bytes counter is blatantly wrong - I'm pretty sure I did not write 16 exabytes of data in that pool :) Its value is quite close to UINT64_MAX, which suggests some kind of overflow or memory corruption.

Describe how to reproduce the problem

Not sure. However I can tell that it started after I did a system upgrade, which included the following version changes:

  • Kernel: 4.12 → 4.13
  • SPL: 0.6.5 → 0.7.3
  • ZFS: 0.6.5 → 0.7.3

For this reason I suspect this might be a regression introduced between SPL/ZFS 0.6.5 and SPL/ZFS 0.7.3.

This issue might seem benign, but in my case it's really not because it prevents Prometheus Node exporter from exporting ZFS metrics correctly. Here's the log message from the node exporter in an attempt to make this issue easier to search for:

time="2017-12-20T22:32:39Z" level=error msg="ERROR: zfs collector failed after 0.000693s: could not parse expected integer value for \"kstat.zfs.misc.zil.zil_itx_needcopy_bytes\"" source="node_exporter.go:95"
@dechamps
Copy link
Contributor Author

The only place where that counter gets written to in the code is:

https://github.com/zfsonlinux/zfs/blob/1b2b0acab54ad4320e9fab9f46612fdb2a71cf87/module/zfs/zil.c#L1489

It looks like the only way that can happen is if lrw->lr_length itself is corrupted (overflow?), which actually sounds quite scary.

@behlendorf behlendorf added this to the 0.8.0 milestone Dec 21, 2017
@behlendorf
Copy link
Contributor

That is scary! So it's possible this was fixed in 0.7.4 by commit 4a98780 which is a backport from master. The patch resolves a potential issue where an itx, and thus an lrw->lr_length, might be corrupted under exactly the right circumstances. I've never actually been able to reproduce this specific issue but it's a plausible way to explain the zilstats. And it would only need to have happened once to trash that counter.

If you're able to reproduce the issue I'd suggest updating to 0.7.4 or newer which includes the fix. Locally I haven't been able to reproduce this issue.

@dechamps
Copy link
Contributor Author

Interesting. I might just try that. I tried to reproduce it by adding a tactical VERIFY3U() before that line and then running ztest, but no luck.

On my live production pool, I know from my Prometheus timeseries that it happens very quickly (< minutes) after the pool comes up, but I have no idea what triggers it specifically. Once I'm back from the holidays I'll try updating to 0.7.4 (or just cherry-picking your patch) and see what happens.

@cwedgwood
Copy link
Contributor

@dechamps underflow?

18446744072731425348 has most MSB set

'-18446744072731425348' is 978126267 which isn't outrageous

@dechamps
Copy link
Contributor Author

dechamps commented Jan 7, 2018

I have upgraded to 0.7.4, and that appears to have fixed the issue.

@dechamps dechamps closed this as completed Jan 7, 2018
@dechamps
Copy link
Contributor Author

Strike my last. It looks like I spoke too fast. The issue is still there in 0.7.4-1. The counter started showing corruption again after about 5 days of uptime.

@dechamps dechamps reopened this Jan 14, 2018
@nightah
Copy link

nightah commented Jan 23, 2018

It appears that I'm having the same issue.

System information

Type Version/Name
Distribution Name Arch Linux
Distribution Version n/a
Linux Kernel 4.14.14-1
Architecture amd64
ZFS Version 0.7.5-1
SPL Version 0.7.5-1

Describe the problem you're observing

# cat /proc/spl/kstat/zfs/zil
15 1 0x01 13 624 1924297609 504195774290
name                            type data
zil_commit_count                4    7137
zil_commit_writer_count         4    6754
zil_itx_count                   4    126774
zil_itx_indirect_count          4    145
zil_itx_indirect_bytes          4    14166783
zil_itx_copied_count            4    0
zil_itx_copied_bytes            4    0
zil_itx_needcopy_count          4    120733
zil_itx_needcopy_bytes          4    18446744073709013048
zil_itx_metaslab_normal_count   4    5581
zil_itx_metaslab_normal_bytes   4    84571112
zil_itx_metaslab_slog_count     4    0
zil_itx_metaslab_slog_bytes     4    0

Similarly I noticed this when my node exporter instance complained about the same issue as the previous report.

@cwedgwood
Copy link
Contributor

@dechamps what about putting some unused (but exported) guard values before and after it? that way when it's corrupted we could dump those (which by default would be 0) and perhaps get some more insight into this

@cwedgwood
Copy link
Contributor

@dechamps furthermore, if the guard values are being corrupted we could look for this in common code-paths and WARN

@chrisrd
Copy link
Contributor

chrisrd commented Feb 13, 2018

Me too... zfs-0.7.6, linux-4.9.76 (also noticed via node-exporter failing)

# grep zil_itx_needcopy_bytes /proc/spl/kstat/zfs/zil
zil_itx_needcopy_bytes          4    18446744073709537686

@cwedgwood
Copy link
Contributor

@chrisrd @behlendorf note 18446744073709537686 is 0xffffffffffffc996 - again, high bits set as if an underflow (-13930)

@richardelling
Copy link
Contributor

The logic for the length of the data written is incorrect in zil.c line 1165, allowing a negative
increment soon thereafter.

				if (lrwb->lr_length > dnow)
					lrwb->lr_length = dnow;
				lrw->lr_offset += dnow;
				lrw->lr_length -= dnow;
				ZIL_STAT_BUMP(zil_itx_needcopy_count);
				ZIL_STAT_INCR(zil_itx_needcopy_bytes,
				    lrw->lr_length);

I'll take a closer look at this soon.

Regarding node_explorer, the latest version after 16-jan-2018 should handle uint64 to float64 ok. node_explorer and Prometheus only do floats, so we inevitably will lose resolution. If you can reach out to me at Richard.Elling@RichardElling.com with the error message you saw, I think we can make node_explorer get smarter about these things.

@richardelling
Copy link
Contributor

I can see no benefit to incrementing zil_itx_needcopy_bytes after lrw->lr_length is decremented. IMHO, the benefit of tracking the bytes is to see the relative weight of WR_COPY, WR_NEED_COPY, and WR_INDIRECT. With the relative weighting, decisions can be made wrt zfs_immediate_write_sz and logbias

So I propose moving the lrw->lr_length -= dnow; after the ZIL_STAT_* adjustments. Thoughts?

@chrisrd
Copy link
Contributor

chrisrd commented Feb 15, 2018

It's all pretty opaque to me, but here goes nothing...

It looks like dnow is how much data we're adding "now" to lwb (via lrcb) and writing out in zil_lwb_add_txg(lwb, txg). And this stuff:

        lrw->lr_offset += dnow; 
        lrw->lr_length -= dnow;

...is recording the offset in the buffer (lr_offset) and how much data (lr_length) we need to output next time we come through the goto cont loop,

So this:

        ZIL_STAT_INCR(zil_itx_needcopy_bytes, lrw->lr_length);

...looks like we're incrementing zil_itx_needcopy_bytes by how much data is remaining, each time through the loop, and lrw->lr_length goes negative when we can fit the whole thing with space left over in the current record (which means we won't come through the loop again).

I suspect the correct thing to do is to increment zil_itx_needcopy_bytes by how much we're writing this time through the loop, i.e.:

        ZIL_STAT_INCR(zil_itx_needcopy_bytes, dnow);

But I've no idea how to test this!

chrisrd added a commit to chrisrd/zfs that referenced this issue Feb 15, 2018
In zil_lwb_commit() with TX_WRITE, we copy the log write record (lrw)
into the log write block (lwb) and send it off using zil_lwb_add_txg().
If we also have WR_NEED_COPY, we additionally copy the lwr's data into
the lwb to be sent off.  If the lwr + data doesn't fit into the lwb, we
send the lrw and as much data as will fit (dnow bytes), then go back
and do the same with the remaining data.

Each time through this loop we're sending dnow data bytes. I.e.
zil_itx_needcopy_bytes should be incremented by dnow.

Signed-off-by: Chris Dunlop <chris@onthe.net.au>
Closes: openzfs#6988
chrisrd added a commit to chrisrd/zfs that referenced this issue Mar 1, 2018
In zil_lwb_commit() with TX_WRITE, we copy the log write record (lrw)
into the log write block (lwb) and send it off using zil_lwb_add_txg().
If we also have WR_NEED_COPY, we additionally copy the lwr's data into
the lwb to be sent off.  If the lwr + data doesn't fit into the lwb, we
send the lrw and as much data as will fit (dnow bytes), then go back
and do the same with the remaining data.

Each time through this loop we're sending dnow data bytes. I.e.
zil_itx_needcopy_bytes should be incremented by dnow.

Signed-off-by: Chris Dunlop <chris@onthe.net.au>
Closes: openzfs#6988
tonyhutter pushed a commit to tonyhutter/zfs that referenced this issue Mar 7, 2018
In zil_lwb_commit() with TX_WRITE, we copy the log write record (lrw)
into the log write block (lwb) and send it off using zil_lwb_add_txg().
If we also have WR_NEED_COPY, we additionally copy the lwr's data into
the lwb to be sent off.  If the lwr + data doesn't fit into the lwb, we
send the lrw and as much data as will fit (dnow bytes), then go back
and do the same with the remaining data.

Each time through this loop we're sending dnow data bytes. I.e.
zil_itx_needcopy_bytes should be incremented by dnow.

Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris Dunlop <chris@onthe.net.au>
Closes openzfs#6988 
Closes openzfs#7176
tonyhutter pushed a commit to tonyhutter/zfs that referenced this issue Mar 7, 2018
This is a squashed patchset for zfs-0.7.7.  The individual commits are
in the tonyhutter:zfs-0.7.7-hutter branch.  I squashed the commits so
that buildbot wouldn't have to run against each one, and because
github/builbot seem to have a maximum limit of 30 commits they can
test from a PR.

- Linux 4.16 compat: get_disk_and_module() openzfs#7264
- Change checksum & IO delay ratelimit values openzfs#7252
- Increment zil_itx_needcopy_bytes properly openzfs#6988  openzfs#7176
- Fix some typos openzfs#7237
- Fix zpool(8) list example to match actual format openzfs#7244
- Add SMART self-test results to zpool status -c openzfs#7178
- Add scrub after resilver zed script openzfs#4662  openzfs#7086
- Fix free memory calculation on v3.14+ openzfs#7170
- Report duration and error in mmp_history entries openzfs#7190
- Do not initiate MMP writes while pool is suspended openzfs#7182
- Linux 4.16 compat: use correct *_dec_and_test() openzfs#7179  openzfs#7211
- Allow modprobe to fail when called within systemd openzfs#7174
- Add SMART attributes for SSD and NVMe openzfs#7183  openzfs#7193
- Correct count_uberblocks in mmp.kshlib openzfs#7191
- Fix config issues: frame size and headers openzfs#7169
- Clarify zinject(8) explanation of -e openzfs#7172
- OpenZFS 8857 - zio_remove_child() panic due to already destroyed
  parent zio openzfs#7168
- 'zfs receive' fails with "dataset is busy" openzfs#7129  openzfs#7154
- contrib/initramfs: add missing conf.d/zfs openzfs#7158
- mmp should use a fixed tag for spa_config locks openzfs#6530  openzfs#7155
- Handle zap_add() failures in mixed case mode openzfs#7011 openzfs#7054
- Fix zdb -ed on objset for exported pool openzfs#7099 openzfs#6464
- Fix zdb -E segfault openzfs#7099
- Fix zdb -R decompression openzfs#7099  openzfs#4984
- Fix racy assignment of zcb.zcb_haderrors openzfs#7099
- Fix zle_decompress out of bound access openzfs#7099
- Fix zdb -c traverse stop on damaged objset root openzfs#7099
- Linux 4.11 compat: avoid refcount_t name conflict openzfs#7148
- Linux 4.16 compat: inode_set_iversion() openzfs#7148
- OpenZFS 8966 - Source file zfs_acl.c, function zfs_aclset_common
  contains a use after end of the lifetime of a local variable openzfs#7141
- Remove deprecated zfs_arc_p_aggressive_disable openzfs#7135
- Fix default libdir for Debian/Ubuntu openzfs#7083  openzfs#7101
- Bug fix in qat_compress.c for vmalloc addr check openzfs#7125
- Fix systemd_ RPM macros usage on Debian-based distributions openzfs#7074
  openzfs#7100
- Emit an error message before MMP suspends pool openzfs#7048
- ZTS: Fix create-o_ashift test case openzfs#6924  openzfs#6977
- Fix --with-systemd on Debian-based distributions (openzfs#6963) openzfs#6591  openzfs#6963
- Remove vn_rename and vn_remove dependency openzfs/spl#648 openzfs#6753
- Add support for "--enable-code-coverage" option openzfs#6670
- Make "-fno-inline" compile option more accessible openzfs#6605
- Add configure option to enable gcov analysis openzfs#6642
- Implement --enable-debuginfo to force debuginfo openzfs#2734
- Make --enable-debug fail when given bogus args openzfs#2734

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Requires-spl: refs/pull/690/head
tonyhutter pushed a commit to tonyhutter/zfs that referenced this issue Mar 12, 2018
In zil_lwb_commit() with TX_WRITE, we copy the log write record (lrw)
into the log write block (lwb) and send it off using zil_lwb_add_txg().
If we also have WR_NEED_COPY, we additionally copy the lwr's data into
the lwb to be sent off.  If the lwr + data doesn't fit into the lwb, we
send the lrw and as much data as will fit (dnow bytes), then go back
and do the same with the remaining data.

Each time through this loop we're sending dnow data bytes. I.e.
zil_itx_needcopy_bytes should be incremented by dnow.

Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris Dunlop <chris@onthe.net.au>
Closes openzfs#6988 
Closes openzfs#7176
tonyhutter pushed a commit to tonyhutter/zfs that referenced this issue Mar 13, 2018
This is a squashed patchset for zfs-0.7.7.  The individual commits are
in the tonyhutter:zfs-0.7.7-hutter branch.  I squashed the commits so
that buildbot wouldn't have to run against each one, and because
github/builbot seem to have a maximum limit of 30 commits they can
test from a PR.

- Fix MMP write frequency for large pools openzfs#7205 openzfs#7289
- Handle zio_resume and mmp => off openzfs#7286
- Fix zfs-kmod builds when using rpm >= 4.14 openzfs#7284
- zdb and inuse tests don't pass with real disks openzfs#6939 openzfs#7261
- Take user namespaces into account in policy checks openzfs#6800 openzfs#7270
- Detect long config lock acquisition in mmp openzfs#7212
- Linux 4.16 compat: get_disk_and_module() openzfs#7264
- Change checksum & IO delay ratelimit values openzfs#7252
- Increment zil_itx_needcopy_bytes properly openzfs#6988 openzfs#7176
- Fix some typos openzfs#7237
- Fix zpool(8) list example to match actual format openzfs#7244
- Add SMART self-test results to zpool status -c openzfs#7178
- Add scrub after resilver zed script openzfs#4662 openzfs#7086
- Fix free memory calculation on v3.14+ openzfs#7170
- Report duration and error in mmp_history entries openzfs#7190
- Do not initiate MMP writes while pool is suspended openzfs#7182
- Linux 4.16 compat: use correct *_dec_and_test()
- Allow modprobe to fail when called within systemd openzfs#7174
- Add SMART attributes for SSD and NVMe openzfs#7183 openzfs#7193
- Correct count_uberblocks in mmp.kshlib openzfs#7191
- Fix config issues: frame size and headers openzfs#7169
- Clarify zinject(8) explanation of -e openzfs#7172
- OpenZFS 8857 - zio_remove_child() panic due to already destroyed parent zio openzfs#7168
- 'zfs receive' fails with "dataset is busy" openzfs#7129 openzfs#7154
- contrib/initramfs: add missing conf.d/zfs openzfs#7158
- mmp should use a fixed tag for spa_config locks openzfs#6530 openzfs#7155
- Handle zap_add() failures in mixed case mode openzfs#7011 openzfs#7054
- Fix zdb -ed on objset for exported pool openzfs#7099 openzfs#6464
- Fix zdb -E segfault openzfs#7099
- Fix zdb -R decompression openzfs#7099 openzfs#4984
- Fix racy assignment of zcb.zcb_haderrors openzfs#7099
- Fix zle_decompress out of bound access openzfs#7099
- Fix zdb -c traverse stop on damaged objset root openzfs#7099
- Linux 4.11 compat: avoid refcount_t name conflict openzfs#7148
- Linux 4.16 compat: inode_set_iversion() openzfs#7148
- OpenZFS 8966 - Source file zfs_acl.c, function zfs_aclset_common contains a use after end of the lifetime of a local variable openzfs#7141
- Remove deprecated zfs_arc_p_aggressive_disable openzfs#7135
- Fix default libdir for Debian/Ubuntu openzfs#7083 openzfs#7101
- Bug fix in qat_compress.c for vmalloc addr check openzfs#7125
- Fix systemd_ RPM macros usage on Debian-based distributions openzfs#7074 openzfs#7100
- Emit an error message before MMP suspends pool openzfs#7048
- ZTS: Fix create-o_ashift test case openzfs#6924 openzfs#6977
- Fix --with-systemd on Debian-based distributions (openzfs#6963) openzfs#6591 openzfs#6963
- Remove vn_rename and vn_remove dependency openzfs/spl#648 openzfs#6753
- Add support for "--enable-code-coverage" option openzfs#6670
- Make "-fno-inline" compile option more accessible openzfs#6605
- Add configure option to enable gcov analysis openzfs#6642
- Implement --enable-debuginfo to force debuginfo openzfs#2734
- Make --enable-debug fail when given bogus args openzfs#2734

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Requires-spl: refs/pull/690/head
tonyhutter pushed a commit to tonyhutter/zfs that referenced this issue Mar 13, 2018
This is a squashed patchset for zfs-0.7.7.  The individual commits are
in the tonyhutter:zfs-0.7.7-hutter branch.  I squashed the commits so
that buildbot wouldn't have to run against each one, and because
github/builbot seem to have a maximum limit of 30 commits they can
test from a PR.

- Fix MMP write frequency for large pools openzfs#7205 openzfs#7289
- Handle zio_resume and mmp => off openzfs#7286
- Fix zfs-kmod builds when using rpm >= 4.14 openzfs#7284
- zdb and inuse tests don't pass with real disks openzfs#6939 openzfs#7261
- Take user namespaces into account in policy checks openzfs#6800 openzfs#7270
- Detect long config lock acquisition in mmp openzfs#7212
- Linux 4.16 compat: get_disk_and_module() openzfs#7264
- Change checksum & IO delay ratelimit values openzfs#7252
- Increment zil_itx_needcopy_bytes properly openzfs#6988 openzfs#7176
- Fix some typos openzfs#7237
- Fix zpool(8) list example to match actual format openzfs#7244
- Add SMART self-test results to zpool status -c openzfs#7178
- Add scrub after resilver zed script openzfs#4662 openzfs#7086
- Fix free memory calculation on v3.14+ openzfs#7170
- Report duration and error in mmp_history entries openzfs#7190
- Do not initiate MMP writes while pool is suspended openzfs#7182
- Linux 4.16 compat: use correct *_dec_and_test()
- Allow modprobe to fail when called within systemd openzfs#7174
- Add SMART attributes for SSD and NVMe openzfs#7183 openzfs#7193
- Correct count_uberblocks in mmp.kshlib openzfs#7191
- Fix config issues: frame size and headers openzfs#7169
- Clarify zinject(8) explanation of -e openzfs#7172
- OpenZFS 8857 - zio_remove_child() panic due to already destroyed
  parent zio openzfs#7168
- 'zfs receive' fails with "dataset is busy" openzfs#7129 openzfs#7154
- contrib/initramfs: add missing conf.d/zfs openzfs#7158
- mmp should use a fixed tag for spa_config locks openzfs#6530 openzfs#7155
- Handle zap_add() failures in mixed case mode openzfs#7011 openzfs#7054
- Fix zdb -ed on objset for exported pool openzfs#7099 openzfs#6464
- Fix zdb -E segfault openzfs#7099
- Fix zdb -R decompression openzfs#7099 openzfs#4984
- Fix racy assignment of zcb.zcb_haderrors openzfs#7099
- Fix zle_decompress out of bound access openzfs#7099
- Fix zdb -c traverse stop on damaged objset root openzfs#7099
- Linux 4.11 compat: avoid refcount_t name conflict openzfs#7148
- Linux 4.16 compat: inode_set_iversion() openzfs#7148
- OpenZFS 8966 - Source file zfs_acl.c, function zfs_aclset_common
  contains a use after end of the lifetime of a local variable openzfs#7141
- Remove deprecated zfs_arc_p_aggressive_disable openzfs#7135
- Fix default libdir for Debian/Ubuntu openzfs#7083 openzfs#7101
- Bug fix in qat_compress.c for vmalloc addr check openzfs#7125
- Fix systemd_ RPM macros usage on Debian-based distributions openzfs#7074
  openzfs#7100
- Emit an error message before MMP suspends pool openzfs#7048
- ZTS: Fix create-o_ashift test case openzfs#6924 openzfs#6977
- Fix --with-systemd on Debian-based distributions (openzfs#6963) openzfs#6591 openzfs#6963
- Remove vn_rename and vn_remove dependency openzfs/spl#648 openzfs#6753
- Add support for "--enable-code-coverage" option openzfs#6670
- Make "-fno-inline" compile option more accessible openzfs#6605
- Add configure option to enable gcov analysis openzfs#6642
- Implement --enable-debuginfo to force debuginfo openzfs#2734
- Make --enable-debug fail when given bogus args openzfs#2734

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Requires-spl: refs/pull/690/head
tonyhutter pushed a commit to tonyhutter/zfs that referenced this issue Mar 13, 2018
In zil_lwb_commit() with TX_WRITE, we copy the log write record (lrw)
into the log write block (lwb) and send it off using zil_lwb_add_txg().
If we also have WR_NEED_COPY, we additionally copy the lwr's data into
the lwb to be sent off.  If the lwr + data doesn't fit into the lwb, we
send the lrw and as much data as will fit (dnow bytes), then go back
and do the same with the remaining data.

Each time through this loop we're sending dnow data bytes. I.e.
zil_itx_needcopy_bytes should be incremented by dnow.

Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris Dunlop <chris@onthe.net.au>
Closes openzfs#6988 
Closes openzfs#7176
tonyhutter pushed a commit to tonyhutter/zfs that referenced this issue Mar 13, 2018
This is a squashed patchset for zfs-0.7.7.  The individual commits are
in the tonyhutter:zfs-0.7.7-hutter branch.  I squashed the commits so
that buildbot wouldn't have to run against each one, and because
github/builbot seem to have a maximum limit of 30 commits they can
test from a PR.

- Fix MMP write frequency for large pools openzfs#7205 openzfs#7289
- Handle zio_resume and mmp => off openzfs#7286
- Fix zfs-kmod builds when using rpm >= 4.14 openzfs#7284
- zdb and inuse tests don't pass with real disks openzfs#6939 openzfs#7261
- Take user namespaces into account in policy checks openzfs#6800 openzfs#7270
- Detect long config lock acquisition in mmp openzfs#7212
- Linux 4.16 compat: get_disk_and_module() openzfs#7264
- Change checksum & IO delay ratelimit values openzfs#7252
- Increment zil_itx_needcopy_bytes properly openzfs#6988 openzfs#7176
- Fix some typos openzfs#7237
- Fix zpool(8) list example to match actual format openzfs#7244
- Add SMART self-test results to zpool status -c openzfs#7178
- Add scrub after resilver zed script openzfs#4662 openzfs#7086
- Fix free memory calculation on v3.14+ openzfs#7170
- Report duration and error in mmp_history entries openzfs#7190
- Do not initiate MMP writes while pool is suspended openzfs#7182
- Linux 4.16 compat: use correct *_dec_and_test()
- Allow modprobe to fail when called within systemd openzfs#7174
- Add SMART attributes for SSD and NVMe openzfs#7183 openzfs#7193
- Correct count_uberblocks in mmp.kshlib openzfs#7191
- Fix config issues: frame size and headers openzfs#7169
- Clarify zinject(8) explanation of -e openzfs#7172
- OpenZFS 8857 - zio_remove_child() panic due to already destroyed
  parent zio openzfs#7168
- 'zfs receive' fails with "dataset is busy" openzfs#7129 openzfs#7154
- contrib/initramfs: add missing conf.d/zfs openzfs#7158
- mmp should use a fixed tag for spa_config locks openzfs#6530 openzfs#7155
- Handle zap_add() failures in mixed case mode openzfs#7011 openzfs#7054
- Fix zdb -ed on objset for exported pool openzfs#7099 openzfs#6464
- Fix zdb -E segfault openzfs#7099
- Fix zdb -R decompression openzfs#7099 openzfs#4984
- Fix racy assignment of zcb.zcb_haderrors openzfs#7099
- Fix zle_decompress out of bound access openzfs#7099
- Fix zdb -c traverse stop on damaged objset root openzfs#7099
- Linux 4.11 compat: avoid refcount_t name conflict openzfs#7148
- Linux 4.16 compat: inode_set_iversion() openzfs#7148
- OpenZFS 8966 - Source file zfs_acl.c, function zfs_aclset_common
  contains a use after end of the lifetime of a local variable openzfs#7141
- Remove deprecated zfs_arc_p_aggressive_disable openzfs#7135
- Fix default libdir for Debian/Ubuntu openzfs#7083 openzfs#7101
- Bug fix in qat_compress.c for vmalloc addr check openzfs#7125
- Fix systemd_ RPM macros usage on Debian-based distributions openzfs#7074
  openzfs#7100
- Emit an error message before MMP suspends pool openzfs#7048
- ZTS: Fix create-o_ashift test case openzfs#6924 openzfs#6977
- Fix --with-systemd on Debian-based distributions (openzfs#6963) openzfs#6591 openzfs#6963
- Remove vn_rename and vn_remove dependency openzfs/spl#648 openzfs#6753
- Fix "--enable-code-coverage" debug build openzfs#6674
- Update codecov.yml openzfs#6669
- Add support for "--enable-code-coverage" option openzfs#6670
- Make "-fno-inline" compile option more accessible openzfs#6605
- Add configure option to enable gcov analysis openzfs#6642
- Implement --enable-debuginfo to force debuginfo openzfs#2734
- Make --enable-debug fail when given bogus args openzfs#2734

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Requires-spl: refs/pull/690/head
tonyhutter pushed a commit that referenced this issue Mar 19, 2018
In zil_lwb_commit() with TX_WRITE, we copy the log write record (lrw)
into the log write block (lwb) and send it off using zil_lwb_add_txg().
If we also have WR_NEED_COPY, we additionally copy the lwr's data into
the lwb to be sent off.  If the lwr + data doesn't fit into the lwb, we
send the lrw and as much data as will fit (dnow bytes), then go back
and do the same with the remaining data.

Each time through this loop we're sending dnow data bytes. I.e.
zil_itx_needcopy_bytes should be incremented by dnow.

Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris Dunlop <chris@onthe.net.au>
Closes #6988
Closes #7176
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants