Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Redacted Send/Receive #7958

Merged
merged 12 commits into from
Jun 19, 2019
Merged

Conversation

pcd1193182
Copy link
Contributor

This PR contains the implementation of Redacted Send and Receive. This is a new feature for ZFS that allows users to selectively not back up parts of their data for space savings, security/privacy concerns, or other reasons. It also contains a number of related features, including enhancements to zdb and zstreamdump, an improvement to the ASSERT and VERIFY macros, the ability to do send size estimation from bookmarks, a more accurate send size estimation process, type support for featureflags, and new tests for all the various new features. There is also a substantial refactor of the zfs send/recv logic to make it more readable and maintainable. Finally, additions have been made to the documentation to reflect the new features and logic.

Motivation and Context

Redacted send/receive allows users to send subsets of their data to a target system. One possible use case for this feature is to not transmit sensitive information to a data warehousing, test/dev, or analytics environment. Another is to save space by not replicating unimportant data within a given dataset, for example in backup tools like zrepl.

Description

Redacted send/receive is a three-stage process. First, a clone (or clones) is made of the snapshot to be sent to the target. In this clone (or clones), all unnecessary or unwanted data is removed or modified. This clone is then snapshotted to create the "redaction snapshot" (or snapshots). Second, the new zfs redact command is used to create a redaction bookmark. The redaction bookmark stores the list of blocks in a snapshot that were modified by the redaction snapshot(s). Finally, the redaction bookmark is passed as a parameter to zfs send. When sending to the snapshot that was redacted, the redaction bookmark is used to filter out blocks that contain sensitive or unwanted information, and those blocks are not included in the send stream. When sending from the redaction bookmark, the blocks it contains are considered as candidate blocks in addition to those blocks in the destination snapshot that were modified since the creation_txg of the redaction bookmark. This step is necessary to allow the target to rehydrate data in the case where some blocks are accidentally or unnecessarily modified in the redaction snapshot.

The changes to bookmarks to enable fast space estimation involve adding deadlists to bookmarks. There is also logic to manage the life cycles of these deadlists.

The new size estimation process operates in cases where previously an accurate estimate could not be provided. In those cases, a send is performed where no data blocks are read, reducing the runtime significantly and providing a byte-accurate size estimate.

To make part of this diff easier to review, I have created a gist that contains a diff between the bottom half of the old dmu_send.c and the current dmu_recv.c. This should make it easier to compare the content changes that occurred while the refactor was occurring. That gist is located here.

How Has This Been Tested?

Many new tests have been added for all the changes in this pull request. In addition, the changes have been running in production at Delphix for months or years, and are considered reliable.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

  • My code follows the ZFS on Linux code style requirements.
  • I have updated the documentation accordingly.
  • I have read the contributing document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.
  • All commit messages are properly formatted and contain Signed-off-by.
  • Change has been approved by a ZFS on Linux member.

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the gist, that helps with the review, but it is still a huge change. This is going to need to be broken up in to at few PRs in order to integrate it without destabilizing the branch. Let's start by pulling out a few of minor independent changes in to their own PRs all of which should be straight forward to integrate.

  • ASSERT/VERIFY improvements
  • New fnvlist_lookup_* functions.
  • Changes to the zfeature_register() interface
  • Any other chunks which make sense

Some version of the these dfbacf0 changes will also be needed to resolve the build issues.

cmd/zfs/zfs_main.c Outdated Show resolved Hide resolved
module/zfs/dsl_bookmark.c Outdated Show resolved Hide resolved
@ahrens
Copy link
Member

ahrens commented Sep 27, 2018

I think @don-brady's comment is relevant here: add an entry for ZFS_IOC_REDACT to libzfs_input_check.c: https://github.com/delphix/zfs/pull/28#pullrequestreview-159620335

include/spl/sys/debug.h Outdated Show resolved Hide resolved
Copy link
Member

@grwilson grwilson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pcd1193182 you should add logic in validate_ioc_values and a test in zfs_ioc_input_tests for the new ioctls

@ahrens ahrens added the Type: Feature Feature request or new feature label Sep 28, 2018
@behlendorf behlendorf added Status: Blocked Depends on another pending change and removed Status: Code Review Needed Ready for review and testing labels Oct 9, 2018
@pcd1193182 pcd1193182 added Status: Code Review Needed Ready for review and testing and removed Status: Blocked Depends on another pending change labels Oct 29, 2018
Copy link
Contributor

@rstrlcpy rstrlcpy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will continue

cmd/zdb/zdb.c Outdated Show resolved Hide resolved
cmd/zdb/zdb.c Outdated Show resolved Hide resolved
cmd/zdb/zdb.c Show resolved Hide resolved
cmd/zdb/zdb.c Show resolved Hide resolved
@pcd1193182
Copy link
Contributor Author

The PR has been rebase on top of master

@behlendorf
Copy link
Contributor

@pcd1193182 while I work my way though reviewing and testing this, it looks like the CI caught at least one memory leak which needs to be investigated. I should be able to get you some feedback by mid next week.

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great and works as advertised. Though I think adding a briefly high level description of conceptually how to use the feature would be tremendously helpful. We already have sections in zfs(8) for snapshots, bookmarks, clones, encryption, deduplication, etc.., adding another for redaction wouldn't be unreasonable, Starting with a version of the text from the PR's motivation and description sections would go a long way towards explaining the feature.

Personally, while I read though the documentation for the feature multiple times, and was already familiar with the design, it wasn't until I actually started looking at the test cases that I felt I understood how to use it. Given that, I think we should also strongly consider adding at least one well documented example to the zfs(8) man page. Something as simple as redacting a targeted byte range in a single file would go a long way towards making this feature less intimidating and easier to start using.

Also what do you think about clearly stating in the man page that the granularity of redaction occurs on a per-block basis. Blocks cannot be partially redacted. Unless you're already familiar with the internals of ZFS I don't think this would necessarily be intuitive.

Regarding the CI builders, they are all currently passing using the master branch with the exception of Fedora 30 (where we see one benign ZTS failure which still needs to be resolved). So you should expect to see pretty much everything passing, including the coverage builder which flagged a memory leak in the current version.

man/man5/zpool-features.5 Outdated Show resolved Hide resolved
man/man5/zpool-features.5 Outdated Show resolved Hide resolved
man/man5/zpool-features.5 Outdated Show resolved Hide resolved
man/man5/zpool-features.5 Outdated Show resolved Hide resolved
man/man5/zpool-features.5 Outdated Show resolved Hide resolved
tests/zfs-tests/tests/functional/redacted_send/cleanup.ksh Outdated Show resolved Hide resolved
tests/zfs-tests/cmd/stride_dd/stride_dd.c Outdated Show resolved Hide resolved
@pcd1193182
Copy link
Contributor Author

what exactly does this mean? how on earth can blocks accidentally or unnecessarily be modified?

The phrasing here could probably be clearer. The idea is that when doing redaction, there is some data that you care about hiding; this is the data you are attempting to redact. However, because of the nature of the data or the surroundings of the data or some quirk of the tools being used, other blocks may be changed when you modify that data For example, if the data being masked is stored in a database, automatic operations on the database may result in other blocks of the database being modified before a redaction snapshot is taken. If a later version of the masked data doesn't have that automatic operation occur, then those blocks would not be redacted in the later redaction bookmark. In that case, those blocks would be rehydrated when doing an incremental that redacts with that bookmark.

Or it could simply be an accident; if you delete a password in a config file, every block after it will change because the text gets shifted. If you, in a later redaction operation, overwrite the password with garbage of the same length, only that block would be redacted.

@pcd1193182
Copy link
Contributor Author

@pcd1193182 while I work my way though reviewing and testing this, it looks like the CI caught at least one memory leak which needs to be investigated. I should be able to get you some feedback by mid next week.

Found and fixed the memory leak, will update the PR once I have a passing test run.

@codecov
Copy link

codecov bot commented Jun 14, 2019

Codecov Report

Merging #7958 into master will increase coverage by 0.19%.
The diff coverage is 77.2%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #7958      +/-   ##
==========================================
+ Coverage   78.51%    78.7%   +0.19%     
==========================================
  Files         382      388       +6     
  Lines      117840   120221    +2381     
==========================================
+ Hits        92526    94625    +2099     
- Misses      25314    25596     +282
Flag Coverage Δ
#kernel 79.4% <83.75%> (+0.16%) ⬆️
#user 66.32% <32.98%> (-0.71%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b1b4ac2...351acbd. Read the comment docs.

@pcd1193182
Copy link
Contributor Author

Working on addressing these test failures now.

pcd1193182 and others added 11 commits June 14, 2019 10:47
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
When receiving a send stream that has unexpected features enabled (i.e.
a flag is set in `DMU_GET_FEATUREFLAGS(drr_versioninfo)` that is not in
`DMU_BACKUP_FEATURE_MASK`), libzfs will detect this and fail, printing
`cannot receive: stream has unsupported feature, feature flags = ...`.

However, if a process is using libzfs_core directly (i.e.
`lzc_receive*()`), or if libzfs and the kernel module are mismatched,
then the kernel may attempt to receive a stream with unsupported
features.

The problem is that the kernel is not explicitly checking
drr_versioninfo for unsupported features.  Typically this will cause the
receive the fail with EINVAL when it encounters a specific piece of
metadata that is not understood (e.g. a new record type).  However,
other, arbitrarily bad failure modes are also possible.

The solution is to make the kernel explicitly check drr_versioninfo for
unsupported features.  A new zfs_errno_t value was added, which the
kernel will return, and libzfs will print `cannot receive new filesystem
stream: the loaded zfs module does not support this operation. A reboot
may be required to enable this operation.`
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Jun 19, 2019
@behlendorf behlendorf merged commit 30af21b into openzfs:master Jun 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested) Type: Feature Feature request or new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants