Skip to content

Commit

Permalink
Wait for txg sync if the last DRR_FREEOBJECTS might result in a hole
Browse files Browse the repository at this point in the history
If we receive a DRR_FREEOBJECTS as the first entry in an object range,
this might end up producing a hole if the freed objects were the
only existing objects in the block.

If the txg starts syncing before we've processed any following
DRR_OBJECT records, this leads to a possible race where the backing
arc_buf_t gets its psize set to 0 in the arc_write_ready() callback
while still being referenced from a dirty record in the open txg.

To prevent this, we insert a txg_wait_synced call if the first
record in the range was a DRR_FREEOBJECTS that actually
resulted in one or more freed objects.

Signed-off-by: David Hedberg <david.hedberg@findity.com>
Sponsored by: Findity AB
Closes: openzfs#11893
  • Loading branch information
dhedberg committed Jan 7, 2023
1 parent a0105f6 commit 70605f4
Showing 1 changed file with 26 additions and 0 deletions.
26 changes: 26 additions & 0 deletions module/zfs/dmu_recv.c
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,12 @@ static int zfs_recv_best_effort_corrective = 0;
static const void *const dmu_recv_tag = "dmu_recv_tag";
const char *const recv_clone_name = "%recv";

typedef enum {
ORNS_NO,
ORNS_YES,
ORNS_MAYBE
} or_need_sync_t;

static int receive_read_payload_and_next_header(dmu_recv_cookie_t *ra, int len,
void *buf);

Expand Down Expand Up @@ -128,6 +134,9 @@ struct receive_writer_arg {
uint8_t or_mac[ZIO_DATA_MAC_LEN];
boolean_t or_byteorder;
zio_t *heal_pio;

/* Keep track of DRR_FREEOBJECTS right after DRR_OBJECT_RANGE */
or_need_sync_t or_need_sync;
};

typedef struct dmu_recv_begin_arg {
Expand Down Expand Up @@ -1903,10 +1912,22 @@ receive_object(struct receive_writer_arg *rwa, struct drr_object *drro,
/* object was freed and we are about to allocate a new one */
object_to_hold = DMU_NEW_OBJECT;
} else {
/*
* If the only record in this range so far was DRR_FREEOBJECTS
* with at least one actually freed object, it's possible that
* the block will now be converted to a hole. We need to wait
* for the txg to sync to prevent races.
*/
if (rwa->or_need_sync == ORNS_YES)
txg_wait_synced(dmu_objset_pool(rwa->os), 0);

/* object is free and we are about to allocate a new one */
object_to_hold = DMU_NEW_OBJECT;
}

/* Only relevant for the first object in the range */
rwa->or_need_sync = ORNS_NO;

/*
* If this is a multi-slot dnode there is a chance that this
* object will expand into a slot that is already used by
Expand Down Expand Up @@ -2100,6 +2121,9 @@ receive_freeobjects(struct receive_writer_arg *rwa,

if (err != 0)
return (err);

if (rwa->or_need_sync == ORNS_MAYBE)
rwa->or_need_sync = ORNS_YES;
}
if (next_err != ESRCH)
return (next_err);
Expand Down Expand Up @@ -2593,6 +2617,8 @@ receive_object_range(struct receive_writer_arg *rwa,
memcpy(rwa->or_mac, drror->drr_mac, ZIO_DATA_MAC_LEN);
rwa->or_byteorder = byteorder;

rwa->or_need_sync = ORNS_MAYBE;

return (0);
}

Expand Down

0 comments on commit 70605f4

Please sign in to comment.