Skip to content

Commit

Permalink
ocfs2: fix DIO failure due to insufficient transaction credits
Browse files Browse the repository at this point in the history
commit be346c1 upstream.

The code in ocfs2_dio_end_io_write() estimates number of necessary
transaction credits using ocfs2_calc_extend_credits().  This however does
not take into account that the IO could be arbitrarily large and can
contain arbitrary number of extents.

Extent tree manipulations do often extend the current transaction but not
in all of the cases.  For example if we have only single block extents in
the tree, ocfs2_mark_extent_written() will end up calling
ocfs2_replace_extent_rec() all the time and we will never extend the
current transaction and eventually exhaust all the transaction credits if
the IO contains many single block extents.  Once that happens a
WARN_ON(jbd2_handle_buffer_credits(handle) <= 0) is triggered in
jbd2_journal_dirty_metadata() and subsequently OCFS2 aborts in response to
this error.  This was actually triggered by one of our customers on a
heavily fragmented OCFS2 filesystem.

To fix the issue make sure the transaction always has enough credits for
one extent insert before each call of ocfs2_mark_extent_written().

Heming Zhao said:

------
PANIC: "Kernel panic - not syncing: OCFS2: (device dm-1): panic forced after error"

PID: xxx  TASK: xxxx  CPU: 5  COMMAND: "SubmitThread-CA"
  #0 machine_kexec at ffffffff8c069932
  #1 __crash_kexec at ffffffff8c1338fa
  #2 panic at ffffffff8c1d69b9
  #3 ocfs2_handle_error at ffffffffc0c86c0c [ocfs2]
  #4 __ocfs2_abort at ffffffffc0c88387 [ocfs2]
  #5 ocfs2_journal_dirty at ffffffffc0c51e98 [ocfs2]
  ayufan-rock64#6 ocfs2_split_extent at ffffffffc0c27ea3 [ocfs2]
  ayufan-rock64#7 ocfs2_change_extent_flag at ffffffffc0c28053 [ocfs2]
  ayufan-rock64#8 ocfs2_mark_extent_written at ffffffffc0c28347 [ocfs2]
  ayufan-rock64#9 ocfs2_dio_end_io_write at ffffffffc0c2bef9 [ocfs2]
ayufan-rock64#10 ocfs2_dio_end_io at ffffffffc0c2c0f5 [ocfs2]
ayufan-rock64#11 dio_complete at ffffffff8c2b9fa7
ayufan-rock64#12 do_blockdev_direct_IO at ffffffff8c2bc09f
ayufan-rock64#13 ocfs2_direct_IO at ffffffffc0c2b653 [ocfs2]
ayufan-rock64#14 generic_file_direct_write at ffffffff8c1dcf14
ayufan-rock64#15 __generic_file_write_iter at ffffffff8c1dd07b
ayufan-rock64#16 ocfs2_file_write_iter at ffffffffc0c49f1f [ocfs2]
ayufan-rock64#17 aio_write at ffffffff8c2cc72e
ayufan-rock64#18 kmem_cache_alloc at ffffffff8c248dde
ayufan-rock64#19 do_io_submit at ffffffff8c2ccada
ayufan-rock64#20 do_syscall_64 at ffffffff8c004984
ayufan-rock64#21 entry_SYSCALL_64_after_hwframe at ffffffff8c8000ba

Link: https://lkml.kernel.org/r/20240617095543.6971-1-jack@suse.cz
Link: https://lkml.kernel.org/r/20240614145243.8837-1-jack@suse.cz
Fixes: c15471f ("ocfs2: fix sparse file & data ordering issue in direct io")
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reviewed-by: Heming Zhao <heming.zhao@suse.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  • Loading branch information
jankara authored and gregkh committed Jul 5, 2024
1 parent a4f9251 commit 9ea2d1c
Show file tree
Hide file tree
Showing 4 changed files with 26 additions and 0 deletions.
5 changes: 5 additions & 0 deletions fs/ocfs2/aops.c
Original file line number Diff line number Diff line change
Expand Up @@ -2370,6 +2370,11 @@ static int ocfs2_dio_end_io_write(struct inode *inode,
}

list_for_each_entry(ue, &dwc->dw_zero_list, ue_node) {
ret = ocfs2_assure_trans_credits(handle, credits);
if (ret < 0) {
mlog_errno(ret);
break;
}
ret = ocfs2_mark_extent_written(inode, &et, handle,
ue->ue_cpos, 1,
ue->ue_phys,
Expand Down
17 changes: 17 additions & 0 deletions fs/ocfs2/journal.c
Original file line number Diff line number Diff line change
Expand Up @@ -447,6 +447,23 @@ int ocfs2_extend_trans(handle_t *handle, int nblocks)
return status;
}

/*
* Make sure handle has at least 'nblocks' credits available. If it does not
* have that many credits available, we will try to extend the handle to have
* enough credits. If that fails, we will restart transaction to have enough
* credits. Similar notes regarding data consistency and locking implications
* as for ocfs2_extend_trans() apply here.
*/
int ocfs2_assure_trans_credits(handle_t *handle, int nblocks)
{
int old_nblks = jbd2_handle_buffer_credits(handle);

trace_ocfs2_assure_trans_credits(old_nblks);
if (old_nblks >= nblocks)
return 0;
return ocfs2_extend_trans(handle, nblocks - old_nblks);
}

/*
* If we have fewer than thresh credits, extend by OCFS2_MAX_TRANS_DATA.
* If that fails, restart the transaction & regain write access for the
Expand Down
2 changes: 2 additions & 0 deletions fs/ocfs2/journal.h
Original file line number Diff line number Diff line change
Expand Up @@ -243,6 +243,8 @@ handle_t *ocfs2_start_trans(struct ocfs2_super *osb,
int ocfs2_commit_trans(struct ocfs2_super *osb,
handle_t *handle);
int ocfs2_extend_trans(handle_t *handle, int nblocks);
int ocfs2_assure_trans_credits(handle_t *handle,
int nblocks);
int ocfs2_allocate_extend_trans(handle_t *handle,
int thresh);

Expand Down
2 changes: 2 additions & 0 deletions fs/ocfs2/ocfs2_trace.h
Original file line number Diff line number Diff line change
Expand Up @@ -2578,6 +2578,8 @@ DEFINE_OCFS2_ULL_UINT_EVENT(ocfs2_commit_cache_end);

DEFINE_OCFS2_INT_INT_EVENT(ocfs2_extend_trans);

DEFINE_OCFS2_INT_EVENT(ocfs2_assure_trans_credits);

DEFINE_OCFS2_INT_EVENT(ocfs2_extend_trans_restart);

DEFINE_OCFS2_INT_INT_EVENT(ocfs2_allocate_extend_trans);
Expand Down

0 comments on commit 9ea2d1c

Please sign in to comment.