Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rxe: fix completion queue consumer index overrun #1512

Merged
merged 1 commit into from
Nov 12, 2024

Conversation

dragonJACson
Copy link
Contributor

Fix a bug in the CQ polling sequence where the consumer index can incorrectly advance beyond available completions.

Consider this polling sequence:

    ibv_start_poll();
    ibv_next_poll();
    ibv_end_poll();
    ibv_start_poll();

With one completion in the queue, the indices would be:

  1. After ibv_start_poll() (reading first CQE):
     P 
┌──┬─┴┬──┬──┬──┐
└─┬┴──┴──┴──┴──┘ 
  C
  1. After ibv_next_poll() (returns ENOENT):
     P 
┌──┬─┴┬──┬──┬──┐
└──┴─┬┴──┴──┴──┘
     C
  1. After ibv_end_poll():
     P 
┌──┬─┴┬──┬──┬──┐ 
└──┴──┴─┬┴──┴──┘ 
        C

The issue occurs because ibv_end_poll() advances the consumer index even after ibv_next_poll() returns ENOENT. This causes the consumer index to move beyond the producer index, leading to:

  • False indication of available completions
  • Reading of uninitialized completion entries

Fix this by checking for available completions before advancing the consumer index in ibv_next_poll().

Note: According to the man page, ibv_end_poll() must be called even when ibv_next_poll() returns ENOENT, but consumer index should only be advanced once in this case.

Fix a bug in the CQ polling sequence where the consumer index can
incorrectly advance beyond available completions.

Consider this polling sequence:

```
    ibv_start_poll();
    ibv_next_poll();
    ibv_end_poll();
    ibv_start_poll();
```

With one completion in the queue, the indices would be:

1. After `ibv_start_poll()` (reading first CQE):
      P
 ┌──┬─┴┬──┬──┬──┐
 └─┬┴──┴──┴──┴──┘
   C

2. After `ibv_next_poll()` (returns `ENOENT`):
      P
 ┌──┬─┴┬──┬──┬──┐
 └──┴─┬┴──┴──┴──┘
      C

3. After `ibv_end_poll()`:
      P
 ┌──┬─┴┬──┬──┬──┐
 └──┴──┴─┬┴──┴──┘
         C

The issue occurs because `ibv_end_poll()` advances the consumer index
even after `ibv_next_poll()` returns `ENOENT`. This causes the consumer
index to move beyond the producer index, leading to:

- False indication of available completions
- Reading of uninitialized completion entries

Fix this by checking for available completions before advancing the
consumer index in `ibv_next_poll()`.

Note: According to the man page, `ibv_end_poll()` must be called even
when `ibv_next_poll()` returns `ENOENT`, but consumer index should only
be advanced once in this case.

Signed-off-by: Luke Yue <lukedyue@gmail.com>
@rleon rleon merged commit 69ac14f into linux-rdma:master Nov 12, 2024
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants