-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix #806: Reduce count of IPI during work_queue processing. #1072
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR design seems robust and good, only couple of verifications/additions are required.
tempesta_fw/cache.c
Outdated
|
||
set_bit(TFW_QUEUE_B_IPI, &wq->flags); | ||
|
||
smp_mb__after_atomic(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, here we set the IPI flag before checking the queue size and we check queue size before the flag in tfw_wq_push()
- the synchronization seems safe. We need the barrier to make the queue size and the flag dependant. However, it seems the barrier pairing rule is broken here: strictly speaking, there is no guarantee that the queue operation won't be reordered with the flag testing in tfw_wq_push()
. I see that __tfw_wq_push()
uses at least atomic set, but it seems we still need an _after_atomic()
barrier in tfw_wq_push()
now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected.
tempesta_fw/sock.c
Outdated
if (!ss_close_q_sz()) | ||
return; | ||
|
||
clear_bit(TFW_QUEUE_B_IPI, &wq->flags); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a copy of part of tfw_wq_tasklet()
- maybe to make a macro with the queue size function as a parameter or a function accepting the queue size callback?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -208,13 +211,14 @@ ss_turnstile_push(long ticket, SsWork *sw, int cpu) | |||
cn->ticket = ticket; | |||
memcpy(&cn->sw, sw, sizeof(*sw)); | |||
spin_lock_bh(&cb->lock); | |||
list_add(&cn->list, &cb->head); | |||
list_add_tail(&cn->list, &cb->head); | |||
cb->size++; | |||
if (cb->turn > ticket) | |||
cb->turn = ticket; | |||
spin_unlock_bh(&cb->lock); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, we don't need a barrier after spinlock operation, but having a comment here would be good in case if we remove the spinlock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is still some fog with the barrier...
@@ -77,6 +86,11 @@ tfw_wq_push(TfwRBQueue *q, void *ptr, int cpu, struct irq_work *work, | |||
long ticket = __tfw_wq_push(q, ptr); | |||
if (unlikely(ticket)) | |||
return ticket; | |||
/* | |||
* The atomic operation is 'atomic64_cmpxchg()' in | |||
* '__tfw_wq_push()' above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? atomic64_cmpxchg()
is called very rarely, only on slow path. In fast path there are just atomic reads and writes: in good case there is write barrier, but in bad case there is no any barriers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fast path take place only when the queue is full and tfw_wq_push
exits with ticket - before IPI generation. In case of insertion new element into the queue - the atomic64_cmpxchg()
must be called.
tempesta_fw/work_queue.h
Outdated
} TfwRBQueue; | ||
|
||
enum { | ||
/* Enable IPI generation. */ | ||
TFW_QUEUE_B_IPI = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably we don't need _B_
suffix meaning bit
here. For other flags we historically used _B_
and _F_
suffixes to distinguish between flags and bits, but now we tend to use common bitmap API instead of 'manual' bit operations. Removing of _B_
suffix will synchronise the code with upcoming patches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected.
No description provided.