-
Notifications
You must be signed in to change notification settings - Fork 278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle FDB meta data together with FDB data in fdborch to avoid race … #441
Conversation
…condition Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>
} | ||
} | ||
|
||
void handle_meta_fdb_event( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where this is called ?
can you give scenario where you get race condition ? |
issue #440 |
you moved some functionality to handle_meta_fdb_event, but this function is not called anywhere, basically you remove meta_ call which is required, and issues in OA are caused because OA is listening on redis notifications for fdb, and not handling them in SAI FDB callback ! https://github.com/Azure/sonic-swss/blob/master/orchagent/notifications.cpp#L8 which is totally wrong !, "concurency" in the DB can be solved other way! OA should use SAI callback to handle fdb notifications, this is only valid way to handle them |
@jipanyang , can you check the comments here? |
handle_meta_fdb_event() is used in sonic-net/sonic-swss#824 |
@kcudnik Not sure if I got your point. The current ntf_thread() is also a thread in OA, both fdborch in the main orchagent thread and ntf_thread operate on fdb meta data, there is race condition between them. |
there always will be race condition on 2 indepentent processes. interprocess synchronization could be used here. you can use lock on ntf_callback from SAI when it will be used, but this does not guarantee anything, since more notifications can be in the redis queue waiting to be consumed by OA and again, you moved some functionality and not using using which opens possible race condition |
@kcudnik I think you are talking about the notification callback in syncd process. We are not touching that part. |
no, im talking about notification callback in OA, take a look here: i already pointed this in my previous comment |
retest this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please address commments
The comment is correct. While the change in meta_fdb_event_snoop_oid() simply conflicts with what is stated in the comment, in orchangent, now you have two threads handling the FDB related meta data reference. Please check #440 . fdb is asynchronous. Having them processed in one thread in orchagent doesn't prevent possible pending notification in syncd or the db channel, but that should not be the excuse for introducing further race condition at OA module. |
i already mentioned the same issue, it does not matter whether those pending notifications are in syncd or in internal vendors sai, always is issue of having them come ant any time. |
closing this as obsolete, code drifted to cpp objects instead of global functions, if there is need to still address this, please reopen |
…condition
Signed-off-by: Jipan Yang jipan.yang@alibaba-inc.com
Move FDB meta data data processing to fdborch. This a first step try to solve issue #440