Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lldp-syncd] Fix unexpected exception in snmp-subagent #64

Merged
merged 2 commits into from
Nov 29, 2023

Conversation

ZhaohuiS
Copy link
Contributor

@ZhaohuiS ZhaohuiS commented Nov 16, 2023

Issue

InterfaceNeighbor table does not show all interfaces information once in a large set of iterations. This table is populated by query lldpRemTable 1.0.8802.1.1.2.1.4.1, when the issue occurs the query to this OID will no return any result and hence the interfaceNeighbor will not be populated at that instance.

Exception message seen in SNMP :
<28>Oct 18 18:32:19 LON60-0101-0504-17T1 snmp#snmp-subagent [sonic_ax_impl] WARNING: Exception when updating lldpRemTable: 'lldp_rem_sys_cap_supported'
and

<28>Oct 18 16:57:19 LON60-0101-0504-17T1 snmp#snmp-subagent [sonic_ax_impl] WARNING: Exception when updating lldpRemTable: 'lldp_rem_index'

Root cause:

SNMP periodically get the lldp remote neighbor information by reading LLDP_ENTRY_TABLE in APP_DB.
SNMP tries to get all information for each key in LLDP_ENTRY_TABLE, SNMP expects 3 keys to be present for each entry in LLDP_ENTRY_TABLE. lldp_rem_sys_cap_supported/lldp_rem_index and lldp_rem_sys_cap_enabled. During some iterations, SNMP does not find the expected keys in the table entry.

This is happening because the LLDP_ENTRY_TABLE can get deleted when there is any change in any of the entry. During the deletion/repopulation if SNMP tries to query the data, some data can be missing.

The constantly changed item is lldp_rem_time_mark when neighbors are stable. Others are quite fixed.
So based on the current code, it will delete all interfaces tables then add each item back one by one for each sync.
Even the operation is very fast, snmp subagent still has chance to read non-added key, since lldp sync interval is 10s.

How to fix it:

Separate deleted and changed scearios

  • deleted interfaces, delete their table key
  • changed interfaces
  1. if lldp_rem_time_mark is only changed key, just use hset its new value. Don't delete interface table key.
  2. If other keys are changed as well, delete interface key and use hmset to set the whole dict into DB, instead of using hset each item one by one. This will avoid interface exists but some of its keys are not added.

work item:

25645641

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
@StormLiangMS
Copy link

hi @ZhaohuiS could you update ADO in the description and run a test on top of 202305 with this PR?

@ZhaohuiS
Copy link
Contributor Author

ZhaohuiS commented Dec 1, 2023

hi @ZhaohuiS could you update ADO in the description and run a test on top of 202305 with this PR?

@StormLiangMS Update the description. I have already tested it on 202305 image. Ran test_snmp_lldp.py and test_lldp.py cases, also verify long time check if the number of lldp neighbors matches the number in snmp.

StormLiangMS pushed a commit that referenced this pull request Dec 5, 2023
Separate deleted and changed scearios
deleted interfaces, delete their table key
changed interfaces
if lldp_rem_time_mark is only changed key, just use hset its new value. Don't delete interface table key.
If other keys are changed as well, delete interface key and use hmset to set the whole dict into DB, instead of using hset each item one by one. This will avoid interface exists but some of its keys are not added.

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
yxieca pushed a commit that referenced this pull request Dec 7, 2023
Separate deleted and changed scearios
deleted interfaces, delete their table key
changed interfaces
if lldp_rem_time_mark is only changed key, just use hset its new value. Don't delete interface table key.
If other keys are changed as well, delete interface key and use hmset to set the whole dict into DB, instead of using hset each item one by one. This will avoid interface exists but some of its keys are not added.

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
ZhaohuiS added a commit to ZhaohuiS/sonic-dbsyncd that referenced this pull request Aug 1, 2024
Separate deleted and changed scearios
deleted interfaces, delete their table key
changed interfaces
if lldp_rem_time_mark is only changed key, just use hset its new value. Don't delete interface table key.
If other keys are changed as well, delete interface key and use hmset to set the whole dict into DB, instead of using hset each item one by one. This will avoid interface exists but some of its keys are not added.

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
mssonicbld pushed a commit to mssonicbld/sonic-dbsyncd that referenced this pull request Aug 1, 2024
Separate deleted and changed scearios
deleted interfaces, delete their table key
changed interfaces
if lldp_rem_time_mark is only changed key, just use hset its new value. Don't delete interface table key.
If other keys are changed as well, delete interface key and use hmset to set the whole dict into DB, instead of using hset each item one by one. This will avoid interface exists but some of its keys are not added.

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
@mssonicbld
Copy link

Cherry-pick PR to 202311: #68

mssonicbld pushed a commit that referenced this pull request Aug 1, 2024
Separate deleted and changed scearios
deleted interfaces, delete their table key
changed interfaces
if lldp_rem_time_mark is only changed key, just use hset its new value. Don't delete interface table key.
If other keys are changed as well, delete interface key and use hmset to set the whole dict into DB, instead of using hset each item one by one. This will avoid interface exists but some of its keys are not added.

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants