-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add saiserver docker for mlnx sn2700 platform #12
Conversation
|
||
RUN apt-get update | ||
|
||
COPY deps /deps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More specific files in deps/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
Ethernet29 112,113,114,115 | ||
Ethernet30 116,117,118,119 | ||
Ethernet31 120,121,122,123 | ||
Ethernet32 124,125,126,127 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add new line at the end?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer not to have newline in the end.
@@ -0,0 +1 @@ | |||
SAI_INIT_CONFIG_FILE=/usr/share/sai_2700.xml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add new line at the end?
<root> | ||
<platform_info type="2700"> | ||
|
||
<!-- Device MAC address --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace tab with spaces?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this file is from mlnx, do not want to change it.
|
||
function clean_up { | ||
service rsyslog stop | ||
exit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to 'exit'
|
||
start_mlnx() | ||
{ | ||
mkdir -p /dev/sxdevs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move after '-e' test? Or you may create the folder in Dockerfile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's what syncd does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved
Use single start script for all platforms and remove symbolic links Change path to system eeprom Signed-off-by: marian-pritsak <marianp@mellanox.com>
Use single start script for all platforms and remove symbolic links Change path to system eeprom Signed-off-by: marian-pritsak <marianp@mellanox.com>
Use single start script for all platforms and remove symbolic links Change path to system eeprom Signed-off-by: marian-pritsak <marianp@mellanox.com>
Use single start script for all platforms and remove symbolic links Change path to system eeprom Signed-off-by: marian-pritsak <marianp@mellanox.com>
Use single start script for all platforms and remove symbolic links Change path to system eeprom Signed-off-by: marian-pritsak <marianp@mellanox.com>
Use single start script for all platforms and remove symbolic links Change path to system eeprom Signed-off-by: marian-pritsak <marianp@mellanox.com>
Use single start script for all platforms and remove symbolic links Change path to system eeprom Signed-off-by: marian-pritsak <marianp@mellanox.com>
Use single start script for all platforms and remove symbolic links Change path to system eeprom Signed-off-by: marian-pritsak <marianp@mellanox.com>
Use single start script for all platforms and remove symbolic links Change path to system eeprom Signed-off-by: marian-pritsak <marianp@mellanox.com>
- Merge pull request sonic-net#18 from yxieca/no_buffering - Revert "Pep 8 compliance, code cleanup (sonic-net#15)" (sonic-net#16) - Pep 8 compliance, code cleanup (sonic-net#15) - add detailed comments for get_transceiver_change_event (sonic-net#12) Signed-off-by: Ying Xie <ying.xie@microsoft.com>
[sonic_cli]: Fix bash completion for 'show' command
* [sonic-head.yang]: Minor modification for enumeration of ip-type in ACL yang models. [sonic-vlan.yang]: modify vlan table key from vlanid(int) to vlan_name(string). [yangModelTesting.py] Fix Test Code and JSON input. * [sonic-acl.yang]: Present Enumeration similar to config DB. * [sonic-head.yang]: Minor update in enumeration
* [sonic-head.yang]: Minor modification for enumeration of ip-type in ACL yang models. [sonic-vlan.yang]: modify vlan table key from vlanid(int) to vlan_name(string). [yangModelTesting.py] Fix Test Code and JSON input. * [sonic-acl.yang]: Present Enumeration similar to config DB. * [sonic-head.yang]: Minor update in enumeration
Updated the hw-mgmt pointer to include some bugfixes related to power supply voltages.
May 5 21:35:16.852142 sonic ERR snmp#snmp-subagent [sonic_ax_impl] ERROR: Uncaught exception in sonic_ax_impl.main#012Traceback (most recent call last):sonic-net#12 File "/usr/local/lib/python3.7/dist-packages/sonic_ax_impl/main.py", line 70, in main#012 event_loop.run_until_complete(agent.run_in_event_loop())sonic-net#12 File "/usr/lib/python3.7/asyncio/base_events.py", line 584, in run_until_complete#012 return future.result()sonic-net#12 File "/usr/local/lib/python3.7/dist-packages/ax_interface/agent.py", line 37, in run_in_event_loop#012 background_task = self.mib_table.start_background_tasks(self.oid_updaters_enabled)sonic-net#12 File "/usr/local/lib/python3.7/dist-packages/ax_interface/mib.py", line 276, in start_background_tasks#012 task = event._loop.create_task(fut)sonic-net#12 File "/usr/lib/python3.7/asyncio/base_events.py", line 405, in create_task#012 task = tasks.Task(coro, loop=self)#012TypeError: a coroutine was expected, got <Task pending coro=<MIBUpdater.start() running at /usr/local/lib/python3.7/dist-packages/ax_interface/mib.py:34> cb=[MIBTable._done_background_task_callback() at /usr/local/lib/python3.7/dist-packages/ax_interface/mib.py:263]> We really do not need to wrap the future within a task. This also addresses the issue where the subagent would not exit on SIGTERM.
**- What I did** A 'key not found' exception will be raised in bgp4.py if the state for a given neighbor is not found in STATE_DB. ``` ERR snmp#snmp-subagent [ax_interface] ERROR: MIBUpdater.start() caught an unexpected exception during update_data() #012Traceback (most recent call last): sonic-net#12 File "/usr/local/lib/python3.7/dist-packages/ax_interface/mib.py", line 43, in start sonic-net#12 self.update_data()sonic-net#12 File "/usr/local/lib/python3.7/dist-packages/sonic_ax_impl/mibs/vendor/cisco/bgp4.py", line 42, in update_data sonic-net#12 state = neigh_info['state'] sonic-net#12 File "/usr/lib/python3/dist-packages/swsscommon/swsscommon.py", line 345, in __getitem__ sonic-net#12 return _swsscommon.FieldValueMap___getitem__(self, key) #012IndexError: key not found ``` It is becaues an empty ```dict``` is returned by ```get_all``` when nothing is found for the given key. So check for ```None``` can't detect the error. **- How I did it** This commit addressed the issue by checking the key ```state```. **- How to verify it** Verified on A7260. No exception is observed after the update. **- Description for the changelog** This PR fix exception caused by non existing key.
DHCP relay enhancements for B+
…onic-net#1215) route_check.py will report an ERROR in syslog if route mismatch is found, which is out control of monit config file. This commit add an option (-s) to control whether error will be reported in syslog. **- How to verify it** The update is verified on Arista-7260. 1. Add a static route whose nexthop is not reachable. ``` ip route add 1.1.1.1 via 192.168.1.101 ``` 2. Run ```route_check.py```, and error msg is only printed on stdout. Nothing is writen to syslog 3. Run ```route_check.py -s```. and error msg is writen to both stdout and syslog 4. Wait for 15 minutes, and confirm that monit will report the error ``` Nov 4 09:30:36.917367 str-7260cx3-acs-2 ERR monit[631]: 'routeCheck' status failed (255) -- results: { {#12 "missed_ROUTE_TABLE_routes": [#12 "1.1.1.1/32"#12 ]#12} }#12 Failed. Look at reported mismatches above ``` Signed-off-by: bingwang <bingwang@microsoft.com>
This commit fix the exception thrown by struct.pack when attempting to pack a unicode string. The script ```fast-reboot-dump.py``` will throw an exception in python3 because ```struct.pack``` requires ```bytes``` for ```s```. ``` Traceback: Traceback (most recent call last): #12 File "/usr/local/bin/fast-reboot-dump.py", line 299, in <module> #12 res = main() #12 File "/usr/local/bin/fast-reboot-dump.py", line 292, in main #12 send_garp_nd(neighbor_entries, map_mac_ip_per_vlan) #12 File "/usr/local/bin/fast-reboot-dump.py", line 221, in send_garp_nd #12 src_ip_addrs = {vlan_name:get_iface_ip_addr(vlan_name) for vlan_name,_,_ in neighbor_entries} #12 File "/usr/local/bin/fast-reboot-dump.py", line 221, in <dictcomp> #12 src_ip_addrs = {vlan_name:get_iface_ip_addr(vlan_name) for vlan_name,_,_ in neighbor_entries} #12 File "/usr/local/bin/fast-reboot-dump.py", line 195, in get_iface_ip_addr #12 return get_if(iff, SIOCGIFADDR)[20:24] #12 File "/usr/local/bin/fast-reboot-dump.py", line 185, in get_if #12 ifreq = ioctl(s, cmd, struct.pack("16s16x",iff)) #12 struct.error: argument for 's' must be a bytes object ``` Signed-off-by: bingwang <bingwang@microsoft.com>
Update cisco-8000.ini to use 202205-v0.1
* Extended build system with flags for the FRR submodule INCLUDE_FRR_BGP INCLUDE_FRR_BFD INCLUDE_FRR_PBR INCLUDE_FRR_VRRP INCLUDE_FRR_OSPF * Print FRR options during the build * Update frr.mk
* Extended build system with flags for the FRR submodule INCLUDE_FRR_BGP INCLUDE_FRR_BFD INCLUDE_FRR_PBR INCLUDE_FRR_VRRP INCLUDE_FRR_OSPF * Print FRR options during the build * Update frr.mk
…nic-net#12) YGOT:- To improve the performance of the Unmarshal and EmitJSON methods. Caching the schema information to path instead of finding the schema for the path every time. Disabled the validation of the response while marshaling the ygot object. Added the debug check for the debug logs to get generated/printed only when the debug flag is enabled Fixes in the Unmarshal method for the node leaf-list, and leaf contains union which has enum as one of its type. Added changes in the Unmarshal method to throw an error if the request payload has the state information. Request Binder:- Added the method validateObjectType - to check and throw an error if the given request payload contains state information. Added the changes not to validate the request payload if the model of the payload is sonic yang, since CVL validates the sonic yang model request Package update:- Updated the ygot package to the version v0.7.1 Updated the goyang package to the version v0.0.0-20200309174518-a00bece872fc Updated the gnmi package to the version v0.0.0-20200307010808-e7106f7f5493
This patch is a backport from linux 4.13
#### Why I did it To fix errors that happen when writing to the queue: ``` Jun 5 23:04:41.798613 r-leopard-56 NOTICE healthd: Caught SIGTERM - exiting... Jun 5 23:04:41.798985 r-leopard-56 NOTICE healthd: Caught SIGTERM - exiting... Jun 5 23:04:41.799535 r-leopard-56 NOTICE healthd: Caught SIGTERM - exiting... Jun 5 23:04:41.806010 r-leopard-56 NOTICE healthd: Caught SIGTERM - exiting... Jun 5 23:04:41.814075 r-leopard-56 ERR healthd: system_service[Errno 104] Connection reset by peer Jun 5 23:04:41.824135 r-leopard-56 ERR healthd: Traceback (most recent call last):#12 File "/usr/local/lib/python3.9/dist-packages/health_checker/sysmonitor.py", line 484, in system_service#012 msg = self.myQ.get(timeout=QUEUE_TIMEOUT)#12 File "<string>", line 2, in get#012 File "/usr/lib/python3.9/multiprocessing/managers.py", line 809, in _callmethod#012 kind, result = conn.recv()#12 File "/usr/lib/python3.9/multiprocessing/connection.py", line 255, in recv#012 buf = self._recv_bytes()#12 File "/usr/lib/python3.9/multiprocessing/connection.py", line 419, in _recv_bytes#012 buf = self._recv(4)#12 File "/usr/lib/python3.9/multiprocessing/connection.py", line 384, in _recv#012 chunk = read(handle, remaining)#012ConnectionResetError: [Errno 104] Connection reset by peer Jun 5 23:04:41.826489 r-leopard-56 INFO healthd[8494]: ERROR:dbus.connection:Exception in handler for D-Bus signal: Jun 5 23:04:41.826591 r-leopard-56 INFO healthd[8494]: Traceback (most recent call last): Jun 5 23:04:41.826640 r-leopard-56 INFO healthd[8494]: File "/usr/lib/python3/dist-packages/dbus/connection.py", line 232, in maybe_handle_message Jun 5 23:04:41.826686 r-leopard-56 INFO healthd[8494]: self._handler(*args, **kwargs) Jun 5 23:04:41.826738 r-leopard-56 INFO healthd[8494]: File "/usr/local/lib/python3.9/dist-packages/health_checker/sysmonitor.py", line 82, in on_job_removed Jun 5 23:04:41.826785 r-leopard-56 INFO healthd[8494]: self.task_notify(msg) Jun 5 23:04:41.826831 r-leopard-56 INFO healthd[8494]: File "/usr/local/lib/python3.9/dist-packages/health_checker/sysmonitor.py", line 110, in task_notify Jun 5 23:04:41.826877 r-leopard-56 INFO healthd[8494]: self.task_queue.put(msg) Jun 5 23:04:41.826923 r-leopard-56 INFO healthd[8494]: File "<string>", line 2, in put Jun 5 23:04:41.826973 r-leopard-56 INFO healthd[8494]: File "/usr/lib/python3.9/multiprocessing/managers.py", line 808, in _callmethod Jun 5 23:04:41.827018 r-leopard-56 INFO healthd[8494]: conn.send((self._id, methodname, args, kwds)) Jun 5 23:04:41.827065 r-leopard-56 INFO healthd[8494]: File "/usr/lib/python3.9/multiprocessing/connection.py", line 211, in send Jun 5 23:04:41.827115 r-leopard-56 INFO healthd[8494]: self._send_bytes(_ForkingPickler.dumps(obj)) Jun 5 23:04:41.827158 r-leopard-56 INFO healthd[8494]: File "/usr/lib/python3.9/multiprocessing/connection.py", line 416, in _send_bytes Jun 5 23:04:41.827199 r-leopard-56 INFO healthd[8494]: self._send(header + buf) Jun 5 23:04:41.827254 r-leopard-56 INFO healthd[8494]: File "/usr/lib/python3.9/multiprocessing/connection.py", line 373, in _send Jun 5 23:04:41.827322 r-leopard-56 INFO healthd[8494]: n = write(self._handle, buf) Jun 5 23:04:41.827368 r-leopard-56 INFO healthd[8494]: BrokenPipeError: [Errno 32] Broken pipe Jun 5 23:04:42.800216 r-leopard-56 NOTICE healthd: Caught SIGTERM - exiting... ``` When the multiprocessing.Manager is shutdown the queue will raise the above errors. This happens during shutdown - fast-reboot, warm-reboot. With the fix, system-health service does not hang: ``` root@sonic:/home/admin# sudo systemctl start system-health ; sleep 10; echo "$(date): Stopping..."; sudo systemctl stop system-health; echo "$(date): Stopped" Thu Oct 17 01:07:56 PM IDT 2024: Stopping... Thu Oct 17 01:07:58 PM IDT 2024: Stopped root@sonic:/home/admin# sudo systemctl start system-health ; sleep 10; echo "$(date): Stopping..."; sudo systemctl stop system-health; echo "$(date): Stopped" Thu Oct 17 01:08:13 PM IDT 2024: Stopping... Thu Oct 17 01:08:14 PM IDT 2024: Stopped root@sonic:/home/admin# sudo systemctl start system-health ; sleep 10; echo "$(date): Stopping..."; sudo systemctl stop system-health; echo "$(date): Stopped" Thu Oct 17 01:09:05 PM IDT 2024: Stopping... Thu Oct 17 01:09:06 PM IDT 2024: Stopped ``` ##### Work item tracking - Microsoft ADO **(number only)**: #### How I did it Remove the call to shutdown, the cleanup will happen automatically when GC runs as per documentation - https://docs.python.org/3/library/multiprocessing.html #### How to verify it <!-- If PR needs to be backported, then the PR must be tested against the base branch and the earliest backport release branch and provide tested image version on these two branches. For example, if the PR is requested for master, 202211 and 202012, then the requester needs to provide test results on master and 202012. --> Run warm-reboot, fast-reboot multiple times and verify no errors in the log. #### Which release branch to backport (provide reason below if selected) <!-- - Note we only backport fixes to a release branch, *not* features! - Please also provide a reason for the backporting below. - e.g. - [x] 202006 --> - [ ] 201811 - [ ] 201911 - [ ] 202006 - [ ] 202012 - [ ] 202106 - [ ] 202111 - [x] 202205 - [x] 202311 - [x] 202405 #### Tested branch (Please provide the tested image version) <!-- - Please provide tested image version - e.g. - [x] 20201231.100 --> - [ ] <!-- image version 1 --> - [ ] <!-- image version 2 --> #### Description for the changelog <!-- Write a short (one line) summary that describes the changes in this pull request for inclusion in the changelog: --> <!-- Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU. --> #### Link to config_db schema for YANG module changes <!-- Provide a link to config_db schema for the table for which YANG model is defined Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md --> #### A picture of a cute animal (not mandatory but encouraged)
To fix a statistical issue. The original fix was done in FRRouting/frr#17297. However to accommodate 8.5.4 the patch in the PR was added. [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `/usr/lib/frr/zebra -A 127.0.0.1 -s 90000000 -M dplane_fpm_nl -M snmp'. Program terminated with signal SIGABRT, Aborted. #0 0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6 [Current thread is 1 (Thread 0x7fccd6faf7c0 (LWP 36))] (gdb) bt #0 0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x00007fccd7302fb2 in raise () from /lib/x86_64-linux-gnu/libc.so.6 #2 0x00007fccd72ed472 in abort () from /lib/x86_64-linux-gnu/libc.so.6 #3 0x00007fccd75bb3a9 in _zlog_assert_failed (xref=xref@entry=0x7fccd7652380 <_xref.16>, extra=extra@entry=0x0) at ../lib/zlog.c:678 #4 0x00007fccd759b2fe in route_node_delete (node=<optimized out>) at ../lib/table.c:352 #5 0x00007fccd759b445 in route_unlock_node (node=0x0) at ../lib/table.h:258 #6 route_next (node=<optimized out>) at ../lib/table.c:436 #7 route_next (node=node@entry=0x56029d89e560) at ../lib/table.c:410 #8 0x000056029b6b6b7a in if_lookup_by_name_per_ns (ns=ns@entry=0x56029d873d90, ifname=ifname@entry=0x7fccc0029340 "PortChannel1020") at ../zebra/interface.c:312 #9 0x000056029b6b8b36 in zebra_if_dplane_ifp_handling (ctx=0x7fccc0029310) at ../zebra/interface.c:1867 #10 zebra_if_dplane_result (ctx=0x7fccc0029310) at ../zebra/interface.c:2221 #11 0x000056029b7137a9 in rib_process_dplane_results (thread=<optimized out>) at ../zebra/zebra_rib.c:4810 #12 0x00007fccd75a0e0d in thread_call (thread=thread@entry=0x7ffe8e553cc0) at ../lib/thread.c:1990 #13 0x00007fccd7559368 in frr_run (master=0x56029d65a040) at ../lib/libfrr.c:1198 #14 0x000056029b6ac317 in main (argc=9, argv=0x7ffe8e5540d8) at ../zebra/main.c:478
Adding the below fix from FRR FRRouting/frr#17297 This is to fix the following crash which is a statistical issue [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `/usr/lib/frr/zebra -A 127.0.0.1 -s 90000000 -M dplane_fpm_nl -M snmp'. Program terminated with signal SIGABRT, Aborted. #0 0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6 [Current thread is 1 (Thread 0x7fccd6faf7c0 (LWP 36))] (gdb) bt #0 0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x00007fccd7302fb2 in raise () from /lib/x86_64-linux-gnu/libc.so.6 #2 0x00007fccd72ed472 in abort () from /lib/x86_64-linux-gnu/libc.so.6 #3 0x00007fccd75bb3a9 in _zlog_assert_failed (xref=xref@entry=0x7fccd7652380 <_xref.16>, extra=extra@entry=0x0) at ../lib/zlog.c:678 #4 0x00007fccd759b2fe in route_node_delete (node=<optimized out>) at ../lib/table.c:352 #5 0x00007fccd759b445 in route_unlock_node (node=0x0) at ../lib/table.h:258 #6 route_next (node=<optimized out>) at ../lib/table.c:436 #7 route_next (node=node@entry=0x56029d89e560) at ../lib/table.c:410 #8 0x000056029b6b6b7a in if_lookup_by_name_per_ns (ns=ns@entry=0x56029d873d90, ifname=ifname@entry=0x7fccc0029340 "PortChannel1020") at ../zebra/interface.c:312 #9 0x000056029b6b8b36 in zebra_if_dplane_ifp_handling (ctx=0x7fccc0029310) at ../zebra/interface.c:1867 #10 zebra_if_dplane_result (ctx=0x7fccc0029310) at ../zebra/interface.c:2221 #11 0x000056029b7137a9 in rib_process_dplane_results (thread=<optimized out>) at ../zebra/zebra_rib.c:4810 #12 0x00007fccd75a0e0d in thread_call (thread=thread@entry=0x7ffe8e553cc0) at ../lib/thread.c:1990 #13 0x00007fccd7559368 in frr_run (master=0x56029d65a040) at ../lib/libfrr.c:1198 #14 0x000056029b6ac317 in main (argc=9, argv=0x7ffe8e5540d8) at ../zebra/main.c:478
No description provided.