Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Controller: Handle empty HTTP response and add restart functionality for ns-plug service #977

Merged
merged 4 commits into from
Jan 10, 2025

Conversation

stephdl
Copy link
Contributor

@stephdl stephdl commented Dec 13, 2024

Implement handling for empty responses in HTTP code extraction and introduce a restart command that triggers when the ns-plug service goes down. A new script for restarting the ns-plug service has also been added.

when the server does not answer we have this kind of error because the .code is missing with jq

Dec 13 11:23:26 NethSec ns-plug[9737]: jq: error (at <stdin>:1): Cannot index number with string "code"
Dec 13 11:23:26 NethSec ns-plug[9737]: parse error: Invalid numeric literal at line 1, column 9

when the tun-nsplug goes down, it restarts ns-plug, we have a countdown about 1 minute to connect again to the server and we give up

#978

This PR tries to fix che current scenario:

  • ns8 cluster with one node
  • 100 units connected to the controller
  • the node where the controller runs has an hardware failure
  • new installation of a ns8 cluster
  • restore of the controller wit the same FQDN of the old one
  • the units must reconnect automatically to the new restored controller

@stephdl stephdl changed the title Handle empty HTTP response and add restart functionality for ns-plug service Controller: Handle empty HTTP response and add restart functionality for ns-plug service Dec 13, 2024
@stephdl stephdl requested a review from gsanchietti December 13, 2024 14:33
@stephdl stephdl marked this pull request as draft December 16, 2024 11:15
@stephdl stephdl requested a review from gsanchietti December 16, 2024 15:16
@stephdl stephdl marked this pull request as ready for review December 16, 2024 15:16
packages/ns-plug/files/ns-plug Outdated Show resolved Hide resolved
packages/ns-plug/files/ns-plug Outdated Show resolved Hide resolved
@stephdl stephdl force-pushed the fix-vpn-ns-plug-restart branch from f6396c5 to 7e2d4bc Compare December 17, 2024 14:40
@stephdl stephdl requested a review from gsanchietti December 17, 2024 14:44
Copy link
Member

@filippocarletti filippocarletti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens on boot if the controller is down?
What is the output of /etc/init.d/ns-plug status?

packages/ns-plug/files/ns-plug Outdated Show resolved Hide resolved
Make sure that ns-plug always try to connect to the controller.
This fix will allow automatica re-connection in case
of disaster recovery of the remote controller.
@gsanchietti gsanchietti force-pushed the fix-vpn-ns-plug-restart branch from 3da22c0 to b9a581c Compare December 18, 2024 10:45
@stephdl
Copy link
Contributor Author

stephdl commented Dec 18, 2024

so if I turn off the NS8 node and I restart nethsecurity

root@NethSec:~# /etc/init.d/ns-plug status
running
root@NethSec:~# ps aux | grep ns-plug
root      3196  0.0  0.0   2188  1700 ?        S    10:55   0:00 /bin/bash /usr/sbin/ns-plug
root      6136  0.0  0.0   1604   912 pts/0    S+   10:55   0:00 grep ns-plug
root@NethSec:~# less /var/log/messages 
root@NethSec:~# tail -f  /var/log/messages 
Dec 18 10:55:31 NethSec ns-plug: Connection failed. Waiting for the controller ...
Dec 18 10:55:39 NethSec ns-plug: Connection failed. Waiting for the controller ...
Dec 18 10:55:43 NethSec : cannot connect to 172.28.69.1:20026: Connection refused [v8.2110.0 try https://www.rsyslog.com/e/2027 ]
Dec 18 10:55:47 NethSec ns-plug: Connection failed. Waiting for the controller ...
Dec 18 10:55:55 NethSec ns-plug: Connection failed. Waiting for the controller ...
Dec 18 10:56:03 NethSec ns-plug: Connection failed. Waiting for the controller ...
Dec 18 10:56:12 NethSec ns-plug: Connection failed. Waiting for the controller ...
Dec 18 10:56:13 NethSec : cannot connect to 172.28.69.1:20026: Connection refused [v8.2110.0 try https://www.rsyslog.com/e/2027 ]
Dec 18 10:56:20 NethSec ns-plug: Connection failed. Waiting for the controller ...
Dec 18 10:56:28 NethSec ns-plug: Connection failed. Waiting for the controller ...
Dec 18 10:56:36 NethSec ns-plug: Connection failed. Waiting for the controller ...

The ns-plug is running
the log show that we are waiting a connection endless
the vpn is not started

I turn on the NS8 node and I get

root@NethSec:~# tail -f  /var/log/messages 
Dec 18 10:57:17 NethSec ns-plug: Connection failed. Waiting for the controller ...
Dec 18 10:57:25 NethSec ns-plug: Connection failed. Waiting for the controller ...
Dec 18 10:57:33 NethSec ns-plug: Connection failed. Waiting for the controller ...
Dec 18 10:57:41 NethSec ns-plug: Connection failed. Waiting for the controller ...
Dec 18 10:57:43 NethSec : cannot connect to 172.28.69.1:20026: Connection refused [v8.2110.0 try https://www.rsyslog.com/e/2027 ]
Dec 18 10:57:49 NethSec ns-plug: Connection failed. Waiting for the controller ...
Dec 18 10:57:58 NethSec ns-plug: Connection failed. Waiting for the controller ...
Dec 18 10:58:06 NethSec ns-plug: Connection failed. Waiting for the controller ...
Dec 18 10:58:13 NethSec : cannot connect to 172.28.69.1:20026: Connection refused [v8.2110.0 try https://www.rsyslog.com/e/2027 ]
Dec 18 10:58:14 NethSec ns-plug: Connection failed. Waiting for the controller ...
Dec 18 10:58:22 NethSec ns-plug: Connection failed. Waiting for the controller ...
Dec 18 10:58:30 NethSec ns-plug: Connection failed. Waiting for the controller ...
Dec 18 10:58:38 NethSec ns-plug: Connection failed. Waiting for the controller ...
Dec 18 10:58:43 NethSec : cannot connect to 172.28.69.1:20026: Connection refused [v8.2110.0 try https://www.rsyslog.com/e/2027 ]
Dec 18 10:58:44 NethSec ns-plug[3196]: 2024-12-18 10:58:44 --cipher is not set. Previous OpenVPN version defaulted to BF-CBC as fallback when cipher negotiation failed in this case. If you need this fallback please add '--data-ciphers-fallback BF-CBC' to your configuration and/or add BF-CBC to --data-ciphers.
Dec 18 10:58:44 NethSec ns-plug[3196]: 2024-12-18 10:58:44 OpenVPN 2.5.8 x86_64-openwrt-linux-gnu [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [MH/PKTINFO] [AEAD]
Dec 18 10:58:44 NethSec ns-plug[3196]: 2024-12-18 10:58:44 library versions: OpenSSL 3.0.15 3 Sep 2024, LZO 2.10
Dec 18 10:58:44 NethSec ns-plug[3196]: 2024-12-18 10:58:44 WARNING: No server certificate verification method has been enabled.  See http://openvpn.net/howto.html#mitm for more info.
Dec 18 10:58:44 NethSec ns-plug[3196]: 2024-12-18 10:58:44 NOTE: the current --script-security setting may allow this configuration to call user-defined scripts
Dec 18 10:58:44 NethSec ns-plug[3196]: 2024-12-18 10:58:44 TCP/UDP: Preserving recently used remote address: [AF_INET]192.168.100.243:20022
Dec 18 10:58:44 NethSec ns-plug[3196]: 2024-12-18 10:58:44 Socket Buffers: R=[212992->212992] S=[212992->212992]
Dec 18 10:58:44 NethSec ns-plug[3196]: 2024-12-18 10:58:44 UDP link local: (not bound)
Dec 18 10:58:44 NethSec ns-plug[3196]: 2024-12-18 10:58:44 UDP link remote: [AF_INET]192.168.100.243:20022
Dec 18 10:58:44 NethSec ns-plug[3196]: 2024-12-18 10:58:44 TLS: Initial packet from [AF_INET]192.168.100.243:20022, sid=30ac76d5 8d545250
Dec 18 10:58:44 NethSec ns-plug[3196]: 2024-12-18 10:58:44 VERIFY OK: depth=1, CN=nethsec
Dec 18 10:58:44 NethSec ns-plug[3196]: 2024-12-18 10:58:44 VERIFY OK: depth=0, CN=server
Dec 18 10:58:44 NethSec ns-plug[3196]: 2024-12-18 10:58:44 Control Channel: TLSv1.3, cipher TLSv1.3 TLS_AES_256_GCM_SHA384, peer certificate: 2048 bit RSA, signature: RSA-SHA256
Dec 18 10:58:44 NethSec ns-plug[3196]: 2024-12-18 10:58:44 [server] Peer Connection Initiated with [AF_INET]192.168.100.243:20022
Dec 18 10:58:44 NethSec ns-plug[3196]: 2024-12-18 10:58:44 PUSH: Received control message: 'PUSH_REPLY,route 172.28.69.0 255.255.255.0,route-gateway 172.28.69.1,topology subnet,ping 20,ping-restart 120,ifconfig 172.28.69.3 255.255.255.0,peer-id 0,cipher AES-256-GCM'
Dec 18 10:58:44 NethSec ns-plug[3196]: 2024-12-18 10:58:44 OPTIONS IMPORT: timers and/or timeouts modified
Dec 18 10:58:44 NethSec ns-plug[3196]: 2024-12-18 10:58:44 OPTIONS IMPORT: --ifconfig/up options modified
Dec 18 10:58:44 NethSec ns-plug[3196]: 2024-12-18 10:58:44 OPTIONS IMPORT: route options modified
Dec 18 10:58:44 NethSec ns-plug[3196]: 2024-12-18 10:58:44 OPTIONS IMPORT: route-related options modified
Dec 18 10:58:44 NethSec ns-plug[3196]: 2024-12-18 10:58:44 OPTIONS IMPORT: peer-id set
Dec 18 10:58:44 NethSec ns-plug[3196]: 2024-12-18 10:58:44 OPTIONS IMPORT: adjusting link_mtu to 1624
Dec 18 10:58:44 NethSec ns-plug[3196]: 2024-12-18 10:58:44 OPTIONS IMPORT: data channel crypto options modified
Dec 18 10:58:44 NethSec ns-plug[3196]: 2024-12-18 10:58:44 Data Channel: using negotiated cipher 'AES-256-GCM'
Dec 18 10:58:44 NethSec ns-plug[3196]: 2024-12-18 10:58:44 Outgoing Data Channel: Cipher 'AES-256-GCM' initialized with 256 bit key
Dec 18 10:58:44 NethSec ns-plug[3196]: 2024-12-18 10:58:44 Incoming Data Channel: Cipher 'AES-256-GCM' initialized with 256 bit key
Dec 18 10:58:44 NethSec ns-plug[3196]: 2024-12-18 10:58:44 net_route_v4_best_gw query: dst 0.0.0.0
Dec 18 10:58:44 NethSec ns-plug[3196]: 2024-12-18 10:58:44 net_route_v4_best_gw result: via 192.168.61.1 dev br-lan
Dec 18 10:58:44 NethSec ns-plug[3196]: 2024-12-18 10:58:44 TUN/TAP device tun-nsplug opened
Dec 18 10:58:44 NethSec ns-plug[3196]: 2024-12-18 10:58:44 net_iface_mtu_set: mtu 1500 for tun-nsplug
Dec 18 10:58:44 NethSec ns-plug[3196]: 2024-12-18 10:58:44 net_iface_up: set tun-nsplug up
Dec 18 10:58:44 NethSec ns-plug[3196]: 2024-12-18 10:58:44 net_addr_v4_add: 172.28.69.3/24 dev tun-nsplug
Dec 18 10:58:44 NethSec ns-plug[3196]: 2024-12-18 10:58:44 net_route_v4_add: 172.28.69.0/24 via 172.28.69.1 dev [NULL] table 0 metric -1
Dec 18 10:58:44 NethSec ns-plug[3196]: 2024-12-18 10:58:44 Initialization Sequence Completed
Dec 18 10:59:07 NethSec nethsecurity-api[4395]: nethsecurity_api 2024/12/18 10:59:07 middleware.go:220: [INFO][AUTH] unauthorized request: signature is invalid
Dec 18 10:59:07 NethSec nethsec nginx: 172.28.69.1 - - [18/Dec/2024:10:59:07 +0000] "POST /api/ubus/call HTTP/1.1" 401 81 "https://controller23.rocky9.org/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36"
Dec 18 10:59:07 NethSec nethsecurity-api[4395]: nethsecurity_api 2024/12/18 10:59:07 middleware.go:77: [INFO][AUTH] authentication success for user 7133e7761cb3c6e98d900701 from ::1
Dec 18 10:59:07 NethSec nethsecurity-api[4395]: nethsecurity_api 2024/12/18 10:59:07 middleware.go:185: [INFO][AUTH] login response success for user 7133e7761cb3c6e98d900701
Dec 18 10:59:07 NethSec nethsec nginx: 172.28.69.1 - - [18/Dec/2024:10:59:07 +0000] "POST /api/login HTTP/1.1" 200 266 "-" "Go-http-client/1.1"
Dec 18 10:59:13 NethSec : action 'action-1-builtin:omfwd' resumed (module 'builtin:omfwd') [v8.2110.0 try https://www.rsyslog.com/e/2359 ]

@gsanchietti
Copy link
Member

@filippocarletti could you please review again?

@gsanchietti gsanchietti merged commit 15229b4 into main Jan 10, 2025
1 check passed
@gsanchietti gsanchietti deleted the fix-vpn-ns-plug-restart branch January 10, 2025 14:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants