Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] gnrc_border_router usability improvements #16840

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

chrysn
Copy link
Member

@chrysn chrysn commented Sep 11, 2021

Contribution description

This draft PR describes the changes necessary to get the gnrc_border_router easy to use without configuration. It roughly contains the changes I had to make to the 6LBR running at my home system, and I expect that what's in here explicitly right now to evolve into more and more of a tracking issue of non-draft PRs.

Testing procedure

  • Take a board with 6lowpan and USB Ethernet; I'm using nrf52840dongle.
    • Most of this should equally apply to boards with actual Ethernet instead of USB Ethernet -- but do we have them? (The custom mikrobusboard I'm often using RIOT with has ENC28J60 Ethernet, but I don't have a radio there; finding for a board that has both would be really great because these you could just plug into your switch and off you go).
  • On an OpenWRT router of your choosing, install the kmod-usb-net-cdc-ether package, and plug in the board.
    • In its network config GUI you'll find a USB device; add it to the LAN group.
    • kmod-usb-cdc-acm and socat are great for debugging while CoAP based remote configuration is not yet the default ;-)
    • Note that OpenWRT devices by default pick a ULA and announce that for stable operation when upstream goes away.
  • As an alternative to the OpenWRT router, plug it into your PC and tell your NetworkManager to make this interface "shared to other computers". (Works just as well, but as it involves two legs of prefix delegation, there's more that can go wrong with it).
  • Any other 6LoWPAN device should see a network come up that hands out routable addresses and actually routes them.

Issues/PRs references (including TBD)

  • Changes like setting the BOARD and UPLINK to cdc-ecm are really more of placeholders for documentation that'll tell users who want to set up something like this to put those on the command line.
    • CDC-ECM would be a good default for the boards, but with the current Makefile based setup it's hard to know that the board can do it, and we'll have to know a few lines down already which upstream is used
  • This enables RPL by default.
  • This enables ICMPv6 errors by default, which IIRC allow traceroutes to be used. Will become a teeny tiny pull request, or lumped together with other better defaults.
  • ULAs are rejected for delegatable prefixes. This is a rough version, a better one would take as many prefixes as are usable but take the global addresses first. The bad thing is that upstream doesn't give priorities, so we have to priorize, and worse we have to pick as many as the downstream devices can use. Might need more discussion too.
  • Some defaults are tuned (eg. netif choice), which might be an artifact of how long this has been sitting on my git stash; others are possibly obsoleted by the ULA change if that sticks.

This would also benefit greatly from:

because then I think it's something each and every one of us should just have running 24/7 in their homes just to see the practical operation when we're not using the [affe:affe::] network.

CC'ing @haukepetersen on following up on exchanged mails.

@github-actions github-actions bot added Area: examples Area: Example Applications Area: network Area: Networking Area: sys Area: System labels Sep 11, 2021
@haukepetersen
Copy link
Contributor

Finally had the chance to flash a nrf52840dongle and deploy it on my OpenWRT box. I am using the fixes proposed in this branch, but did rebase everything on the newest master (004b93e). On top I am using IP over BLE instead of the 15.4 mode of the radio - what else :-).

As suggested by @chrysn, I connect to the RIOT shell by running socat on the OpenWRT box, running socat -d -d TCP-LISTEN:12345,reuseaddr FILE:/dev/ttyACM0,b115200,raw,echo=0 in my case.

I did some quick tests and here are my observatiions:

  • I can reliably crash the shell by calling ifconfig -> when reading the details of the wired interface ifconfig is stuck and won't output anything. It appears to me, that i blocks the shell thread indefinitely and rebooting the dongle via reset button or unplugging is the only way I get it unstuck. However, the rest of the RIOT system is running as expected: i can stiill ping both interfaces and the dongle is still routing traffic for connected RIOT nodes, so it seems to be only the shell thread that is effected
  • as discovered before (see border_router: significant packet loss when sending out packets using USB cdc-ecm on nrf52 #16411), I observed packet loss when pinging hosts in my home network or in the Internet from a node behind the BR. When pinging the wireless interface from the same node, I do however not see any packet loss -> so it seems to be related to GNRC or the USB interface?!

@haukepetersen
Copy link
Contributor

After some further testing:

  • the shell is not stuck after all, seems that restarting socat on the OpenWRT box gives me access to the shell on the RIOT node again, so maybe I am simply not configuring something correctly on that part...

I further collected some more details on the packet loss, again pinging different addresses from a RIOT node that is conneccted to the BR node via IP over BLE:
Pinging heise.de from the RIOT node (-c 2000 -i 250):

 2021-09-24 15:31:55,914 # --- 2a02:2e0:3fe:1001:302:: PING statistics ---
2021-09-24 15:31:55,920 # 2000 packets transmitted, 1993 packets received, 0% packet loss
2021-09-24 15:31:55,924 # round-trip min/avg/max = 51.563/93.403/344.860 ms

-> 7 packets lost

Pinging a Pi in my local network from the same node (-c 2000 -i 250):

2021-09-24 15:43:13,822 # --- fd5f:6d79:a254::affe PING statistics ---
2021-09-24 15:43:13,827 # 2000 packets transmitted, 1988 packets received, 0% packet loss
2021-09-24 15:43:13,832 # round-trip min/avg/max = 51.626/105.328/449.623 ms

-> similar picture, 12 packets lost

Pinging the link local address of the BLE interface on the BR node (-c 2000 -i 250):

2021-09-24 15:52:15,539 # --- FE80::DBE4:F4FF:FEC2:21DA PING statistics ---
2021-09-24 15:52:15,545 # 2000 packets transmitted, 2000 packets received, 0% packet loss
2021-09-24 15:52:15,549 # round-trip min/avg/max = 18.935/103.343/484.210 ms

-> no packet loss

These are not one-time results but I redid them with different parameters and the results are always similar to the ones postet here...

@fjmolinas
Copy link
Contributor

Nice I was able to reproduce the setup when plugged directly into my laptop! Would be really nice to see this one in!

Comment on lines +836 to +842
// Is taking the routable one(s) really *better* than taking the ULA?
// The ULA might be what is more stable, or what persists through
// renumberings ... but probably picking routable *is* best, and when
// that prefix becomes unavailable we'll just wait for a network
// reconfiguration to bring everyone up on ULAs instead. (If of course
// we had the capacity for multiple prefixes on all devices...).
bool take_only_routable = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we eventually add a configuration for this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, but I'd rather have good defaults primarily. A usable BR should be provided in RIOT to be set up as easily as setting up a WiFi access point -- and that usually doesn't require knowledge of whether or not particular types of addresses are around.

A better version could express some policy: Use as many prefixes as possible, but use globally routable ones first. Or use a globally routable ones first, but if there is a ULA use that too if there is space (because that's expected to be more stable). Maybe better choices in the numbers chosen (adjusting CONFIG_DHCPV6_CLIENT_PFX_LEASE_MAX to match the number of prefixes we can use) can be taken, and maybe then (if we only request one prefix) this will push the problem of selecting the right prefix(es) upstream to where more data is available and a better decision can be made.

(Hard thing is: I don't know how most of these numbers play together, and if we can assume any of them to be synchronized)

@stale
Copy link

stale bot commented Aug 13, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you want me to ignore this issue, please mark it with the "State: don't stale" label. Thank you for your contributions.

@stale stale bot added the State: stale State: The issue / PR has no activity for >185 days label Aug 13, 2022
USEMODULE += netstats_neighbor_count
USEMODULE += netstats_neighbor_rssi
USEMODULE += netstats_neighbor_lqi
USEMODULE += netstats_neighbor_tx_time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are you using those for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, these are relics from testing some commits that were prerequisites for running MRHOF.

@stale stale bot removed the State: stale State: The issue / PR has no activity for >185 days label Jan 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: examples Area: Example Applications Area: network Area: Networking Area: sys Area: System
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants