slurm file processing not working if an entry contains a wrong network address #127

DonOtuseGH · 2024-10-01T14:11:40Z

Hello,

we are running some RTRTR instances on Kubernetes clusters using a custom image:

Alpine 3.20.3 base image and build env
RTRTR 0.3.0, built with:
- cargo 1.78.0
- rustc 1.78.0

RTRTR usually runs against our own Routinator instance, but to demonstrate the issue, the following config can be used as well:

rtrtr.conf:

log_level = "debug"
log_target = "stderr"
http-listen = ["0.0.0.0:8323"]
[units.json]
type = "json"
uri = "https://console.rpki-client.org/vrps.json"
refresh = 60
[units.slurm]
type = "slurm"
source = "json"
files = [ "/home/rtrtr/slurm.json" ]
[targets.rtr]
type = "rtr"
listen = [ "0.0.0.0:3323" ]
unit = "slurm"
client-metrics = true
[targets.http]
type = "http"
path = "/json"
format = "json"
unit = "slurm"

We realized that RTRTR does not start correctly, does not process the slurm file at all and does not give an error message if the slurm file contains an invalid network address in the prefix value of prefixAssertions.

slurm.json with wrong entry (10.10.10.164/27 is invalid/wrong, should be 10.10.10.160/27)

{
  "slurmVersion": 1,
  "validationOutputFilters": {
    "prefixFilters": [],
    "bgpsecFilters": []
  },
  "locallyAddedAssertions": {
    "prefixAssertions": [
      {
        "asn": 64546,
        "prefix": "192.168.255.0/24",
        "maxPrefixLength": 24,
        "comment": "RTR Health Check"
      },
      {
        "asn": 65535,
        "prefix": "10.10.10.164/27",
        "maxPrefixLength": 32
      }
    ],
    "bgpsecAssertions": []
  }
}

rtrtr log doesn't show anything about the issue, no slurm file processing, no target information...

[DEBUG] HTTP server listening on 0.0.0.0:8323
[DEBUG] Target http: link status: healthy
[DEBUG] starting new connection: https://console.rpki-client.org/
[DEBUG] RTR: Got reset query.
[DEBUG] Unit json: successfully updated.
[DEBUG] RTR: Got reset query.
[DEBUG] RTR: Got reset query.
[DEBUG] RTR: Got reset query.
[DEBUG] Unit json: successfully updated.
[DEBUG] RTR: Got reset query.
[DEBUG] RTR: Got reset query.
[DEBUG] RTR: Got reset query.
[DEBUG] Unit json: update without changes.
...

local target isn't working (expected result according to the missing log entries from above):

$ rtrclient -e -t csv -o /dev/stdout tcp 127.0.0.1 3323 2>/dev/null | wc -l
===> times out

Of course everything is working fine, if we correct the network address of the prefix to a valid one:

slurm.json with valid entries:

{
  "slurmVersion": 1,
  "validationOutputFilters": {
    "prefixFilters": [],
    "bgpsecFilters": []
  },
  "locallyAddedAssertions": {
    "prefixAssertions": [
      {
        "asn": 64546,
        "prefix": "192.168.255.0/24",
        "maxPrefixLength": 24,
        "comment": "RTR Health Check"
      },
      {
        "asn": 65535,
        "prefix": "10.10.10.160/27",
        "maxPrefixLength": 32
      }
    ],
    "bgpsecAssertions": []
  }
}

rtrtr log looks as expected:

[DEBUG] HTTP server listening on 0.0.0.0:8323
[DEBUG] Target http: link status: healthy
[DEBUG] starting new connection: https://console.rpki-client.org/
[DEBUG] Updated Slurm file /home/rtrtr/slurm.json
[DEBUG] Unit json: successfully updated.
[DEBUG] Unit slurm: file /home/rtrtr/slurm.json: added 2, removed 0.
[DEBUG] Target rtr: Got update (615244 entries)
[DEBUG] Target http: Got update (615244 entries)
[DEBUG] Target http: link status: healthy
[DEBUG] RTR: Got reset query.
[DEBUG] RTR: Got reset query.
[DEBUG] RTR: Got reset query.
[DEBUG] RTR: Got reset query.
[DEBUG] Unit json: successfully updated.
[DEBUG] Unit slurm: file /home/rtrtr/slurm.json: added 2, removed 0.
[DEBUG] Target rtr: Got update (615246 entries)
[DEBUG] Target http: Got update (615246 entries)
[DEBUG] Target http: link status: healthy
[DEBUG] RTR: Got reset query.
[DEBUG] RTR: Got reset query.
[DEBUG] RTR: Got reset query.
...

local target gives the correct count of VRPs:

$ rtrclient -e -t csv -o /dev/stdout tcp 127.0.0.1 3323 2>/dev/null | wc -l
615248

The text was updated successfully, but these errors were encountered:

DonOtuseGH · 2024-10-01T14:19:14Z

What we would expect

It would be great to have an error message in the log, that there's something wrong, while processing the slurm file. Of course it could be helpful to show the wrong/invalid entries in the log as well. This would simplify troubleshooting considerably, especially if the slurm file contains several hundred locallyAddedAssertions ;-)

partim · 2025-01-03T12:47:52Z

Apologies for the very late response. I had notification for new PRs turned off during my vacation and forgot to check after.

Currently, the slurm unit delays any processing until the first successful load of the SLURM set and, for some reason, just ignores any error. I was going to release 0.3.1 today, but I am instead going to add logging an error and release another RC so this will get into 0.3.1.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

slurm file processing not working if an entry contains a wrong network address #127

slurm file processing not working if an entry contains a wrong network address #127

DonOtuseGH commented Oct 1, 2024

DonOtuseGH commented Oct 1, 2024

partim commented Jan 3, 2025

slurm file processing not working if an entry contains a wrong network address #127

slurm file processing not working if an entry contains a wrong network address #127

Comments

DonOtuseGH commented Oct 1, 2024

DonOtuseGH commented Oct 1, 2024

partim commented Jan 3, 2025