Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Undocumented error field in traceroute hop reply #14

Open
jmeggitt opened this issue Nov 16, 2022 · 0 comments
Open

Undocumented error field in traceroute hop reply #14

jmeggitt opened this issue Nov 16, 2022 · 0 comments

Comments

@jmeggitt
Copy link

jmeggitt commented Nov 16, 2022

Issue

RIPE Atlas probes will occasionally emit hop replies containing an "error" field (Appearing in 0.019% of traceroute measurements I checked). This behavior is not documented for any firmware version on https://atlas.ripe.net/docs/apis/result-format/. Upon some investigation, there are a few places in the measurement code where this field can be produced. According to git blame, this has and continues to be part of the probe behavior for at least 10 years now.

At the moment I am somewhat unsure if this is an issue with the documentation or probe code. I can think of a few arguments for each side.

  • [For bug in code] This field performs a very similar function to the error field on traceroute hops. Errors described by this field appear to always be fatal and will usually (but not always) be the only reply in a given hop. Given that they both have the same name and value type, there is a decent chance that this field was simply added to the wrong object by accident. They should be consolidated to a single field to simplify parsing of the data. Additionally, leaving this error inside of hop replies may give the false impression that this error is recoverable when that does not appear to have been the case.
  • [For documentation] This field has likely been appearing in data for the past 10 years now without issue. It is easily identifiable in the output JSON and has a consistent value type and behavior. Since replies containing an error do not contain any other fields, this closely mirrors the naming and fields of ping result variants having Timeout/Error/Reply (As compared to the current Timeout/Reply variants specified for hop replies). Additionally, marking an entire hop as having errored may be misleading as valid replies could have been recorded for that hop prior to the error.

Effected Measurement Examples

I dumped every measurement from one of the RIPE Atlas hourly traceroute dump files (traceroute-2022-10-14T0400.bz2) that contained this field. This data is stored as newline delimited JSON and can be found in this gist. This appears to be extremely rare and only occurred in 1690 of the 8,912,306 traceroute measurements from that data file (0.019%).

Examples

Here are a couple measurements I arbitrarily chose to show off what they look like in the context of the data.

{
  "af": 6,
  "dst_addr": "2001:500:2::c",
  "dst_name": "2001:500:2::c",
  "endtime": 1665721544,
  "from": "2001:bc8:62c:2545::1",
  "fw": 5020,
  "lts": 11,
  "msm_id": 6011,
  "msm_name": "Traceroute",
  "mver": "2.2.0",
  "paris_id": 2,
  "prb_id": 1000410,
  "proto": "UDP",
  "result": [
    {
      "hop": 1,
      "result": [
        {
          "error": "sendto failed: Operation not permitted"
        }
      ]
    }
  ],
  "size": 40,
  "src_addr": "2001:bc8:62c:2545::1",
  "timestamp": 1665721544,
  "type": "traceroute"
}
{
  "af": 4,
  "dst_addr": "46.101.130.201",
  "dst_name": "46.101.130.201",
  "endtime": 1665722912,
  "from": "170.39.226.151",
  "fw": 5040,
  "group_id": 29556742,
  "lts": 1,
  "msm_id": 29556742,
  "msm_name": "Traceroute",
  "mver": "2.4.1",
  "paris_id": 14,
  "prb_id": 6927,
  "proto": "ICMP",
  "result": [
    {"hop": 1,"result": [{"x": "*"},{"x": "*"},{"x": "*"}]},
    {
      "hop": 2,
      "result": [
        {
          "error": "sendto failed: Network is unreachable"
        }
      ]
    }
  ],
  "size": 48,
  "src_addr": "170.39.226.151",
  "timestamp": 1665722900,
  "type": "traceroute"
}
{
  "af": 6,
  "dst_addr": "2a00:74c0:0:2::20",
  "dst_name": "2a00:74c0:0:2::20",
  "endtime": 1665722173,
  "from": "2a05:f6c7:3853:0:eade:27ff:fe69:dd4e",
  "fw": 5070,
  "group_id": 25639804,
  "lts": 38,
  "msm_id": 25639804,
  "msm_name": "Traceroute",
  "mver": "2.6.1",
  "paris_id": 7,
  "prb_id": 22203,
  "proto": "ICMP",
  "result": [
    {"hop":1,"result":[{"from":"2a05:f6c7:3853:0:1e74:dff:fec3:e2f8","rtt":0.948,"size":96,"ttl":255},{"from":"2a05:f6c7:3853:0:1e74:dff:fec3:e2f8","rtt":0.828,"size":96,"ttl":255},{"from":"2a05:f6c7:3853:0:1e74:dff:fec3:e2f8","rtt":0.759,"size":96,"ttl":255}]},
    {"hop":2,"result":[{"from":"2a05:f6c0:1::18","rtt":3.658,"size":96,"ttl":63},{"from":"2a05:f6c0:1::18","rtt":4.791,"size":96,"ttl":63},{"from":"2a05:f6c0:1::18","rtt":3.415,"size":96,"ttl":63}]},
    {"hop":3,"result":[{"from":"2a05:f6c0:2:23::1","rtt":5.209,"size":96,"ttl":62},{"from":"2a05:f6c0:2:23::1","rtt":4.848,"size":96,"ttl":62},{"from":"2a05:f6c0:2:23::1","rtt":5.137,"size":96,"ttl":62}]},
    {"hop":4,"result":[{"from":"2001:6c8:81:100::1a1","rtt":4.867,"size":96,"ttl":252},{"from":"2001:6c8:81:100::1a1","rtt":18.54,"size":96,"ttl":252},{"from":"2001:6c8:81:100::1a1","rtt":4.106,"size":96,"ttl":252}]}, 
    {"hop":5,"result":[{"from":"2001:6c8:40::1e","rtt":4.761,"size":96,"ttl":250},{"from":"2001:6c8:40::1e","rtt":4.698,"size":96,"ttl":250},{"from":"2001:6c8:40::1e","rtt":5.275,"size":96,"ttl":250}]},
    {
      "hop": 6,
      "result": [
        {
          "x": "*"
        },
        {
          "error": "sendto failed: Network is unreachable"
        }
      ]
    }
  ],
  "size": 48,
  "src_addr": "2a05:f6c7:3853:0:eade:27ff:fe69:dd4e",
  "timestamp": 1665722169,
  "type": "traceroute"
}

Values

As far as I have seen, this field is always a string when it appears. Here are the number of occurrences for all the different values I found when dumping the effected measurements.

count   value
      9 "bind failed: Address already in use"
     11 "bind failed: Address not available"
     47 "bind failed: Cannot assign requested address"
    334 "bind failed: Invalid argument"
    804 "sendto failed: Network is unreachable"
    104 "sendto failed: Network unreachable"
    364 "sendto failed: Operation not permitted"
     17 "sendto failed: Permission denied"

Tests

I found this field when I was in the process of writing some code to parse measurements in Rust with serde. To verify that I had written my tests correctly I had it attempt to parse every measurement in one of the hourly data dumps of traceroute measurements. Thanks to Rust and serde, this code is able to detect and verify the following aspects when parsing the input data. However the error field was the only* notable inconsistency it found with the API documentation.

  • All fields are required to be present in the input data unless explicitly marked optional.
  • No unexpected fields may be present in the input data. (Thanks to #[serde(deny_unknown_fields)]).
  • Field types must be enforced. Implicit conversions between types such as strings, integers, booleans, and floats would result in an error.
  • Mutually exclusive fields**. For example, traceroute replies must contain either rtt or late, but not both.
  • Objects that could be parsed as one of multiple variants must satisfy a single variant. (Ex: The timeout case can not overlap with the reply case)
  • [Partial] Fields such as af and proto that only have a couple possible values must be one of the specified values. I only enforced this check on some fields. Fields such as type and flags were left to read any string since they were not necessary for my use case.

* Excluding the bug where empty objects could be emitted alongside traceroute hop replies.
** When making error and result mutually exclusive in traceroute hops, I decided to give a pass to hop sometimes not appearing during an error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant