Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core: remove all traces of unused protocol version #11600

Merged
merged 2 commits into from
Feb 22, 2022

Conversation

schmichael
Copy link
Member

Note to reviewers: Feel free to review now, but given the changes impact Serf tags, I don't want to merge it until 1.3. Marked as draft only to make me think twice before I hit the merge button before main is open for 1.3. 😅

Nomad inherited protocol version numbering configuration from Consul and
Serf, but unlike those projects Nomad has never used it. Nomad's
protocol_version has always been 1.

While the code is effectively unused and therefore poses no runtime
risks to leave, I felt like removing it was best because:

  1. Nomad's RPC subsystem has been able to evolve extensively without
    needing to increment the version number.
  2. Nomad's HTTP API has evolved extensively without increment
    API{Major,Minor}Version. If we want to version the HTTP API in the
    future, I doubt this is the mechanism we would choose.
  3. The presence of the server.protocol_version configuration
    parameter is confusing since server.raft_protocol is an important
    parameter for operators to consider. Even more confusing is that
    there is a distinct Serf protocol version which is included in nomad server members output under the heading Protocol. raft_protocol
    is the only protocol version relevant to Nomad developers and
    operators. The other protocol versions are either deadcode or have
    never changed (Serf).
  4. If we were to need to version the RPC, HTTP API, or Serf protocols, I
    don't think these configuration parameters and variables are the best
    choice. If we come to that point we should choose a versioning scheme
    based on the use case and modern best practices -- not this 6+ year
    old dead code.

Copy link
Member

@tgross tgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this field be removed from the /agent/self and /agent/members API as well? With a -dev agent running from this branch, but with raft_version=3, I still see what looks like these fields.

curl outputs
$ curl -s "localhost:4646/v1/agent/self" | jq .member
{
  "Addr": "10.0.2.15",
  "DelegateCur": 4,
  "DelegateMax": 5,
  "DelegateMin": 2,
  "Name": "linux.global",
  "Port": 4648,
  "ProtocolCur": 2,
  "ProtocolMax": 5,
  "ProtocolMin": 1,
  "Status": "alive",
  "Tags": {
    "build": "1.2.3-dev",
    "raft_vsn": "3",
    "id": "986bb5aa-58f6-1d3c-eb3d-8d6dd5b9a071",
    "expect": "1",
    "region": "global",
    "bootstrap": "1",
    "role": "nomad",
    "dc": "dc1",
    "rpc_addr": "10.0.2.15",
    "port": "4647"
  }
}

$ curl -s "localhost:4646/v1/agent/members" | jq .
{
  "Members": [
    {
      "Addr": "10.0.2.15",
      "DelegateCur": 4,
      "DelegateMax": 5,
      "DelegateMin": 2,
      "Name": "linux.global",
      "Port": 4648,
      "ProtocolCur": 2,
      "ProtocolMax": 5,
      "ProtocolMin": 1,
      "Status": "alive",
      "Tags": {
        "rpc_addr": "10.0.2.15",
        "port": "4647",
        "bootstrap": "1",
        "role": "nomad",
        "dc": "dc1",
        "expect": "1",
        "region": "global",
        "build": "1.2.3-dev",
        "raft_vsn": "3",
        "id": "986bb5aa-58f6-1d3c-eb3d-8d6dd5b9a071"
      }
    }
  ],
  "ServerDC": "dc1",
  "ServerName": "linux",
  "ServerRegion": "global"
}

$ nomad server members
Name          Address    Port  Status  Leader  Protocol  Build      Datacenter  Region
linux.global  10.0.2.15  4648  alive   true    2         1.2.3-dev  dc1         global

e2e/terraform/.terraform.lock.hcl Outdated Show resolved Hide resolved
nomad/eval_broker.go Outdated Show resolved Hide resolved
@schmichael
Copy link
Member Author

@tgross The Protocol{Cur,Min,Max} fields in that curl output are the Serf protocol versions which are untouched by this PR (other than to remove a useless mapping of ProtocolVersion/protocol_version -> Serf version now that ProtocolVersion is gone. The mapping merely mapped ProtoclVersion's only valid value (1) to the default Serf version. These Serf version numbers are all the same in 1.1.6, main (@ 71cafc5), and this branch. If all of this seems very confusing, that is exactly why I want this unused overly generic ProtocolVersion gone. 😁

/agent/self was still returning ProtocolVersion: 0, because I botched the json tag, so I fixed that! It is now elided from API output.

Copy link
Member

@tgross tgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@DerekStrickland DerekStrickland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it should all be internals and not affect external consumers. So in other words, safe. 🤞

ProtocolVersion int `hcl:"protocol_version"`
//
// Deprecated: This has never been used and will emit a warning if nonzero.
ProtocolVersion int `hcl:"protocol_version" json:"-"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enhancement idea. Should/could we introduce a deprecated tag that can be introspected to auto-generated deprecation warnings like you had to manually add to command/agent.command.go? I've had a lot of success with that kind of AOP approach on APIs in other projects.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps, but this happens fairly rarely, and I think we may want to handle each deprecation on a case by case basis. For example nomad job run has a whole special Warnings slice where we include deprecation notifications: https://github.com/hashicorp/nomad/blob/v1.2.2/nomad/structs/structs.go#L6514

@schmichael schmichael marked this pull request as ready for review February 19, 2022 00:10
Nomad inherited protocol version numbering configuration from Consul and
Serf, but unlike those projects Nomad has never used it. Nomad's
`protocol_version` has always been `1`.

While the code is effectively unused and therefore poses no runtime
risks to leave, I felt like removing it was best because:

1. Nomad's RPC subsystem has been able to evolve extensively without
   needing to increment the version number.
2. Nomad's HTTP API has evolved extensively without increment
   `API{Major,Minor}Version`. If we want to version the HTTP API in the
   future, I doubt this is the mechanism we would choose.
3. The presence of the `server.protocol_version` configuration
   parameter is confusing since `server.raft_protocol` *is* an important
   parameter for operators to consider. Even more confusing is that
   there is a distinct Serf protocol version which is included in `nomad
   server members` output under the heading `Protocol`. `raft_protocol`
   is the *only* protocol version relevant to Nomad developers and
   operators. The other protocol versions are either deadcode or have
   never changed (Serf).
4. If we were to need to version the RPC, HTTP API, or Serf protocols, I
   don't think these configuration parameters and variables are the best
   choice. If we come to that point we should choose a versioning scheme
   based on the use case and modern best practices -- not this 6+ year
   old dead code.
@schmichael schmichael deleted the f-remove-unused-version branch February 22, 2022 17:51
schmichael added a commit that referenced this pull request Mar 19, 2022
Revert a small part of #11600 after @lgfa29 discovered it would break
compatibility with Nomad <= v1.2!

Nomad <= v1.2 expects the `vsn` tag to exist in Serf. It has always been
`1`. It has no functional purpose. However it causes a parsing error if
it is not set:

https://github.com/hashicorp/nomad/blob/v1.2.6/nomad/util.go#L103-L108

This means Nomad servers at version v1.2 or older will not allow servers
without this tag to join.

The `mvn` minor version tag is also checked, but soft fails. I'm not
setting that because I want as much of this cruft gone as possible.
lgfa29 pushed a commit that referenced this pull request Mar 24, 2022
Revert a small part of #11600 after @lgfa29 discovered it would break
compatibility with Nomad <= v1.2!

Nomad <= v1.2 expects the `vsn` tag to exist in Serf. It has always been
`1`. It has no functional purpose. However it causes a parsing error if
it is not set:

https://github.com/hashicorp/nomad/blob/v1.2.6/nomad/util.go#L103-L108

This means Nomad servers at version v1.2 or older will not allow servers
without this tag to join.

The `mvn` minor version tag is also checked, but soft fails. I'm not
setting that because I want as much of this cruft gone as possible.
@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 28, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants