Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MDNS integration and NodeGraph structure expansion #537

Closed
5 tasks done
tegefaulkes opened this issue Jul 26, 2023 · 31 comments · Fixed by #687
Closed
5 tasks done

MDNS integration and NodeGraph structure expansion #537

tegefaulkes opened this issue Jul 26, 2023 · 31 comments · Fixed by #687
Assignees
Labels
development Standard development epic Big issue with multiple subissues r&d:polykey:core activity 4 End to End Networking behind Consumer NAT Devices

Comments

@tegefaulkes
Copy link
Contributor

tegefaulkes commented Jul 26, 2023

Specification

With the creation of MDNS we need to integrate it's usage into Polykey. With that comes some considerations surrounding
what connection information needs to be stored in the NodeGraph and what information needs to be shared between nodes.

Right now the NodeGraph is based on the Kademlia spec and is pretty simple. It only stores a single Address for each node. A There are only 20 nodes per bucket and only the oldest active ones are stored. This is all based on the Kademlia spec.

The NodeGraph logic needs to be expanded to allow for more information relating to each node to be stored. We need to store IPv4 as well as IPv6 addresses. We need to track information about if the node is publicly contactable or needs to be hole punched. We need sticky nodes such as the seed nodes or user defined static information.

Step 1 - MDNS Sidecar onto NodeConnectionManager and NodeManager

Untitled-2023-10-23-0424 excalidraw

I've decided as the simplest solution, MDNS can be integrated into methods of NodeConnectionManager and NodeManager related to discovery, like findNode, etc. This essentially allows NodeConnectionManager and NodeManager use both MDNS and NodeGraph as discovery mechanisms independent of each other.

Note: MDNS should not be integrated into findNode directly, but as a separate method to be called by the callers of findNode in non-kademllia contexts.

The NodeGraph related data structures will still have to be edited like so to facilitate the separation of MDNS addresses and NodeGraph addresses:

type NodeAddressScope = 'local' | 'external';
type NodeAddress = {
  host: Host | Hostname;
  port: Port;
  scopes: Array<NodeAddressScope>;
};

This will mean that there will be overlapping NodeData that exists on both the NodeGraph and MDNS, this will have to be dealt with.

The methods on NCM: getConnectionWithAddress and getConnection will need to be amended to support multiple NodeAddresses as input. This isn't the ideal way to do this, but it will have to be like this until the NodeGraph itself is made to support storing multiple IP addresses in Step 2. Much of the flow will need to be rewritten to support this change.

Scopes

Currently in Step 1, scopes will only be used for disabling holepunching on local addresses. An address can be either local, external, or both. Local means that the address is locally routable, so holepunching is not required when connecting to that address. External means the that the address is gliobally routable, but will require holepunching to establish a connection.

Scopes aren't just for disabling signaling or not, they are also used for the purpose of classifying IP addresses into different levels. This means that in future, when scopes are incorporated as part of the nodegraph, we can specify for only external addresses to be shared on the public mainnet.

Being able to define scopes as an array also means that i address could have multiple classifications, which would be useful in the future if we ever needed it expanded. Also addresses with both local and external will be understood to be shared on the mainnet, but not require holepunching to be connected to.

Custom Group Address and Port

We're going to need to select a port and ipv4 + ipv6 groups for now, as to not interfere with existing MDNS stacks. However, as native integration of MDNS improves, this may eventually no longer be needed.

I've decided to take inspiration from our slogan fait accompli, to choose these addresses:

IPv4 IPv6 Port
Value 224.0.0.250 ff02::fa17 64023
Reasoning 250 is decimal representation for fa fa17 looks like fait in fait accompli 64023 is the decimal representation of fa17

All these Values have been checked against IANA to be not of conflict with any existing reservations.

Step 2 - Full Integration of MDNS to Supply Data to NodeGraph

Untitled-2023-10-23-0424 excalidraw(1)

The NodeGraph is expanded to support multiple IPv4 and IPv6 addresses.

This will allow us to integrate MDNS into the NodeGraph, so that it is used as the central Node address book.

type NodeData = {
  addresses: Array<NodeAddress>;
  lastUpdated: number;
};

This might require alot of API changes. We'll see if we can get there in time.

Additional context

Tasks

  • 1. Step 1 - Integrate MDNS and it's usage into initial network entry and discovery.
    • Integrate MDNS querying into NodeConnectionManager.findNode
  • 2. Step 2 - NodeGraph functionality needs to be expanded to handle connection metadata, support IPv6 and general policy for garbage collection, data updating and data sharing policy.
    • The initial sync node graph proceedure should be extracted out of NodeManager.start and be made a public function. It's usage should be called in PolykeyAgent when starting. We should make it optional for testing.
    • [ ] Integrate MDNS querying into NodeManager.syncNodeGraph
    • [ ] Integrate MDNS querying into NodeManager.getNodeAddress
    • [ ] Integrate MDNS querying into NodeManager.setNode
    • Integrate MDNS into NodeManager.findNode - this was moved out from NCM into NM now.
@tegefaulkes tegefaulkes added the development Standard development label Jul 26, 2023
@tegefaulkes
Copy link
Contributor Author

@amydevs @CMCDragonkai

@CMCDragonkai
Copy link
Member

Also that node addresses can now have both local node addresses and global node addresses. In the mainnet/testnet, local addresses shouldn't be given out, as that leaks local network environment and could be a security risk.

However when connecting to a PKE portal, local network addresses could be shared, just like in tailscale.

In that case, it might be part of the bootstrapping configuration of PK nodes in a private network, where they can be enabled to share those local addresses.

Even when it is shared, receiving nodes need to intelligently decide and pick out which address to use to connect. In some cases, even in a LAN, they still cannot connect to the LAN address because they are not part of the same LAN. So in some cases, there may need to be concurrent/parallel attempts to the addresses, the node graph should then remember which ones worked and which ones didn't and sort addresses by concurrent priority.

@amydevs
Copy link
Contributor

amydevs commented Jul 27, 2023

Some context regarding MDNS
The responder I've implemented will not re-query for records that have expired. This behaviour is deemed to be optional by the MDNS RFC. But since Polykey checks for reachability anyway, the MDNS service-removed event should only be treated as a hint and not that it has actually been taken offline.

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Sep 11, 2023

I tried out enabling LAN loopback, and I have an empirical sense of how it works now.

Most routers do not have it enabled, and many routers don't even have this feature at all.

When it is enabled, it may only be enabled on specific port forwarding rules (maybe port triggering rules), and not for the entire network.

The basic idea is that when it is enabled it is possible to use the external IP address while you're in the local network.

For example if you have your router forwarding 55555, and have a nc -l 55555 on a local computer. Then this will succeed:

nc -vz `curl -Ss -4 icanhazip.com` 55555

If you don't have a service listening 55555, you'd get Connection refused.

If loopback isn't enabled, you'd also get a connection refused or a timeout.

Basically this feature is primarily for convenience. It means that while you're in the local network, you can use the external IP to access internal services. It's convenient because if you are sharing scripts between others, you don't have to remember to swap the IP address from internal to external or vice versa when running the scripts from an external location or from a local location.

I asked myself why is this not on by default? It seems like a convenient feature. The only security issue I discovered is mentioned here on SO:

image

Anyway this means we generally have to assume that this feature does not exist or does not work and not rely on it. It appears most organisations (that need to expose an internal service over NAT) rely on split DNS to deal with this problem, and instead use a hostname that resolves to an external IP or local IP depending on the situation.

What does this mean for the NodeGraph? Basically for P2P to work between agents on the local network, it is necessary for us to expand our NodeGraph to contain both local IPs and external IPs. When contacting the seed node, the seed nodes will tell the local node what its external IP is, but it cannot tell it what its internal IP is. Only the node itself can find that out by inspecting its local interfaces.

This means MDNS is necessary to use multicast to find other nodes on the same local network. Multicast of course is also subject to the local network rules that the router has, if it disables multicast, then multicast becomes broken too. Finally there's also broadcast... but if multicast is disabled, most likely broadcast is disabled too. The idea is that the nodegraph will have multiple IPs that are potential IPs of the same node.

To prevent leaking internal network information, one has to add classification rules (categorisation or tagging) to the node graph entries. So that internal address entries are only shared with other nodes that are on the same internal network. This can be done by inspecting and comparing the source-IP of any RPC requests. This is a fairly minimal security risk... and for private PK networks, it's fine to tell the root nodes what the internal addresses are. We can observe this behaviour in tailscale, where all nodes advertise their internal network to tailscale too. But in our public mainnet and testnet, we would not want to do this.

Another thing... we might want to also expand our address types to include non-network addresses. Let's consider the generic idea of a "beacon" or "lighthouse" as a rendezvous point where nodes can discover each other.

There are multiple layers of rendezvous points, each one increasing its scope of operations.

  1. The first rendezvous point can be a filesystem location. This is useful for PK agents running on the same host operating system, and enables easy virtualisation, by simply changing this filesystem location (imagine VMs and docker containers bind-mounts).
  2. The second rendezvous point can be MDNS, this is useful for local networks.
  3. The third rendezvous point can be a PK private network hosted on PKE.
  4. The fourth rendezvous point can be PK public mainnet and testnet thus representing the global network (also hosted on PKE - but presented publically).

Given our desire to create a decentralised signalling and relaying system, all rendezvous point is just a "starting" point, subsequent signalling and relaying should rely on the network itself to help scale it. We can call this a "self-scaling" technology.

@amydevs
Copy link
Contributor

amydevs commented Oct 11, 2023

I remember in a conversation we had earlier in the year
, that we will have two hard coded groups of addresses for now, being public (accessible via Holepunching, and is shared with other nodes) and private (accessible via LAN, is not shared with other nodes, and discovered through MDNS)

I also remember one of us saying that private addresses should be shared with other nodes when the node is bootstrap with a private PKE instance by default.

@CMCDragonkai
Copy link
Member

It's like classification tiers.

There may actually be more tiers than that, so I was thinking of adding a classification label into the node graph structure some how.

Just make a quick solution for the launch, then new issue to rearchitect it properly. But try not to build too much technical debt.

@CMCDragonkai
Copy link
Member

I remember at the very least we have:

  1. Local Machine
  2. Local Network
  3. Global Network

Right now the goal is for MDNS to cover 1 and 2, since kademlia can't discover 1 and 2.

Note that 1 should actually be found via also a filesystem discovery using like a common directory location like /var/run or /run. However we can do that later.

@CMCDragonkai
Copy link
Member

The #537 is related to this. @amydevs please review.

@amydevs
Copy link
Contributor

amydevs commented Oct 12, 2023

the WIP data structure of NodeData and NodeAddress:

type NodeAddressScope = 'local' | 'external';
type NodeAddress = {
  host: Host | Hostname;
  port: Port;
  scopes: Array<NodeAddressScope>;
};
type NodeData = {
  addresses: Array<NodeAddress>;
  lastUpdated: number;
};

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Oct 19, 2023

@amydevs can improve the spec by expanding this to include each network scenario, and how different parts of nodes domain and MDNS is involved in achieving it.

  1. Local Discovery on Same Device
  2. Local Area Network Discovery
  3. Global Discovery
  4. (Private network discovery?)
  5. IPv4 to IPv6 and IPv6 to IPv4 (dual stack connectivity)

I'm interested in seeing what exactly is used to deal with the fact that an agent can have multiple valid IPs (and/or hostnames).

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Oct 23, 2023

This is a key feature of the MatrixAI/Polykey-CLI#40 at the very least new users should be able to discover their local nodes on the same computer or local area network and share a vault between them.

This has some interaction with MatrixAI/Polykey-CLI#30 if it finds out that the other local node is part of another gestalt.

@CMCDragonkai
Copy link
Member

So node graph expansion is being done separately, we start with just MDNS being its own separate state. And we have to work out and model what happens with an expanded node graph.

@CMCDragonkai
Copy link
Member

Removing this from the #551 because this will require further work later down the line.

@CMCDragonkai CMCDragonkai added the epic Big issue with multiple subissues label Oct 24, 2023
@CMCDragonkai
Copy link
Member

Stage 2 isn't part of CLI beta launch. So we can address it later. I believe we need to test local discovery too. But further work on this can be paused until afterwards.

@CMCDragonkai
Copy link
Member

#611 fixed up getting all node records, atm it does not fetch MDNS records. This is because MDNS is a separate data source of nodes, it does not feed into the nodegraph yet. We will need to re-design the nodegraph data structure to support more sophisticated understanding of nodes.

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Nov 8, 2023

Being able to define scopes as an array also means that i address could have multiple classifications, which would be useful in the future if we ever needed it expanded. Also addresses with both local and external will be understood to be shared on the mainnet, but not require holepunching to be connected to.

It should be local plus global or internal vs external.

Right now the naming doesn't quite make sense.

Also if something is local and external is understood that it will be shared but not require hole punching? I'm not sure if this is a good way of doing this.

To really understand scopes, we need to think of it either as set intersection or set union.

If an address is ['local', 'global'], do we mean the intersection of local and global, or do we mean the union of local and global?

Being an array is just for convenient storage, it actually needs to be a set to avoid duplicate labels.

I'm biased towards making it a set union. So that if an address is local and global, it just means that the address is locally routable and globally routable.

I don't think it makes sense to add the semantics of holepunching here.

Although @tegefaulkes has mentioned in #365 that it would be useful if it was possible to share information about whether a node requires hole punching or not, I don't have a concrete formulation for how that's supposed to work atm and I dislike the idea of just adding holepunchable: true/false sort of property.

The real purpose of scopes is not determined by the NodeGraph, it is determined by interpretation of the rest of the nodes system.

If it is set union, then we need to understand that an address that is local only, then there's no point sharing it across the network. But if it possesses the global then it should be shared. But if it is global only, and not local, then it shouldn't be shared across MDNS. ATM because MDNS data isn't even merged into the node graph, then this isn't even possible yet.

Future scopes could expand this further into things like:

  1. private - addresses kept to the own node entirely and never shared at all
  2. local - meaning on the same machine (but what is a machine?, especially with VMs and other abstractions)
  3. internal - meaning the internal network (some how distinguished from the local network?)
  4. global

And also how this would work in private networks in PKE, as well as how this might work if PK were to be able to connect to multiple networks at the same time, thus creating segregated networks. Imagine org1 and org2 networks, each with their own node graph, then the scope labels could be org1.enterprise.polykey.com and org2.enterprise.polykey.com... etc.

@tegefaulkes
Copy link
Contributor Author

I don't like having a set as part of the NodeAddress type, too much complexity especially when creating from scratch. If we want to indicate scopes of an address without duplicates we can use bitflags.

enumb BitFlags {
 FLAG_A: 0b01,
 FLAG_B: 0b10,
}

// They are composed with an bitwise OR.
// Creates a value where both A and B are enabled
let value = BitFlags.FLAG_A | BitFlags.FLAG_B;

// They can be toggled with bitwise XOR
// switches FLAG_A off
value ^= BitFlags.FLAG_A

// And they can be checked with a bitwise AND ​
// would be true
const result1 = value & BitFlags.FLAG_B;
// would be false
const result2 = value & BitFlags.FLAG_A;

Since it's just a number it's much easier to store or pass around.

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Nov 9, 2023

It's already stored as an array of strings. I think it's fine to keep it as is, I don't think storage size is a huge problem here.

If we use bit-vector we would need to use a number (due to JSON encoding). That would give us almost 64 possible labels.

But I'm going to be potentially using arbitrary dynamic labels, so I don't want to change to using a fixed bitvector with a fixed meaning.

@amydevs
Copy link
Contributor

amydevs commented Nov 9, 2023

@CMCDragonkai also think a set union would make sense, if an address is both locally and globally routable

@amydevs
Copy link
Contributor

amydevs commented Nov 9, 2023

Holepunching notes:

Roger Qiu — Today at 12:51 PM
I feel like due to set union, conditions should not be based on negativity, but instead on positivity
Hole punch, only if external
Not disable if local
But I'm looking at adding provenance too

@amydevs
Copy link
Contributor

amydevs commented Nov 9, 2023

@CMCDragonkai renaming of scopes has been addressed in 4f0ae2a, more to come will be done in PRs

Holepunching notes:

Roger Qiu — Today at 12:51 PM I feel like due to set union, conditions should not be based on negativity, but instead on positivity Hole punch, only if external Not disable if local But I'm looking at adding provenance too

Holepunching logic has also been adpated to fit this ^

Furthermore, there has been an config option I have added to allow for a shorter timeout for findNodesLocal: 09b291f

@CMCDragonkai
Copy link
Member

I'm linking #618 (comment) because I was reviewing the NodeGraph, and the expansion to be able to maintain multiple addresses.

Currently the NodeGraph still isn't capable of storing multiple addresses for the same node. I thought we were meant to have this as per #537? @amydevs

Yes, this is meant to be done by the Step 2 of #537 but #584 only completed Step 1. This was done for the sake of time, as it was deemed that expanding the NodeGraph required more consideration and many changes.

Also doesn't that mean no where in the NodeGraph would we have ['local']?
Or wait.. if we successfully connect we would end up putting ['local'] over? And end up overwriting the previous global record?
Would that be correct?

The first is true, NodeGraph attaches ['global'] for all node addresses it sets.

I think you mean that NodeManager does it. But NodeGraph.setNode allows the caller to set this information.

Expanding to multiple addresses - would require an extra sublevel in the DB if we want to do this efficiently. Right now it just becomes a JSON NodeData, and we have scopes applied to the entire thing. Multiple addresses would instead require a more sublevel structure like:

NodeId/NodeAddress/Port -> { lastUpdated, scopes }

Would we need Port as a sublevel too? It would only make sense if we expect multiple ports to also exist as part of the NodeAddress. I mean like why not...?

But since we see it as a sort of single unit... perhaps this is better:

NodeId/NodeAddress-Port -> { lastUpdated, scopes }

That could work better.

Since NodeAddress could be Host | Hostname it would need to be a string.

But we also need to canonicalise any host that could exist. Because IPv4 mapped IPv6 and IPv6 have multiple representations of the same address. This canonicalisation should be done either BEFORE setNode or WITHIN setNode?

Hostnames cannot be canonicalised... we never know if www.a.com is the same as a.com until we visit it and even then it could be dynamic.

Port can also just be a string and joined up with NodeAddress.

Canonicalisation should be a network utility.

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Nov 10, 2023

It might even be related to the problem at MatrixAI/Polykey-CLI#47.

@CMCDragonkai
Copy link
Member

Based on this comment #618 (comment) I believe the initial usage of scopes should be fulfilled based on where the data came from/discovered.

If it was discovered from MDNS it should be local, if it was discovered from kademlia, then it should be global.

If it was discovered from both, it could be both.

If the user provided it, then the user may specify the appropriate scope, but if not, it may default to global?

So then an address would be default global and only get more restrictive?

@CMCDragonkai
Copy link
Member

With #365 #618, many of the pieces of the puzzle is here. But one still needs to set the local scope and not share that information across the node graph.

@CMCDragonkai
Copy link
Member

It should not present local scoped addresses to the outside world.

@CMCDragonkai
Copy link
Member

Seems like to finish this issue, the node should not be presenting local scoped addresses from its NodeGraph to the outside world.

@tegefaulkes
Copy link
Contributor Author

We also need policy for culling old connection information. For example we keep separate entries for each ip-port combo. With NATs the port is highly ephemiral so each new connection will have a new port assigned to it (depending on spectific nat implementation). We've seen this in manual testing already where we're accumulating new ports in the NodeGraph.

@tegefaulkes
Copy link
Contributor Author

This seems pretty much done? I think there are some small things to address still but we can make a new issue to address these and reduce the scope.

@tegefaulkes
Copy link
Contributor Author

Moving this to todo since it's not actively worked on.

@tegefaulkes
Copy link
Contributor Author

Right, so the main part of this is determining what kind of information to keep in the node graph. Essentially work out when and what information is OK to cull from the graph. To answer this we need to start with what kind information we have.

The information is structured as a NodeContact which is mapped to a NodeId Map<NodeIdString, NodeContact> Not actually the format but shows the relation). The NodeContact is structured as...

type NodeContactAddress = Opaque<'NodeContactAddress', string>;
type NodeContact = Record<NodeContactAddress, NodeContactAddressData>;

type NodeContactAddressData = {
  /**
   * Indicates how the contact address was connected on its
   * last connection establishment. The ICE procedure concurrently
   * uses all methods to try to connect, however, whichever one
   * succeeded first should be indicated here. When sharing this
   * information to other nodes, it can hint towards whether a
   * contact does not require signalling or relaying.
   */
  mode: 'direct' | 'signal' | 'relay';
  /**
   * Unix timestamp of when the connection was last active.
   * This property should be set when the connection is first
   * established, but it should also be updated as long as the
   * connection is active.
   */
  connectedTime: number;
  /**
   * Scopes can be used to classify the address.
   * Multiple scopes is understood as set-union.
   */
  scopes: Array<NodeAddressScope>;
};

NodeContactAddressData is a kind of metadata so we'll ignore that for now. NodeContactAddress is the standard string format for IPv4, IPv6 and hostname including port. We'll examine each of these for lifetime.

IPv4

IPv4 Will be our most common kind of address as most machines still use this. When connecting to a node this is usually what we will get back, same for when a node connects to us. It is formatted as xxx.xxx.xxx.xxx:port as a string.

For a desktop machine the ip can be expected to be static for the full up time of the node, but for mobile devices such as phones or laptops and even VPN connections that roam it's very possible that this IP can change constantly. As for the port, it it's dynamically selected via a NAT then this will change usually on a per-connection basis. If a static port is configured then this will not change.

Also note while somewhat rare, a node can be bound to multiple interfaces that have access to the internet. So a node could have multiple concurrent IP addresses that are valid at the same time.

IPv6

IPv6 Is here and there, It's pretty well supported now days and it's usage is growing. It is formatted like [xxxx:xxxx::xx:x]:port, note that the IP portion of this is subject to some formatting rules that significantly reduces the size of the address. Any leading zeros is optionally omitted.

An IPv6 address isn't natted but that doesn't mean it isn't subject to firewall rules and a connection still has a port component. A machine can have multiple IPv6 addresses with varying degrees of lifetime. The port should be static for the duration it is bound since it isn't transformed by a nat.

hostname

These generally don't change over time but they will be resolved to and IP address when needed. I think Hostnames are shared between nodes when making connections but we'd need to manually test this is the case. In any case a Hostname should be considered a reasonably static way to refer to a node.


So for IP addresses we can just expect them to change for any number of reasons. So how do we know what information to keep then? Previously Kademlia just kept the newest information overwriting the old for a NodeId. But since we keep multiple mappings now we can't overwrite any one of them. But we do know what type and addresses is and the last time it was connected to.

I think a simple solution is to just keep the N last seen addresses for each type, 3 may be a good compromise. So at most we'd only have 3 IPv4, 3IPv6 and 3 Hostnames for any NodeId. If this proves too much we can have a shared limit across all of them.

Now that I think about it, its very similar to how the buckets work in the NodeGraph but an extra layer down for each NodeId. We have a bucket with a hard limit, so how do we decide witch ones to cull? Normally I'd say we'l pick the oldest one that doesn't respond but that might be far too much work at this level. So we can just treat it as a FIFO queue where the oldest one get's pushed out for each type.

I'm thinking it would also be useful to keep a simple counter to track how often the address changes in the last calendar day as a metric for determining if a port is statically shared or publicly accessible. Since a public port wouldn't change at all. But that can be a metric added to the audit domain.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
development Standard development epic Big issue with multiple subissues r&d:polykey:core activity 4 End to End Networking behind Consumer NAT Devices
Development

Successfully merging a pull request may close this issue.

3 participants