Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

List the multiple definitions of "node". #683

Closed
johnnymatthews opened this issue Mar 8, 2021 · 8 comments
Closed

List the multiple definitions of "node". #683

johnnymatthews opened this issue Mar 8, 2021 · 8 comments
Assignees
Labels
dif/easy Someone with a little familiarity can pick up effort/hours Estimated to take one or several hours good first issue Good issue for new contributors help wanted Seeking public contribution on this issue kind/enhancement A net-new feature or an improvement to an existing feature P3 Low: Not priority right now status/ready Ready to be worked topic/docs Documentation

Comments

@johnnymatthews
Copy link
Contributor

johnnymatthews commented Mar 8, 2021

The word node means different things in different contexts. Let's list those definitions to the Glossary.

The following was taken from a HackMD post put together by @Gozala, and referenced in #678:


go-ipfs node

  • runs on servers and user machines with the full set of capabilities
    • tcp and quic transports enabled by default
    • /ws/ transport disabled by default
    • http gateway with subdomain support for origin isolation between content roots

js-ipfs node

  • runs in the browser with a limited set of capabilities
    • can connect to server nodes (go/js-ipfs) only via secure websockets (/wss/ requires manual setup of TLS at the server)
    • can connect to browser nodes via webrtc (with help of centralized ws-webrtc-star signaling service)
    • no http gateway (can't open TCP port)
  • runs on servers and user machines with (in theory) the full set of capabilities
    • DHT not in par with go-ipfs (is this still the case?)
    • http gateway present, but has no subdomain support

Preload node

  • are go-ipfs nodes with their API ports exposed, some HTTP API commands accessible, and a patch applied
  • used by js-ipfs nodes, both in browser and not
  • js-ipfs nodes remain connected to the libp2p swarm ports of all preload nodes by having preload nodes on the bootstrap list
  • when users want to make some UnixFS DAG publicly available they call ipfs refs -r <CID> on a randomly chosen preload node's HTTP API which puts the CID in the preload nodes' wantlist which then causes it to fetch the data from the user
  • Other js-ipfs nodes requesting the content can then resolve it from the preload node via bitswap as the data is now present in the preload node's blockstore
  • Only works with dag-pb CIDs because that's all the refs command understands
    • Q: What are the net effects of this? Bad perf or broken js-ipfs for non-dag-pb CIDs? Are there mitigations?
    • A: Harder to find non-dag-pb content - e.g. you need a connection to the publishing js-ipfs instance or it needs to be put on the DHT by a delegate node. We could do this at the block level and use block stat in the same way as js-delegate-content module
  • Preload nodes garbage collect every hour so preloaded content only survives for that long
    • Q: Is this configurable?
    • A: Yes? Infra would be able to tell you more
  • TODO: Is there anything about pubsub topic bootstrapping here?

Relay node

  • are go-ipfs nodes
    • Q: or are they custom go-libp2p nodes?
  • can also be js-libp2p nodes properly configured, or the out of the box js relay
  • are used by go-ipfs nodes to serve as relays/VPNs for nodes who deem themselves to be unreachable by the public internet
    • Q: Used by js-ipfs too?
    • A: Yes. They can also be used to overcome lack of transport compatibility. For instance, a browser node with websockets/webRTC transports can talk with a go-ipfs node that only talks TCP, via a relay that support both transports. This is not enabled by default and needs to be setup.
  • not configurable in go-ipfs and uses a preset list of relays

Bootstrap node

  • are go-ipfs nodes
  • used by go and js-ipfs nodes to enter the DHT
  • if they go offline a go-ipfs node that restarts will not by default be able to join the public DHT
    • Q: SO MANY QUESTIONS... to start, do you mean if all configured bootstrap nodes go offline this happens?
  • configurable in go and js-ipfs config files

Delegate routing node

  • are go-ipfs nodes with their API ports exposed and some HTTP API commands accessible
  • used by js-ipfs nodes to query the DHT and also publish content without having to actually run DHT logic on their own
  • publishing works with arbitrary CID codecs as the js-delegate-content module publishes CIDs at the block level rather than the ipld/dag level
  • Delegate nodes garbage collect every hour so provided content only survives for that long - unless the uploading js-ipfs node is still running, in which case it will issue periodic re-provides via the same publising mechanic which extends the life of the content on the DHT
@johnnymatthews johnnymatthews added dif/easy Someone with a little familiarity can pick up help wanted Seeking public contribution on this issue P3 Low: Not priority right now kind/enhancement A net-new feature or an improvement to an existing feature effort/hours Estimated to take one or several hours status/ready Ready to be worked topic/docs Documentation labels Mar 8, 2021
@Gozala
Copy link

Gozala commented Mar 8, 2021

Thanks @johnnymatthews for consolidating these here, I have couple of suggestion

  1. I think it would be really helpful to have a table mapping IPFS node types to transports. So it is easy to see what is available, what is enabled/disabled etc...

    I think it would be even better to disambiguate ability to dial / connect and be dialed / listen per transport. Because web nodes can only listen to webrtc, but dial webrtc, websocket (assuming SSL setup). They also can be dialed and dial any transport though circuit relays.

  2. I think classifying nodejs and web IPFS nodes into single js-ipfs group is a mistake. While we share codebase across them platform constraints are so significant that we can not expect feature parity. In that regard I think nodejs nodes are far more comparable to go-ipfs nodes than to web nodes.

  3. I think it would be helpful to disambiguate between nodes and services they provide (or put it differently roles they play in the system). Here specifically I would argue that

    • preload
    • relay
    • bootstrap
    • delegated routing (maybe router would be a better name)
      are kinds of services provided by nodes. While it is true that most of those services are provided by go-ipfs nodes that is not an inherently so, just happens to be that nodejs nodes due to DHT limitation has never made to a state where they would be deployed as a part of infrastructure.

    If we reframe those as services it also easier to talk about which nodes provide those services and which nodes consume them.

@Gozala
Copy link

Gozala commented Mar 8, 2021

go-ipfs node

* runs on servers and user machines with the full set of capabilities

I think I would differentiate between capabilities and supported transports where tcp, quick, ws, webrtc are transports while DHT, pubsub are capabilities.

HTTP Gateway, IPFS HTTP API seem like a services that node may provide just like relay or delegated routing are.

  * tcp and quic transports enabled by default
  * /ws/ transport disabled by default
  * http gateway with subdomain support for origin isolation between content roots

js-ipfs node

* runs in the browser with a limited set of capabilities
  
  * can connect to server nodes (go/js-ipfs) only via secure websockets (/wss/ requires manual setup of TLS at the server)

This is generally true, but little more nuanced. Host nodes (nodes it's connecting to) be open to web connections from that specific web origin. In practice that implies that if request is coming from HTTPS host needs to have wss address and have TLS certificate setup accordingly.

  * can connect to browser nodes via webrtc (with help of centralized ws-webrtc-star signaling service)
  * no http gateway (can't open TCP port)

This is correct, although technically we could have something like gateway through service workers

* runs on servers and user machines with (in theory) the full set of capabilities

Not sure if worth calling out electron specifically here.

  * DHT not in par with go-ipfs (is this still the case?)

Yes still the case

  * http gateway present, but has no subdomain support

Do we have issue on file ? I did not realize subdomains were not implemented in js.

Preload node

* are go-ipfs nodes with their API ports exposed, some HTTP API commands accessible, and a [patch applied](https://github.com/ipfs/go-ipfs/tree/feat/stabilize-dht)

There is a bit of documentation here of preload service
https://github.com/protocol/bifrost-infra/blob/master/docs/preload.md

I think it would be really good to have more in depth documentation of what it is. In practice it is

  • HTTP API endpoint exposing /api/v0/refs functionality of IPFS HTTP API.
  • It is assumed that web node is connected to a node providing preload service, which enables service provider to obtain underlying content from the web node. That is also why nodes providing preload service are on the bootstrap list.
* used by js-ipfs nodes, both in browser and not

* js-ipfs nodes remain connected to the libp2p swarm ports of all preload nodes by having preload nodes on the bootstrap list

* when users want to make some UnixFS DAG publicly available they call `ipfs refs -r <CID>` on a randomly chosen preload node's HTTP API which puts the CID in the preload nodes' wantlist which then causes it to fetch the data from the user

* Other js-ipfs nodes requesting the content can then resolve it from the preload node via bitswap as the data is now present in the preload node's blockstore

* Only works with dag-pb CIDs because that's all the refs command understands
  
  * Q: What are the net effects of this? Bad perf or broken js-ipfs for non-dag-pb CIDs? Are there mitigations?
  * A: Harder to find non-dag-pb content - e.g. you need a connection to the publishing js-ipfs instance or it needs to be put on the DHT by a delegate node.  We could do this at the block level and use block stat in the same way as [js-delegate-content module](https://github.com/libp2p/js-libp2p-delegated-content-routing/blob/master/src/index.js#L127-L128)

* Preload nodes garbage collect every hour so preloaded content only survives for that long
  
  * Q: Is this configurable?
  * A: Yes? Infra would be able to tell you more

* TODO: Is there anything about pubsub topic bootstrapping here?

Relay node

* are go-ipfs nodes
  
  * Q: or are they custom go-libp2p nodes?

I'm not sure we have any such nodes deployed, but technically they don't need to be ipfs nodes. Again I think it's better to think about it as a service node provides (or a role node assumes) which I think removes these questions.

* can also be js-libp2p nodes properly configured, or the out of the box [js relay](https://github.com/libp2p/js-libp2p-relay-server)

In fact nodejs nodes could be potentially enable connections across web and go nodes over WebRTC without having to e.g. implement WebRTC transport in go.

* are used by go-ipfs nodes to serve as relays/VPNs for nodes who deem themselves to be unreachable by the public internet
  
  * Q: Used by js-ipfs too?
  * A: Yes. They can also be used to overcome lack of transport compatibility. For instance, a browser node with websockets/webRTC transports can talk with a go-ipfs node that only talks TCP, via a relay that support both transports. This is not enabled by default and needs to be setup.

* not configurable in go-ipfs and uses a preset list of relays

Bootstrap node

* are go-ipfs nodes

Yes in practice, but I think we should aim for having diversity there to improve resilience and working towards a proposal to deploy nodejs IPFS bootstrap nodes (it's blocked on DHT though)

* used by go and js-ipfs nodes to enter the DHT

* if they go offline a go-ipfs node that restarts will not by default be able to join the public DHT
  
  * Q: SO MANY QUESTIONS... to start, do you mean if _all_ configured bootstrap nodes go offline this happens?

@vasco-santos would have an accurate answer here, but as far as I know currently that is true. When node starts it tries to establish connections with nodes on a bootstrap list from which it can discover other nodes, content etc... If node is unable to dial any of the bootstrap nodes it won't be able to discover nodes/content over the internet, although it would still be able to do on local area network (only go and nodejs).

That said, there is/was ongoing effort to improve connection manager such that node would be able to remember addresses of nodes that could provide bootstrap service and dial them if ones on the preconfigured list are unreachable.

* configurable in go and js-ipfs config files

Delegate routing node

* are go-ipfs nodes with their API ports exposed and some HTTP API commands accessible

This document provides a bit more details on specific HTTP Endpoints
https://github.com/libp2p/js-libp2p-delegated-content-routing

Nodes providing this service are go-ipfs, but that is because nodejs nodes do not have usable DHT implementation. I think these comments provide a good overview of assumptions at play
https://github.com/libp2p/js-libp2p-delegated-content-routing/blob/99a64914cc5aa535852c67cf159134c347aaa547/src/index.js#L112-L115

* used by js-ipfs nodes to query the DHT and also publish content without having to actually run DHT logic on their own

* publishing works with arbitrary CID codecs as the [js-delegate-content module](https://github.com/libp2p/js-libp2p-delegated-content-routing/blob/master/src/index.js#L127-L128) publishes CIDs at the block level rather than the ipld/dag level

* Delegate nodes garbage collect every hour so provided content only survives for that long - unless the uploading js-ipfs node is still running, in which case it will issue periodic re-provides via the same publising mechanic which extends the life of the content on the DHT

@lidel
Copy link
Member

lidel commented Mar 9, 2021

Not sure if worth calling out electron specifically here.

I'd say yes, mentioning Electron is important, especially given the fact it can be an odd mix of capabilities present in web and node versions (depending on execution context)

  • http gateway present, but has no subdomain support

Do we have issue on file ? I did not realize subdomains were not implemented in js.

@Gozala AFAIK js-ipfs does not support Host header at all (at least that was the case in past): ipfs/js-ipfs#2248

  • if they go offline a go-ipfs node that restarts will not by default be able to join the public DHT
  • Q: SO MANY QUESTIONS... to start, do you mean if all configured bootstrap nodes go offline this happens?

[..] .. If node is unable to dial any of the bootstrap nodes it won't be able to discover nodes/content over the internet, although it would still be able to do on local area network (only go and nodejs).

@johnnymatthews FYSA related issues are ipfs/kubo#3926, ipfs/kubo#3908 ipfs/js-ipfs#1505 but the best TLDR one is libp2p/go-libp2p-kad-dht#254 I believe.

@Annamarie2019
Copy link
Contributor

I'll take this one (and also 245 and 334, to really get into nodes)

@Annamarie2019
Copy link
Contributor

In progress....

@Annamarie2019
Copy link
Contributor

@johnnymatthews Because this feature required me to update pages that I just updated for #245, I committed the changes to the same branch as 245. If I understand correctly, that doesn't require a new PR, because you can ostensibly review them together. Sorry, it's a big chunk at once. Let me know if I need to make changes to my understanding of the PR workflow.

@Annamarie2019
Copy link
Contributor

Annamarie2019 commented Feb 18, 2022

@johnnymatthews I addressed the initial request in this issue by making sure that all node "types" have a concise entry in the glossary and are linked to a more detailed discussion on the nodes page.

Starting with Gozala's comments (node graph for transport services/roles vs. types), it seems to me that we should have a separate issue. I'll read about and consider Gozala's changes, but I'm hesitant to add any more to PR 1031, which is already chock full.

@Annamarie2019
Copy link
Contributor

This issue was resolved with PR #1031 . I included these changed files in that PR before I knew how to properly separate them.

@johnnymatthews johnnymatthews moved this from Done to Archive of closed issues in Protocol Docs Apr 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dif/easy Someone with a little familiarity can pick up effort/hours Estimated to take one or several hours good first issue Good issue for new contributors help wanted Seeking public contribution on this issue kind/enhancement A net-new feature or an improvement to an existing feature P3 Low: Not priority right now status/ready Ready to be worked topic/docs Documentation
Projects
None yet
Development

No branches or pull requests

4 participants