Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Qri needs a long-term plan for operating on IPFS, libp2p protocols out-of-process #7196

Open
b5 opened this issue Apr 22, 2020 · 3 comments
Open
Labels
kind/discussion Topical discussion; usually not changes to codebase kind/enhancement A net-new feature or improvement to an existing feature

Comments

@b5
Copy link
Contributor

b5 commented Apr 22, 2020

Most of our users come from the world of data science, and don't (yet) have IPFS installed independent of our binary. Then @momack2 gave Qri Desktop a spin, and it turns out we don't play very nice with other IPFS's 😞:

qri-io/qri#1296
qri-io/desktop#535

Right now Qri wants the repo lock so it can run go-ipfs in-process. Long term, we really shouldn't need to do that, and it should instead be possible to run IPFS somewhere else. Needing to choose between Desktop applications isn't ok.

That said, this problem is not small. There are two immediate blockers to us operating on IPFS out-of-process:

  1. our dsync algorithm needs to know if blocks are local or not. I've filed canonical method for checking locality of a block via coreAPI #6726 for this one, but have no idea how this would work over a Unix Socket/HTTP API.
  2. we register at least 3 custom libp2p protocols, and tune the connection manager to prioritize connections to qri peers.

On the upside, we do have cursory support for operating on IPFS over the HTTP API, and use this exact setup in our cloud backend services. But with the two above issues we've had to carefully carve our backend architecture to support these deficits by having some services that operate on in-process repos, and some that operate over HTTP with a lot of configuration and reduced functionality.

Finally performance is an obvious concern here. Much of Qri's performance depends on fast access to the local blockstore (SQL JOINs will access blocks once per row at the moment).

I'd love to get a discussion going to figure out the right course of action is so we can start to work in a good roadmap. I'm not sure what the right end state for a project like ours even is, given that there's no way to register a libp2p protocol without access to the host itself, and by extension, the repo lock (unless I'm missing something crucial. I'd love to be missing something crucial).

@b5 b5 added the kind/enhancement A net-new feature or improvement to an existing feature label Apr 22, 2020
@jbenet
Copy link
Member

jbenet commented Apr 24, 2020

(quick, short note)

  • I think longer term path should be able to use/reuse nodes external to the Qri app.
  • I think short term path should just isolate installations:
    • move Qri's IPFS repo inside ~/.ipfs -> ~/.qri/ipfs
    • move to use different ports (qri probably should pick random ports when the default ones are taken -- go-ipfs is not the only thing that uses :8080, :4001, :5001)
    • running two ipfs nodes will be a bit more resource intensive, but lets us solve the immediate problem and deal with the resource utilization issue later on.
    • moving content between two local ipfs nodes should be pretty fast. there may be some content duplication, that we should explore how to avoid (shared blockstores?)

@hsanjuan
Copy link
Contributor

1. our dsync algorithm needs to know if blocks are local or not. I've filed #6726 for this one, but have no idea how this would work over a Unix Socket/HTTP API.

I think POST /api/v0/block/stat?offline=true&arg=CID works for this.


Regarding re-using the datastore, we could have a go-datastore implementation-wrapper that either:

  1. Exposes the go-datastore API over Unix sockets (using grpc or something) when no one else is doing it
  2. Uses the already existing API when someone else is doing it.

This could allow multiple go-ipfs processes to re-use the same underlying datastore.

@b5
Copy link
Contributor Author

b5 commented Apr 28, 2020

I think longer term path should be able to use/reuse nodes external to the Qri app

That'll be a fun world to live in, and would be happy to help tire-kick. We're getting great milage out of the core-libp2p event bus. Feels like a great place to start the conversation with small wins.

I think short term path should just isolate installations

Agree with all points here. We've got a repo migration ahead of us with the switch to v0.5.0, good time to discuss moving the repo location with the Qri community. Seems this would address both issues. After a quick test I'm delighted to see /tcp/0 multiaddrs do random port selection within IPFS config.

@hsanjuan

I think POST /api/v0/block/stat?offline=true&arg=CID works for this.

You've just made my day 😄.

shared blockstores?

No rush here given that Qri has opinions about DAG shape, making overlap with content created by other systems kinda trivial. On end-user machines the value isn't enough compared to the advantage of being able to blow away ~/.qri and have all associated data go with it, and leaving other IPFS-enabled things. I'm sure some users will want to deduplicate, we can focus on getting there through qri-side configuration to use IPFS over the HTTP API.

Feels like the blockstore problem is well handled by the HTTP api for our use cases.

@BigLep BigLep added the kind/discussion Topical discussion; usually not changes to codebase label Mar 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/discussion Topical discussion; usually not changes to codebase kind/enhancement A net-new feature or improvement to an existing feature
Projects
None yet
Development

No branches or pull requests

4 participants