Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFP for cluster test environment #4

Closed
bigs opened this issue Jun 5, 2018 · 11 comments
Closed

RFP for cluster test environment #4

bigs opened this issue Jun 5, 2018 · 11 comments

Comments

@bigs
Copy link

bigs commented Jun 5, 2018

Wanted to start a conversation to make sure @jbenet, @mgoelzer, and I are all on the same page (all encouraged to contribute!) Still very early days of specification/wish list definition.

Rough Wish List

  • Cluster management (perhaps w/ Kubernetes?)
  • Ability to launch many daemons (IPFS or otherwise) running various versions of software
  • Centralized telemetry collection
  • Declarative network topology definitions (i.e. A connects to B but not to C)
  • Log processing?

@jbenet would love some additional thoughts!

@whyrusleeping
Copy link
Member

cc @frrist @phritz @laser as they have a keen interest in this too.

@ghost
Copy link

ghost commented Jun 13, 2018

We should specify what we want to test on the cluster. I'm imagining a Kubernetes-based system that will spin up 1,000 (for example) libp2p processes[*], but then what does it do with them next? Ideas would be:

  • test the ability to transmit a predefined block of data between two random nodes
  • simulate network partitions and random failures (ChaosMonkey?) and test the same

The author of the RFP can recommend a specific testing framework, but it's on us to say what we want tested and how we define "success."

[*] Is this RFP blocked on daemonization of libp2p?

@ghost
Copy link

ghost commented Jun 13, 2018

We should incorporate a plan for testing js-libp2p. This could involve spinning up some VMs with Firefox in them, or it could be a test harness that wraps js-libp2p as a daemon, etc.

@ghost
Copy link

ghost commented Jun 13, 2018

Ideally Rust would be tested also, but I don't see that as an MVP goal. Thoughts?

@bigs
Copy link
Author

bigs commented Jun 13, 2018

For the JS side, we could probably get away with a headless browser. I'm kind of envisioning some sort of testing daemon that receives a protobuf message that dictates the scenario they should carry out (i.e. attempt to connect to peer by this address) and they then execute it and log the result.

@ghost
Copy link

ghost commented Jun 15, 2018

@florianlenz Saw your note about testing framework that @Stebalien linked to this issue from.

Any interest in helping us design the testing framework? I think there are three parts:

  • figure out what we want to test, what metrics we'd collect
  • specify some general parameters about the testing infrastructure like how tests are written, how results are logged and viewed, what other tools are used (eg, Kubernetes)
  • PL and/or other interested organizations can provide grants to people who will build this system

There is opportunity to help in all three areas.

@florianlenz
Copy link

florianlenz commented Jun 16, 2018

Hi.

Any interest in helping us design the testing framework?

Yes!

figure out what we want to test, what metrics we'd collect

Thats a good question. So I guess the standard things (bandwidth used, etc). But also protocol specific things. I am e.g. implementing quasar which is a pub sub protocol and I would like to see how it behaves in a network of 30k, 60k, 90K people (and even more). I would for also like to test how useful the message routing is, if a portion of the network is not behaving it's supposed to. Specially the ability to model a network that doesn't behave as it is supposed to is of value from my point of view.

specify some general parameters about the testing infrastructure like how tests are written, how results are logged and viewed, what other tools are used (eg, Kubernetes)

Kubernetes sounds good. I guess a visual result of the test would be very nice. I am not entirely sure how that would look like / can be build. I have to think about it.

PL and/or other interested organizations can provide grants to people who will build this system

I could work on this 4 - 10 hours peer week. I can also free resources from the project I am working on to work on this as soon as we have the specs.

@frrist
Copy link

frrist commented Jun 18, 2018

Cluster management (perhaps w/ Kubernetes?)

A potential problem with Kubernetes is the lack of OSX support. Generally we want to test across all supported operating system - Linux, Windows, and OSX - Kubernetes will only work with Windows and Linux. I believe the same is true for Docker Swarm.
MacStadium could be a potential solution to the OSX problem, but would require something like IPTB or SaltStack to manage the instances - @whyrusleeping has written some notes on IPTB and what a test lab might look like with it.

Centralized telemetry collection
Log processing?

The Opentracing API (specifically opentracing-go) was recently added to go-log. This allows all packages that use go-log (go-ipfs & go-libp2p-*) to produce tracing data that is compliant with the opentracing spec. The Opentracing-API also has a javascript implementation, meaning if js-ipfs-* were to instrament it, we could view traces of events that span both implementations.

@jsoares
Copy link
Contributor

jsoares commented Feb 22, 2019

@bigs @mgoelzer Is this still an open issue / something that we're looking to do?

@bigs
Copy link
Author

bigs commented Feb 22, 2019

@jsoares i'm actually working on this in house with vasco and jacob, so i think we can close this perhpas

@jsoares
Copy link
Contributor

jsoares commented Feb 22, 2019

@bigs Thought so, thanks for the feedback! We can always re-open at any point if you feel the need to outsource it.

@jsoares jsoares closed this as completed Feb 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants