Sprint Prep Call Notes #5

whyrusleeping · 2017-02-24T20:13:07Z

InterPlanetary Test Lab Planning Call

Participants

Goal

Jenkins works reliably for standard go-ipfs CI
Infrastructure + code in place for large scale network tests
- need to be able to run tests on many machines
Kubernetes deployed and able to run tests based on commits
- Jenkins able to build docker images for go-ipfs on every commit
- Per commit builds pushed somewhere accessible (and garbage collected?)

Actionables:
1. Jenkins working as day to day go-ipfs CI
2. System to build docker images for each commit
3. Make it easy to trigger run of kubernetes tests given a commit hash
4. Use google cloud as initial deployment pool for kubernetes nodes

jbenet · 2017-02-27T12:32:16Z

Re: the Planning Call

Hey Everyone,

I'm sorry for missing the planning call. I should've been there. I've been overwhelmed with other work and failed to wake up-- i'm sorry. That said, I don't think the lack of my presence is the reason for the issues here. Faults like that will happen and we need to be fault-tolerant and proceed through them.

The bigger issue here -- and why the call went the way it went -- is that, collectively, we had dropped this thing, and we don't really know what we're doing. We have not taken the time to craft product, nor synthesize all the available comments into a coherent product. (Product Development).

The PMing role -- which often involves that Product Development or at least surfacing and synthesizing others' work -- fell to @whyrusleeping late in the game, who has also been extremely busy with other things and has not had the time required to do greatly here. The result is that the call ended up with fishing for things vaguely related to do, and the action plan is very far away from the original goals set for this sprint in early January. We also need @whyrusleeping tech-leading here, which is a hard and time-consuming role. I voiced this a few days ago: I doubt that Tech-leading, being one of the lead Implementors, Project Managing, AND Product Developing, is a feasible thing, if we want this sprint to be successful. As amazing as @whyrusleeping is, there's just not enough hours in a day. (thoughts @whyrusleeping?)

Product Development

As the person who called this sprint into being, I was the de facto product owner. But this was not quite explicit, and I did not have the time to do the product development necessary. Regardless, I should've either found someone else or do it. So I dropped that ball, which I wasn't even aware I was dropping until a couple days ago.

Here are a bunch of threads relevant to this product (thanks to everyone who surfaced these, or contributed thoughts):

Leveling up testing infrastructure ipfs/notes#202 -- the issue that surfaced the need for the test lab
Real world IPFS scenarios we want to test ipfs/notes#211 -- a set of tests we want to run
Interplanetary Test Lab ipfs/notes#191 -- a description of a tool to help test ipfs
Create a Roadmap with Steps for Building the Interplanetary Test Lab #1 (comment) -- a description of user desires / user stories
Create a Roadmap with Steps for Building the Interplanetary Test Lab #1 (comment) -- same as above
Investigate golang/build ipfs/infra#99
CI Infrastructure ipfs/infra#100
Inspiration: https://build.golang.org/

And then there is the vision & goals for the "InterPlanetary Lab" (what this sprint is here for), which i've discussed with many people in this sprint, but has not yet been written up. (For those familiar with Planet Lab, imagine: Planet Lab for IPFS, with all sort of hardware around the world.

The goals I had for this sprint were:

Make the first working version of IPLab or IPTL (let's go with IPTL henceforth...)
This would involve:
- a cluster / process orchestrator that can deal with hundreds of machines coming and going
- a container (or similar isolation + deployment tech) engine to deploy arbitrary programs
- a volunteer hardware base (like https://build.golang.org)
  - yes, we can surely involve cloud machines and simulated networks too
- a way to issue "interesting workload" tests to the network:
  - that get run, parameters then get permuted
  - and when they finish, all the data is gathered and returned to me automatically.
  - lots of metrics, viz, and access to logs would tell us what all the things we need to fix
Make dashboards / UIs that let us interact with the lab to make the most of it

Product Development, Actually

So I went ahead and finally did some of the product development required for this sprint. I made a designdoc like thing, and a video discussing it. it's here:

https://www.youtube.com/watch?v=giQfhypeo7g <--- watch at 1.5x -- it's 1hr. and i was rambly
Slides also here: https://ipfs.io/ipfs/QmWHHppDw9jLcdr4T1jEfH91zt8QjUxS7EoviKe7HAihvK/

This is not everything we need, of course. We still need tighter connection to user stories. Still, I would like people in this sprint to:

watch this and consider what building this would take.
think whether this would satisfy the user stories you've been thinking about
think about the roadmap at the end of the video
join a discussion today (as the last call of the day) to figure this out. (planning redux, sorry)

IF we decide to go for it, we can:

flesh out what's on that video into github issues / user stories
figure out what we can accomplish in 2 weeks (i think we can get to the "usable" milestone at least)
allocate our teams such that we can parallelize a lot of those (it divides well):
- LabManager (Kubernetes, infra, some go work) -- @lgierth @FrankPetrilli @victorbjelkholm
- Lablet/Miner (Kubernetes client, golang, making VMs) -- @whyrusleeping @FrankPetrilli @lgierth @Kubuxu
- Client (issuing jobs, format and tooling work) -- @victorbjelkholm @jbenet @whyrusleeping @Kubuxu
- Dashboards (frontend UI on top of simple APIs) -- @haadcode @victorbjelkholm @dignifiedquire
- Documentation (writing guides for users) -- @flyingzumwalt @SidHarder leading @everyone ?

Other options

Taking stock of our options, two others appear:

(1) Finish the CI/CD work. This work is vaguely related, so it's already invading this sprint. To be clear, if we do this, this IS NOT the test-lab. What's written here so far does not satisfy original goals for test lab: Sprint Prep Call Notes #5 (comment)
(2) We could punt the test-lab to Q2 and work closely with OpenBazaar to build a number of improvements we and they want. (eg IPNS improvements, pubsub routing -- also great for all the VR use cases, libp2p improvements, relay, teach bitswap about graphs, etc...)

I may be ok to do (2) if we decide not to do the proper test lab.

But I think we should go for the test lab. It's incredibly important for the continued success of everything we're doing, and it's going to level up our tech dramatically. If we do it, I can commit to a daily standup (except Thursday) and to guiding the Product Development, if @whyrusleeping can handle big part of the Lablet implementation & Tech Leading (making sure everyone can do what they need to do, and guiding people through the implementation), someone else (@victorbjelkholm maybe, since we had discussed it as a possibility?) could handle the Project Management.

jbenet · 2017-02-28T10:04:20Z

Own Notes

Orchestration of processes

Setup & Deploy a Kubernetes master to orchestrate the test lab.
Setup & Deploy 2 Kubelets on the cloud that speak to the running master.
Setup & Deploy 2 Kubelets on our own machines that speak to the running master.
Create a lablet VM we can use to image cloud machines
Create a lablet VM we can use to run in our machines (ideally is same VM)
Test: issue a test to all the kubelets in the network
- example: downloads go-ipfs, runs it, adds something, and then counts the peers.

Experiment with testing

Create 5 different network tests:
- all go-ipfs, add + cat 1000 files
- all js-ipfs, add + cat 1000 files
- half go-ipfs and half js-ipfs, add + cat 1000 files
- orbit nodes chatting (all on go-ipfs)
- orbit nodes chatting + web (on go-ipfs, js-ipfs and js-ipfs-browser)
Revive the tests from data.gov, and run them on the cluster
Make some test that produces static grafana output
Make some test that produces some simple trace, gather it afterward from all the nodes

Testing Setup -- Job bundle and Results bundle

Spec out a format for the "job bundle"
Spec out a format for the "results bundle"
Implement Job Bundle
Implement Results Bundle
Choose & Implement network test config DSL
CLI: Dev should be able to start a job from CLI with 1 command
WEB: Dev should be able to start a job from WEB with 1 pageload + 1 click

jbenet · 2017-02-28T10:09:17Z

Documentation

Create a guide that lets someone deploy the Kubernetes master with out config. (ideally this is very short)
Create a guide that lets someone setup a lablet (VM and native, covered separately)
Create a guide for issuing jobs to the lab, and other random things (CLIs).
Create a guide for writing good tests for the lab

SidHarder · 2017-02-28T15:58:31Z

@jbenet I have copied relevant parts of your notes to a sprint objective document here: https://github.com/ipfs/test-lab/blob/master/sprint-objectives-2-27-2017.md.

Tentatively, I think we could create the following issues to be completed during this sprint:

Setting up the Kubernetes Master.
Setting up the VM.
Write Specifications for Job Bundles
Implement Job Bundles
Choose and implement the test config DSL
Documentation guides
Experimentation with testing as stated above.

I believe these could happen to some degree in parallel. The actual deliverable seems a bit vague however.

ghost added the in progress label Mar 1, 2017

brainstorm mentioned this issue Mar 10, 2017

Add docker and dist.ipfs.io install methods to README ipfs-cluster/ipfs-cluster#61

Merged

jbenet closed this as completed Jan 3, 2018

ghost removed the in progress label Jan 3, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sprint Prep Call Notes #5

Sprint Prep Call Notes #5

whyrusleeping commented Feb 24, 2017

jbenet commented Feb 27, 2017 •

edited

Loading

jbenet commented Feb 28, 2017

jbenet commented Feb 28, 2017

SidHarder commented Feb 28, 2017 •

edited

Loading

Sprint Prep Call Notes #5

Sprint Prep Call Notes #5

Comments

whyrusleeping commented Feb 24, 2017

InterPlanetary Test Lab Planning Call

Goal

jbenet commented Feb 27, 2017 • edited Loading

Re: the Planning Call

Product Development

Product Development, Actually

Other options

jbenet commented Feb 28, 2017

Own Notes

jbenet commented Feb 28, 2017

SidHarder commented Feb 28, 2017 • edited Loading

jbenet commented Feb 27, 2017 •

edited

Loading

SidHarder commented Feb 28, 2017 •

edited

Loading