Skip to content
This repository has been archived by the owner on Apr 29, 2020. It is now read-only.

Sprint Prep Call Notes #5

Closed
whyrusleeping opened this issue Feb 24, 2017 · 4 comments
Closed

Sprint Prep Call Notes #5

whyrusleeping opened this issue Feb 24, 2017 · 4 comments

Comments

@whyrusleeping
Copy link

InterPlanetary Test Lab Planning Call

Video recording of the call

Participants

Goal

  • Jenkins works reliably for standard go-ipfs CI

  • Infrastructure + code in place for large scale network tests

    • need to be able to run tests on many machines
  • Kubernetes deployed and able to run tests based on commits

    • Jenkins able to build docker images for go-ipfs on every commit
    • Per commit builds pushed somewhere accessible (and garbage collected?)

Actionables:
1. Jenkins working as day to day go-ipfs CI
2. System to build docker images for each commit
3. Make it easy to trigger run of kubernetes tests given a commit hash
4. Use google cloud as initial deployment pool for kubernetes nodes

@jbenet
Copy link
Contributor

jbenet commented Feb 27, 2017

Re: the Planning Call

Hey Everyone,

I'm sorry for missing the planning call. I should've been there. I've been overwhelmed with other work and failed to wake up-- i'm sorry. That said, I don't think the lack of my presence is the reason for the issues here. Faults like that will happen and we need to be fault-tolerant and proceed through them.

The bigger issue here -- and why the call went the way it went -- is that, collectively, we had dropped this thing, and we don't really know what we're doing. We have not taken the time to craft product, nor synthesize all the available comments into a coherent product. (Product Development).

The PMing role -- which often involves that Product Development or at least surfacing and synthesizing others' work -- fell to @whyrusleeping late in the game, who has also been extremely busy with other things and has not had the time required to do greatly here. The result is that the call ended up with fishing for things vaguely related to do, and the action plan is very far away from the original goals set for this sprint in early January. We also need @whyrusleeping tech-leading here, which is a hard and time-consuming role. I voiced this a few days ago: I doubt that Tech-leading, being one of the lead Implementors, Project Managing, AND Product Developing, is a feasible thing, if we want this sprint to be successful. As amazing as @whyrusleeping is, there's just not enough hours in a day. (thoughts @whyrusleeping?)

Product Development

As the person who called this sprint into being, I was the de facto product owner. But this was not quite explicit, and I did not have the time to do the product development necessary. Regardless, I should've either found someone else or do it. So I dropped that ball, which I wasn't even aware I was dropping until a couple days ago.

Here are a bunch of threads relevant to this product (thanks to everyone who surfaced these, or contributed thoughts):

And then there is the vision & goals for the "InterPlanetary Lab" (what this sprint is here for), which i've discussed with many people in this sprint, but has not yet been written up. (For those familiar with Planet Lab, imagine: Planet Lab for IPFS, with all sort of hardware around the world.

The goals I had for this sprint were:

  • Make the first working version of IPLab or IPTL (let's go with IPTL henceforth...)
  • This would involve:
    • a cluster / process orchestrator that can deal with hundreds of machines coming and going
    • a container (or similar isolation + deployment tech) engine to deploy arbitrary programs
    • a volunteer hardware base (like https://build.golang.org)
      • yes, we can surely involve cloud machines and simulated networks too
    • a way to issue "interesting workload" tests to the network:
      • that get run, parameters then get permuted
      • and when they finish, all the data is gathered and returned to me automatically.
      • lots of metrics, viz, and access to logs would tell us what all the things we need to fix
  • Make dashboards / UIs that let us interact with the lab to make the most of it

Product Development, Actually

So I went ahead and finally did some of the product development required for this sprint. I made a designdoc like thing, and a video discussing it. it's here:

This is not everything we need, of course. We still need tighter connection to user stories. Still, I would like people in this sprint to:

  • watch this and consider what building this would take.
  • think whether this would satisfy the user stories you've been thinking about
  • think about the roadmap at the end of the video
  • join a discussion today (as the last call of the day) to figure this out. (planning redux, sorry)

IF we decide to go for it, we can:

Other options

Taking stock of our options, two others appear:

  • (1) Finish the CI/CD work. This work is vaguely related, so it's already invading this sprint. To be clear, if we do this, this IS NOT the test-lab. What's written here so far does not satisfy original goals for test lab: Sprint Prep Call Notes #5 (comment)
  • (2) We could punt the test-lab to Q2 and work closely with OpenBazaar to build a number of improvements we and they want. (eg IPNS improvements, pubsub routing -- also great for all the VR use cases, libp2p improvements, relay, teach bitswap about graphs, etc...)

I may be ok to do (2) if we decide not to do the proper test lab.

But I think we should go for the test lab. It's incredibly important for the continued success of everything we're doing, and it's going to level up our tech dramatically. If we do it, I can commit to a daily standup (except Thursday) and to guiding the Product Development, if @whyrusleeping can handle big part of the Lablet implementation & Tech Leading (making sure everyone can do what they need to do, and guiding people through the implementation), someone else (@victorbjelkholm maybe, since we had discussed it as a possibility?) could handle the Project Management.

@jbenet
Copy link
Contributor

jbenet commented Feb 28, 2017

Own Notes

Orchestration of processes

  • Setup & Deploy a Kubernetes master to orchestrate the test lab.
  • Setup & Deploy 2 Kubelets on the cloud that speak to the running master.
  • Setup & Deploy 2 Kubelets on our own machines that speak to the running master.
  • Create a lablet VM we can use to image cloud machines
  • Create a lablet VM we can use to run in our machines (ideally is same VM)
  • Test: issue a test to all the kubelets in the network
    • example: downloads go-ipfs, runs it, adds something, and then counts the peers.

Experiment with testing

  • Create 5 different network tests:
    • all go-ipfs, add + cat 1000 files
    • all js-ipfs, add + cat 1000 files
    • half go-ipfs and half js-ipfs, add + cat 1000 files
    • orbit nodes chatting (all on go-ipfs)
    • orbit nodes chatting + web (on go-ipfs, js-ipfs and js-ipfs-browser)
  • Revive the tests from data.gov, and run them on the cluster
  • Make some test that produces static grafana output
  • Make some test that produces some simple trace, gather it afterward from all the nodes

Testing Setup -- Job bundle and Results bundle

  • Spec out a format for the "job bundle"
  • Spec out a format for the "results bundle"
  • Implement Job Bundle
  • Implement Results Bundle
  • Choose & Implement network test config DSL
  • CLI: Dev should be able to start a job from CLI with 1 command
  • WEB: Dev should be able to start a job from WEB with 1 pageload + 1 click

@jbenet
Copy link
Contributor

jbenet commented Feb 28, 2017

Documentation

  • Create a guide that lets someone deploy the Kubernetes master with out config. (ideally this is very short)
  • Create a guide that lets someone setup a lablet (VM and native, covered separately)
  • Create a guide for issuing jobs to the lab, and other random things (CLIs).
  • Create a guide for writing good tests for the lab

@SidHarder
Copy link
Collaborator

SidHarder commented Feb 28, 2017

@jbenet I have copied relevant parts of your notes to a sprint objective document here: https://github.com/ipfs/test-lab/blob/master/sprint-objectives-2-27-2017.md.

Tentatively, I think we could create the following issues to be completed during this sprint:

  1. Setting up the Kubernetes Master.
  2. Setting up the VM.
  3. Write Specifications for Job Bundles
  4. Implement Job Bundles
  5. Choose and implement the test config DSL
  6. Documentation guides
  7. Experimentation with testing as stated above.

I believe these could happen to some degree in parallel. The actual deliverable seems a bit vague however.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants