Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing Improvement Plan #6483

Open
4 of 43 tasks
Stebalien opened this issue Jul 3, 2019 · 8 comments
Open
4 of 43 tasks

Testing Improvement Plan #6483

Stebalien opened this issue Jul 3, 2019 · 8 comments
Labels
topic/meta Topic meta

Comments

@Stebalien
Copy link
Member

Stebalien commented Jul 3, 2019

  • CLI/API
    • Coverage
    • Language Interop
    • Backwards Compatibility
      • Run sharness tests with previous go-ipfs binary as a client. Need to mark the required client version on sharness tests.
      • Interop tests: These may be sufficient.
  • Performance
  • Reliability/Stability
    • Nightly Mirror Gateway: All requests to a specific gateway get mirrored to this gateway with nginx.
      • Track and compare errors/latencies versus the gateway serving the response.
        • Auto-file Issues.
      • Auto deploy a nightly build.
      • Auto-file issues on crash.
      • Automatically pull goroutine dumps, cpu profiles, and memory profiles.
      • Automatically analyze with stackparse and auto-file issues when a specific goroutine count increases by 100x (usually indicates an issues). Note: this'll have a bunch of false positives but it's a good signal.
      • Same with the heap: File an issue if the "top memory user in a steady state" changes.
      • Same with the CPU profile.
    • Nightly Bootstrapper
      • Auto deploy a nightly build.
      • Auto-file issues on crash.
      • Automatically pull goroutine dumps, cpu profiles, and memory profiles.
      • Automatically analyze with stackparse and auto-file issues when a specific goroutine count increases by 100x (usually indicates an issues). Note: this'll have a bunch of false positives but it's a good signal.
      • Same with the heap: File an issue if the "top memory user in a steady state" changes.
      • Same with the CPU profile.
    • Network Simulation: tests run on a simulated network so we can predict how are changes will affect the network. We should be able to run each test with X% of the network running the current release and the old release.
      • DHT Tests - test node run as a client and server.
        • FindPeer
        • Provide/FindProviders
        • IPNS
      • End-to-end Bitswap (including finding providers, multiple providers, etc.).
    • Network Canary - Continuous background tests against the main network.
      • DHT Tests - test node run as a client and server.
        • FindPeer
        • Provide/FindProviders
        • IPNS
      • End-to-end Bitswap (including finding providers, multiple providers, etc.).
      • Gateway Test - Add a file to the test node and fetch it on the ipfs.io gateway (measuring latency).
    • Release Candidates
      • Opt-out telemetry is built into RC binaries for collection of network operation and performance (required to be able to confirm any hypothesises formed, from testlab tests, in the production network)

In both network tests, we should track and compare metrics on:

  • Number of dials.
  • Total network bandwidth.
  • Latencies/times for each step.
  • CPU Usage
  • Memory Usage
  • Goroutine Counts.

The Plan: https://docs.google.com/spreadsheets/d/1xyqyGUF-oe3x9ln88YonVeOMWWdknik74lVgL_3dBY8

@Stebalien Stebalien added the topic/meta Topic meta label Jul 3, 2019
@Stebalien Stebalien assigned jbenet and whyrusleeping and unassigned jbenet Jul 3, 2019
@Stebalien
Copy link
Member Author

cc @ipfs/wg-infrastructure

@lanzafame
Copy link
Contributor

As part of the end-to-end benchmarking, can we look at some tooling around the storage of the benchmarks, over just comparing on a PR by PR basis, i.e. https://github.com/golang/perf or https://github.com/influxdata/grade

@Stebalien
Copy link
Member Author

@lanzafame I've added a item to the todo list.

@lanzafame
Copy link
Contributor

@Stebalien I have updated the issue with the point about RC telemetry.

@yiannisbot
Copy link
Member

On the issue of Network Simulation for libp2p TestLab, I've come across this: https://chepeftw.github.io/NS3DockerEmulator/ - which is a docker emulator that can then be imported in NS-3 (https://nsnam.org). The production IPFS/libp2p/DHT code is packaged in a docker container, which represents a node in the network. Through NS-3 we then do the network setup and run the simulation (it's more emulation than simulation TBH). This can produce realistic results using production code. It might not be useful for all purposes, but surely it can be for many cases.

@Stebalien Stebalien mentioned this issue Jul 12, 2019
51 tasks
@Jorropo
Copy link
Contributor

Jorropo commented Nov 4, 2019

I don't think Dial Latency (testing the dialer, transports, etc.). is an IPFS part, I think this test is libp2p releated because its impossible to fix any problem about that in IPFS without changing the libp2p codebase, also that test and his result impact all libp2p users, such as filecoin, IPFS, ethereum 2.0...

@Stebalien
Copy link
Member Author

@Jorropo these are tests needed by go-ipfs, not necessarily tests of go-ipfs. That is, we want to test these libp2p features before we cut a release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic/meta Topic meta
Projects
None yet
Development

No branches or pull requests

6 participants