Skip to content
This repository has been archived by the owner on Nov 9, 2020. It is now read-only.

Jepsen Testing

Touseef Liaqat edited this page May 1, 2017 · 1 revision

Jepsen Testing

Jepsen is a framework for distributed systems verification, with fault injection, written by Kyle Kingsbury. It fuzzes the system with random operations while injecting network partitions. The results of operation history is analyzed to see if the system violates any of the consistency properties it claims to have. It generates graphs of performance and availability, helping user characterize how a system responds to different faults.

We ran a Jepsen test, as part of our Xenon testing, to determine if any consistency issues could be uncovered. In our testing of Read, Write and CAS (Compare-And-Set) operations, we found them to be linearizable, and Xenon gracefully recovered from partitions without introducing any consistency issues.

Setup:

Service Options

We ran Jepsen tests against ExampleService that comes along with Xenon library. Apart from the default service options of PERSISTENCE, REPLICATION and OWNER_SELECTION, for Jepsen testing we included IDEMPOTENT_POST option. IDEMPOTENT_POST was added for making it easy to write a test for Jepsen which creates and updates with idempotent assumptions.

    public ExampleService() {
        super(ExampleServiceState.class);
        toggleOption(ServiceOption.PERSISTENCE, true);
        toggleOption(ServiceOption.REPLICATION, true);
        toggleOption(ServiceOption.INSTRUMENTATION, true);
        toggleOption(ServiceOption.OWNER_SELECTION, true);
        toggleOption(ServiceOption.IDEMPOTENT_POST, true);
    }

Operations

We choose to test common Read, Write and CAS (Compare-And-Set) operations. Read is a HTTP GET request to service, Write is implemented as an Idempotent HTTP POST request, and CAS is a custom operation implemented in ExampleService. Xenon has a feature that lets service implementors to write custom routing logic. Using this feature ExampleService has custom router that accepts a PATCH request with StrictUpdateRequest body that lets caller set expected documentVersion and new value. If current state has the expected documentVersion then new value is applied to the document state.

    /**
     * Request for strict update version check of a service.  This shows an example of implementing
     * compareAndSet operation in a service.  We implement handler that will check the supplied documentVersion
     * with documentVersion in current state and only apply the change of name if the two versions are equal.
     * See {@code getOperationProcessingChain} and {@code handlePatchForStrictUpdate} for implementation details.
     */
    public static class StrictUpdateRequest {
        public static final String KIND = Utils.buildKind(StrictUpdateRequest.class);

        // Field used by RequestRouter to route this 'kind' of request to
        // special handler. See {@code getOperationProcessingChain}.
        public String kind;

        // Field to be updated after version check.
        public String name;

        // Version to match before update.
        public long documentVersion;
    }

Tests

Out test uses mix of Read, Write and CAS operations, sent to the Xenon by concurrent clients with injections of network partitions (using nemesis).We use linearizable checker provided by Jepsen to verify the operation history for consistency. A Xenon Clojure client library is written to easy the development of Jepsen tests.

(defn xenon-test
  "Given an options map from the command line runner (e.g. :nodes, :ssh,
  :concurrency, ...), constructs a test map."
  [opts]
  (merge tests/noop-test
    {:name "xenon"
     :os debian/os
     :db (db "1.5.0-SNAPSHOT")
     :client (client nil nil)
     :nemesis (nemesis/partition-random-halves)
     :model (model/cas-register)
     :generator (->> (independent/concurrent-generator
                       10
                       (range)
                       (fn [k]
                         (->> (gen/mix [r w cas])
                           (gen/stagger 1/10)
                           (gen/limit 100))))
                  (gen/nemesis
                    (gen/seq (cycle [(gen/sleep 5)
                                     {:type :info, :f :start}
                                     (gen/sleep 5)
                                     {:type :info, :f :stop}])))
                  (gen/time-limit (:time-limit opts)))
     :checker (checker/compose
                {:perf (checker/perf)
                 :indep (independent/checker
                          (checker/compose
                            {:timeline (timeline/html)
                             :linear checker/linearizable}))})}
    opts))

Xenon Jepson clojure code and xenon client libarary written in clojure is available at xenon-utils repo.

Results

The multiple execution of the test show that the operations under test are linearizable in Xenon. Following is the sample output of a test run.

> lein run test --time-limit 500 --concurrency 20

Here are partial outout of one test execution.

2017-05-01 05:40:30,176{GMT}    INFO    [jepsen node n4] jepsen.os.debian: :n4 setting up debian
2017-05-01 05:40:30,176{GMT}    INFO    [jepsen node n2] jepsen.os.debian: :n2 setting up debian
2017-05-01 05:40:30,176{GMT}    INFO    [jepsen node n5] jepsen.os.debian: :n5 setting up debian
2017-05-01 05:40:30,176{GMT}    INFO    [jepsen node n1] jepsen.os.debian: :n1 setting up debian
2017-05-01 05:40:30,176{GMT}    INFO    [jepsen node n3] jepsen.os.debian: :n3 setting up debian
2017-05-01 05:40:59,792{GMT}    INFO    [jepsen node n4] jepsen.xenon: :n4 tearing down xenon
2017-05-01 05:40:59,792{GMT}    INFO    [jepsen node n5] jepsen.xenon: :n5 tearing down xenon
2017-05-01 05:40:59,792{GMT}    INFO    [jepsen node n1] jepsen.xenon: :n1 tearing down xenon
2017-05-01 05:40:59,792{GMT}    INFO    [jepsen node n2] jepsen.xenon: :n2 tearing down xenon
2017-05-01 05:40:59,792{GMT}    INFO    [jepsen node n3] jepsen.xenon: :n3 tearing down xenon
2017-05-01 05:40:59,794{GMT}    INFO    [jepsen node n4] jepsen.control.util: Stopping /usr/bin/java
2017-05-01 05:40:59,794{GMT}    INFO    [jepsen node n5] jepsen.control.util: Stopping /usr/bin/java
2017-05-01 05:40:59,794{GMT}    INFO    [jepsen node n1] jepsen.control.util: Stopping /usr/bin/java
2017-05-01 05:40:59,795{GMT}    INFO    [jepsen node n2] jepsen.control.util: Stopping /usr/bin/java
2017-05-01 05:40:59,795{GMT}    INFO    [jepsen node n3] jepsen.control.util: Stopping /usr/bin/java
2017-05-01 05:41:02,027{GMT}    INFO    [jepsen node n5] jepsen.xenon: :n5 installing xenon 1.5.0-SNAPSHOT
2017-05-01 05:41:02,028{GMT}    INFO    [jepsen node n3] jepsen.xenon: :n3 installing xenon 1.5.0-SNAPSHOT
2017-05-01 05:41:02,028{GMT}    INFO    [jepsen node n1] jepsen.xenon: :n1 installing xenon 1.5.0-SNAPSHOT
2017-05-01 05:41:02,031{GMT}    INFO    [jepsen node n4] jepsen.xenon: :n4 installing xenon 1.5.0-SNAPSHOT
2017-05-01 05:41:02,042{GMT}    INFO    [jepsen node n2] jepsen.xenon: :n2 installing xenon 1.5.0-SNAPSHOT
2017-05-01 05:41:03,445{GMT}    INFO    [jepsen node n5] jepsen.control.util: starting java
2017-05-01 05:41:03,547{GMT}    INFO    [jepsen node n3] jepsen.control.util: starting java
2017-05-01 05:41:03,553{GMT}    INFO    [jepsen node n4] jepsen.control.util: starting java
2017-05-01 05:41:03,559{GMT}    INFO    [jepsen node n1] jepsen.control.util: starting java
2017-05-01 05:41:03,563{GMT}    INFO    [jepsen node n2] jepsen.control.util: starting java
2017-05-01 05:41:13,902{GMT}    INFO    [jepsen nemesis] jepsen.core: Nemesis starting
2017-05-01 05:41:13,905{GMT}    INFO    [jepsen worker 0] jepsen.core: Worker 0 starting
2017-05-01 05:41:13,906{GMT}    INFO    [jepsen worker 1] jepsen.core: Worker 1 starting
2017-05-01 05:41:13,907{GMT}    INFO    [jepsen worker 4] jepsen.core: Worker 4 starting
2017-05-01 05:41:13,907{GMT}    INFO    [jepsen worker 5] jepsen.core: Worker 5 starting
2017-05-01 05:41:13,907{GMT}    INFO    [jepsen worker 6] jepsen.core: Worker 6 starting
2017-05-01 05:41:13,907{GMT}    INFO    [jepsen worker 3] jepsen.core: Worker 3 starting
2017-05-01 05:41:13,907{GMT}    INFO    [jepsen worker 2] jepsen.core: Worker 2 starting
2017-05-01 05:41:13,909{GMT}    INFO    [jepsen worker 7] jepsen.core: Worker 7 starting
2017-05-01 05:41:13,910{GMT}    INFO    [jepsen worker 9] jepsen.core: Worker 9 starting
2017-05-01 05:41:13,911{GMT}    INFO    [jepsen worker 10] jepsen.core: Worker 10 starting
2017-05-01 05:41:13,913{GMT}    INFO    [jepsen worker 8] jepsen.core: Worker 8 starting
2017-05-01 05:41:13,913{GMT}    INFO    [jepsen worker 11] jepsen.core: Worker 11 starting
2017-05-01 05:41:13,913{GMT}    INFO    [jepsen worker 12] jepsen.core: Worker 12 starting
2017-05-01 05:41:13,915{GMT}    INFO    [jepsen worker 14] jepsen.core: Worker 14 starting
2017-05-01 05:41:13,915{GMT}    INFO    [jepsen worker 13] jepsen.core: Worker 13 starting
2017-05-01 05:41:13,915{GMT}    INFO    [jepsen worker 16] jepsen.core: Worker 16 starting
2017-05-01 05:41:13,915{GMT}    INFO    [jepsen worker 15] jepsen.core: Worker 15 starting
2017-05-01 05:41:13,915{GMT}    INFO    [jepsen worker 17] jepsen.core: Worker 17 starting
2017-05-01 05:41:13,916{GMT}    INFO    [jepsen worker 18] jepsen.core: Worker 18 starting
2017-05-01 05:41:13,919{GMT}    INFO    [jepsen worker 19] jepsen.core: Worker 19 starting
2017-05-01 05:41:13,963{GMT}    INFO    [jepsen worker 16] jepsen.util: 16    :invoke    :read    [1 nil]
2017-05-01 05:41:13,974{GMT}    INFO    [jepsen worker 1] jepsen.util: 1    :invoke    :cas    [0 [0 4]]
2017-05-01 05:41:13,974{GMT}    INFO    [jepsen worker 3] jepsen.util: 3    :invoke    :cas    [0 [0 2]]
2017-05-01 05:41:13,974{GMT}    INFO    [jepsen worker 17] jepsen.util: 17    :invoke    :cas    [1 [0 2]]
2017-05-01 05:41:13,974{GMT}    INFO    [jepsen worker 15] jepsen.util: 15    :invoke    :cas    [1 [4 3]]
2017-05-01 05:41:13,975{GMT}    INFO    [jepsen worker 5] jepsen.util: 5    :invoke    :cas    [0 [2 3]]
2017-05-01 05:41:13,975{GMT}    INFO    [jepsen worker 9] jepsen.util: 9    :invoke    :cas    [0 [3 0]]
2017-05-01 05:41:13,975{GMT}    INFO    [jepsen worker 2] jepsen.util: 2    :invoke    :cas    [0 [2 1]]
2017-05-01 05:41:13,975{GMT}    INFO    [jepsen worker 18] jepsen.util: 18    :invoke    :cas    [1 [4 3]]
2017-05-01 05:41:13,977{GMT}    INFO    [jepsen worker 4] jepsen.util: 4    :invoke    :cas    [0 [3 0]]
2017-05-01 05:41:13,977{GMT}    INFO    [jepsen worker 14] jepsen.util: 14    :invoke    :cas    [1 [1 0]]
2017-05-01 05:41:13,977{GMT}    INFO    [jepsen worker 12] jepsen.util: 12    :invoke    :cas    [1 [3 2]]
2017-05-01 05:41:13,977{GMT}    INFO    [jepsen worker 6] jepsen.util: 6    :invoke    :write    [0 4]
2017-05-01 05:41:13,977{GMT}    INFO    [jepsen worker 7] jepsen.util: 7    :invoke    :cas    [0 [0 2]]
2017-05-01 05:41:13,978{GMT}    INFO    [jepsen worker 0] jepsen.util: 0    :invoke    :write    [0 3]
2017-05-01 05:41:13,977{GMT}    INFO    [jepsen worker 19] jepsen.util: 19    :invoke    :cas    [1 [0 2]]
2017-05-01 05:41:13,979{GMT}    INFO    [jepsen worker 11] jepsen.util: 11    :invoke    :cas    [1 [1 3]]
2017-05-01 05:41:13,979{GMT}    INFO    [jepsen worker 8] jepsen.util: 8    :invoke    :cas    [0 [4 0]]
2017-05-01 05:41:13,979{GMT}    INFO    [jepsen worker 13] jepsen.util: 13    :invoke    :cas    [1 [1 3]]
2017-05-01 05:41:13,979{GMT}    INFO    [jepsen worker 10] jepsen.util: 10    :invoke    :cas    [1 [2 3]]
2017-05-01 05:41:15,442{GMT}    INFO    [jepsen worker 3] jepsen.util: 3    :fail    :cas    [0 [0 2]]
2017-05-01 05:41:15,443{GMT}    INFO    [jepsen worker 9] jepsen.util: 9    :fail    :cas    [0 [3 0]]
2017-05-01 05:41:15,444{GMT}    INFO    [jepsen worker 2] jepsen.util: 2    :fail    :cas    [0 [2 1]]
2017-05-01 05:41:15,444{GMT}    INFO    [jepsen worker 1] jepsen.util: 1    :fail    :cas    [0 [0 4]]
2017-05-01 05:41:15,445{GMT}    INFO    [jepsen worker 6] jepsen.util: 6    :ok    :write    [0 4]
2017-05-01 05:41:15,446{GMT}    INFO    [jepsen worker 7] jepsen.util: 7    :fail    :cas    [0 [0 2]]
2017-05-01 05:41:15,447{GMT}    INFO    [jepsen worker 4] jepsen.util: 4    :fail    :cas    [0 [3 0]]
2017-05-01 05:41:15,449{GMT}    INFO    [jepsen worker 7] jepsen.util: 7    :invoke    :cas    [0 [4 0]]
2017-05-01 05:41:15,450{GMT}    INFO    [jepsen worker 1] jepsen.util: 1    :invoke    :read    [0 nil]
2017-05-01 05:41:15,456{GMT}    INFO    [jepsen worker 3] jepsen.util: 3    :invoke    :cas    [0 [0 2]]
2017-05-01 05:41:15,453{GMT}    INFO    [jepsen worker 5] jepsen.util: 5    :fail    :cas    [0 [2 3]]
2017-05-01 05:41:15,457{GMT}    INFO    [jepsen worker 4] jepsen.util: 4    :invoke    :read    [0 nil]
2017-05-01 05:41:15,463{GMT}    INFO    [jepsen worker 9] jepsen.util: 9    :invoke    :cas    [0 [3 2]]
2017-05-01 05:41:15,464{GMT}    INFO    [jepsen worker 2] jepsen.util: 2    :invoke    :cas    [0 [3 2]]
2017-05-01 05:41:15,464{GMT}    INFO    [jepsen worker 6] jepsen.util: 6    :invoke    :write    [0 2]
2017-05-01 05:41:15,466{GMT}    INFO    [jepsen worker 1] jepsen.util: 1    :ok    :read    [0 4]
2017-05-01 05:41:15,480{GMT}    INFO    [jepsen worker 5] jepsen.util: 5    :invoke    :cas    [0 [4 4]]
2017-05-01 05:41:15,492{GMT}    INFO    [jepsen worker 1] jepsen.util: 1    :invoke    :write    [0 4]
2017-05-01 05:41:15,503{GMT}    INFO    [jepsen worker 2] jepsen.util: 2    :fail    :cas    [0 [3 2]]
2017-05-01 05:41:15,520{GMT}    INFO    [jepsen worker 2] jepsen.util: 2    :invoke    :cas    [0 [0 3]]
2017-05-01 05:41:15,524{GMT}    INFO    [jepsen worker 3] jepsen.util: 3    :fail    :cas    [0 [0 2]]
2017-05-01 05:41:15,535{GMT}    INFO    [jepsen worker 3] jepsen.util: 3    :invoke    :write    [0 2]
2017-05-01 05:41:15,580{GMT}    INFO    [jepsen worker 5] jepsen.util: 5    :fail    :cas    [0 [4 4]]
2017-05-01 05:41:15,585{GMT}    INFO    [jepsen worker 8] jepsen.util: 8    :ok    :cas    [0 [4 0]]
2017-05-01 05:41:15,596{GMT}    INFO    [jepsen worker 5] jepsen.util: 5    :invoke    :write    [0 3]
2017-05-01 05:41:15,599{GMT}    INFO    [jepsen worker 8] jepsen.util: 8    :invoke    :read    [0 nil]
2017-05-01 05:41:15,627{GMT}    INFO    [jepsen worker 8] jepsen.util: 8    :ok    :read    [0 0]
2017-05-01 05:41:15,645{GMT}    INFO    [jepsen worker 8] jepsen.util: 8    :invoke    :cas    [0 [1 0]]
2017-05-01 05:41:15,659{GMT}    INFO    [jepsen worker 3] jepsen.util: 3    :ok    :write    [0 2]
2017-05-01 05:41:15,663{GMT}    INFO    [jepsen worker 3] jepsen.util: 3    :invoke    :cas    [0 [2 1]]
2017-05-01 05:41:15,682{GMT}    INFO    [jepsen worker 8] jepsen.util: 8    :fail    :cas    [0 [1 0]]
2017-05-01 05:41:15,691{GMT}    INFO    [jepsen worker 8] jepsen.util: 8    :invoke    :write    [0 0]
2017-05-01 05:41:15,747{GMT}    INFO    [jepsen worker 7] jepsen.util: 7    :fail    :cas    [0 [4 0]]
2017-05-01 05:41:15,753{GMT}    INFO    [jepsen worker 7] jepsen.util: 7    :invoke    :write    [0 2]
2017-05-01 05:41:15,803{GMT}    INFO    [jepsen worker 1] jepsen.util: 1    :ok    :write    [0 4]
2017-05-01 05:41:15,821{GMT}    INFO    [jepsen worker 1] jepsen.util: 1    :invoke    :write    [0 0]
2017-05-01 05:41:15,882{GMT}    INFO    [jepsen worker 15] jepsen.util: 15    :fail    :cas    [1 [4 3]]    :internal
2017-05-01 05:41:15,901{GMT}    INFO    [jepsen worker 6] jepsen.util: 6    :ok    :write    [0 2]
2017-05-01 05:41:15,902{GMT}    INFO    [jepsen worker 15] jepsen.util: 15    :invoke    :write    [1 3]
2017-05-01 05:41:15,921{GMT}    INFO    [jepsen worker 6] jepsen.util: 6    :invoke    :write    [0 1]
2017-05-01 05:41:15,954{GMT}    INFO    [jepsen worker 5] jepsen.util: 5    :ok    :write    [0 3]
2017-05-01 05:41:15,970{GMT}    INFO    [jepsen worker 5] jepsen.util: 5    :invoke    :cas    [0 [2 1]]
2017-05-01 05:41:15,988{GMT}    INFO    [jepsen worker 2] jepsen.util: 2    :fail    :cas    [0 [0 3]]
.
.
.
   84
   {:timeline {:valid? true},
    :linear
    {:valid? true,
     :configs
     ({:model {:value 0},
       :last-op
       {:type :ok,
        :f :write,
        :value 0,
        :process 9,
        :time 236100934696,
        :index 270},
       :pending []}
      {:model {:value 4},
       :last-op
       {:type :ok,
        :f :write,
        :value 4,
        :process 5,
        :time 236142473153,
        :index 278},
       :pending []}),
     :final-paths ()},
    :valid? true}},
  :failures []},
 :valid? true}


Everything looks good!
Clone this wiki locally