-
Notifications
You must be signed in to change notification settings - Fork 257
2013 Strange Loop Unsession
Jason Wolfe edited this page Jan 13, 2016
·
2 revisions
(ns strangeloop-schema-unsession
"An informal introduction to prismatic/schema
(http://github.com/plumatic/schema)"
(:require
[schema.core :as s]))
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; Thanks for coming!
;; I'm Jason Wolfe.
;; I work at a small company called Prismatic.
;; We make real-time, personally ranked newsfeeds.
;; See my teammate Jenny Finkel's keynote first thing Wednesday for details.
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; Clojure @ Prismatic
;; We're a 99% Clojure(Script) shop.
;; A handful of open-source projects
;; - Plumbing/Graph, which I talked about at Strange Loop 2012
;; - Dommy for ClojureScript dom manipulation
;; - Hiphip (array!) for arrays
;; - And now Schema
;; Beyond public GitHub, about 100klocc supporting tens of services
;; - web and social network crawlers
;; - document analysis
;; - distributed indices
;; - api for real-time newsfeeds
;; - web server with client-side cljs
;; Possible with 6 engineers partially because of Clojure, and in particular:
;; - (almost) everything can be manipulated as data
;; - ubiquitous sequence and map abstractions
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; With great power ...
;; One of the biggest difficulties we've encountered with our growing Clojure
;; codebase and team is the overhead of understanding the kind of data
;; being passed around and manipulated.
(defn update-share-counts [share-counts updates]
(reduce
(fn [result {:keys [user-id share-type delta]}]
(update-in
result
[share-type (str user-id)]
(fnil + 0)
delta))
share-counts
updates))
;; What is a share-counts? How about updates?
;; Maybe a docstring will help?
(defn update-share-counts
"Increment share-counts according to the share actions in updates:
share-counts: map from user-id to map of share-type
(must be one of :twitter, :facebook, or :email) to
count of number of shares (a long)
updates: sequence of maps {:user-id :share-type :delta}, where delta is
amount to increment share-type by
returns updated share-counts reflecting shares in updates"
[share-counts updates]
;; etc
)
;; This is undoubtedly an improvement. But:
;; - not machine readable
;; - can't help us find bugs
;; - prone to bit-rot
;; - not (very) human readable either...
;; - many ways to phrase this content
;; - hard to adopt uniform standards and get good at easily reading such
;; descriptions
;; - no abstraction -- if data types are reused, becomes very repetitive
;; Enter Schema
(def ShareType (s/enum :twitter :facebook :email))
(def ShareCounts {(s/named Long 'user-id) {ShareType (s/named Long 'count)}})
(s/defn update-share-counts :- ShareCounts
[share-counts :- ShareCounts
updates :- [{:user-id Long :share-type ShareType :delta Long}]]
;; etc
)
;; By default, functionally identical to the normal defn ...
;; but with crisp documentation that is:
;; - easy to read (IMO)
;; - precise
;; - reusable
;; - composable
;; - TOOWTDI
;; ...
;;; Moreover, Schemas are composable data, and print like their definitions:
ShareCounts
;; ==> {java.lang.Long {(enum :facebook :email :twitter) java.lang.Long}}
;;; Schemas can be used for validation, and provide nice error messages
;; s/check returns nil for success, or something that looks like the
;; bad parts of your data
(s/check ShareCounts
{12 {:twitter 10 :facebook 15}})
;; ==> nil
(s/check ShareCounts
{12 {:twitter 10}
13 {:twitter 1.4}
"fred" {:facebook 10}})
;; ==> {13 {:twitter (not (instance? java.lang.Long 1.4))}
;; (not (instance? java.lang.Long "fred")) invalid-key}
;; s/validate is like (assert (not (s/check ...)))
(try
(s/validate ShareCounts
{12 {:twitter 10}
13 {:twitter 1.4}
"fred" {:facebook 10}})
(catch Exception e e))
;; ==> #<RuntimeException: Value does not match schema:
;; {(not (instance? java.lang.Long "fred")) invalid-key
;; 13 {:twitter (not (instance? java.lang.Long 1.4))}}>
;;; And if you attach schemas to your functions with s/defn or s/fn, you can
;; optionally turn on fn schema validation at runtime (e.g., in tests):
(try
(s/with-fn-validation (update-share-counts
{12 {:fakeblock 10}}
[{:user-id 10 :share-type :twitter :delta 12.0}]))
(catch Exception e e))
;; ==> #<RuntimeException: Input to update-share-counts does not match schema:
;; [(named {12 {(not (#{:facebook :email :twitter} :fakeblock)) invalid-key}} share-counts)
;; (named [{:delta (not (instance? java.lang.Long 12.0))}] updates)]>
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; Outline
(def ThisUnsession
{0 (s/both Short (s/eq :tour))
1 (s/enum :why? :how?)
2 (s/pred next)
3 (s/named s/Any 'discussion)
double [(s/either (s/eq :comments) (s/eq :questions))]})
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; Meet Schema
;;; Schema is a lightweight Clojure(Script) library for declarative data shape
;; description and validation.
;; We love the simplicity, flexibility, and pithiness of Clojure functions.
;; But as our codebase and team grew, we increasingly found the need for equally
;; crisp declarations of key data shapes, for readability and maintainability.
;;; Start simple: all type hints are valid Schemas:
(s/check String "good")
;; ==> nil
(s/check String :bad)
;; ==> (not (instance? java.lang.String :bad))
(s/check long 12)
;; ==> nil
;;; A few fancier 'leaf' schemas:
(s/check s/Any :whatever)
;; ==> nil
(s/check (s/enum :a :b :c) :a)
;; ==> nil
(s/check (s/enum :a :b :c) :Z)
;; ==> (not (#{:a :c :b} :Z))
(s/check (s/pred odd?) 3)
;; ==> nil
;;; Higher-order schemas
(s/check (s/maybe String) "asdf")
(s/check (s/maybe String) nil)
;; ==> nil
(s/check (s/either String long) "a")
(s/check (s/either String long) 1)
;; ==> nil
(s/check (s/both long (s/pred odd?)) 11)
;; ==> nil
(s/check (s/both long (s/pred odd?)) 12)
;; ==> (not (#<core$odd_QMARK_ clojure.core$odd_QMARK_@4c6ed9d1> 12))
(s/check (s/named long 'UserID) "bob")
;; ==> (named (not (instance? java.lang.Long "bob")) UserID)
;;;;; Data Structures
;;; Sequences
;; [schema] is a uniform sequence
(s/check [String] ["abc" "foo" "123"])
;; ==> nil
(s/check [String] ["abc" :foo "123"])
;; ==> [nil (not (instance? java.lang.String :foo)) nil]
;; positional constraints can be expressed with s/one (and s/optional):
(def ScoredLabel
[(s/one String "label") (s/one Double "score")])
(s/check ScoredLabel ["a" 1.0])
;; ==> nil
(s/check ScoredLabel [:foo])
;; ==> [(named (not (instance? java.lang.String :foo)) "label")
;; (not (present? "score"))]
;;; Maps
;; {key-schema val-schema} is a uniform map
(s/check {Long [String]} {1 ["a" "b"] 2 []})
;; ==> nil
;; specific key constraints can be expressed with s/required-key and
;; s/optional-key.
(def ScoredLabelMap
{(s/required-key :label) String
(s/optional-key :score) Double})
(s/check ScoredLabelMap {:label "a"})
(s/check ScoredLabelMap {:label "a" :score 1.0})
;; ==> nil
(s/check ScoredLabelMap {:label "a" :another-key :another-val})
;; ==> {:another-key disallowed-key}
;; the two can also be combined, and
;; s/required-key can be omitted for keywords
(s/check {:foo String Long Long} {:foo "a" 1 2 3 4})
;; ==> nil
;;; Complex data shapes can be built up from components, and checked
;; with helpful error messages
(def StampedNames
{:date Long
:names [String]})
(def ScoredLabel
[(s/one String "label") (s/one Double "score")])
(defn OddNumber [number-type] ;; like generics, kinda
(s/both number-type (s/pred odd?)))
(def FooBar
{:stamped-strings (s/maybe StampedNames)
(s/optional-key :scored-labels) [ScoredLabel]
String (OddNumber Long)})
(s/check FooBar
{:stamped-strings nil})
;; ==> nil
(s/check FooBar
{:stamped-strings {:date 123 :names ["foo" "bar"]}
:scored-labels [["label1" 1.0] ["label2" 2.0]]
"another-key" 11})
;; ==> nil
(s/check FooBar
{:stamped-strings {:date 123 :names ["foo" :bar]}
:scored-labels [["label1" 1.0] ["label2"]]
"another-key" 12})
;; ==> {:stamped-strings {:names [nil (not (instance? java.lang.String :bar))]}
;; :scored-labels [nil [nil (not (present? "score"))]]
;; "another-key" (not (odd? 12))}
;;; Schema also provides a natural way to attach and validate schemas on
;; defrecord fields:
(s/defrecord RStampedNames
[date :- s/Int
names :- [s/String]])
(s/check RStampedNames
(->RStampedNames 10 ["a" :b]))
;; ==> {:names [nil (not (instance? java.lang.String :b))]}
;;; As well as a way to provide schemas for function inputs and outputs
(s/defn stamped-names :- RStampedNames
[names :- [s/String]]
(->RStampedNames (System/currentTimeMillis) names))
(s/explain (s/fn-schema stamped-names))
;; ==> (=> (record strangeloop_schema_unsession.RStampedNames
;; {:date Int, :names [java.lang.String]})
;; [java.lang.String])
(stamped-names ["a" :b])
;; ==> {:date 1379537133893, :names ["a" :b]}
(comment
(s/with-fn-validation
(stamped-names ["a" :b]))
;; java.lang.RuntimeException: Input to stamped-names does not match schema:
;; [(named [nil (not (instance? java.lang.String :b))] names)]
)
;;; A powerful subset of Schemas (json+) can be shared between Clojure
;; and ClojureScript
;; - enables schema sharing of API inputs and outputs
;; - less code duplication between client and server, better team communication
(def StampedNames
{:date s/Int
:names [s/String]})
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; Design Goals
;;; #1: Schemas should be simple data.
;; Schemas are composable
;; Schemas are inspectible and transformable
;; - and so are validation errors
;; Validation is based on a simple, open protocol, so you can make your
;; own schema types and combinations
(extend-protocol s/Schema
;; #+cljs cljs.core.PersistentHashSet #+clj ;; if this were a cljx file...
clojure.lang.APersistentSet
(check [this x]
(or (when-not (set? x)
(schema.macros/validation-error this x (list 'set? (schema.utils/value-name x))))
(when-let [out (seq (keep #(s/check (first this) %) x))]
(set out))))
(explain [this] (set [(s/explain (first this))])))
;; Only ~600 LOC for Clojure(Script) schemas
;; (plus a few hundred more for s/defrecord and s/defn)
;;; #2: Compared to a docstring or comment with the same information,
;; a schema should always be less hassle to write, and easier to read.
;; Schemas should gracefully extend built-in type hints
;; - provided next to arguments and return types, on defrecords and defn
;; - type hints are valid schemas, and simple schemas act as type hints
;; - more complex schemas can be defined inline
;; Schemas should be optional, and never constrain what you can write
;; Schemas should be able to express arbitrary constraints
;; Validation should be off by default, so you don't have to think twice
;; about performance
;;; #3: Schemas should be incrementally useful.
;; First, define schemas for key data types in your namespace
;; - documentation
;; - manual data validation entering/exiting a system, with nice error msgs
;; Next, annotate key public functions in your namespace with schemas
;; - better documentation
;; - turn on validation at test-time to catch 'type-like' bugs
;; (or run-time, if you like)
;; You could go on to annotate everything...
;; - but if you want a full-blown type system, you want core.typed
;; - core.typed can do static validation based on consistency of annotations
;; throughout your program, with no runtime cost
;; - Schema is designed to make the most of a few annotations in critical places
;; - But there may be interesting synergies...
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; What's next?
;;; Generate core.typed annotations on functions?
;;; Generate test data from schemas?
;;; Generate model classes for clients and even full client APIs (WIP!)
(def Interest {:type (s/enum "topic" "publisher" "user") :key s/String})
(require '[plumbing.core :as plumbing])
(plumbing/defnk interests$update
{:methods [:post]
:query-params {}
:body {:updates [{:op (s/enum "add" "remove")
:interest Interest}]}
:description "Update the interests of the current user"
:returns {:interests [Interest]}}
[env user-store [:request user [:body updates]]]
;; ...
)
;; On the server, we can automatically transform namespaces of such functions
;; into a versioned web API, with automatic schema validation of inputs and
;; outputs (and automatic plumbing of resources, using Graph)
;; In ClojureScript, we can directly use these same schemas to test our code and
;; validate our requests and responses.
;; On other clients, when we're forced to leave Clojure behind, we can
;; automatically generate model classes that know how to populate themselves
;; from server responses
;; (Interest.h, Interest.m, InterestUpdateResponse.m, ...)
;; and even a full API class that handles the plumbing to the server.
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; Your questions and feedback
;; TODO(audience): fill this in