Check for differences between two JSON documents, YAML documents, or Ruby data structures parsed from either of those.
- check_please
- Installation
- Terminology
- Usage
- TODO (maybe)
- Development
- Contributing
- License
- Code of Conduct
Add this line to your application's Gemfile:
gem 'check_please'
And then execute:
$ bundle install
Or install it yourself as:
$ gem install check_please
I know, you just want to see how to use this thing. Feel free to scroll down, but be aware that CheckPlease uses a few words in a jargony way:
- Reference is always used to refer to the "target" or "source of truth." We assume you're comparing two things because you want one of them to be like the other; the reference is what you're aiming for.
- Candidate is always used to refer to some JSON you'd like to compare against the reference. (We could've also used "sample," but it turns out that "reference" and "candidate" are the same length, which makes code line up neatly in a monospaced font...)
- A diff is what CheckPlease calls an individual discrepancy between the reference and the candidate. More on this in "Understanding the Output", below.
Also, even though this gem was born from a need to compare JSON documents, I'll be talking about "hashes" instead of "objects", because I assume this will mostly be used by Ruby developers. Feel free to substitute "object" wherever you see "hash" if that's easier for you. :)
Use the bin/check_please
executable. (To get started, run it with the '-h' flag.)
Note that the executable assumes you've saved your reference to a file.
Once that's done, you can either save the candidate to a file as well if
that fits your workflow, or you can pipe it to bin/check_please
in lieu
of giving it a second filename as the argument. (This is especially useful if
you're copying an XHR response out of a web browser's dev tools and have a tool
like MacOS's pbpaste
utility.)
See check_please_rspec_matcher.
If you'd like more control over the output formatting, and especially if you'd like to provide custom logic for diffing your own classes, you might be better served by the super_diff gem. Check it out!
See also: ./usage_examples.rb.
Create two strings, each containing a JSON or YAML document, and pass them to
CheckPlease.render_diff
. You'll get back a third string containing a report
of all the differences CheckPlease found in the two JSON strings.
Or, if you'd like to inspect the diffs in your own way, use CheckPlease.diff
instead. You'll get back a CheckPlease::Diffs
custom collection that
contains CheckPlease::Diff
instances.
CheckPlease follows the Unix philosophy of "no news is good news". If your candidate matches your reference, you'll get an empty message.
But let's be honest: how often is that going to happen? No, you're using this tool because you want a human-friendly summary of all the places that your candidate fell short.
When CheckPlease compares your two samples, it generates a list of diffs to describe any discrepancies it encounters.
An example would probably help here.
(NOTE: these examples may fall out of date with the code. They're swiped from the CLI integration spec, so please consider that more authoritative than this README. If you do spot a difference, please feel free to open an issue!)
Given the following reference JSON:
{
"id": 42,
"name": "The Answer",
"words": [ "what", "do", "you", "get", "when", "you", "multiply", "six", "by", "nine" ],
"meta": { "foo": "spam", "bar": "eggs", "yak": "bacon" }
}
And the following candidate JSON:
{
"id": 42,
"name": [ "I am large, and contain multitudes." ],
"words": [ "what", "do", "we", "get", "when", "I", "multiply", "six", "by", "nine", "dude" ],
"meta": { "foo": "foo", "yak": "bacon" }
}
CheckPlease should produce the following output:
TYPE | PATH | REFERENCE | CANDIDATE
--------------|-----------|------------|-------------------------------
type_mismatch | /name | The Answer | ["I am large, and contain m...
mismatch | /words/3 | you | we
mismatch | /words/6 | you | I
extra | /words/11 | | dude
missing | /meta/bar | eggs |
mismatch | /meta/foo | spam | foo
Let's start with the leftmost column...
The above example is intended to illustrate every possible type of diff that CheckPlease defines:
- type_mismatch means that both the reference and the candidate had a value at the given path, but one value was an Array or a Hash and the other was not. When CheckPlease encounters a type mismatch, it does not compare anything "below" the given path. (Technical note: CheckPlease uses a "recursive descent" strategy to traverse the reference data structure, and it stops when it encounters a type mismatch in order to avoid producing a lot of "garbage" diff output.)
- mismatch means that both the reference and the candidate had a value at the given path, and neither value was an Array or a Hash, and the two values were not equal.
- extra means that, inside an Array or a Hash, the candidate contained elements that were not found in the reference.
- missing is the opposite of extra: inside an Array or a Hash, the reference contained elements that were not found in the candidate.
The second column contains a path expression. This is extremely lo-fi:
- The root of the data structure is defined as "/".
- If an element in the data structure is an array, its child elements will have a one-based index appended to their parent's path.
- If an element in the data structure is an object ("Hash" in Ruby), the key for each element will be appended to their parent's path, and the values will be compared.
Being primarily a Ruby developer, I'm quite ignorant of conventions in the JS community; if there's an existing convention for paths, please open an issue!
CheckPlease produces tabular output by default. (It leans heavily on the amazing table_print gem for this.)
If you want to incorporate CheckPlease into some other toolchain, it can also print diffs as JSON to facilitate parsing. How you do this depends on whether you're using CheckPlease from the command line or in Ruby, which is a good time to talk about...
CheckPlease has several flags that control its behavior.
For quick help on which flags are available, as well as some terse help text,
you can run the check_please
executable with no arguments (or the -h
or
--help
flags if that makes you feel better).
While of course we aspire to keep this README up to date, it's probably best to believe things in the following priority order:
- observed behavior
- the code (start from
./lib/check_please.rb
and search forFlags.define
, then trace through as needed) - the tests (
spec/check_please/flags_spec.rb
describes how the flags work; from there, you'll have to search on the flag's name to see how it shows up in code) - the output of
check_please --help
- this README :)
All flags have exactly one "Ruby name" and one or more "CLI names". When the
CLI runs, it parses the values in ARGV
(using Ruby's native OptionParser
)
and uses that information to build a CheckPlease::Flags
instance. After that
point, a flag will be referred to within the CheckPlease code exclusively by
its "Ruby name".
For example, the flag that controls the format in which diffs are displayed has
a Ruby name of format
, and CLI names of -f
and --format
.
This should behave more or less as an experienced Unix CLI user might expect.
As such, you can specify, e.g., that you want output in JSON format using
either --format json
or -f json
.
(I might expand this section some day. In the meantime, if you are not yet an
experienced Unix CLI user, feel free to ask for help! You can either open an
issue or look for emails in the .gemspec
file...)
All external API entry points allow you to specify flags using their Ruby names in the idiomatic "options Hash at the end of the argument list" that should be familiar to most Rubyists. (Again, I assume that, if you're using this tool, I don't need to explain this further, but feel free to ask for help if you need it.)
(Internally, CheckPlease immediately converts that options hash into a
CheckPlease::Flags
object, but that should be considered an implementation
detail unless you're interested in hacking on CheckPlease itself.)
For example, to get back a String containing the diffs between two data structures in JSON format, you might do:
reference = { "foo" => "wibble" }
candidate = { "bar" => "wibble" }
puts CheckPlease.render_diff(
reference,
candidate,
format: :json # <--- flags
)
Several flags may be specified more than once when invoking the CLI. I've tried to make both the CLI and the Ruby API follow their respective environment's conventions.
For example, if you want to specify a path to ignore using the
--reject-paths
flag, you'd invoke the CLI like this:
[bundle exec] check_please reference.json candidate.json --select-paths /foo
And if you want to specify more than one path, that would look like:
[bundle exec] check_please reference.json candidate.json --select-paths /foo --select-paths /bar
In Ruby, you can specify this in the options hash as a single key with an Array value:
CheckPlease.render_diff(reference, candidate, select_paths: [ "/foo", "/bar" ])
(NOTE TO MAINTAINERS: internally, the way CheckPlease::CLI::Parser
uses
Ruby's OptionParser
leads to some less than obvious behavior. Search
./spec/check_please/flags_spec.rb for the
word "surprising" for details.)
I know this looks like a LOT of information, but it's really not that bad! This feature just requires specific examples to describe, and talking about it in English (rather than code) is hard. Take a moment for some deep breaths if you need it. :)
If you're comfortable reading RSpec and/or want to check out all the edge cases, go look in
./spec/check_please/comparison_spec.rb
and check out thedescribe
block labeled"comparing arrays by keys"
.
The match_by_key
flag allows you to match up arrays of hashes using the value
of a single key that is treated as the identifier for each hash.
There's a lot going on in that sentence, so let's unpack it a bit.
Imagine you're comparing two documents that contain the same data, but in different orders. To use a contrived example, let's say that both documents consist of a single array of two simple hashes, but the reference array and the candidate array are reversed:
# REFERENCE
[ { "id" => 1, "foo" => "bar" }, { "id" => 2, "foo" => "spam" } ]
# CANDIDATE
[ { "id" => 2, "foo" => "spam" }, { "id" => 1, "foo" => "bar" } ]
By default, CheckPlease will match up array elements by their position in the array, resulting in a diff report like this:
TYPE | PATH | REFERENCE | CANDIDATE
---------|--------|-----------|----------
mismatch | /1/id | 1 | 2
mismatch | /1/foo | "bar" | "bat"
mismatch | /2/id | 2 | 1
mismatch | /2/foo | "bat" | "bar"
To solve this problem, CheckPlease adds a key expression to its (very simple) path syntax that lets you specify a key to use to match up elements in both lists, rather than simply comparing elements by position.
Continuing with the above example, if we give match_by_key
a value of
["/:id"]
, it will use the "id" value in both hashes (remember, A's id
is
1
and B's id
is 2
) to identify every element in both the reference array
and the candidate array, and correctly match A and B, giving you an empty list
of diffs.
Please note that the CLI and Ruby implementations of these are a bit different
(see "Setting Flags in the CLI" versus "Setting Flags in Ruby"), so if you're
doing this from the command line, it'll look like: --match-by-key /:id
Here, have another example. If you want to specify a match_by_key expression
below the root of the document, you can put the key expression further down
the path: /books/:isbn
This would correctly match up the following documents:
# REFERENCE
{
"books" => [
{ "isbn" => "12345", "title" => "Who Am I, Really?" },
{ "isbn" => "67890", "title" => "Who Are Any Of Us, Really?" },
]
}
# CANDIDATE
{
"books" => [
{ "isbn" => "67890", "title" => "Who Are Any Of Us, Really?" },
{ "isbn" => "12345", "title" => "Who Am I, Really?" },
]
}
Finally, if you have deeply nested data with arrays of hashes at multiple
levels, you can specify more than one key expression in a single path,
like: /authors/:id/books/:isbn
This would correctly match up the following documents:
# REFERENCE
{
"authors" => [
{
"id" => 1,
"name" => "Anne Onymous",
"books" => [
{ "isbn" => "12345", "title" => "Who Am I, Really?" },
{ "isbn" => "67890", "title" => "Who Are Any Of Us, Really?" },
]
},
]
}
# CANDIDATE
{
"authors" => [
{
"id" => 1,
"name" => "Anne Onymous",
"books" => [
{ "isbn" => "67890", "title" => "Who Are Any Of Us, Really?" },
{ "isbn" => "12345", "title" => "Who Am I, Really?" },
]
},
]
}
Finally, if there are any diffs to report, CheckPlease uses a key/value expression to report mismatches.
Using the last example above (the one with /authors/:id/books/:isbn
), if the
reference had Anne Onymous' book title as "Who Am I, Really?" and the candidate
listed it as "Who The Heck Am I?", CheckPlease would show the mismatch using
the following path expression: /authors/id=1/books/isbn=12345
This syntax is intended to be readable by humans first. If you need to build tooling that consumes it... well, I'm open to suggestions. :)
NOTE: This flag is only accessible via the Ruby API. (I have no idea how to reasonably express it in a CLI flag.)
Before comparing values at specified paths, normalize both values using the provided message or Proc.
To use an example from the tests, the following reference/candidate pair would normally create three "mismatch" diffs:
ref = { list: [ "foo", "bar", "yak" ] }
can = { list: [ :foo, :bar, :yak ] }
However, the values can be converted to String before comparison via any of the following:
CheckPlease.diff(ref, can, normalize_values: { "/list/*" => ->(v) { v.to_s } })
CheckPlease.diff(ref, can, normalize_values: { "/list/*" => :to_s })
CheckPlease.diff(ref, can, normalize_values: { "/list/*" => "to_s" })
Note that the value of the flag is a Hash.
- Its keys must be strings representing path expressions.
- If the value associated with a given path is a lambda/proc, it will be called with both the reference value and the candidate value.
- If the value is a String or Symbol, it will be sent as a message to both the reference and candidate values using Object#send.
- document flags for rspec matcher
- command line flags for :allthethings:!
- change display width for table format (for example, "2020-07-16T19:42:41.312978" gets cut off)
- sort by path?
- detect timestamps and compare after parsing?
- ignore sub-second precision (option / CLI flag)?
- possibly support plugins for other folks to add custom coercions?
- display filters? (e.g., { a: 1, b: 2 } ==> "Hash#3")
- shorter descriptions of values with different classes (but maybe just the existing :type_mismatch diffs?)
- another "possibly support plugins" expansion point here
- more output formats, maybe?
- [0xABAD1DEA] support wildcards in --select-paths and --reject-paths?
#
for indexes,**
to match one or more path segments? (This could get ugly fast.)
- [0xABAD1DEA] look for a config file in ./.check_please_config or ~/.check_please_config,
combine flags found there with those in ARGV in order of precedence:
- ARGV
- ./.check_please
- ~/.check_please
- but this may not actually be worth the time and complexity to implement, so think about this first...
After checking out the repo, run bin/setup
to install dependencies. Then, run
rake spec
to run the tests. You can also run bin/console
for an interactive
prompt that will allow you to experiment.
To install this gem onto your local machine, run bundle exec rake install
. To
release a new version, update the version number in version.rb
, and then run
bundle exec rake release
, which will create a git tag for the version, push
git commits and tags, and push the .gem
file to
rubygems.org.
Bug reports and pull requests are welcome on GitHub at https://github.com/RealGeeks/check_please. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the code of conduct.
The gem is available as open source under the terms of the MIT License.
Everyone interacting in the CheckPlease project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.