Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"when freedom patches fight, nobody wins" -@jeremy #138

Closed
wants to merge 1 commit into from

Conversation

hone
Copy link

@hone hone commented Sep 13, 2013

Use JSON.generate instead of Object#to_json

The reasoning behind this is that Object#to_json is part of the [freedom]((https://github.com/rails/rails/blob/master/activesupport/lib/active_support/core_ext/object/to_json.rb#L7-L19) patching wars with ActiveSupport. This means that in Rails out of the box for Ruby 1.9+ (since json is part of stdlib), the asset pipeline will use
the pure ruby #to_json method provided by ActiveSupport. This results in ~40% performance decrease.

Use `JSON.generate` instead of `Object#to_json`

The reasoning behind this is that `Object#to_json` is part of the
[freedom]((https://github.com/rails/rails/blob/master/activesupport/lib/active_support/core_ext/object/to_json.rb#L7-L19)
[patching](https://github.com/rails/rails/blob/master/activesupport/lib/active_support/json/encoding.rb#L305-L312)
wars with `ActiveSupport`. This means that in Rails out of the box for
Ruby 1.9+ (since json is part of stdlib), the asset pipeline will use
the pure ruby `#to_json` method provided by `ActiveSupport`. This
results in ~40% performance decrease.
@guilleiguaran
Copy link

great work tackling the issue ❤️ 👍

@chancancode
Copy link

👍

1 similar comment
@friism
Copy link

friism commented Sep 13, 2013

👍

@guilleiguaran
Copy link

btw, the commit message is great 😄

@rwz
Copy link
Member

rwz commented Sep 14, 2013

Guys, I'm all for this, but we had this before and ended up reverting it back.

Here's the related issue: #86

@rwz
Copy link
Member

rwz commented Sep 14, 2013

So, could you all try this patch and run your real-life rails apps test suites against it?

@guilleiguaran
Copy link

/cc @jeremy wdyt of this and #86 ??

@weilu
Copy link

weilu commented Sep 14, 2013

👍

@chancancode chancancode mentioned this pull request Sep 14, 2013
21 tasks
@chancancode
Copy link

The situation is actually quite complex, but I'll attempt to explain.

TL;DR

  1. @hone's patch itself, in isolation, is just fine.
  2. Currently, you cannot use the JSON gem together with Rails. More accurately, you cannot use the JSON gem's generator if you have required, directly or indirectly, active_support/json/encoding. JSON gem's parser should work just fine.
  3. If you are currently using the JSON gem adapter with Rails (AS::JSON::Encoding), and you have been calling object#to_json or MultiJson.dump(object), you are actually using Rail's pure Ruby JSON encoder :trollface:.
  4. If you are currently calling JSON.generate or JSON.dump in your Rails app, you should stop doing that. If you try, your app might break unexpectedly 💥.
  5. Scroll to bottom for "So what should MultiJson do?"

❗ ❗ ❗ ❗ ❗

By the way, don't EVER require json/add/core in your Rails (or ActiveSupport-powered) app. Things would break spectacularly and would likely introduce some very dangerous security vulnerabilities to your app.

Here is why:

  1. ActiveSupport's JSON encoding API provide three methods/hooks: #as_json(options = {}), #to_json(options = {}) (the default is actually nil, but the point is they options are expected to be a Hash) and #encode_json(encoder)
  2. JSON gem's encoding API consists of a single method/hook: #to_json(state = nil, *)

As you can see, they both define #to_json, but the signature is different. In fact, they do very different things. Rail's #to_json is an entry point for its encoder. JSON gem's #to_json is the encoder.

Here is what (currently) happens when you call ActiveSupport's some_object.to_json(some: "options"):

  1. It create an instance of AS::JSON::Encoding::Encoder to hold the options hash
  2. It calls encoder.encode(some_object), which in turns calls some_object.as_json(@options).encode_json(self)

Here is what happens when you call JSON gem's some_object.to_json(some: "options"):

  1. It turns the first argument into a JSON::State:
    1. If it's already a JSON::State object, do nothing
    2. If it's the object can be converted into a Hash,
      1. Convert it into a Hash
      2. Copy all the values stored in the Hash as instance variables (!)
      3. Pretend to be a Hash (JSON::State does not inherit from Hash) by implementing #to_h, #to_hash, #[], #[]=, and #merge :trollface:
    3. Otherwise, raise an error
  2. Convert self into a JSON string, by recursively calling #to_json(state) (for objects with children, i.e. Array and Hash) and joining the returned strings with commas and stuff

And this is what happens when you call JSON.generate(some_obj, quirks_mode: true, other: "json gem's options"):

  1. It converts the second parameter into a JSON::State
  2. It calls state.generate(some_obj), which calls...
  3. some_obj.to_json(self)

If you connect the dots, there are two major problems that prevents you from mixing JSON gem with ActiveSupport's JSON encoder:

  1. JSON gem's JSON.generate doesn't actually do the encoding - it just delegates to the #to_json method on that object. But Rails also defines the same method and pipes that through it's own encoder! So, when you have active_support/json/encoding loaded, JSON.generate actually uses ActiveSupport's encoder instead.**
  2. If you call JSON.generate with ActiveSupport's encoder loaded, it will pass a JSON::State object into your #as_json methods. Because JSON::State pretends to be a Hash, things would actually work okay for the most part, until you try to call the Hash methods that it doesn't define, such as #key?

There is one minor detail I haven't mentioned yet, the JSON gem comes in two versions – the pure Ruby version and the C extension, which is what MRI actually uses. The stuff I wrote above applies to the pure Ruby version. The C extension does more or less the same thing, except:

  1. When you can actually use it, it is faster (obviously)
  2. For built in classes that it recognizes (specificallyHash, Array, String, nil, false, true, Fixnum, Bignum, Float – but not their subclasses) it doesn't call #to_json on it, and use the hardcoded (in C) algorithm

Which is why, on a good day, when you're calling JSON.generate in Rails on some "clean" data structure, it will actually work, and it'll be quite fast. However, as soon as you started passing in objects that the C extension doesn't recognize, it'll call your #to_json method with its JSON::State argument.

The future

In Rails 4.1, MultiJson will be removed. Currently, the parsing side hardcoded to use the JSON gem. The encoding side is hardcoded to use its pure Ruby encoder.

I'm working on a refactor of the encoding side to solve the aforementioned problems and use the JSON gem for generating the JSON string. You can track my progress at rails/rails#12183. It's still a WIP, and obviously we don't 100% know if it would land in Rails 4.1, but I'm quite determined to get this done and I have made some significant progress already.

By the way, if you have a Rails 4 app and would like to help test the new JSON encoder when it's ready, please shoot me an email at godfrey@caliper.io. It would be really helpful to be able to test this with real-world apps to tune compatibility and performance.

So what should MultiJson do?

Honestly, I'm not quite sure. IMO the ideal thing to do would be...

  1. Merge this PR
  2. Add an :active_support adapter, and prefer it over :json_gem adapter if ActiveSupport::JSON is defined (yes it's slower, but it avoids surprises)
  3. When someone try to set MultiJson's adapter to :json_gem, and ActiveSupport::JSON is defined, then raise a n error and die 💥

But that won't work, because ActiveSupport <= 4.0 still uses MultiJson for its decoding, so there would be a chicken an egg problem for decoding.

So realistically, the only option for MultiJson at this point is:

  1. Don't merge this PR for maximum compatibility
  2. When someone try to call MultiJson::Adapters::JsonCommon.dump, check if ActiveSupport::JSON::Encoding is defined, if so emit then a warning that they are not using what they think are using

To close off...

  1. "when freedom patches fight, nobody wins" -@jeremy
  2. If you have a chance to write a library of this sort, don't expose your internal states through your public hooks, and keep your entry point and your engine internals seperate

@stve
Copy link

stve commented Sep 18, 2013

thanks to @chancancode for taking on this hornets nest. 👏

@rwz rwz closed this Sep 28, 2013
chancancode added a commit to chancancode/rails that referenced this pull request Nov 6, 2013
See [1] for why this is not a good idea.

As part of this refactor, circular reference protection in as_json has
been removed and the corresponding error class has been deprecated.

As discussed with @jeremy, circular reference error is considered
programmer errors and protecting against it is out of scope for
the encoder.

This is again based on the excellent work by @sergiocampama in rails#11728.

[1]: intridea/multi_json#138 (comment)
chancancode added a commit to chancancode/rails that referenced this pull request Nov 6, 2013
JSON.{dump,generate} offered by the JSON gem is not compatiable with
Rails at the moment and can cause a lot of subtle bugs when passed
certain data structures. This changed all direct usage of the JSON gem
in internal Rails code to always go through AS::JSON.{decode,encode}.

We also shouldn't be implementing `to_json` most of the time, and
these occurances are replaced with an equivilent `as_json`
implementation to avoid problems down the road.

See [1] for all the juicy details.

[1]: intridea/multi_json#138 (comment)
chancancode added a commit to chancancode/rails that referenced this pull request Nov 7, 2013
See [1] for why this is not a good idea.

As part of this refactor, circular reference protection in as_json has
been removed and the corresponding error class has been deprecated.

As discussed with @jeremy, circular reference error is considered
programmer errors and protecting against it is out of scope for
the encoder.

This is again based on the excellent work by @sergiocampama in rails#11728.

[1]: intridea/multi_json#138 (comment)
tomclegg added a commit to arvados/arvados that referenced this pull request Jan 21, 2015
* We use the Oj and multi_json gems, which makes Oj the default JSON
  parser. However, Rails' ActiveRecord::Base overrides this and uses
  the native JSON parser, which is slow. In our case we have two
  render() calls that represent nearly all cases where we ask
  ActiveRecord to serialize for us. In both cases we already have a
  hash (not a model object), and we always want JSON responses. So we
  can fix the performance problem simply by calling Oj.dump()
  ourselves, and passing the resulting JSON (instead of the hash) to
  render().

More gory details:

* "ActiveRecord::Base.extend kills JSON performance":
  rails/rails#9212

* "when freedom patches fight, nobody wins":
  intridea/multi_json#138 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants