Digest::Base cannot be directly inherited in Ruby #525

yorickpeterse opened this issue Apr 30, 2014 · 18 comments

After upgrading some of our aws-sdk based systems to MRI 2.1.1 we started seeing the following errors:

RuntimeError: Digest::Base cannot be directly inherited in Ruby

File "/var/www/review_collector/deploy-2014-04-30_01_23_15/vendor/bundle/ruby/2.1.0/gems/aws-sdk-1.36.2/lib/aws/core/signers/version_4.rb" line 195 in new
File "/var/www/review_collector/deploy-2014-04-30_01_23_15/vendor/bundle/ruby/2.1.0/gems/aws-sdk-1.36.2/lib/aws/core/signers/version_4.rb" line 195 in hexdigest
File "/var/www/review_collector/deploy-2014-04-30_01_23_15/vendor/bundle/ruby/2.1.0/gems/aws-sdk-1.36.2/lib/aws/core/signers/version_4.rb" line 188 in body_digest
File "/var/www/review_collector/deploy-2014-04-30_01_23_15/vendor/bundle/ruby/2.1.0/gems/aws-sdk-1.36.2/lib/aws/core/signers/version_4.rb" line 61 in sign_request
File "/var/www/review_collector/deploy-2014-04-30_01_23_15/vendor/bundle/ruby/2.1.0/gems/aws-sdk-1.36.2/lib/aws/core/client.rb" line 705 in block in signature_version
File "/var/www/review_collector/deploy-2014-04-30_01_23_15/vendor/bundle/ruby/2.1.0/gems/aws-sdk-1.36.2/lib/aws/core/client.rb" line 491 in block (3 levels) in client_request
File "/var/www/review_collector/deploy-2014-04-30_01_23_15/vendor/bundle/ruby/2.1.0/gems/aws-sdk-1.36.2/lib/aws/core/response.rb" line 171 in call
File "/var/www/review_collector/deploy-2014-04-30_01_23_15/vendor/bundle/ruby/2.1.0/gems/aws-sdk-1.36.2/lib/aws/core/response.rb" line 171 in build_request
File "/var/www/review_collector/deploy-2014-04-30_01_23_15/vendor/bundle/ruby/2.1.0/gems/aws-sdk-1.36.2/lib/aws/core/response.rb" line 111 in initialize
File "/var/www/review_collector/deploy-2014-04-30_01_23_15/vendor/bundle/ruby/2.1.0/gems/aws-sdk-1.36.2/lib/aws/core/client.rb" line 203 in new
File "/var/www/review_collector/deploy-2014-04-30_01_23_15/vendor/bundle/ruby/2.1.0/gems/aws-sdk-1.36.2/lib/aws/core/client.rb" line 203 in new_response
File "/var/www/review_collector/deploy-2014-04-30_01_23_15/vendor/bundle/ruby/2.1.0/gems/aws-sdk-1.36.2/lib/aws/core/client.rb" line 489 in block (2 levels) in client_request
File "/var/www/review_collector/deploy-2014-04-30_01_23_15/vendor/bundle/ruby/2.1.0/gems/aws-sdk-1.36.2/lib/aws/core/client.rb" line 390 in log_client_request
File "/var/www/review_collector/deploy-2014-04-30_01_23_15/vendor/bundle/ruby/2.1.0/gems/aws-sdk-1.36.2/lib/aws/core/client.rb" line 476 in block in client_request
File "/var/www/review_collector/deploy-2014-04-30_01_23_15/vendor/bundle/ruby/2.1.0/gems/aws-sdk-1.36.2/lib/aws/core/client.rb" line 372 in return_or_raise
File "/var/www/review_collector/deploy-2014-04-30_01_23_15/vendor/bundle/ruby/2.1.0/gems/aws-sdk-1.36.2/lib/aws/core/client.rb" line 475 in client_request
File "(eval)" line 3 in get_queue_url
File "/var/www/review_collector/deploy-2014-04-30_01_23_15/vendor/bundle/ruby/2.1.0/gems/aws-sdk-1.36.2/lib/aws/sqs/queue_collection.rb" line 165 in url_for
File "/var/www/review_collector/deploy-2014-04-30_01_23_15/vendor/bundle/ruby/2.1.0/gems/aws-sdk-1.36.2/lib/aws/sqs/queue_collection.rb" line 143 in named
File "/var/www/review_collector/deploy-2014-04-30_01_23_15/vendor/bundle/ruby/2.1.0/gems/oni-3.1.0/lib/oni/daemons/sqs.rb" line 56 in queue
File "/var/www/review_collector/deploy-2014-04-30_01_23_15/lib/review_collector/daemon.rb", line 72 in receive

This is on MRI ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-linux] using aws-sdk 1.36.2. The application itself is multi-threaded but uses individual AWS::SQS instances per thread. No data is shared between threads on application level.

It's not yet clear to me what exactly is causing this problem. However, since it only seems to occur sometimes. Less than 1% of our workload seems to trigger this error. This leads me to believe it's a race condition, which will be fun to debug. I'll see if I can piece together some extra info.

screenshot - 300414 - 10 26 16 Looking at this graph (showing these Digest errors grouped per hour) the errors don't seem to happen at specific times, instead they appear to occur fairly randomly.

Contents of the application's Gemfile (with trimmed Git URLs):

source ''

gem 'rollbar'
gem 'daemon-kit'
gem 'aws-sdk', '~> 1.0'
gem 'logstash-file'
gem 'oni', '~> 3.0'
gem 'json', ['>= 1.8.1']

gem 'dalli'
gem 'nokogiri', ['~> 1.6', '>= 1.6.1']
gem 'httpclient'
gem 'countries'

gem 'mongoid', '~> 3.1'
gem 'activerecord', '~> 3.0'

# Git dependencies.

gem 'holidaycheck_api',
  :git    => '...',
  :branch => 'master'

gem 'hotels_nl',
  :git    => '...',
  :branch => 'master'

gem 'tripadvisor', :git => '...'

group :development, :test do
  gem 'pry'
  gem 'pry-doc'
  gem 'pry-theme'
  gem 'bond'
  gem 'webmock'
  gem 'ansi'

  gem 'rspec'
  gem 'ci_reporter'
  gem 'simplecov'
  gem 'rake'
  gem 'rubocop'

group :yard do
  gem 'yard'
  gem 'kramdown'

Contents of the Gemfile.lock:

  remote: ...
  revision: 20dd108366b7d6fdaa7e6629fb42cab4db1e5ea8
  branch: master
    holidaycheck_api (0.3.1)
      nori (~> 2.3)

  remote: ...
  revision: b0be19e9a649cfb02801abb550f93f14540f9810
  branch: master
    hotels_nl (0.2.0)
      nori (~> 2.3)

  remote: ...
  revision: 02ab7a98e4915e99c790245ff7cbfec7bc6e4cfe
    tripadvisor (1.0.1)

    activemodel (3.2.17)
      activesupport (= 3.2.17)
      builder (~> 3.0.0)
    activerecord (3.2.17)
      activemodel (= 3.2.17)
      activesupport (= 3.2.17)
      arel (~> 3.0.2)
      tzinfo (~> 0.3.29)
    activesupport (3.2.17)
      i18n (~> 0.6, >= 0.6.4)
      multi_json (~> 1.0)
    addressable (2.3.5)
    ansi (1.4.3)
    arel (3.0.3)
    ast (1.1.0)
    aws-sdk (1.36.2)
      json (~> 1.4)
      nokogiri (>= 1.4.4)
      uuidtools (~> 2.1)
    bond (0.5.1)
    builder (3.0.4)
    ci_reporter (1.9.1)
      builder (>= 2.1.2)
    coderay (1.1.0)
    countries (0.9.3)
      currencies (~> 0.4.2)
    crack (0.4.2)
      safe_yaml (~> 1.0.0)
    currencies (0.4.2)
    daemon-kit (0.2.3)
      eventmachine (>= 0.12.10)
      safely (>= 0.3.1)
    dalli (2.7.0)
    diff-lcs (1.2.5)
    docile (1.1.3)
    eventmachine (1.0.3)
    faraday (0.8.9)
      multipart-post (~> 1.2.0)
    faraday_middleware (0.9.0)
      faraday (>= 0.7.4, < 0.9)
    httpclient (
    i18n (0.6.9)
    json (1.8.1)
    kramdown (1.3.3)
    logstash-file (0.2.0)
    method_source (0.8.2)
    mini_portile (0.5.3)
    mongoid (3.1.6)
      activemodel (~> 3.2)
      moped (~> 1.4)
      origin (~> 1.0)
      tzinfo (~> 0.3.29)
    moped (1.5.2)
    multi_json (1.9.2)
    multipart-post (1.2.0)
    nokogiri (1.6.1)
      mini_portile (~> 0.5.0)
    nori (2.3.0)
    oni (3.1.0)
    origin (1.1.0)
    parser (2.1.7)
      ast (~> 1.1)
      slop (~> 3.4, >= 3.4.5)
    powerpack (0.0.9)
    pry (
      coderay (~> 1.0)
      method_source (~> 0.8)
      slop (~> 3.4)
    pry-doc (0.6.0)
      pry (~> 0.9)
      yard (~> 0.8)
    pry-theme (1.0.2)
      coderay (~> 1.1)
      json (~> 1.8)
    rainbow (2.0.0)
    rake (10.1.1)
    rollbar (0.12.15)
      multi_json (~> 1.3)
    rspec (2.14.1)
      rspec-core (~> 2.14.0)
      rspec-expectations (~> 2.14.0)
      rspec-mocks (~> 2.14.0)
    rspec-core (2.14.8)
    rspec-expectations (2.14.5)
      diff-lcs (>= 1.1.3, < 2.0)
    rspec-mocks (2.14.6)
    rubigen (1.5.7)
      activesupport (>= 2.3.5)
    rubocop (0.19.1)
      json (>= 1.7.7, < 2)
      parser (~> 2.1.7)
      powerpack (~> 0.0.6)
      rainbow (>= 1.99.1, < 3.0)
      ruby-progressbar (~> 1.4)
    ruby-progressbar (1.4.1)
    safe_yaml (1.0.1)
    safely (0.3.2)
    simplecov (0.8.2)
      docile (~> 1.1.0)
      simplecov-html (~> 0.8.0)
    simplecov-html (0.8.0)
    slop (3.5.0)
    thor (0.18.1)
    tzinfo (0.3.39)
    uuidtools (2.1.4)
    webmock (1.17.4)
      addressable (>= 2.2.7)
      crack (>= 0.3.2)
    yard (


  activerecord (~> 3.0)
  aws-sdk (~> 1.0)
  json (>= 1.8.1)
  mongoid (~> 3.1)
  nokogiri (~> 1.6, >= 1.6.1)
  oni (~> 3.0)

The offending code is the following:

To make things even weirder:

Digest::SHA256.superclass # => Digest::Base

Perhaps MRI pulls off some magic tricks but it's a bit odd that the SHA256 class inherits something that can't be inherited.

Seems this was reported in the past as well in amazon-archives/aws-sdk-core-ruby#43

Looking at the C code, seems super racy. If some other thread is modifying the same data structures it could potentially be the case that it ends up raising the error as the ancestor tree is still being set up.

@knu any comments on the above? Is the Digest module supposed to be thread-safe?

The rate at which this error occur seems to vary depending on how fast data is being processed. For example, one service with a much smaller work load per job triggers this error much faster than a slower service.

You could try patching with if you are using 2.1.1.

OpenSSL had a thread safety issue that was resolved in July for 2.1.1, but does not appear to have made it to 2.0 much less 1.9.

We are running on MRI 2.1.1 so I don't think those changes resolve this particular problem.

Oh derp, I misinterpreted that as patching Digest, not OpenSSL::Digest. I was considering that or slapping a big mutex around the signing part but I haven't gotten to trying either method yet.

Yeah if you correct and plain Ruby Digest is not threadsafe, I was wondering how the guys might go about fixing it. The best performing fix might be ugly but a simple mutex around hex digest would burdensome for MRI-2.1.1 folks or anyone with JRuby assuming its not an issue for them.

knu commented Apr 30, 2014

Preloading Digest::SHA256, i.e. require 'digest/sha2' at the top level, may work as a cure in the meantime.
I'll investigate it when I get the time, maybe this weekend.

I deployed the following hack to one of our applications:

AWS::Core::Signers::Version4::Digest = OpenSSL::Digest

I've not seen the error pop up in said application since adding the hack. I'll give it a try with some other applications as well.

Version 1.40.3 has been released with this fix.

Closing as fixed. Please re-open is the issue persists.

Is there are reproducible test case? I'd like to get this reported to ruby-core, if this is an issue with 2.1.1.

@findchris No, I haven't been able to set up a test case that doesn't use aws-sdk.

@yorickpeterse Do you have a reproducible test case that uses aws-sdk?

@findchris I vaguely recall having had one around but I can't seem to find it. You might be able to reproduce this particular error by using the code here #455 (comment). Having said that, since deploying the above fix (which is included in current aws-sdk releases) I have not experienced this particular problem.

knu added a commit to ruby/ruby that referenced this issue Oct 31, 2014
* ext/digest/lib/digest.rb (Digest()): This function should now be
  thread-safe.  If you have a problem with regard to on-demand
  loading under a multi-threaded environment, preload "digest/*"
  modules on boot or use this method instead of directly
  referencing Digest::*. [Bug #9494]
  cf. aws/aws-sdk-ruby#525

git-svn-id: svn+ssh:// b2dd03c8-39d4-4d8f-98ff-823fe69b080e
ayumin pushed a commit to ayumin/ruby that referenced this issue Jan 4, 2015
* ext/digest/lib/digest.rb (Digest()): This function should now be
  thread-safe.  If you have a problem with regard to on-demand
  loading under a multi-threaded environment, preload "digest/*"
  modules on boot or use this method instead of directly
  referencing Digest::*. [Bug ruby#9494]
  cf. aws/aws-sdk-ruby#525

git-svn-id: svn+ssh:// b2dd03c8-39d4-4d8f-98ff-823fe69b080e
hsbt pushed a commit to ruby/digest that referenced this issue Jul 28, 2017
* ext/digest/lib/digest.rb (Digest()): This function should now be
  thread-safe.  If you have a problem with regard to on-demand
  loading under a multi-threaded environment, preload "digest/*"
  modules on boot or use this method instead of directly
  referencing Digest::*. [Bug #9494]
  cf. aws/aws-sdk-ruby#525

git-svn-id: svn+ssh:// b2dd03c8-39d4-4d8f-98ff-823fe69b080e
alexdunae added a commit to rrn/acts_as_replaceable that referenced this issue Apr 14, 2021
Digest is apparently not threadsafe and was causing errors in the specs:

`Digest::Base cannot be directly inherited in Ruby`

