Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS::SQS::Errors::SignatureDoesNotMatch errors on Rubinius #455

Closed
yorickpeterse opened this issue Jan 28, 2014 · 65 comments
Closed

AWS::SQS::Errors::SignatureDoesNotMatch errors on Rubinius #455

yorickpeterse opened this issue Jan 28, 2014 · 65 comments

Comments

@yorickpeterse
Copy link

Since switching a daemon over to Rubinius we've been getting quite a few of the following errors:

AWS::SQS::Errors::SignatureDoesNotMatch: The request signature we calculated does not match the signature you provided. Check your AWS Secret Access Key and signing method. Consult the service documentation for details.

The Canonical String for this request should have been
'POST
/

content-length:99
content-type:application/x-www-form-urlencoded; charset=utf-8
host:sqs.eu-west-1.amazonaws.com
user-agent:aws-sdk-ruby/1.32.0 rbx/2.1.0 x86_64-linux-gnu
x-amz-content-sha256:5e204b2282fa47a4ecee2d1d985468c5a342980c025cad72584efcc69eb2f938
x-amz-date:20140128T112115Z

content-length;content-type;host;user-agent;x-amz-content-sha256;x-amz-date
5e204b2282fa47a4ecee2d1d985468c5a342980c025cad72584efcc69eb2f938'

The String-to-Sign should have been
'AWS4-HMAC-SHA256
20140128T112115Z
20140128/eu-west-1/sqs/aws4_request
b511d5ea84ff04bd3445ec8f15fdf21b458a41d7b452a4057018dd66a2e3cfab'

The corresponding stack trace:

File "/var/www/review_persister/deploy-2014-01-28_11_20_30/vendor/bundle/rbx/2.1/gems/aws-sdk-1.32.0/lib/aws/core/client.rb" line 368 in return_or_raise
File "/var/www/review_persister/deploy-2014-01-28_11_20_30/vendor/bundle/rbx/2.1/gems/aws-sdk-1.32.0/lib/aws/core/client.rb" line 469 in client_request
File "(eval)" line 3 in get_queue_url
File "/var/www/review_persister/deploy-2014-01-28_11_20_30/vendor/bundle/rbx/2.1/gems/aws-sdk-1.32.0/lib/aws/sqs/queue_collection.rb" line 161 in url_for
File "/var/www/review_persister/deploy-2014-01-28_11_20_30/vendor/bundle/rbx/2.1/gems/aws-sdk-1.32.0/lib/aws/sqs/queue_collection.rb" line 139 in named

The root frame of the error is our own code which creates an instance of AWS::SQS and then tries to retrieve a queue by its name. This happens in a threaded environment (10 threads to be exact) under Rubinius. MRI does not (so far) seem to be affected.

It's not exactly clear to me what's causing it. The individual AWS::SQS and related instances are not shared between threads, nor are any of our own AWS related operations.

As a temporary fix I disabled checksum validation for SQS operations but I'd rather not keep that disabled in the long run.

Setup info:

  • Ruby: rubinius 2.2.3 (2.1.0 4792e746 2013-12-29 JI) [x86_64-linux-gnu]
  • AWS SDK version: 1.32.0
  • Region: eu-west-1 (properly configured using AWS.config)
@yorickpeterse
Copy link
Author

To clarify, we've been running the same code base under MRI for a good 2 months now without any of these errors popping up. Last Friday I switched things over to Rubinius and they've since started to occur. It could be a bug in Rubinius but the stack trace doesn't show any clear signs of this sadly.

@yorickpeterse
Copy link
Author

And it seems disabling checksum validation doesn't do anything, the error just re-occurred.

@yorickpeterse
Copy link
Author

Extra info: the error seems to occur when a thread encounters an error, kicks off the error handling process and is then re-started (using a retry). I'll see if I can set up a way to reproduce this.

@yorickpeterse
Copy link
Author

Managed to reproduce it using the following script on Rubinius:

require 'aws'
require 'thread'

AWS.config(
  :access_key_id     => '...SNIP...',
  :secret_access_key => '...SNIP...',
  :sqs_region        => 'eu-west-1',
  :sqs_endpoint      => 'sqs.eu-west-1.amazonaws.com'
)

AWS.eager_autoload!(AWS::SQS)
AWS.eager_autoload!(AWS::Core)

def do_work
  sqs   = AWS::SQS.new
  queue = sqs.queues.named('test')

  queue.poll do |message|
    sleep 2 # Fake some work
    raise 'Fake error to showcase restarting of threads'
  end
end

threads = []
mutex   = Mutex.new

10.times do
  threads << Thread.new do
    begin
      do_work
    rescue => error
      # Don't mess up console output
      mutex.synchronize { puts "Error: #{error.message}" }

      sleep 2
      retry
    end
  end
end

threads.each(&:join)

@yorickpeterse
Copy link
Author

It's worth noting that the behaviour is rather random. In some runs I can't reproduce it, some other runs it happens basically right away.

@trevorrowe
Copy link
Member

Its curious that it is generating a signature error. Thank you for all of the details. I'll see what I can do about reproducing on my end.

@yorickpeterse
Copy link
Author

Any news on this? I'll be performing some extra tests today on the latest version of Rbx but I believe the issue still persists. Not yet sure yet if it's an issue with the AWS SDK or Rbx though.

@yorickpeterse
Copy link
Author

Extra note: this issue still persists on Rbx 2.2.6. Not sure yet if it's an Rbx or aws-sdk issue.

@yorickpeterse
Copy link
Author

Any news on this? I'd love to somehow fix this in whatever way we can in Rbx. However, I'm not familiar with the inner workings of the aws-sdk and have never seen anything like this outside of it.

@trevorrowe
Copy link
Member

Sorry for the slow response. I spent quite a bit of time reproducing and trying to resolve the issue from my end. Here are some of my findings:

  • I am unable to reproduce this using MRI 2.0.0p353
  • I can reproduce this issue locally "fairly" consistently using rubinius 2.2.1
  • The service responds with the error "The request signature we calculated does not match the signature you provided."
    • The error contains a string to sign and canonical request as computed by the service
    • I have patched my SDK to append the string to sign and canonical request to the error
    • These values always match up 100% with those computed by the service

The authorization header is contains a signature that is computed roughly like so:

OpenSSL::HMAC.hexdigest( OpenSSL::Digest.new('sha256'), derived_key, string_to_sign)

I have verified I am working with the correct string to sign. That leaves the possible sources of discrepancy to:

  • the derived key
  • the computed digest

The derived key is is calculated like so:

def derive_key(datetime)
  k_secret = credentials.secret_access_key
  k_date = hmac("AWS4" + k_secret, datetime[0,8])
  k_region = hmac(k_date, region)
  k_service = hmac(k_region, service_name)
  k_credentials = hmac(k_service, 'aws4_request')
end

Given the intermittent failure modes, this leeds me to strongly suspect a thread-safety issue. The v4 signer that computes the signature does not share state between calls to sign (a new digest is always created), except the read only credentials.

Is it possible that the OpenSSL::HMAC or OpenSSL::Digest modules are not thread safe? I'm open to suggestions.

@yorickpeterse
Copy link
Author

I'm not aware of any thread-safety issues with OpenSSL::Digest. The code used by Rubinius for this is the same code as used in MRI with some changes to make it compatible with Rbx. Assuming this were thread-unsafe it should also affect MRI, though the GIL probably prevents it from occuring. Having said that, it could be that indeed OpenSSL is somehow not thread-safe.

Is there a way this can be tested without the whole aws-sdk stack? Since you mentioned that the various signatures match it could be that something else other than OpenSSL is interfering along the line.

@JoshMcKin
Copy link

Hate to barge in with some possibly not very useful information. In the early development of EmAws I saw this error when I hadn't patched the Mutex with the Em.synchrony's fiber safe mutex. I 90% sure there use to be a mutex around fetching request credentials or a least a portion like the session_token but cannot be sure. I didn't create an issue in EmAws at the time because it was before it was ever released.

This was so long ago it might not be related but thought I'd mention it.

EmAws Mutex patch: https://github.com/JoshMcKin/em_aws/blob/master/lib/em_aws/patches.rb

@trevorrowe
Copy link
Member

@yorickpeterse Yes, we should be able to test and isolate this further. I simply ran out of time yesterday.

Some more background on the error message:

The two strings the service echos back in the error message are the:

  • canonical request
  • string to sign.

The canonical request is a string compiled from the request uri, headers and a full sha256 checksum of the http request body.

After generating the canonical request, you generate a "string to sign" which is comprised of a fixed string prefix, the current time, the credential scope (date/region/service name/fixed suffix) and finally a hexdigest of the canonical request string.

In my local test script (I will try to share this later), I am comparing the canonical request and string to sign that I generate with those returned by the service. They match perfectly in every error. Given the canonical request contains a sha256 checksum of the body, and the service checksum matches ours, I am positive the request is not getting corrupted over the wire.

This points to a bad signature. The final signature is created by computing a hex digest of the string to sign using a derived key (the derived key is your secret access key signed recursively with each portion of the credential scope).

The service validates the signature by computing and comparing signatures. In the case of failure, it echos back the computed fingerprints (canonical request and string to sign). It does not echo back the computed signature, but that is what fails to match. This is why I feel like the error is happening while generating the signature.

@JoshMcKin Not unhelpful, but I don't think that is the cause. In my test example, I am creating a single sqs client (with a single credentials object). That object is used to get my queue by name outside the concurrent sections of code. This should guarantee my static credentials are ready to read/reuse. You are correct, the non-static credential providers use a mutex to allow them to update.

I won't much have time to look at this today, I'm heading out of town this afternoon. I'll try to pick this back up on Monday though.

@trevorrowe
Copy link
Member

@yorickpeterse I have not been able to reproduce this error locally single switching over to OpenSSL::Digest. Can you confirm?

@yorickpeterse
Copy link
Author

@trevorrowe I haven't tried this yet with using OpenSSL::Digest instead of plain Digest. I'll give this a try tomorrow.

@yorickpeterse
Copy link
Author

Sorry for the late reply. I gave this a shot using aws-sdk 1.43.2 and have not yet experienced the error. I'll do some further testing in the coming days to be absolutely sure it's gone.

trevorrowe added a commit to amazon-archives/aws-sdk-core-ruby that referenced this issue Jun 18, 2014
The Digest class has some thread safety issues.

See aws/aws-sdk-ruby#455
See aws/aws-sdk-ruby#529
trevorrowe added a commit to amazon-archives/aws-sdk-core-ruby that referenced this issue Jun 18, 2014
@yorickpeterse
Copy link
Author

Upon further testing this problem does seem to still occur on Rubinius 2.2.10 using aws-sdk 1.46.0. In this case I'm getting errors such as the following:

AWS::SQS::Errors::SignatureDoesNotMatch: The request signature we calculated does not match the signature you provided. Check your AWS Secret Access Key and signing method. Consult the service documentation for details.

The Canonical String for this request should have been
'POST
/345153512707/shepherd

content-length:171
content-type:application/x-www-form-urlencoded; charset=utf-8
host:sqs.eu-west-1.amazonaws.com
user-agent:aws-sdk-ruby/1.46.0 rbx/2.1.0 x86_64-linux-gnu
x-amz-content-sha256:0edb693803f329fa2976405b32df33f1bd0600f5aeea7b5d3cd15a3bec7ab9ab
x-amz-date:20140703T151611Z

content-length;content-type;host;user-agent;x-amz-content-sha256;x-amz-date
0edb693803f329fa2976405b32df33f1bd0600f5aeea7b5d3cd15a3bec7ab9ab'

The String-to-Sign should have been
'AWS4-HMAC-SHA256
20140703T151611Z
20140703/eu-west-1/sqs/aws4_request
7447fe83fa48a596773ebecb9503237a2e4614988f6bb696bbe26fed8b6394ef

@yorickpeterse
Copy link
Author

Contents of the Gemfile:

source 'https://rubygems.org'

gem 'activerecord', '~> 4.0'
gem 'mysql2'

gem 'rollbar'
gem 'daemon-kit'
gem 'aws-sdk', '~> 1.0'
gem 'logstash-file'
gem 'oni', '~> 3.0'
gem 'dotenv'
gem 'json', '>= 1.8.1'

group :development, :test do
  gem 'pry'
  gem 'pry-doc'
  gem 'pry-theme'

  gem 'rspec', '~> 3.0'
  gem 'simplecov'
  gem 'rake'
end

group :yard do
  gem 'yard'
  gem 'kramdown'
end

And Gemfile.lock:

GEM
  remote: https://rubygems.org/
  specs:
    activemodel (4.1.4)
      activesupport (= 4.1.4)
      builder (~> 3.1)
    activerecord (4.1.4)
      activemodel (= 4.1.4)
      activesupport (= 4.1.4)
      arel (~> 5.0.0)
    activesupport (4.1.4)
      i18n (~> 0.6, >= 0.6.9)
      json (~> 1.7, >= 1.7.7)
      minitest (~> 5.1)
      thread_safe (~> 0.1)
      tzinfo (~> 1.1)
    arel (5.0.1.20140414130214)
    aws-sdk (1.46.0)
      json (~> 1.4)
      nokogiri (>= 1.4.4)
    builder (3.2.2)
    coderay (1.1.0)
    daemon-kit (0.3.0)
      eventmachine (>= 0.12.10)
      thor
    diff-lcs (1.2.5)
    docile (1.1.5)
    dotenv (0.11.1)
      dotenv-deployment (~> 0.0.2)
    dotenv-deployment (0.0.2)
    eventmachine (1.0.3)
    i18n (0.6.9)
    json (1.8.1)
    kramdown (1.4.0)
    logstash-file (0.2.0)
      json
    method_source (0.8.2)
    mini_portile (0.6.0)
    minitest (5.3.5)
    multi_json (1.10.1)
    mysql2 (0.3.16)
    nokogiri (1.6.2.1)
      mini_portile (= 0.6.0)
    oni (3.1.0)
    pry (0.10.0)
      coderay (~> 1.1.0)
      method_source (~> 0.8.1)
      slop (~> 3.4)
    pry-doc (0.6.0)
      pry (~> 0.9)
      yard (~> 0.8)
    pry-theme (1.1.2)
      coderay (~> 1.1)
      json (~> 1.8)
    rake (10.3.2)
    rollbar (0.13.1)
      multi_json (~> 1.3)
    rspec (3.0.0)
      rspec-core (~> 3.0.0)
      rspec-expectations (~> 3.0.0)
      rspec-mocks (~> 3.0.0)
    rspec-core (3.0.2)
      rspec-support (~> 3.0.0)
    rspec-expectations (3.0.2)
      diff-lcs (>= 1.2.0, < 2.0)
      rspec-support (~> 3.0.0)
    rspec-mocks (3.0.2)
      rspec-support (~> 3.0.0)
    rspec-support (3.0.2)
    simplecov (0.8.2)
      docile (~> 1.1.0)
      multi_json
      simplecov-html (~> 0.8.0)
    simplecov-html (0.8.0)
    slop (3.5.0)
    thor (0.19.1)
    thread_safe (0.3.4)
    tzinfo (1.2.1)
      thread_safe (~> 0.1)
    yard (0.8.7.4)

PLATFORMS
  ruby

DEPENDENCIES
  activerecord (~> 4.0)
  aws-sdk (~> 1.0)
  daemon-kit
  dotenv
  json (>= 1.8.1)
  kramdown
  logstash-file
  mysql2
  oni (~> 3.0)
  pry
  pry-doc
  pry-theme
  rake
  rollbar
  rspec (~> 3.0)
  simplecov
  yard

@yorickpeterse
Copy link
Author

Ok I did some testing and both Digest and OpenSSL::Digest instances are not safe to share across threads. On MRI this seems to magically work, probably due to the lack of proper threading. On Rubinius doing so will rather quickly result in either a Ruby exception or an entire VM crash when running the following:

require 'thread'
require 'digest'
require 'digest/sha1'

Thread.abort_on_exception = true

expected = '0beec7b5ea3f0fdbc95d0dd47f3c5bc275da8a33'
threads  = []
digest   = Digest::SHA1.new

100.times do
  threads << Thread.new do
    loop do
      got = digest.hexdigest('foo')

      unless got == expected
        raise "#{got} is not #{expected}"
      end
    end
  end
end

threads.each(&:join)

Using https://github.com/aws/aws-sdk-ruby/search?q=%22OpenSSL+Digest%22&type=Code there seem to be some places where Digest instances are kept around. For example,

OpenSSL::HMAC.hexdigest(@digest, @key, value)
stores an instance variable with a Digest instance.

I'll do some digging to see if those particular bits are shared between threads.

@trevorrowe
Copy link
Member

The chunk signed stream class you linked to is currently not used by the SDK. Each client constructs a single signer and reuses that across requests. However, I don't believe any of the signer classes cache digest objects (except the un-used chunk signer).

It could be fairly easy to test. The base client class defines a #sign_request method. This method memoizes the request signer. If this memoization were removed, the #sign_request method would construct a new signer for every request.

#sign_request is defined here:

https://github.com/aws/aws-sdk-ruby/blob/master/lib/aws/core/client.rb#L701

@yorickpeterse
Copy link
Author

@trevorrowe The code in particular is a tad confusing and I'm not sure what method exactly to override. That is, AWS::Core::Client does not seem to have the method signature_version nor sign_request as either an instance or class method (at least according to Pry). The API documentation (http://docs.aws.amazon.com/AWSRubySDK/latest/AWS/Core/Client.html) also makes no mention of said method.

Basically my idea is to override the method so that it doesn't memoize the signer, then test if that works without throwing any signature errors.

@trevorrowe
Copy link
Member

Each service can choose a different signature version / signer. For this reason, there is a macro .signature_version that defines the #sign_request method.

As a quick monkey patch, the following should work:

class AWS::SQS::Client
  def sign_request(req)
    args = [credential_provider, 'sqs', req.region]
    signer = AWS::Core::Signers::Version4.new(*args)
    signer.sign_request(req)
    req
  end
end

The code this is replacing looks like so:

def signature_version(version, service_signing_name = nil)
  define_method(:sign_request) do |req|
    @signer ||= begin
      signer_class = AWS::Core::Signers.const_get(version)
      signer_args = (version == :Version4) ?
        [credential_provider, service_signing_name, req.region] :
        [credential_provider]
      signer_class.new(*signer_args)
    end
    @signer.sign_request(req)
    req
  end
end

Notice, the default implantation caches the signer to the client instance. The monkey patched version creates a new signer for each request.

@yorickpeterse
Copy link
Author

After testing said patch it appears that removing the caching of the signer instance doesn't prevent the errors from occurring. A global lock however seems to do the trick so far:

class AWS::SQS::Client
  GLOBAL_LOCK = Mutex.new

  def sign_request(req)
    GLOBAL_LOCK.synchronize do
      args = [credential_provider, 'sqs', req.region]
      signer = AWS::Core::Signers::Version4.new(*args)
      signer.sign_request(req)
      req
    end
  end
end

This might suggest that the actual AWS client instance is shared between threads. This would be odd considering the application in question itself shares nothing AWS related between threads. Instead each SQS connection runs in its own, isolated thread with its own client. I'll try to see if the client is indeed accessed from multiple threads.

@yorickpeterse
Copy link
Author

I applied the following patch to our codebase:

class AWS::SQS::Client
  GLOBAL_LOCK = Mutex.new

  def sign_request(req)
    id = Thread.current.instance_variable_get(:@thread_id)
    puts "Thread ##{id} accessing #{self.class} ##{object_id}"

    GLOBAL_LOCK.synchronize do
      args = [credential_provider, 'sqs', req.region]
      signer = AWS::Core::Signers::Version4.new(*args)
      signer.sign_request(req)
      req
    end
  end
end

Running the application results in the following output:

Thread #12 accessing AWS::SQS::Client::V20121105 #165844
Thread #16 accessing AWS::SQS::Client::V20121105 #165844
Thread #11 accessing AWS::SQS::Client::V20121105 #165844
Thread #14 accessing AWS::SQS::Client::V20121105 #165844
Thread #10 accessing AWS::SQS::Client::V20121105 #165844
Thread #13 accessing AWS::SQS::Client::V20121105 #165844
Thread #12 accessing AWS::SQS::Client::V20121105 #165844
Thread #16 accessing AWS::SQS::Client::V20121105 #165844
Thread #11 accessing AWS::SQS::Client::V20121105 #165844
Thread #17 accessing AWS::SQS::Client::V20121105 #165844
Thread #13 accessing AWS::SQS::Client::V20121105 #165844
Thread #9 accessing AWS::SQS::Client::V20121105 #165844
Thread #8 accessing AWS::SQS::Client::V20121105 #165844
Thread #15 accessing AWS::SQS::Client::V20121105 #165844
Thread #10 accessing AWS::SQS::Client::V20121105 #165844
Thread #14 accessing AWS::SQS::Client::V20121105 #165844
Thread #12 accessing AWS::SQS::Client::V20121105 #165844
Thread #16 accessing AWS::SQS::Client::V20121105 #165844
Thread #13 accessing AWS::SQS::Client::V20121105 #165844
Thread #17 accessing AWS::SQS::Client::V20121105 #165844
Thread #11 accessing AWS::SQS::Client::V20121105 #165844
Thread #10 accessing AWS::SQS::Client::V20121105 #165844
Thread #8 accessing AWS::SQS::Client::V20121105 #165844
Thread #14 accessing AWS::SQS::Client::V20121105 #165844
Thread #15 accessing AWS::SQS::Client::V20121105 #165844
Thread #16 accessing AWS::SQS::Client::V20121105 #165844
Thread #12 accessing AWS::SQS::Client::V20121105 #165844
Thread #9 accessing AWS::SQS::Client::V20121105 #165844
Thread #11 accessing AWS::SQS::Client::V20121105 #165844
Thread #13 accessing AWS::SQS::Client::V20121105 #165844
Thread #14 accessing AWS::SQS::Client::V20121105 #165844
Thread #8 accessing AWS::SQS::Client::V20121105 #165844
Thread #15 accessing AWS::SQS::Client::V20121105 #165844
Thread #16 accessing AWS::SQS::Client::V20121105 #165844
Thread #12 accessing AWS::SQS::Client::V20121105 #165844
Thread #17 accessing AWS::SQS::Client::V20121105 #165844
Thread #9 accessing AWS::SQS::Client::V20121105 #165844
Thread #11 accessing AWS::SQS::Client::V20121105 #165844
Thread #15 accessing AWS::SQS::Client::V20121105 #165844
Thread #16 accessing AWS::SQS::Client::V20121105 #165844
Thread #13 accessing AWS::SQS::Client::V20121105 #165844
Thread #12 accessing AWS::SQS::Client::V20121105 #165844
Thread #8 accessing AWS::SQS::Client::V20121105 #165844
Thread #14 accessing AWS::SQS::Client::V20121105 #165844

The last number in each line is the object ID of the client in question. Perhaps I'm missing something but this would suggest that there's a single AWS::SQS::Client instance being shared across all threads.

@yorickpeterse
Copy link
Author

Worth noting: the @thread_id variable is Rubinius specific and its a hack to get it out this way. As far as I know you can't get thread IDs reliably on MRI.

@trevorrowe
Copy link
Member

Thats interesting.

Given the global mutex around the signing code seems to resolve the problem, that implies one of a few possible causes:

  • The credential_provider object shared by each signer
  • Code within the signer, e.g. OpenSSL

The v4 signer has 3 instance variables, which are the credential, the service name string and the region string. What type of credentials object are you testing with? Can you dump out the credentials class name?

@JoshMcKin
Copy link

This may not matter but rubinius 2.2.9 is set to ruby version 2.1.0 but the open ssl threadsafe patch is in MRI Ruby 2.1.1+

@yorickpeterse
Copy link
Author

If it helps, while I haven't seen these particular errors on JRuby I did just get the following error:

NameError: uninitialized constant AWS::Core::Signers::S3::Base
org/jruby/RubyModule.java line 2723 in const_missing
/usr/local/rvm/gems/jruby-1.7.16.1/gems/aws-sdk-v1-1.61.0/lib/aws/core/signers/s3.rb line 59 in signature
/usr/local/rvm/gems/jruby-1.7.16.1/gems/aws-sdk-v1-1.61.0/lib/aws/core/signers/s3.rb line 53 in authorization
/usr/local/rvm/gems/jruby-1.7.16.1/gems/aws-sdk-v1-1.61.0/lib/aws/core/signers/s3.rb line 47 in sign_request
/usr/local/rvm/gems/jruby-1.7.16.1/gems/aws-sdk-v1-1.61.0/lib/aws/s3/client.rb line 58 in sign_request
/usr/local/rvm/gems/jruby-1.7.16.1/gems/aws-sdk-v1-1.61.0/lib/aws/core/client.rb line 492 in client_request
org/jruby/RubyProc.java line 271 in call
/usr/local/rvm/gems/jruby-1.7.16.1/gems/aws-sdk-v1-1.61.0/lib/aws/core/response.rb line 175 in build_request
/usr/local/rvm/gems/jruby-1.7.16.1/gems/aws-sdk-v1-1.61.0/lib/aws/core/response.rb line 114 in initialize
/usr/local/rvm/gems/jruby-1.7.16.1/gems/aws-sdk-v1-1.61.0/lib/aws/core/client.rb line 203 in new_response
/usr/local/rvm/gems/jruby-1.7.16.1/gems/aws-sdk-v1-1.61.0/lib/aws/core/client.rb line 490 in client_request
/usr/local/rvm/gems/jruby-1.7.16.1/gems/aws-sdk-v1-1.61.0/lib/aws/core/client.rb line 391 in log_client_request
/usr/local/rvm/gems/jruby-1.7.16.1/gems/aws-sdk-v1-1.61.0/lib/aws/core/client.rb line 477 in client_request
/usr/local/rvm/gems/jruby-1.7.16.1/gems/aws-sdk-v1-1.61.0/lib/aws/core/client.rb line 373 in return_or_raise
/usr/local/rvm/gems/jruby-1.7.16.1/gems/aws-sdk-v1-1.61.0/lib/aws/core/client.rb line 476 in client_request
(eval) line 3 in put_object
/usr/local/rvm/gems/jruby-1.7.16.1/gems/aws-sdk-v1-1.61.0/lib/aws/s3/s3_object.rb line 1760 in write_with_put_object
/usr/local/rvm/gems/jruby-1.7.16.1/gems/aws-sdk-v1-1.61.0/lib/aws/s3/s3_object.rb line 612 in write
/usr/local/rvm/gems/jruby-1.7.16.1/gems/aws-sdk-v1-1.61.0/lib/aws/s3/object_collection.rb line 85 in create
/usr/local/rvm/gems/jruby-1.7.16.1/gems/opener-daemons-2.3.2/lib/opener/daemons/uploader.rb line 32 in create
/usr/local/rvm/gems/jruby-1.7.16.1/gems/opener-daemons-2.3.2/lib/opener/daemons/uploader.rb line 18 in upload
/usr/local/rvm/gems/jruby-1.7.16.1/gems/opener-daemons-2.3.2/lib/opener/daemons/worker.rb line 73 in upload_output
/usr/local/rvm/gems/jruby-1.7.16.1/gems/opener-daemons-2.3.2/lib/opener/daemons/worker.rb line 46 in process

While this is in S3 and not SQS I suspect this code is suffering from the same problem.

@quinn
Copy link

quinn commented Mar 16, 2015

i see the label "Version 1" on here, but I am seeing this with v2.0.30. I am using MRI 2.1.3p242. error happens after awhile using threads.

@yorickpeterse
Copy link
Author

@quinn Can you reproduce it using the script mentioned in #455 (comment)? You probably have to adjust the script a bit for V2 of the SDK though.

@quinn
Copy link

quinn commented Mar 16, 2015

@yorickpeterse I think i was mistaken, i'm trying w/o threads and still seeing. sorry for the confusion

@yorickpeterse
Copy link
Author

@quinn If possible, could you at least share the error output if you still have it?

@quinn
Copy link

quinn commented Mar 16, 2015

sure:

/Users/lsu/.rbenv/versions/2.1.3/lib/ruby/gems/2.1.0/gems/aws-sdk-core-2.0.30/lib/seahorse/client/plugins/raise_response_errors.rb:15:in `call': The request signature we calculated does not match the signature you provided. Check your key and signing method. (Aws::S3::Errors::SignatureDoesNotMatch)
    from /Users/lsu/.rbenv/versions/2.1.3/lib/ruby/gems/2.1.0/gems/aws-sdk-core-2.0.30/lib/aws-sdk-core/plugins/s3_sse_cpk.rb:18:in `call'
    from /Users/lsu/.rbenv/versions/2.1.3/lib/ruby/gems/2.1.0/gems/aws-sdk-core-2.0.30/lib/seahorse/client/plugins/param_conversion.rb:22:in `call'
    from /Users/lsu/.rbenv/versions/2.1.3/lib/ruby/gems/2.1.0/gems/aws-sdk-core-2.0.30/lib/aws-sdk-core/plugins/response_paging.rb:10:in `call'
    from /Users/lsu/.rbenv/versions/2.1.3/lib/ruby/gems/2.1.0/gems/aws-sdk-core-2.0.30/lib/seahorse/client/plugins/response_target.rb:18:in `call'
    from /Users/lsu/.rbenv/versions/2.1.3/lib/ruby/gems/2.1.0/gems/aws-sdk-core-2.0.30/lib/seahorse/client/request.rb:70:in `send_request'
    from /Users/lsu/.rbenv/versions/2.1.3/lib/ruby/gems/2.1.0/gems/aws-sdk-core-2.0.30/lib/seahorse/client/base.rb:216:in `block (2 levels) in define_operation_methods'
    from /Users/lsu/.rbenv/versions/2.1.3/lib/ruby/gems/2.1.0/gems/aws-sdk-resources-2.0.30/lib/aws-sdk-resources/request.rb:24:in `call'
    from /Users/lsu/.rbenv/versions/2.1.3/lib/ruby/gems/2.1.0/gems/aws-sdk-resources-2.0.30/lib/aws-sdk-resources/operations.rb:41:in `call'
    from /Users/lsu/.rbenv/versions/2.1.3/lib/ruby/gems/2.1.0/gems/aws-sdk-resources-2.0.30/lib/aws-sdk-resources/operation_methods.rb:19:in `block in add_operation'
    from copy_assets.rb:8:in `block in <main>'
    from /Users/lsu/.rbenv/versions/2.1.3/lib/ruby/gems/2.1.0/gems/aws-sdk-resources-2.0.30/lib/aws-sdk-resources/batch.rb:75:in `each'
    from /Users/lsu/.rbenv/versions/2.1.3/lib/ruby/gems/2.1.0/gems/aws-sdk-resources-2.0.30/lib/aws-sdk-resources/batch.rb:75:in `each'
    from /Users/lsu/.rbenv/versions/2.1.3/lib/ruby/gems/2.1.0/gems/aws-sdk-resources-2.0.30/lib/aws-sdk-resources/batch.rb:75:in `each'
    from /Users/lsu/.rbenv/versions/2.1.3/lib/ruby/gems/2.1.0/gems/aws-sdk-resources-2.0.30/lib/aws-sdk-resources/collection.rb:18:in `block in each'
    from /Users/lsu/.rbenv/versions/2.1.3/lib/ruby/gems/2.1.0/gems/aws-sdk-resources-2.0.30/lib/aws-sdk-resources/operations.rb:128:in `block in all_batches'
    from /Users/lsu/.rbenv/versions/2.1.3/lib/ruby/gems/2.1.0/gems/aws-sdk-core-2.0.30/lib/aws-sdk-core/pageable_response.rb:77:in `each'
    from /Users/lsu/.rbenv/versions/2.1.3/lib/ruby/gems/2.1.0/gems/aws-sdk-resources-2.0.30/lib/aws-sdk-resources/operations.rb:127:in `all_batches'
    from /Users/lsu/.rbenv/versions/2.1.3/lib/ruby/gems/2.1.0/gems/aws-sdk-resources-2.0.30/lib/aws-sdk-resources/collection.rb:18:in `each'
    from /Users/lsu/.rbenv/versions/2.1.3/lib/ruby/gems/2.1.0/gems/aws-sdk-resources-2.0.30/lib/aws-sdk-resources/collection.rb:18:in `each'
    from copy_assets.rb:5:in `<main>'

It happens after awhile but always on the same object. I am batch copying from one bucket to another using Object#copy_from.

@quinn
Copy link

quinn commented Mar 16, 2015

I was not url encoding the copy source param, a bit embarrassing. Still, it's a cryptic error message for this type of error.

@yorickpeterse
Copy link
Author

@trevorrowe This particular problem also seems to occur when using V2 of the SDK and Rubinius 2.5.2:

Aws::S3::Errors::SignatureDoesNotMatch: The request signature we calculated does not match the signature you provided. Check your key and signing method.
/var/www/kaf_processor/deploy-2015-04-24_15_08_47/vendor/bundle/rbx/2.1/gems/aws-sdk-core-2.0.40/lib/seahorse/client/plugins/raise_response_errors.rb" line 15 in call
/var/www/kaf_processor/deploy-2015-04-24_15_08_47/vendor/bundle/rbx/2.1/gems/aws-sdk-core-2.0.40/lib/aws-sdk-core/plugins/s3_sse_cpk.rb" line 18 in call
/var/www/kaf_processor/deploy-2015-04-24_15_08_47/vendor/bundle/rbx/2.1/gems/aws-sdk-core-2.0.40/lib/seahorse/client/plugins/param_conversion.rb" line 19 in call
/var/www/kaf_processor/deploy-2015-04-24_15_08_47/vendor/bundle/rbx/2.1/gems/aws-sdk-core-2.0.40/lib/aws-sdk-core/plugins/response_paging.rb" line 9 in call
/var/www/kaf_processor/deploy-2015-04-24_15_08_47/vendor/bundle/rbx/2.1/gems/aws-sdk-core-2.0.40/lib/seahorse/client/plugins/response_target.rb" line 15 in call
/var/www/kaf_processor/deploy-2015-04-24_15_08_47/vendor/bundle/rbx/2.1/gems/aws-sdk-core-2.0.40/lib/seahorse/client/request.rb" line 70 in send_request
/var/www/kaf_processor/deploy-2015-04-24_15_08_47/vendor/bundle/rbx/2.1/gems/aws-sdk-core-2.0.40/lib/seahorse/client/base.rb" line 216 in define_operation_methods
/var/www/kaf_processor/deploy-2015-04-24_15_08_47/vendor/bundle/rbx/2.1/gems/aws-sdk-resources-2.0.40/lib/aws-sdk-resources/request.rb" line 24 in call
/var/www/kaf_processor/deploy-2015-04-24_15_08_47/vendor/bundle/rbx/2.1/gems/aws-sdk-resources-2.0.40/lib/aws-sdk-resources/operations.rb" line 41 in call
/var/www/kaf_processor/deploy-2015-04-24_15_08_47/lib/kaf_processor/s3_uploader.rb" line 41 in upload_json
/var/www/kaf_processor/deploy-2015-04-24_15_08_47/lib/kaf_processor/processor/text_nodes.rb" line 24 in process_without_trace_Custom_KafProcessor_Processor_TextNodes_process (process)
/var/www/kaf_processor/deploy-2015-04-24_15_08_47/vendor/bundle/rbx/2.1/gems/newrelic_rpm-3.10.0.279/lib/new_relic/agent/method_tracer.rb" line 343 in process_with_trace_Custom_KafProcessor_Processor_TextNodes_process
/var/www/kaf_processor/deploy-2015-04-24_15_08_47/vendor/bundle/rbx/2.1/gems/newrelic_rpm-3.10.0.279/lib/new_relic/agent/method_tracer_helpers.rb" line 81 in trace_execution_scoped
/var/www/kaf_processor/deploy-2015-04-24_15_08_47/vendor/bundle/rbx/2.1/gems/newrelic_rpm-3.10.0.279/lib/new_relic/agent/method_tracer.rb" line 341 in process (process_with_trace_Custom_KafProcessor_Processor_TextNodes_process)
/var/www/kaf_processor/deploy-2015-04-24_15_08_47/lib/kaf_processor/worker.rb" line 62 in run_processor
/var/www/kaf_processor/deploy-2015-04-24_15_08_47/lib/kaf_processor/worker.rb" line 40 in process
kernel/bootstrap/array.rb" line 76 in each
/var/www/kaf_processor/deploy-2015-04-24_15_08_47/lib/kaf_processor/worker.rb" line 39 in _process (process)
/var/www/kaf_processor/deploy-2015-04-24_15_08_47/config/initializers/newrelic_tracers.rb" line 16 in process_without_newrelic_transaction_trace (process)
(eval)" line 3 in process_with_newrelic_transaction_trace
/var/www/kaf_processor/deploy-2015-04-24_15_08_47/vendor/bundle/rbx/2.1/gems/newrelic_rpm-3.10.0.279/lib/new_relic/agent/instrumentation/controller_instrumentation.rb" line 353 in perform_action_with_newrelic_trace
(eval)" line 2 in process (process_with_newrelic_transaction_trace)
/var/www/kaf_processor/deploy-2015-04-24_15_08_47/vendor/bundle/rbx/2.1/gems/oni-4.0.0/lib/oni/daemon.rb" line 101 in run_worker
/var/www/kaf_processor/deploy-2015-04-24_15_08_47/vendor/bundle/rbx/2.1/gems/oni-4.0.0/lib/oni/daemon.rb" line 85 in process
/var/www/kaf_processor/deploy-2015-04-24_15_08_47/vendor/bundle/rbx/2.1/gems/oni-4.0.0/lib/oni/daemon.rb" line 174 in run_thread
/var/www/kaf_processor/deploy-2015-04-24_15_08_47/vendor/bundle/rbx/2.1/gems/oni-4.0.0/lib/oni/daemons/sqs.rb" line 35 in receive
/var/www/kaf_processor/deploy-2015-04-24_15_08_47/vendor/bundle/rbx/2.1/gems/aws-sdk-resources-2.0.40/lib/aws-sdk-resources/services/sqs/queue_poller.rb" line 413 in yield_messages
kernel/bootstrap/array.rb" line 76 in each
/var/www/kaf_processor/deploy-2015-04-24_15_08_47/vendor/bundle/rbx/2.1/gems/aws-sdk-resources-2.0.40/lib/aws-sdk-resources/services/sqs/queue_poller.rb" line 412 in yield_messages
/var/www/kaf_processor/deploy-2015-04-24_15_08_47/vendor/bundle/rbx/2.1/gems/aws-sdk-resources-2.0.40/lib/aws-sdk-resources/services/sqs/queue_poller.rb" line 405 in process_messages
kernel/bootstrap/proc.rb" line 20 in call
kernel/common/throw_catch.rb" line 30 in catch
kernel/common/throw_catch.rb" line 7 in register
kernel/common/throw_catch.rb" line 29 in catch
/var/www/kaf_processor/deploy-2015-04-24_15_08_47/vendor/bundle/rbx/2.1/gems/aws-sdk-resources-2.0.40/lib/aws-sdk-resources/services/sqs/queue_poller.rb" line 404 in process_messages
/var/www/kaf_processor/deploy-2015-04-24_15_08_47/vendor/bundle/rbx/2.1/gems/aws-sdk-resources-2.0.40/lib/aws-sdk-resources/services/sqs/queue_poller.rb" line 336 in poll
kernel/common/kernel.rb" line 510 in loop
/var/www/kaf_processor/deploy-2015-04-24_15_08_47/vendor/bundle/rbx/2.1/gems/aws-sdk-resources-2.0.40/lib/aws-sdk-resources/services/sqs/queue_poller.rb" line 331 in poll
kernel/bootstrap/proc.rb" line 20 in call
kernel/common/throw_catch.rb" line 30 in catch
kernel/common/throw_catch.rb" line 7 in register
kernel/common/throw_catch.rb" line 29 in catch
/var/www/kaf_processor/deploy-2015-04-24_15_08_47/vendor/bundle/rbx/2.1/gems/aws-sdk-resources-2.0.40/lib/aws-sdk-resources/services/sqs/queue_poller.rb" line 330 in poll
/var/www/kaf_processor/deploy-2015-04-24_15_08_47/vendor/bundle/rbx/2.1/gems/oni-4.0.0/lib/oni/daemons/sqs.rb" line 34 in receive
/var/www/kaf_processor/deploy-2015-04-24_15_08_47/vendor/bundle/rbx/2.1/gems/oni-4.0.0/lib/oni/daemon.rb" line 174 in run_thread
/var/www/kaf_processor/deploy-2015-04-24_15_08_47/vendor/bundle/rbx/2.1/gems/oni-4.0.0/lib/oni/daemon.rb" line 158 in spawn_thread
kernel/bootstrap/proc.rb" line 20 in call
kernel/bootstrap/thread.rb" line 396 in __run__

screenshot_2015-04-24_17-57-04

There is also a huge amount of autoloading problems occurring when using V2. On V1 we also had these problems but they could be solved by using AWS.eager_autoload!. Because such a method doesn't exist for V2 I had to hack it together myself, basically it would recursively refer to all constants, fixing the problem. I'll be looking into seeing if this is a problem with Rbx's autoload not being thread-safe.

@trevorrowe
Copy link
Member

I did not implement eager_auto! load in v2 as my understanding is that autoload in now thread-safe. We should explore this more.

As for the signature errors, the only fix I understand that we've tried that resolves the issue is to put a global mutex around the OpenSSL digest methods. This seems like it shouldn't be necessary. Thoughts?

@yorickpeterse
Copy link
Author

@trevorrowe At least on JRuby 1.7 autoload is not thread-safe as far as I'm aware of. I believe it wasn't until JRuby 9000 that it was made thread-safe, but @headius can probably fill you in on that. By the looks of it autoload isn't thread-safe on Rubinius either, I'll do some digging to see if there's anything we can do on our end to fix that.

As for the signature errors, the only fix I understand that we've tried that resolves the issue is to put a global mutex around the OpenSSL digest methods. This seems like it shouldn't be necessary. Thoughts?

In comment #455 (comment) I discussed this and set up a standalone script that couldn't reproduce anything close to the problem discussed in this issue. The code we use for OpenSSL is pulled directly from MRI commit ruby/ruby@5a58165. I still have to look into updating to the current version, I vaguely remember having problems last time I tried.

Either way, in V1 the problem was that the client/signature signing instances were shared between threads without any synchronisation in place. Is such a pattern still the case with V2?

@headius
Copy link

headius commented Apr 24, 2015

What is not thread-safe about autoload in JRuby 1.7?

@yorickpeterse
Copy link
Author

@headius see #455 (comment).

@headius
Copy link

headius commented Apr 24, 2015

Those errors do not necessarily indicate that JRuby's autoload is unsafe. Constants can end up missing during any concurrent require logic, regardless of whether autoload is involved, if code attempts to access classes while they're still booting. This can happen if, for example, defined? is used to determine the existence of a class. This is because class definition is not atomic.

Of course it could also indicate a thread-safety problem in JRuby's autoload, but it's definitely not the first place I'd look.

@headius
Copy link

headius commented Apr 24, 2015

Ok, looking at the code, there's an access of Base there, and Base is defined as an autoload...so there's a better chance this is a bug in autoload in JRuby. I'll see if I can come up with a case that reproduces it.

@yorickpeterse
Copy link
Author

At least this script https://gist.github.com/YorickPeterse/2efb97451fd27c34aec7 fails on Rubinius with constant missing errors, haven't managed to get it to fail on JRuby yet.

@yorickpeterse
Copy link
Author

In rubinius/rubinius@b57399f I took care of the autoloading problems (as far as I can tell). I finally managed to get my hands on a signature error using the scripts discussed in https://gist.github.com/YorickPeterse/2efb97451fd27c34aec7:

An exception occurred running repro.rb:

 (Aws::SQS::Errors::SignatureDoesNotMatch)

Backtrace:

  Seahorse::Client::Plugins::RaiseResponseErrors::Handler#call at /home/yorickpeterse/.gem/rbx/2.1.0/gems/aws-sdk-core-2.0.40/lib/seahorse/client/plugins/raise_response_errors.rb:15
      Seahorse::Client::Plugins::ParamConversion::Handler#call at /home/yorickpeterse/.gem/rbx/2.1.0/gems/aws-sdk-core-2.0.40/lib/seahorse/client/plugins/param_conversion.rb:22
                    Aws::Plugins::ResponsePaging::Handler#call at /home/yorickpeterse/.gem/rbx/2.1.0/gems/aws-sdk-core-2.0.40/lib/aws-sdk-core/plugins/response_paging.rb:10
       Seahorse::Client::Plugins::ResponseTarget::Handler#call at /home/yorickpeterse/.gem/rbx/2.1.0/gems/aws-sdk-core-2.0.40/lib/seahorse/client/plugins/response_target.rb:18
                        Seahorse::Client::Request#send_request at /home/yorickpeterse/.gem/rbx/2.1.0/gems/aws-sdk-core-2.0.40/lib/seahorse/client/request.rb:70
              { } in Aws::SQS::Client#define_operation_methods at /home/yorickpeterse/.gem/rbx/2.1.0/gems/aws-sdk-core-2.0.40/lib/seahorse/client/base.rb:216
                                      { } in Object#__script__ at repro.rb:11
                                                     Proc#call at kernel/bootstrap/proc.rb:20
                                                Thread#__run__ at kernel/bootstrap/thread.rb:356

This indicates that the script does eventually reproduce the problem given enough time and luck.

@yorickpeterse
Copy link
Author

In a production application the autoloading problems are indeed resolved, sadly the signature errors remain (when using V2). I'll be looking into this today to see if I can solve this problem for good.

@trevorrowe
Copy link
Member

@yorickpeterse In production, are you using static credentials, or are you using refreshing credentials, such as an IAM instance profile, assume role credentials, STS session credentials, etc?

@yorickpeterse
Copy link
Author

@trevorrowe I've tested this both on an EC2 instance (which uses IAM credentials) and locally (which uses static credentials). I'm currently looking into rubysl-openssl as I suspect it has some problems that might trigger this particular problem. Running my repro script mentioned above with -Xcapi.lock (preventing concurrent C API calls) seems to prevent any errors from occurring, hence my suspicion.

@yorickpeterse
Copy link
Author

We recently updated the OpenSSL code of Rubinius to match the code of the latest MRI version (rubysl-openssl is basically a 1:1 copy of MRI's OpenSSL code). Having looked at the code of V2 of the AWS SDK I couldn't find a point where mutable state was shared between threads, leading me to believe it might be OpenSSL that's broken.

To confirm/deny this I set up a new standalone script to try and trigger the problem:

require 'openssl'

Thread.abort_on_exception = true

input  = 'the cake is possibly a lie'
digest = OpenSSL::Digest::SHA256.new

digest.update(input)

expected = digest.hexdigest

puts 'Starting threads...'

threads = 10.times.map do
  Thread.new do
    loop do
      digest = OpenSSL::Digest::SHA256.new
      digest.update(input)

      got = digest.hexdigest

      if got != expected
        raise "Expected digest #{got.inspect} to equal #{expected.inspect}"
      end
    end
  end
end

threads.each(&:join)

On MRI this will run fine for all eternity. On Rubinius on the other hand this pretty much instantly crashes with the following output:

Starting threads...
An exception occurred running /tmp/openssl_thread.rb:

Expected digest "70b9d40465992fa71a861e050481fb1a4bed5c8aa127272e7bf5838a3b0ab240" to equal "83103cff21a7a50e45eb90c29ca9d24204d95fe7369565ec1d5339750da386ab" (RuntimeError)

Backtrace:

  { } in Object#__script__ at /tmp/openssl_thread.rb:23
       Kernel(Object)#loop at kernel/common/kernel.rb:511
  { } in Object#__script__ at /tmp/openssl_thread.rb:16
                 Proc#call at kernel/bootstrap/proc.rb:20
            Thread#__run__ at kernel/bootstrap/thread.rb:356

Interesting enough the digest appears to always be "70b9d40465992fa71a861e050481fb1a4bed5c8aa127272e7bf5838a3b0ab240". However, when running Rubinius with a CAPI lock (using -Xcapi.lock) the produced digest is "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855", though the VM sometimes segfault in the OpenSSL code.

To cut a long story short, it seems the OpenSSL extension is beyond broken in multi-threaded environments or Rubinius does some weird things to it. I'll continue digging until I'm sure what's to blame. If it turns out to be unrelated to the AWS SDK itself I'll close this issue.

@yorickpeterse
Copy link
Author

Per "rideliner" from the Rubinius Gitter channel, the above snippet is flawed. The local variable digest in the Thread.new block also overwrites the outer variable, leading to the race condition. Somehow I expected locals to at least be thread-local (even in this case). I shall get myself a dunce cap.

Either way, the actual signature problem still persists. I'll continue my investigation, but at least it seems that both rubysl-digest and rubysl-openssl are not the cause for these problems.

@yorickpeterse
Copy link
Author

@trevorrowe I literally just got this error on MRI 2.2 as well (though this is using V1), which seems to rule our Rubinius itself being the problem. The error/backtrace is as following (as taken from Rollbar):

AWS::SQS::Errors::SignatureDoesNotMatch: The request signature we calculated does not match the signature you provided. Check your AWS Secret Access Key and signing method. Consult the service documentation for details.

The Canonical String for this request should have been
'POST
/345153512707/qr_generator

content-length:175
content-type:application/x-www-form-urlencoded; charset=utf-8
host:sqs.eu-west-1.amazonaws.com
user-agent:aws-sdk-ruby/1.63.0 ruby/2.2.2 x86_64-linux
x-amz-content-sha256:47236ab18c30db436023205b5b09bd472e527b24ade58bc3acc21ed4a0d6ad4c
x-amz-date:20150716T181615Z
x-amz-security-token:[REMOVED]

content-length;content-type;host;user-agent;x-amz-content-sha256;x-amz-date;x-amz-security-token
[REMOVED]'

The String-to-Sign should have been
'AWS4-HMAC-SHA256
20150716T181615Z
20150716/eu-west-1/sqs/aws4_request
8b1031f744460308b9295139a5b1c3992e3bf83e66b70b699fd3479fe2d25c19'

Backtrace:

File "/var/www/barqr/deploy-2015-06-17_13_58_37/vendor/bundle/ruby/2.2.0/gems/aws-sdk-v1-1.63.0/lib/aws/core/client.rb" line 375 in return_or_raise
File "/var/www/barqr/deploy-2015-06-17_13_58_37/vendor/bundle/ruby/2.2.0/gems/aws-sdk-v1-1.63.0/lib/aws/core/client.rb" line 476 in client_request
File "(eval)" line 3 in receive_message
File "/var/www/barqr/deploy-2015-06-17_13_58_37/vendor/bundle/ruby/2.2.0/gems/aws-sdk-v1-1.63.0/lib/aws/sqs/queue.rb" line 201 in receive_message
File "/var/www/barqr/deploy-2015-06-17_13_58_37/vendor/bundle/ruby/2.2.0/gems/aws-sdk-v1-1.63.0/lib/aws/sqs/queue.rb" line 303 in block in poll
File "/var/www/barqr/deploy-2015-06-17_13_58_37/vendor/bundle/ruby/2.2.0/gems/aws-sdk-v1-1.63.0/lib/aws/sqs/queue.rb" line 301 in loop
File "/var/www/barqr/deploy-2015-06-17_13_58_37/vendor/bundle/ruby/2.2.0/gems/aws-sdk-v1-1.63.0/lib/aws/sqs/queue.rb" line 301 in poll
File "/var/www/barqr/deploy-2015-06-17_13_58_37/vendor/bundle/ruby/2.2.0/gems/oni-3.1.0/lib/oni/daemons/sqs.rb" line 34 in receive
File "/var/www/barqr/deploy-2015-06-17_13_58_37/vendor/bundle/ruby/2.2.0/gems/oni-3.1.0/lib/oni/daemon.rb" line 180 in run_thread
File "/var/www/barqr/deploy-2015-06-17_13_58_37/vendor/bundle/ruby/2.2.0/gems/oni-3.1.0/lib/oni/daemon.rb" line 164 in block in spawn_thread

@yorickpeterse
Copy link
Author

For the first time ever I also experienced this issue on JRuby 1.7:

AWS::SQS::Errors::SignatureDoesNotMatch: The request signature we calculated does not match the signature you provided. Check your AWS Secret Access Key and signing method. Consult the service documentation for details.

The Canonical String for this request should have been
'POST
/345153512707/opener-opinion-detector-basic

content-length:192
content-type:application/x-www-form-urlencoded; charset=utf-8
host:sqs.eu-west-1.amazonaws.com
user-agent:aws-sdk-ruby/1.64.0 jruby/1.9.3 java
x-amz-content-sha256:90638f76a52ee75c5210c44a615ba9e75278e3e54da88972f50cb680501308d6
x-amz-date:20150820T230324Z
x-amz-security-token:[REMOVED]

content-length;content-type;host;user-agent;x-amz-content-sha256;x-amz-date;x-amz-security-token
[REMOVED]'

The String-to-Sign should have been
'AWS4-HMAC-SHA256
20150820T230324Z
20150820/eu-west-1/sqs/aws4_request
25907c2c04c518ad21ab5a3c66f1a31fa3eaacc87c62cc6aecf740cb44710c14'

Backtrace:

File "/usr/local/rvm/gems/jruby-1.7.16.1/gems/aws-sdk-v1-1.64.0/lib/aws/core/client.rb" line 375 in return_or_raise
File "/usr/local/rvm/gems/jruby-1.7.16.1/gems/aws-sdk-v1-1.64.0/lib/aws/core/client.rb" line 476 in client_request
File "(eval)" line 3 in receive_message
File "/usr/local/rvm/gems/jruby-1.7.16.1/gems/aws-sdk-v1-1.64.0/lib/aws/sqs/queue.rb" line 201 in receive_message
File "/usr/local/rvm/gems/jruby-1.7.16.1/gems/aws-sdk-v1-1.64.0/lib/aws/sqs/queue.rb" line 303 in poll
File "org/jruby/RubyKernel.java" line 1501 in loop
File "/usr/local/rvm/gems/jruby-1.7.16.1/gems/aws-sdk-v1-1.64.0/lib/aws/sqs/queue.rb" line 301 in poll
File "/usr/local/rvm/gems/jruby-1.7.16.1/gems/oni-3.1.1/lib/oni/daemons/sqs.rb" line 34 in receive
File "/usr/local/rvm/gems/jruby-1.7.16.1/gems/oni-3.1.1/lib/oni/daemon.rb" line 180 in run_thread
File "/usr/local/rvm/gems/jruby-1.7.16.1/gems/oni-3.1.1/lib/oni/daemon.rb" line 164 in spawn_thread

Ruby info:

jruby 1.7.16.1 (1.9.3p392) 2014-10-28 4e93f31 on OpenJDK 64-Bit Server VM 1.7.0_85-mockbuild_2015_07_20_19_47-b00 +jit [linux-amd64]

@awood45
Copy link
Member

awood45 commented Dec 9, 2016

This issue has been inactive for over a year, and V1 is end of life. Happy to continue to look at this, given all the effort in place so far, if we get more information. For now, closing. Again, feel free to reopen if more information comes in.

@awood45 awood45 closed this as completed Dec 9, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants