Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Process hostname encoding error in sqlite #143

Merged
merged 1 commit into from
Feb 7, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion lib/solid_queue/processes/registrable.rb
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ def kind
end

def hostname
@hostname ||= Socket.gethostname
@hostname ||= Socket.gethostname.force_encoding(Encoding::UTF_8)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just have a question: why not try to convert the string to UTF-8 via encode(Encoding::UTF_8) rather than changing the encoding alone?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm thanks for a good question! I think it's just a habit that i've typically reached for force_encoding in the past so I did it here. I guess most issues i've encountered have been encoding misinterpretations, vs needing to be re-encoded.

But I think encode should work just as well, might work for more use-cases, and also doesn't mutate anything which is a nice side effect (or lack of a side effect 🙃 ). I'll change it

Copy link
Contributor Author

@jpcamara jpcamara Feb 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, it actually doesn't work.

hostname = "Basecamp’s-Computer"
 => "Basecamp’s-Computer"
hostname.force_encoding("ASCII-8BIT")
 => "Basecamp\xE2\x80\x99s-Computer"
hostname.force_encoding("ASCII-8BIT").encode(Encoding::UTF_8)
(irb):3:in `encode': "\xE2" from ASCII-8BIT to UTF-8 (Encoding::UndefinedConversionError)

It also fails the same way if I use the direct result of Socket.gethostname on my machine:

Socket.gethostname.force_encoding("ASCII-8BIT") # works
Socket.gethostname.encode(Encoding::UTF_8) # fails
# `encode': "\xE2" from ASCII-8BIT to UTF-8 (Encoding::UndefinedConversionError)

The return value of gethostname seems to always come back in binary format, but the underlying bytes, at least in my case, are already valid UTF-8 (\xE2\x80\x99 is the byte sequence for "right single quotation mark"). So by forcing the encoding to be UTF-8 with force_encoding, it reinterprets the string correctly.

When I use encode it seems to actually convert as if it was binary data to utf-8 and fails.

I don't know for certain forcing encoding couldn't cause an issue on some platform somewhere - but the same format has been in use by the newrelic ruby agent for the past 5 years (newrelic/newrelic-ruby-agent@e1f65a8), so there's a good chance it's ok to keep as force_encoding.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, how odd! The first case, I'd expect it to fail, but it's very strange the string comes from Socket.gethostname with the wrong encoding, I wonder why that's it...

In any case, ok to keep force_encoding 👍

end

def process_pid
Expand Down
12 changes: 12 additions & 0 deletions test/models/solid_queue/process_test.rb
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
require "test_helper"
require "minitest/mock"

class SolidQueue::ProcessTest < ActiveSupport::TestCase
test "prune processes with expired heartbeats" do
Expand All @@ -13,4 +14,15 @@ class SolidQueue::ProcessTest < ActiveSupport::TestCase
SolidQueue::Process.prune
end
end

test "hostname's with special characters are properly loaded" do
worker = SolidQueue::Worker.new(queues: "*", threads: 3, polling_interval: 0.2)
hostname = "Basecamp’s-Computer"

Socket.stub :gethostname, hostname.force_encoding("ASCII-8BIT") do
worker.start
wait_for_registered_processes(1, timeout: 1.second)
assert_equal hostname, SolidQueue::Process.last.hostname
end
end
end