Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MySQL duplicate key being reported as an error, even though it's expected #3256

Closed
dextorer opened this issue Nov 14, 2023 · 3 comments · Fixed by #3316
Closed

MySQL duplicate key being reported as an error, even though it's expected #3256

dextorer opened this issue Nov 14, 2023 · 3 comments · Fixed by #3316
Assignees
Labels
bug Involves a bug community Was opened by a community member

Comments

@dextorer
Copy link

dextorer commented Nov 14, 2023

Current behaviour
We're seeing MySQL errors being reported by ddtrace when using Rails' create_or_find_by! method. Quoting from the documentation:

Attempts to create a record with the given attributes in a table that has a unique constraint on one or several of its columns. If a row already exists with one or several of these unique constraints, the exception such an insertion would normally raise is caught, and the existing record with those attributes is found using #find_by!.

The exception is caught at the Rails level, as we can observe from the source code:

# File activerecord/lib/active_record/relation.rb, line 209
    def create_or_find_by(attributes, &block)
      transaction(requires_new: true) { create(attributes, &block) }
    rescue ActiveRecord::RecordNotUnique
      find_by!(attributes)
    end

Yet, ddtrace reports this as an error.

image

image

Expected behaviour
ddtrace should not report the Mysql2::Error:Duplicate entry error, because that exception is caught at the Rails level (or, more generally speaking, because it's an expected behavior).

Steps to reproduce

  1. Add a specific entry to the DB using create_or_find_by!
  2. Try to add the same exact record again using the same create_or_find_by! function

Additional information
I checked the MySQL output when running the service locally:

TRANSACTION (0.7ms)  BEGIN
	↳ app/services/create_in_app_purchase_service.rb:13:in `call'
Buyer Create (4.7ms)  INSERT INTO `buyers` (`user_id`, `created_at`, `updated_at`) VALUES ('1', '2023-11-10 09:34:34', '2023-11-10 09:34:34')
	↳ app/services/create_in_app_purchase_service.rb:13:in `call'
TRANSACTION (2.4ms)  ROLLBACK
	↳ app/services/create_in_app_purchase_service.rb:13:in `call'
Buyer Load (1.7ms)  SELECT `buyers`.* FROM `buyers` WHERE `buyers`.`user_id` = '1' LIMIT 1
	↳ app/services/create_in_app_purchase_service.rb:13:in `call'
InAppPurchase Load (1.8ms)  SELECT `in_app_purchases`.* FROM `in_app_purchases` WHERE `in_app_purchases`.`receipt_id` = '10' LIMIT 1
	↳ app/services/create_in_app_purchase_service.rb:19:in `call'
TRANSACTION (0.5ms)  BEGIN
	↳ app/services/create_in_app_purchase_service.rb:27:in `call'
Buyer Load (1.4ms)  SELECT `buyers`.* FROM `buyers` WHERE `buyers`.`id` = 1 LIMIT 1
	↳ app/services/create_in_app_purchase_service.rb:27:in `call'
TRANSACTION (0.4ms)  COMMIT
	↳ app/services/create_in_app_purchase_service.rb:27:in `call'
Completed 202 Accepted in 58ms (Views: 0.3ms | ActiveRecord: 17.2ms | Allocations: 13076)

As you can see, the behavior is precisely what we expect from create_or_find_by!: it tries to create a record (INSERT INTO), fails (ROLLBACK), then goes on to finding it (SELECT).

Environment

  • ddtrace version::
    1.11.1

  • Configuration block (Datadog.configure ...):

Datadog.configure do |c|
  c.env = "my_test_env"
  c.service = SERVICE_NAME

  c.tracing.enabled = true
  c.tracing.instrument :rails
  c.tracing.instrument :mysql2, service_name: "#{SERVICE_NAME}-mysql2"
  c.tracing.instrument :shoryuken, service_name: "#{SERVICE_NAME}-shoryuken"

  # for debugging purposes
  c.agent.host = 'dd_agent'
  c.agent.port = 8126
end
  • Ruby version:
    3.1.4

  • Operating system:
    MacOS, Linux (irrelevant)

  • Relevant library versions:
    N/A

@dextorer dextorer added bug Involves a bug community Was opened by a community member labels Nov 14, 2023
@TonyCTHsu
Copy link
Contributor

TonyCTHsu commented Nov 16, 2023

👋 @dextorer , I believe this behaviour is expected.

From the graph, the blue span is generated by ActiveRecord while the purple span is generated by mysql2.
The exception is raised by the mysql2 but rescued by ORM.

@TonyCTHsu TonyCTHsu self-assigned this Nov 16, 2023
@dextorer
Copy link
Author

Thank you for confirming this behavior. I think we had the wrong assumption that instrumentations would be "aware" of each other and, in this specific case, not log an error if a higher-layer instrumentation is catching that exception; but as you confirmed, they are completely independent.

This makes sense, depending on the angle you look at it. From a user perspective, it is perfectly reasonable to choose which instrumentations to add, without dependencies between them. Want to monitor MySQL? Add the mysql instrumentation. Want to monitor Rails? Add the rails instrumentation. Mix, match, it just works.

At the same time, this approach has the potential of generating noise, since we usually rescue from exceptions on the framework level (Rails/Grape). In this case, the framework itself rescues from the MySQL exception, which seems to be an even stronger indicator that it is reasonable to expect a MySQL exception not to show up.

I don't have a solution to suggest, I just want to offer the two perspectives. If we don't see a clear next action (aside from maybe documenting this somewhere), I'm happy for this issue to be closed as informative.

@TonyCTHsu
Copy link
Contributor

👋 @dextorer , thanks for sharing your perspectives.

We are well aware of such kind of dependent behaviour between instrumentations. This is not limited to ORM (ActiveRecord) and its DB driver (mysql2). Another example would be Elasticsearch gem and its http client(Faraday or Typhoeus).

I totally agree with your perspective and I believe this is a tricky balance. It is important for us to provide the fine granular visibility for each instrumentation but that could potentially lead to bad signal/noise ratio.

Right now, it tells the truth about what is actually happening for your application or how it truly behaves. I supposed the noise is still manageable and engineers can make sense of the graph after understanding the dependent behaviour between instrumentations.

I would love to figure out ways to improve the S/N ratio for such scenario, but my honest answer would be that it is not trivia and the task itself is not prioritized at the moment.

@TonyCTHsu TonyCTHsu linked a pull request Dec 11, 2023 that will close this issue
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Involves a bug community Was opened by a community member
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants