perf: poor performance on small table without index, likely due to contention #16252

tbg · 2017-05-31T22:16:32Z

Not sure how actionable this is, but user @Freeaqingme reported very poor behavior in production with the following schema and queries. The main issue seemed to be contention from the full table scan associated with the first SELECT statement below, but the actual behavior was extremely bad. Adding an index on webhostingServerFqdn appears to have completely resolved the situation. When unhappy, reads would routinely have to wait ~2s for single overlapping requests, and would still take ~300ms after having cleared the command queue. The leader node for the processing table would be constantly maxed out on CPU. Since many operations in this setting take >500ms, SendNextTimeout kicks in and makes an even bigger mess.

I can understand that an insertion-heavy schema with full table scans performs poorly due to contention, but I don't understand why individual operations should take ~1s; it seems that the contention somehow causes a more fundamental badness that shouldn't be there. I think one would be able to strip this schema down to the bare necessities and to repro the bad behavior with load similar to the one explained below.

CREATE TABLE request (
  id SERIAL PRIMARY KEY,
  command STRING NOT NULL,
  arguments STRING NOT NULL,
  webhostingServerFqdn STRING NOT NULL,
  workflowId INT NOT NULL,
  workflowRequest STRING NOT NULL,
  version INT NOT NULL,
  requestedAt TIMESTAMP NOT NULL DEFAULT NOW(),
  FAMILY requestFamily (command, arguments, webhostingServerFqdn, version, requestedAt),
  FAMILY responseFamily (workflowId, workflowRequest)
);

CREATE TABLE processing (
  id SERIAL PRIMARY KEY,
  command STRING NOT NULL,
  arguments STRING NOT NULL,
  webhostingServerFqdn STRING NOT NULL,
  workflowId INT NOT NULL,
  workflowRequest STRING NOT NULL,
  version INT NOT NULL,
  requestedAt TIMESTAMP NOT NULL,
  processingAt TIMESTAMP NOT NULL DEFAULT NOW(),
  FAMILY requestFamily (command, arguments, webhostingServerFqdn, version, requestedAt, processingAt),
  FAMILY responseFamily (workflowId, workflowRequest)
);

CREATE TABLE response (
  id SERIAL PRIMARY KEY,
  command STRING NOT NULL,
  arguments STRING NOT NULL,
  webhostingServerFqdn STRING NOT NULL,
  workflowId INT NOT NULL,
  workflowRequest STRING NOT NULL,
  version INT NOT NULL,
  requestedAt TIMESTAMP NOT NULL,
  processingAt TIMESTAMP NOT NULL,
  responseAt TIMESTAMP NOT NULL DEFAULT NOW(),
  responseOutput STRING NOT NULL,
  responseError STRING NOT NULL,
  exitCode INT NOT NULL,
  FAMILY requestFamily (command, arguments, webhostingServerFqdn, version, requestedAt, processingAt),
  FAMILY responseFamily (workflowId, workflowRequest, responseAt, responseOutput, responseError, exitCode)
);

grep 'SELECT\|UPDATE\|INSERT\|DELETE' QueueProcessor/* | grep -v _test.go  | cut -d '"' -f'2-' | sed 's/)$//g' | sed 's/"$//g'

# SELECT query is executed every 5 seconds from ~100 machines
SELECT id, webhostingServerFqdn, version, command, arguments, workflowId, workflowRequest, requestedAt FROM " + tableRequest + " WHERE webhostingServerFqdn IN (" + webhostingServerFqdnIn + ") ORDER BY id ASC LIMIT 1

# If the query above has a result (which typically only applies to a few machines at a time), the following:
DELETE FROM " + tableRequest + " WHERE id=$1
SELECT id, webhostingServerFqdn, version, command, arguments, workflowId, workflowRequest, requestedAt, processingAt FROM " + tableProcessing + " WHERE id=$1 ORDER BY id ASC LIMIT 1
INSERT INTO " + tableProcessing + " (id, webhostingServerFqdn, version, command, arguments, workflowId, workflowRequest, requestedAt) VALUES ($1, $2, $3, $4, $5, $6, $7, $8)
DELETE FROM " + tableProcessing + " WHERE id=$1
INSERT INTO " + tableResponse + " (id, webhostingServerFqdn, version, command, arguments, workflowId, workflowRequest, requestedAt, processingAt, responseOutput, responseError, exitCode) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12)

# Logic performed by a handful of nodes every few seconds
SELECT id, webhostingServerFqdn, version, command, arguments, workflowId, workflowRequest, requestedAt, processingAt, responseAt, responseOutput, responseError, exitCode FROM " + tableResponse + " ORDER BY id ASC LIMIT 1
DELETE FROM " + tableResponse + " WHERE id = $1

The text was updated successfully, but these errors were encountered:

petermattis · 2017-06-01T00:45:15Z

@spencerkimball Can you perform a bit of initial triage given your recent work on high contention scenarios?

bdarnell · 2017-06-01T16:12:59Z

SELECT query is executed every 5 seconds from ~100 machines

Without the index, that's 20 full scans of the table per second. This table is being used as a queue, so there are probably more tombstones than there is actual data. These add to the cost of the full table scan but not the properly-indexed query (because the deleted tombstones generally have lower IDs, when we add the index we don't have to scan them. I'm not sure that we need contention-related effects to explain what's going on here.

petermattis · 2017-06-01T16:53:17Z

Hmm, that's a good point. We might be scanning over a ton of deleted keys to find the first non-deleted key. Should be straightforward to verify that with a simple test. I don't have any good thoughts on how to fix this, though.

bdarnell · 2017-06-01T17:02:21Z

We're not even looking for the first non-deleted key. We're looking for the first non-deleted key whose webhostingServerFqdn matches the query. I don't think there's anything we could reasonably do here that would improve the performance of this query without an index. Ultimately it's up to the application to use the right indexes for their queries.

We should, on the other hand, think about how we could make it more apparent to the user that this index is necessary. A LIMIT 1 query that the planner can't figure out how to do without a full query scan is usually a bad sign. The postgres protocol allows us to send "notice" messages back with the response to give warnings to the user. The visibility of these warnings depends on the tool and is generally poor, but it's better than nothing (especially if we can show them in our sql shell). We also could let the user pass in some sort of "max tombstones" query hint to have the query abort rather than become expensive if something like this happens.

Freeaqingme · 2017-06-01T17:15:18Z

Although of course ideally we'd find out some sort of optimization that fixes this (besides users using sane indexes - something we clearly weren't), there may be two metrics you could perhaps add to the admin console:

The number of queries that exceed a certain rows/tombstones scanned ratio
The number of queries that required a full scan

Once you drill down to a specific query the warnings in the sql shell may be of interest (as well as the TRACE and EXPLAIN output, perhaps). But in order to decide what's going in ("the cluster seems slow and pulling a lot of cpu"), some metrics in the admin console hinting what's going on could be a first step into determining what's going on.

petermattis · 2017-06-01T18:32:41Z

I wonder if there is some way to keep track of the non-deleted span of keys for a range. Seems a bit challenging to do at the MVCC/Replica level, but perhaps this could be added to the sstable metadata as part of compactions and iterators could indicate that they are interested in non-deleted keys.

petermattis · 2017-09-26T00:08:05Z

Any improvement here will have to wait until 1.2.

tbg · 2018-07-31T14:07:03Z

Closing this for #17229.

petermattis assigned spencerkimball Jun 1, 2017

petermattis modified the milestone: 1.1 Jun 1, 2017

andreimatei mentioned this issue Jun 23, 2017

FAQ about deletion tombstones cockroachdb/docs#1622

Closed

tbg mentioned this issue Jul 26, 2017

storage: Performance degradation caused by kv tombstones #17229

Open

petermattis modified the milestones: 1.1, 1.2 Sep 26, 2017

petermattis unassigned spencerkimball Sep 26, 2017

bdarnell modified the milestones: 2.0, Later Feb 8, 2018

knz mentioned this issue Mar 21, 2018

sql: provide more visibility into long-latency queries #24109

Closed

5 tasks

petermattis added C-performance Perf of queries or internals. Solution not expected to change functional behavior. A-kv-transactions Relating to MVCC and the transactional model. labels Jul 21, 2018

tbg closed this as completed Jul 31, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: poor performance on small table without index, likely due to contention #16252

perf: poor performance on small table without index, likely due to contention #16252

tbg commented May 31, 2017

petermattis commented Jun 1, 2017

bdarnell commented Jun 1, 2017 •

edited

Loading

petermattis commented Jun 1, 2017

bdarnell commented Jun 1, 2017

Freeaqingme commented Jun 1, 2017

petermattis commented Jun 1, 2017

petermattis commented Sep 26, 2017

tbg commented Jul 31, 2018

perf: poor performance on small table without index, likely due to contention #16252

perf: poor performance on small table without index, likely due to contention #16252

Comments

tbg commented May 31, 2017

petermattis commented Jun 1, 2017

bdarnell commented Jun 1, 2017 • edited Loading

petermattis commented Jun 1, 2017

bdarnell commented Jun 1, 2017

Freeaqingme commented Jun 1, 2017

petermattis commented Jun 1, 2017

petermattis commented Sep 26, 2017

tbg commented Jul 31, 2018

bdarnell commented Jun 1, 2017 •

edited

Loading