Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: regression since 20190805 release #41206

Closed
maddyblue opened this issue Sep 30, 2019 · 20 comments
Closed

perf: regression since 20190805 release #41206

maddyblue opened this issue Sep 30, 2019 · 20 comments
Assignees
Labels
regression Regression from a release.

Comments

@maddyblue
Copy link
Contributor

While testing the 20190930 release in #41128, the test clusters diverged pretty heavily in their performance:

ss

vs

ss

The top image (the newer release) shows heavily increasing time and variance for SQL latency compare to the 0805 release which has a slighty increase over time. The various transaction, SQL, and storage metrics are similar. This seems like a performance regression since then.

@maddyblue maddyblue added the regression Regression from a release. label Sep 30, 2019
@maddyblue
Copy link
Contributor Author

@andreimatei
Copy link
Contributor

Just to eliminate one thing can you please disable the periodic generation of the new replication reports by doing set cluster setting kv.replication_reports.interval = 0? I think the chances of this helping are not great, though.

@maddyblue
Copy link
Contributor Author

Sure. I executed set cluster setting kv.replication_reports.interval = '0s' just now on mjibson-release-250f4c36de2b88eff443cf9be9cd5d2759312c88. We can check back tomorrow and see what changes.

@bdarnell
Copy link
Contributor

My first guess was that the uninitialized variable that Andrew K just discovered was causing a lot of unnecessary compactions (and for some obscure reason this uninitialized variable happened to be zero most of the time in the old build). But it looks like the number of rocksdb compactions is the same between the two builds. (We don't know the size of the compactions, so there could be a change in this area, but it's unlikely that the count of compactions would match so closely with a substantially different size).

Is this regression visible in roachperf, or does that not run long enough to see the performance degrade like this? It would be nice to narrow this change down to a smaller window than two months.

@nvanbenschoten
Copy link
Member

The high latencies appear at very regular intervals, with a period of almost exactly 30 minutes.

@nvanbenschoten
Copy link
Member

CPU utilization doesn't seem to be correlated with the spikes but Go allocated memory usage does. We should wait for the next spike and grab a heap profile.

@jordanlewis
Copy link
Member

Could this be statistics collection? I didn't think runs were so regularly scheduled, but collecting statistics does use a good amount of Go memory on each run. There's nothing in the jobs table though, and I think statistics runs do get jobs, so I think this theory is wrong.

@nvanbenschoten
Copy link
Member

nvanbenschoten commented Sep 30, 2019

I'm seeing large fluctuations in Go allocated memory in the following sources:

  457.52MB 29.25% 29.25%   551.03MB 35.23%  github.com/cockroachdb/cockroach/pkg/sql.(*optTableStat).init
   93.51MB  5.98% 35.23%    93.51MB  5.98%  github.com/cockroachdb/cockroach/pkg/sql/sqlbase.(*DatumAlloc).NewDInt
   41.74MB  2.67% 37.89%   593.77MB 37.96%  github.com/cockroachdb/cockroach/pkg/sql.newOptTable

EDIT: this is a diff between the high and the low profiles. (*optTableStat).init is responsible for more than this.

We also saw this in #40566 (comment) and we addressed it by reducing the SQL connection limit. Here we have 2k connections.

cc. @RaduBerinde is it possible that something changed here in the past two months? It does suspect that we suddenly needed #40924 after never needing it before.

@nvanbenschoten
Copy link
Member

nvanbenschoten commented Sep 30, 2019

We do have very regular stats recomputation on the warehouse and district tables. Perhaps this is invalidating all of the connections' optimizer metadata and forcing recreation of query plans.

@nvanbenschoten
Copy link
Member

I'm going to turn off auto-stats collection (sql.stats.automatic_collection.enabled) and see if that has an effect.

@jordanlewis
Copy link
Member

#39743 turned on histograms by default, which was merged 8/20. I wonder if that's related.

@nvanbenschoten
Copy link
Member

It's possible. There is a sql.stats.histogram_collection.enabled cluster setting, so we can play with that once we see whether sql.stats.automatic_collection.enabled has any effect.

@RaduBerinde
Copy link
Member

I believe the delta of memory usage in optTableStats is due to the histograms. @rytaft might know better how much of a difference we should expect. I do agree that having separate instances of these objects across connections is silly so I plan to fix that for 20.1 (tracked by #40922). We can also try a smaller fix to make the stat cache itself store these objects.

@RaduBerinde
Copy link
Member

RaduBerinde commented Sep 30, 2019

I started working on a change to push the decoding of the histograms to the stats cache. That will reuse the histogram memory across all connections (and make things more efficient when a new statistic becomes available).

RaduBerinde added a commit to RaduBerinde/cockroach that referenced this issue Sep 30, 2019
Currently the stats cache exposes the encoded histogram data, and the
opt catalog decodes it. The problem is that opt catalog objects are
not shared across connections so this is inefficient (especially in
memory usage).

This change moves the decoding into the stats cache. The opt catalog
objects now simply point to the data computed by the cache.

There are still inefficiencies to improve on in the future: the opt
catalog can hold on to multiple versions of tables, so we will keep
many versions of histograms "alive".

Informs cockroachdb#41206.
Informs cockroachdb#40922.

Release justification: fix for new functionality.

Release note: None
@nvanbenschoten
Copy link
Member

I'm going to turn off auto-stats collection (sql.stats.automatic_collection.enabled) and see if that has an effect.

Unfortunately, it looks like that didn't resolve the issue. We still see the oscillations.

Regardless, @RaduBerinde thanks for getting #41220 out so fast!

@nvanbenschoten
Copy link
Member

That said, it does seem like the memory fluctuations are from optTableStats. Here's a series of heap profiles that show its growing contribution.

Screen Shot 2019-10-01 at 11 28 31 AM
Screen Shot 2019-10-01 at 11 28 41 AM
Screen Shot 2019-10-01 at 11 28 48 AM
Screen Shot 2019-10-01 at 11 28 56 AM

@RaduBerinde do you have any idea why this memory would be getting flushed every 30 minutes even without auto stats collection?

@nvanbenschoten
Copy link
Member

#41230 was also contributing to memory usage in the cluster, but that memory was stable so it's not responsible for this issue.

craig bot pushed a commit that referenced this issue Oct 1, 2019
41220: sql: move histogram decoding into the stats cache r=RaduBerinde a=RaduBerinde

Currently the stats cache exposes the encoded histogram data, and the
opt catalog decodes it. The problem is that opt catalog objects are
not shared across connections so this is inefficient (especially in
memory usage).

This change moves the decoding into the stats cache. The opt catalog
objects now simply point to the data computed by the cache.

There are still inefficiencies to improve on in the future: the opt
catalog can hold on to multiple versions of tables, so we will keep
many versions of histograms "alive".

Informs #41206.
Informs #40922.

Release justification: fix for new functionality.

Release note: None

41231: util/log: ensure that secondary loggers do not leak memory r=petermattis,nvanbenschoten a=knz

Fixes #41230.

Note: the primary cause of this issue is removed by #40993 but that PR is blocked until 19.2 is out. I'm cherry-picking the subset of those changes sufficient to solve issue #41230, here.

Prior to this patch, logging via a secondary logger would allocate a
buffer, then add it to the buffer free list of the secondary logger.

This was causing a memory leak because only the free list from the
main logger is used to allocate buffers (even in secondary loggers),
so all the now-unused buffers from secondary logs would remain unused
and accumulate, locked from Go's GC attention because they are
referenced somewhere.

Release justification: bug fix

Release note (bug fix): A memory leak was fixed that affected
secondary logging (SQL audit logs, statement execution, and RocksDB
logging).

Co-authored-by: Radu Berinde <radu@cockroachlabs.com>
Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net>
@RaduBerinde
Copy link
Member

The optCatalog (one per connection) has a cache of dataSources. When this reaches 100, it gets cleared. Very simplistic but these were meant to be small objects.

With auto stats off, this should only happen if something else is causing the table descriptor versions to keep changing.

We should see if #41220 fixes it.

@knz
Copy link
Contributor

knz commented Oct 2, 2019

Comment from @nvanbenschoten

the v19.2-beta cluster looks much better running cbbf40a, which contains #41220 and #41231

Screen Shot 2019-10-01 at 11 52 11 PM

we should continue to monitor the cluster, but as long as things are still looking ok tomorrow morning, we should consider #41206 sufficiently resolved to where it’s no longer a blocker

@knz knz mentioned this issue Oct 2, 2019
18 tasks
@nvanbenschoten
Copy link
Member

It looks like this is fixed. Thanks for the help @knz and @RaduBerinde.

Screen Shot 2019-10-02 at 11 19 12 AM

Screen Shot 2019-10-02 at 11 19 23 AM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
regression Regression from a release.
Projects
None yet
Development

No branches or pull requests

7 participants