Add sql.defaults.experimental_vectorized setting to experimental features list #4953

jseldess · 2019-06-20T10:01:34Z

sql.defaults.experimental_vectorize                      | off                     | e            | default experimental_vectorize mode [off = 0, on = 1, always = 2]

Customer asked about the dangers of using this in production. Not clear. We should add to https://www.cockroachlabs.com/docs/dev/experimental-features.html with some details.

I answered:

In the meantime, we just posted a blog on vectorizing the merge joiner that might add some clarity: https://www.cockroachlabs.com/blog/vectorizing-the-merge-joiner-in-cockroachdb/

I think the experimental status pertains the underlying vectorization engine more generally. You can track the work left to be done in this GitHub issue, if you like: cockroachdb/cockroach#36507

I see this at the end:

the bottleneck becomes not the joiner itself but the speed at which data can be read from the disk. This is still important as a more optimized merge joiner can free up CPU cycles to be used elsewhere, on another query perhaps.

The text was updated successfully, but these errors were encountered:

jseldess · 2019-06-20T14:50:29Z

The experimental status is mostly due to the fact that vectorization is still under-tested on our end in general.

@asubiotto also called out that we don’t yet monitor memory used or spill to disk for large joins, so there is a possibility of crashing a server when doing those, and we don’t yet distributed joins, which increases the chances of this problem.

asubiotto · 2019-06-20T18:36:50Z

Note that distribution of joins is something that will make it in in the next couple of days: cockroachdb/cockroach#38233, so shouldn't be called out as a limitation

... along with some context about what the feature is, how it works (by linking to George's blog post), and a list of current limitations. Fixes #4953.

yuzefovich · 2019-08-03T03:43:33Z

A quick note: we have just merged a PR (cockroachdb/cockroach#38777) that renamed experimental_vectorize to just vectorize. It also added a new possible mode of vectorization, and now all modes are:

off - the vectorized engine is disabled, all queries go through the row execution engine.
auto (which is now the default choice) - all queries consisting only of streaming operators (a streaming operator is such that doesn't require any buffering) are executed through the vectorized engine whereas all others are run through the row execution engine.
experimental_on - all queries that are supported in the vectorized engine (both streaming and non-streaming) are run through it. It is "experimental" because we still don't have disk spilling (which means that large queries can get out of memory error and crash the node).
experimental_always - absolutely all queries are forced to run through the vectorized engine. If the engine doesn't support the query, it errors out. The only exception is SET queries so that vectorize setting can be changed.

rmloveland · 2019-08-05T17:35:36Z

Thanks! just created #5141 to address.

jseldess added A-sql P-1 High priority; must be done this release O-support Internal source: Support C-doc-improvement labels Jun 20, 2019

jseldess added this to the 19.2 milestone Jun 20, 2019

jseldess assigned rmloveland Jun 20, 2019

rmloveland added a commit that referenced this issue Jul 8, 2019

Add vectorization to experimental features list

a9867da

... along with some context about what the feature is, how it works (by linking to George's blog post), and a list of current limitations. Fixes #4953.

rmloveland mentioned this issue Jul 8, 2019

Add vectorization to experimental features list #5026

Merged

rmloveland added a commit that referenced this issue Jul 9, 2019

Add vectorization to experimental features list

4444f02

... along with some context about what the feature is, how it works (by linking to George's blog post), and a list of current limitations. Fixes #4953.

rmloveland added a commit that referenced this issue Jul 18, 2019

Add vectorization to experimental features list

5f86a5e

... along with some context about what the feature is, how it works (by linking to George's blog post), and a list of current limitations. Fixes #4953.

rmloveland closed this as completed in #5026 Jul 18, 2019

rmloveland mentioned this issue Aug 5, 2019

Update vectorized setting with new name and defaults #5141

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add sql.defaults.experimental_vectorized setting to experimental features list #4953

Add sql.defaults.experimental_vectorized setting to experimental features list #4953

jseldess commented Jun 20, 2019

jseldess commented Jun 20, 2019

asubiotto commented Jun 20, 2019

yuzefovich commented Aug 3, 2019

rmloveland commented Aug 5, 2019

Add sql.defaults.experimental_vectorized setting to experimental features list #4953

Add sql.defaults.experimental_vectorized setting to experimental features list #4953

Comments

jseldess commented Jun 20, 2019

jseldess commented Jun 20, 2019

asubiotto commented Jun 20, 2019

yuzefovich commented Aug 3, 2019

rmloveland commented Aug 5, 2019