Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coltypes: refactor colexec types #43559

Closed
yuzefovich opened this issue Dec 26, 2019 · 4 comments · Fixed by #48042
Closed

coltypes: refactor colexec types #43559

yuzefovich opened this issue Dec 26, 2019 · 4 comments · Fixed by #48042
Assignees
Labels
C-cleanup Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior.

Comments

@yuzefovich
Copy link
Member

Our current type system used in the vectorized engine has a big disadvantage:

  • it is "lossful" - doing the conversion of types.T -> coltypes.T -> types.T will not always return the same type that it is started with.

This will very likely (if not already) cause issues and additional burden when we need to specialize the behavior of an operation based on the "actual" types.T type of a column, but we might not have that information anymore.

We should fix it. One idea that Jordan mentioned (which I agree with) is that coltypes should be an extension of normal types that will also hold a "physical type" - the way we represent it in Go.

@yuzefovich yuzefovich added the C-cleanup Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior. label Dec 26, 2019
@yuzefovich yuzefovich self-assigned this Jan 2, 2020
@yuzefovich
Copy link
Member Author

I prototyped two approaches of resolving this issue.

At first I started implementing different "vectorized execution" type system in which a type would be coltypes.T ("physical" part) but also would store the reference to the original "logical" type types.T. The idea was that everything in colexec package would be using the new type system. I reached a few hiccups along the way:

  • not all types are exported from sql/types package, so the conversion from types.T to colexectypes.T seemed pretty painful to write correctly
  • vectorized execution planning operates on the specs that contain []types.T
  • we still would need to do a lot of plumbing to put the new types into the operators because coltypes should not care about those (and I appreciate Dan's move of that package outside of colexec, otherwise it would have been tempting to put the "logical" sql type into the "physical" type)
  • we already have two type systems that fully satisfy our needs, and the root of the problem is actually missing necessary plumbing.

The last point made me realize that we do not, in fact, want a new type system - we simply need to put the logical type information during the execution planning, and that's what I did in #43723.

@yuzefovich
Copy link
Member Author

I spent some more time thinking about this, and now I'm leaning towards us using SQL logical types pretty much everywhere in the vectorized engine (i.e. in the operators, during planning, during serialization) and performing types.T -> coltypes.T conversion "in the last moment possible" (whatever that might be). We will still be generating templated code based on coltypes.T though.

Such change will allow us to implement coltypes.Datum for all unoptimized logical types and will solve the problem of lossful conversion since original logical types will be always available. One of the main arguments in favor of this decision is that we need to come up with a way to serialize Datums of any logical type, so we will need to have access to the logical type "deep down, in the heart of the vectorized engine" anyway.

@yuzefovich
Copy link
Member Author

After discussing with Archer and Alfonso, we agreed that plumbing logical types down everywhere is definitely required to get coltypes.Datum for all unoptimized types working. I'll take a stab at it by replacing the usage of coltypes.T by using the logical types wherever possible (the exception being the places where we're touching the physical representation - probably in Allocator object and in overload resolution). I'll also try changing the way we template binary and comparison expressions (currently we copy-paste the code from eval.go - which makes sense to do for the optimized types, but not for all others).

@yuzefovich yuzefovich self-assigned this Apr 15, 2020
craig bot pushed a commit that referenced this issue Apr 21, 2020
45865: backupccl: remove unused username param in WriteTableDescs r=dt a=pbardea

Release note: None

46857: ui: Sortable loading state r=dhartunian a=elkmaster

Added global loading state to sort tables
Added loading state to database tables

![loading-database](https://user-images.githubusercontent.com/12850886/78561401-4e875500-7820-11ea-8462-53bd7310adf0.png)

Resolves: #46568

Release justification: low risk, high benefit changes to existing functionality

Release note (ui): database loading state design updates

47582:  backupccl: fix flake in TestProtectedTimestampDuringBackup  r=pbardea a=pbardea

TestProtectedTimestampDuringBackup would sometimes flake as the GC queue
would assign it a low priority that would be below the threshold to turn
shouldQueue to true. However, the priority is non-zero indicating that
the timestamp was indeed not protected. This change aims to remove the
flake by checking for a non-zero priority rather than if shouldQueue is
true.

This PR also makes the same change to the ImportInto variant of the test since
they share the same test structure.

Fixes #47522.

Release note: None

47583: col*: use logical types throughout the vectorized engine r=yuzefovich a=yuzefovich

**col...: create new package and move some code**

This commit introduces `colbase` package which currently contains the
following (that are extracted from `colexec` package):
- `allocator.go`
- `random_testutils.go`
- `Operator` interface
- `CopyBatch` method.

The reason for extracting these things is that they are used by multiple
col* packages, so I think it's a good hygiene to separate them out.

This commit also renames `execerror` package to `vecerror` and then
moves it inside of `colbase` directory. It also changes the panic
matching so that now we catch all panics coming from a package that has
`pkg/sql/col` in its path (this will make sure that we don't forget to
register newly added packages that use panic-catching mechanism of the
vectorized engine with the panic-catcher). Additionally, this commit
renames the methods of `execerror` package.

Next, it moves `colexec/typeconv` package into `colbase` as well as
`colexec/testutils.go` file.

Finally, it removes `vecerror.NonVectorizedPanic` in favor of
`vecerror.ExpectedError`. The guidance on whether `InternalError` or
`ExpectedError` method should be used has been updated: the distinction
is whether the vectorized engine ends up in an unexpected - invalid -
state (for example, we expect that the vectorized engine might be
performing division by zero when evaluating a binary expression).

Release note: None

**col...: use logical types throughout the vectorized engine**

This commit transitions the vectorized engine to use SQL logical types
wherever possible, only converting to its physical type equivalent when
necessary (for example, when choosing which instance of projection
operator to use). This will allow us to have access to the actual type
whever we need and will allow us implement `coltypes.Datum` for all
unoptimized types.

This commit also moves some a few things around to clean up the
dependency graph (namely, `BytesEncodeFormat` is moved from
`sql/sessiondata` to `sql/lex`). Additionally, it replaces a single call
to `util/log.Infof` with a print statement in `util/protoutil` package
to remove the dependency on `util/log` (which depends on a bunch of
other things) so that `pkg/workload` would need to import less things.

Another thing worth calling out is the creation of copies of types in
`execplan` file to make sure that we don't override the input sync
specs.

Addresses: #43559.

Release note: None

**col...: introduce new package and more code movement**

Move `coldata/random_testutils.go` into newly created package
`coldatatestutils`.

Move contents of `colbase/random_testutils.go` into `coldatatestutils`
package.

Rename `colbase` to `colexecbase`. Also remove templated comments from
import sections of `_tmpl` files in favor of adding vars that remove
unused warnings (those templated comments work poorly when
moving/renaming the dependencies).

Rename `vecerror` to `colexecerror`.

Move `CopyBatch` from `colexecbase` into `coldatatestutils`. Also remove
memory accounting from `CopyBatch`.

Move `typeconv` from `colexecbase` into `coltypes` folder.

Move `coldata/vec_test.go` into `coldata_test` package to prevent an
import cycle. Also move one unit test from `coldata/vec_test.go` into
`coldata/bytes_test.go`.

Move `colexecbase/allocator.go` into newly created `colmem` package.

Release note: None

47629: sql: use Clock.PhysicalTime in beginTransactionTimestampsAndReadMode r=nvanbenschoten a=nvanbenschoten

Synchronizing with the HLC clock doesn't look to be necessary. I'm confused
about this though. The comment on beginTransactionTimestampsAndReadMode says
that "txnSQLTimestamp propagates to become the TxnTimestamp". Is this trying to
say that the timestamp makes it way into the kv.Txn? Because that's not true.

Regardless, the one reason not to make this change is that PhysicalTime is not
guaranteed to be monotonic on some systems and can generally diverge from the
HLC's clock. If we're worried about that though, we should use the HLC here and
feed that directly into the kv.Txn. We shouldn't need to grab two timestamps
from the HLC per txn.

47756: cli,base: surface stored critical errors at the right moment r=tbg a=knz

Fixes #44041

If/when storage detects an important error, the text for this error is
stored in a file named `_CRITICAL_ERROR.txt` in the auxiliary
directory. The intention is to block subsequent server restarts until
the error is investigated and the file (manually) removed.

Prior to this patch, this check was done in the startup sequence 1)
before logging was fully initialized 2) using a `log.Fatal` to
announce the critical error.

The first aspect is problematic because it logs before logging flags
are applied. The second is problematic because it makes the failure
super-verbose and buries the lede.

This patch simplifies the code and makes the error reported at the
right place.

Example, before:
```
kena@kenax ....com/cockroachdb/cockroach % ./cockroach start-single-node
F200421 14:24:02.303675 1 cli/start.go:478  From .../auxiliary/_CRITICAL_ALERT.txt:

boom

goroutine 1 [running]:
github.com/cockroachdb/cockroach/pkg/util/log.getStacks(0x6a7ca00, 0xed630f902, 0x0, 0x1000)
        ...//src/github.com/cockroachdb/cockroach/pkg/util/log/get_stacks.go:25 +0xb8
github.com/cockroachdb/cockroach/pkg/util/log.(*loggerT).outputLogEntry(0x6a79800, 0xc000000004, 0x33eb7b9, 0xc, 0x1de, 0xc000ba8180, 0x76)
        ...//src/github.com/cockroachdb/cockroach/pkg/util/log/clog.go:210 +0xa92
github.com/cockroachdb/cockroach/pkg/util/log.addStructured(0x15e3420, 0xc000078168, 0x4, 0x2, 0x0, 0x0, 0xc00063f7e8, 0x1, 0x1)
        ...//src/github.com/cockroachdb/cockroach/pkg/util/log/structured.go:66 +0x2c9
...
[147/245]
****************************************************************************

This node experienced a fatal error (printed above), and as a result the
process is terminating.

Fatal errors can occur due to faulty hardware (disks, memory, clocks) or a
...
    support@cockroachlabs.com

The Cockroach Labs team appreciates your feedback.
```

Example, after:

```
kena@kenax ....com/cockroachdb/cockroach % ./cockroach start-single-node
*
* ERROR: startup forbidden by prior critical alert
* DETAIL: From /data/home/kena/src/go/src/github.com/cockroachdb/cockroach/cockroach-data/auxiliary/_CRITICAL_ALERT.txt:
* boom
*
```

Release note (cli change): The error message displayed upon `cockroach
start` / `cockroach start-single-node` when manual intervention is
needed in the store directory is now clearer.

Co-authored-by: Paul Bardea <pbardea@gmail.com>
Co-authored-by: Vlad Los <carrott9@gmail.com>
Co-authored-by: Paul Bardea <paul@pbardea.com>
Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net>
@yuzefovich
Copy link
Member Author

The current status of this issue is that we have plumbed logical types into the operators, and the next step is to remove coltypes.T entirely - this requires changing the way we generate code from templates, and I'm currently looking into that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-cleanup Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant