-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
workload: allow columnar data generation (alloc -73%, MB/s +60%) #35349
Conversation
83327fc
to
cf433bd
Compare
cf433bd
to
e6b04e9
Compare
Okay @jordanlewis this is as far as I got on Friday. There's a bit of a performance regression, so I haven't bothered updated the commit message or PR description yet. This is a bummer and there's definitely a bunch of places I can optimize, but I'm torn between how huge and scary this already is vs having a window between landing this and the followup perf work to really get all the benefits of this. Note that even though the csv writing (which we still use for a little while longer) is slower, fixtures import tpcc already uses drastically fewer allocs and is overall a bit faster.
|
Any initial guesses why it might be slower? I'll take a look soon. |
The fixtures import is almost certainly slower because CSV generation is slower. After 19.1.0 goes out the door, david is going to merge #36250, which switches fixtures import to bypass the csv by default and this is definitely faster with this PR. So even if we can't get the csv performance back, this whole problem goes away anyway. As for why it's slower, I've tried digging into before/after profiles. My best guess so far is it's because most workload.Generators have a batch size of 1, which is the worst case for seeing columnar overhead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome work! I have a bunch of comments, but in general I'm definitely eager to see this happen.
@@ -56,7 +66,10 @@ func NewMemBatch(types []types.T) Batch { | |||
// NewMemBatchWithSize allocates a new in-memory Batch with the given column | |||
// size. Use for operators that have a precisely-sized output batch. | |||
func NewMemBatchWithSize(types []types.T, size int) Batch { | |||
b := &memBatch{} | |||
if max := math.MaxUint16; size > max { | |||
panic(fmt.Sprintf(`batches cannot have length larger than %d; requested %d`, max, size)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
pkg/sql/exec/coldata/vec.go
Outdated
@@ -29,6 +29,9 @@ type column interface{} | |||
type Vec interface { | |||
Nulls | |||
|
|||
// Type returns the type of datums stored in this Vec. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we haven't used datums
yet in this code - could you just say data, to avoid eventual terminology clash with tree.Datum?
// Allocate all the []interface{} row slices in one go. | ||
datums := make([]interface{}, numRows*numCols) | ||
for colIdx, col := range cb.ColVecs() { | ||
switch col.Type() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like, here, you should really have the type information passed in, rather than having to check the ColVec, in my opinion. ColBatchToRows
should also take a []types.T
.
} | ||
table := workload.Table{ | ||
InitialRows: workload.BatchedTuples{ | ||
Batch: func(rowIdx int) [][]interface{} { return rows[rowIdx] }, | ||
FillBatch: func(batchIdx int, cb coldata.Batch, _ *bufalloc.ByteAllocator) { | ||
*cb.(*coldata.MemBatch) = *batches[batchIdx].(*coldata.MemBatch) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the reason you had to export MemBatch? This seems kind of unpleasant - could you maybe change the FillBatch
interface to return a coldata.Batch
instead of modifying one in place? Or does that have other problems?
pkg/workload/workload.go
Outdated
// FillBatch is a function to deterministically compute a columnar-batch of | ||
// tuples given its index. | ||
// | ||
// To save allocations, the `Vec`s in the passed `Batch` are reused when |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: is it just me or do we not normally enclose types in backticks in comments? obviously take it or leave it
pkg/workload/workload.go
Outdated
for i, datum := range row { | ||
if datum == nil { | ||
// WIP what do we do here | ||
colTypes[i] = types.Bytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we know the types up front somehow? I think the Tuples
interface should return its column types as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would also solve the allocations issue right?
} | ||
|
||
// ColBatchToRows materializes the columnar data in a coldata.Batch into rows. | ||
func ColBatchToRows(cb coldata.Batch) [][]interface{} { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you use the materializer
operator instead of this function? It does the same thing.
654d73e
to
3a1a58b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @danhhz and @jordanlewis)
pkg/sql/exec/coldata/vec.go, line 32 at r2 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
nit: we haven't used
datums
yet in this code - could you just say data, to avoid eventual terminology clash with tree.Datum?
Done
pkg/workload/csv_test.go, line 93 at r2 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
Is this the reason you had to export MemBatch? This seems kind of unpleasant - could you maybe change the
FillBatch
interface to return acoldata.Batch
instead of modifying one in place? Or does that have other problems?
It has to take it in to get the memory reuse. Also returning it seems awkward to me, especially since (I think) we're both sort of convinced at this point that MemBatch is the new Batch
pkg/workload/workload.go, line 173 at r2 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
nit: is it just me or do we not normally enclose types in backticks in comments? obviously take it or leave it
I certainly do it, though admittedly inconsistently, so there are definitely areas of the codebase that have them. There aren't any other examples in this file, so happy to revert
pkg/workload/workload.go, line 197 at r2 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
That would also solve the allocations issue right?
FWIW, this Tuples
helper is specifically for making it easier to write a Generator when you don't need to eek out every last bit of performance, so the allocation really isn't a big deal. Added a comment since that info wasn't in the code.
As for the types, I agree that we want to plumb some of this information. It'll be super helpful, for example in workloadReader
. But I like how easy it is currently to write a workload.Generator if you don't need the super performant version. Basically, there are some design decisions here that I'd like to consider carefully before committing to something new. How do you feel about separating this question out and focusing a followup on how workload type info gets plumbed?
pkg/workload/workload.go, line 245 at r2 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
Could you use the
materializer
operator instead of this function? It does the same thing.
Definitely don't want that dep, plus this way we get to control the behavior for what workload needs without worrying about breaking distsql if we change something. Also this is pretty simple, so I'm comfortable with the duplication.
pkg/workload/workload.go, line 250 at r2 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
Like, here, you should really have the type information passed in, rather than having to check the ColVec, in my opinion.
ColBatchToRows
should also take a[]types.T
.
This actually seems cleaner to me as is than if it got some types passed in that may not even match. I don't understand the objection, to be honest, can you expand on it?
3a1a58b
to
aacfb45
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, we're past the 19.1.0 push, so I'd like to get this merged. Pulled and resolved conflicts. Also resolved the one outstanding CR issue (assuming you're happy with it). Finally, added a test that runs fixtures import
on all registered workloads, which caught a couple missing switch cases. PTAL
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @danhhz and @jordanlewis)
pkg/workload/workload.go, line 197 at r2 (raw file):
Previously, danhhz (Daniel Harrison) wrote…
FWIW, this
Tuples
helper is specifically for making it easier to write a Generator when you don't need to eek out every last bit of performance, so the allocation really isn't a big deal. Added a comment since that info wasn't in the code.As for the types, I agree that we want to plumb some of this information. It'll be super helpful, for example in
workloadReader
. But I like how easy it is currently to write a workload.Generator if you don't need the super performant version. Basically, there are some design decisions here that I'd like to consider carefully before committing to something new. How do you feel about separating this question out and focusing a followup on how workload type info gets plumbed?
I think I came up with a good solution here to both the nil and allocation problems. You can now declare the types if you're going to return a nil, but you don't have to (only two tables needed this, both in ledger). It was also pretty easy to use a sync.Once to do the col type sniffing, so now we don't pay the allocations.
4dafabf
to
ed4d390
Compare
A workload.Generator previously could specifiy its initial table data using one of two methods. One outputs a single row as `[]interface{}` and the other is a batch of `[][]interface{}`. This commit switches the latter to be a columar batch, allowing selected performance-critical workload generators (tpcc, bank) to generate initial table data with dramatically reduced allocations. The single row option still exists and will continue to be used by most workload.Generators, as it's much simpler. An adaptor from the columnar batch to the old row oriented `[][]interface{}` batch is included for ease of use in non-performance sensitive consumers of initial table data. IMPORT, on the other hand, is switched to using the columnar batches directly. Some of the savings here is reusing batches, but the majority comes from `[]interface{}` requiring that everything assinged to it be on the heap, while something like `[]int` doesn't. We also get some amount of speedup in initial table data consumers from fewer type switches. This could also be accomplished without being columnar, but doing the minimum non-columnar thing to get the same results will require reinventing most of the work that the columnar does. Plus it will almost certainly be convenient for columnar exec benchmarking to have a common format. Benchmark results: name old time/op new time/op delta InitialData/tpcc/warehouses=1-8 552ms ± 1% 454ms ± 0% -17.81% (p=0.008 n=5+5) InitialData/bank/rows=1000-8 461µs ± 0% 332µs ± 1% -28.10% (p=0.008 n=5+5) WriteCSVRows-8 15.0µs ± 1% 16.8µs ± 1% +12.44% (p=0.008 n=5+5) CSVRowsReader-8 17.8µs ± 1% 19.3µs ± 1% +8.41% (p=0.008 n=5+5) name old speed new speed delta InitialData/tpcc/warehouses=1-8 128MB/s ± 1% 209MB/s ± 0% +63.22% (p=0.008 n=5+5) InitialData/bank/rows=1000-8 223MB/s ± 0% 350MB/s ± 1% +56.83% (p=0.008 n=5+5) WriteCSVRows-8 118MB/s ± 2% 107MB/s ± 3% -9.43% (p=0.008 n=5+5) CSVRowsReader-8 101MB/s ± 4% 92MB/s ± 5% -8.31% (p=0.016 n=5+5) name old alloc/op new alloc/op delta InitialData/tpcc/warehouses=1-8 246MB ± 0% 75MB ± 0% -69.61% (p=0.008 n=5+5) InitialData/bank/rows=1000-8 232kB ± 0% 19kB ± 0% -91.75% (p=0.008 n=5+5) WriteCSVRows-8 5.60kB ± 0% 5.70kB ± 0% +1.71% (p=0.016 n=4+5) CSVRowsReader-8 7.35kB ± 1% 7.39kB ± 0% ~ (p=0.667 n=5+4) name old allocs/op new allocs/op delta InitialData/tpcc/warehouses=1-8 5.60M ± 0% 1.48M ± 0% -73.56% (p=0.008 n=5+5) InitialData/bank/rows=1000-8 6.00k ± 0% 1.02k ± 0% -83.01% (p=0.008 n=5+5) WriteCSVRows-8 48.0 ± 0% 50.0 ± 0% +4.17% (p=0.008 n=5+5) CSVRowsReader-8 54.0 ± 0% 55.0 ± 0% +1.85% (p=0.008 n=5+5) We could probably speed the CSV stuff back up with some followup work, but (a) the overall `fixtures import` benchmark is still faster and (b) we're about to switch to dt's import magic that skips the CSV roundtrip. Current `fixtures import` name old time/op new time/op delta ImportFixture/tpcc/warehouses=1-8 3.35s ± 2% 3.29s ± 3% ~ (p=0.310 n=5+5) name old speed new speed delta ImportFixture/tpcc/warehouses=1-8 26.4MB/s ± 2% 26.9MB/s ± 3% ~ (p=0.310 n=5+5) name old alloc/op new alloc/op delta ImportFixture/tpcc/warehouses=1-8 3.09GB ± 0% 2.92GB ± 0% -5.60% (p=0.008 n=5+5) name old allocs/op new allocs/op delta ImportFixture/tpcc/warehouses=1-8 32.1M ± 0% 28.0M ± 0% -12.86% (p=0.008 n=5+5) Magic `fixtures import` that skips CSV roundtrip: name old time/op new time/op delta ImportFixture/tpcc/warehouses=1-8 2.88s ± 3% 2.86s ± 4% ~ (p=0.548 n=5+5) name old speed new speed delta ImportFixture/tpcc/warehouses=1-8 30.7MB/s ± 3% 30.9MB/s ± 4% ~ (p=0.548 n=5+5) name old alloc/op new alloc/op delta ImportFixture/tpcc/warehouses=1-8 2.87GB ± 0% 2.73GB ± 0% -4.97% (p=0.008 n=5+5) name old allocs/op new allocs/op delta ImportFixture/tpcc/warehouses=1-8 26.1M ± 0% 21.5M ± 0% -17.75% (p=0.008 n=5+5) Touches cockroachdb#34809 Release note: None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super pumped about this! You don't get a 90% allocation reduction every day. Thanks for pushing to get it in.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @jordanlewis)
Thanks for all the reviews! bors r=jordanlewis |
Build failed |
TestStoreRangeMergeRaftSnapshot was just modified by #37350 so I'm inclined to guess flake. Dunno what the story is with that TestEncodeBitArray panic but I doubt it's this PR. bors r=jordanlewis |
35349: workload: allow columnar data generation (alloc -73%, MB/s +60%) r=jordanlewis a=danhhz A workload.Generator previously could specifiy its initial table data using one of two methods. One outputs a single row as `[]interface{}` and the other is a batch of `[][]interface{}`. This commit switches the latter to be a columar batch, allowing selected performance-critical workload generators (tpcc, bank) to generate initial table data with dramatically reduced allocations. The single row option still exists and will continue to be used by most workload.Generators, as it's much simpler. An adaptor from the columnar batch to the old row oriented `[][]interface{}` batch is included for ease of use in non-performance sensitive consumers of initial table data. IMPORT, on the other hand, is switched to using the columnar batches directly. Some of the savings here is reusing batches, but the majority comes from `[]interface{}` requiring that everything assinged to it be on the heap, while something like `[]int` doesn't. We also get some amount of speedup in initial table data consumers from fewer type switches. This could also be accomplished without being columnar, but doing the minimum non-columnar thing to get the same results will require reinventing most of the work that the columnar does. Plus it will almost certainly be convenient for columnar exec benchmarking to have a common format. Benchmark results: name old time/op new time/op delta InitialData/tpcc/warehouses=1-8 552ms ± 1% 454ms ± 0% -17.81% (p=0.008 n=5+5) InitialData/bank/rows=1000-8 461µs ± 0% 332µs ± 1% -28.10% (p=0.008 n=5+5) WriteCSVRows-8 15.0µs ± 1% 16.8µs ± 1% +12.44% (p=0.008 n=5+5) CSVRowsReader-8 17.8µs ± 1% 19.3µs ± 1% +8.41% (p=0.008 n=5+5) name old speed new speed delta InitialData/tpcc/warehouses=1-8 128MB/s ± 1% 209MB/s ± 0% +63.22% (p=0.008 n=5+5) InitialData/bank/rows=1000-8 223MB/s ± 0% 350MB/s ± 1% +56.83% (p=0.008 n=5+5) WriteCSVRows-8 118MB/s ± 2% 107MB/s ± 3% -9.43% (p=0.008 n=5+5) CSVRowsReader-8 101MB/s ± 4% 92MB/s ± 5% -8.31% (p=0.016 n=5+5) name old alloc/op new alloc/op delta InitialData/tpcc/warehouses=1-8 246MB ± 0% 75MB ± 0% -69.61% (p=0.008 n=5+5) InitialData/bank/rows=1000-8 232kB ± 0% 19kB ± 0% -91.75% (p=0.008 n=5+5) WriteCSVRows-8 5.60kB ± 0% 5.70kB ± 0% +1.71% (p=0.016 n=4+5) CSVRowsReader-8 7.35kB ± 1% 7.39kB ± 0% ~ (p=0.667 n=5+4) name old allocs/op new allocs/op delta InitialData/tpcc/warehouses=1-8 5.60M ± 0% 1.48M ± 0% -73.56% (p=0.008 n=5+5) InitialData/bank/rows=1000-8 6.00k ± 0% 1.02k ± 0% -83.01% (p=0.008 n=5+5) WriteCSVRows-8 48.0 ± 0% 50.0 ± 0% +4.17% (p=0.008 n=5+5) CSVRowsReader-8 54.0 ± 0% 55.0 ± 0% +1.85% (p=0.008 n=5+5) We could probably speed the CSV stuff back up with some followup work, but (a) the overall `fixtures import` benchmark is still faster and (b) we're about to switch to dt's import magic that skips the CSV roundtrip. Current `fixtures import` name old time/op new time/op delta ImportFixture/tpcc/warehouses=1-8 3.35s ± 2% 3.29s ± 3% ~ (p=0.310 n=5+5) name old speed new speed delta ImportFixture/tpcc/warehouses=1-8 26.4MB/s ± 2% 26.9MB/s ± 3% ~ (p=0.310 n=5+5) name old alloc/op new alloc/op delta ImportFixture/tpcc/warehouses=1-8 3.09GB ± 0% 2.92GB ± 0% -5.60% (p=0.008 n=5+5) name old allocs/op new allocs/op delta ImportFixture/tpcc/warehouses=1-8 32.1M ± 0% 28.0M ± 0% -12.86% (p=0.008 n=5+5) Magic `fixtures import` that skips CSV roundtrip: name old time/op new time/op delta ImportFixture/tpcc/warehouses=1-8 2.88s ± 3% 2.86s ± 4% ~ (p=0.548 n=5+5) name old speed new speed delta ImportFixture/tpcc/warehouses=1-8 30.7MB/s ± 3% 30.9MB/s ± 4% ~ (p=0.548 n=5+5) name old alloc/op new alloc/op delta ImportFixture/tpcc/warehouses=1-8 2.87GB ± 0% 2.73GB ± 0% -4.97% (p=0.008 n=5+5) name old allocs/op new allocs/op delta ImportFixture/tpcc/warehouses=1-8 26.1M ± 0% 21.5M ± 0% -17.75% (p=0.008 n=5+5) Touches #34809 Release note: None Co-authored-by: Daniel Harrison <daniel.harrison@gmail.com>
Build succeeded |
workload's Table schemas are SQL schemas, but cockroachdb#35349 switched the initial data to be returned as a coldata.Batch, which has a more limited set of types. (Or, in the case of simple workloads that return a []interface{}, it's roundtripped through coldata.Batch by the `Tuples` helper.) Notably, this means a SQL STRING column is represented the same as a BYTES column (ditto UUID, etc). This caused a regression in splits, which received some []byte data for a column tried to hand it to SPLIT as a SQL BYTES datum. This didn't work for the UUID column in tpcc's history table nor the VARCHAR in ycsb's usertable. Happily, a STRING works for both of these. It also seems to work for BYTES columns, so it seems like the ambiguity is fine in this case. When/if someone wants to add a workload that splits a BYTES primary key column containing non-utf8 data, we'll may need to revisit. A more principled fix would be to get the fidelity back by parsing the SQL schema, which in fact we do in `importccl.makeDatumFromColOffset`. However, at the moment, this hack works and avoids the complexity and the undesirable pkg/sql/parser dep. Closes cockroachdb#37383 Closes cockroachdb#37382 Closes cockroachdb#37381 Closes cockroachdb#37380 Closes cockroachdb#37379 Closes cockroachdb#37378 Closes cockroachdb#37377 Closes cockroachdb#37393 Release note: None
37401: workload: fix --splits regression introduced in #35349 r=tbg a=danhhz workload's Table schemas are SQL schemas, but #35349 switched the initial data to be returned as a coldata.Batch, which has a more limited set of types. (Or, in the case of simple workloads that return a []interface{}, it's roundtripped through coldata.Batch by the `Tuples` helper.) Notably, this means a SQL STRING column is represented the same as a BYTES column (ditto UUID, etc). This caused a regression in splits, which received some []byte data for a column tried to hand it to SPLIT as a SQL BYTES datum. This didn't work for the UUID column in tpcc's history table nor the VARCHAR in ycsb's usertable. Happily, a STRING works for both of these. It also seems to work for BYTES columns, so it seems like the ambiguity is fine in this case. When/if someone wants to add a workload that splits a BYTES primary key column containing non-utf8 data, we'll may need to revisit. A more principled fix would be to get the fidelity back by parsing the SQL schema, which in fact we do in `importccl.makeDatumFromColOffset`. However, at the moment, this hack works and avoids the complexity and the undesirable pkg/sql/parser dep. Closes #37383 Closes #37382 Closes #37381 Closes #37380 Closes #37379 Closes #37378 Closes #37377 Closes #37393 Release note: None Co-authored-by: Daniel Harrison <daniel.harrison@gmail.com>
…rkload This commit adds back the injection of statistics for some tables at the start of the TPC-C workload that had been removed by mistake in cockroachdb#35349. The removal caused a regression in TPC-C since it caused some plans to change due to the lack of statistics at the beginning of the benchmark. Adding the stats back should fix the regression. Fixes cockroachdb#54702 Release note: None
54713: workload: add back injection of stats for TPC-C tables at start of workload r=rytaft a=rytaft This commit adds back the injection of statistics for some tables at the start of the TPC-C workload that had been removed by mistake in #35349. The removal caused a regression in TPC-C since it caused some plans to change due to the lack of statistics at the beginning of the benchmark. Adding the stats back should fix the regression. Fixes #54702 Release note: None Co-authored-by: Rebecca Taft <becca@cockroachlabs.com>
…rkload This commit adds back the injection of statistics for some tables at the start of the TPC-C workload that had been removed by mistake in cockroachdb#35349. The removal caused a regression in TPC-C since it caused some plans to change due to the lack of statistics at the beginning of the benchmark. Adding the stats back should fix the regression. Fixes cockroachdb#54702 Release note: None
…rkload This commit adds back the injection of statistics for some tables at the start of the TPC-C workload that had been removed by mistake in cockroachdb#35349. The removal caused a regression in TPC-C since it caused some plans to change due to the lack of statistics at the beginning of the benchmark. Adding the stats back should fix the regression. Fixes cockroachdb#54702 Release note: None
…rkload This commit adds back the injection of statistics for some tables at the start of the TPC-C workload that had been removed by mistake in cockroachdb#35349. The removal caused a regression in TPC-C since it caused some plans to change due to the lack of statistics at the beginning of the benchmark. Adding the stats back should fix the regression. Fixes cockroachdb#54702 Release note: None
A workload.Generator previously could specifiy its initial table data
using one of two methods. One outputs a single row as
[]interface{}
and the other is a batch of
[][]interface{}
. This commit switches thelatter to be a columar batch, allowing selected performance-critical
workload generators (tpcc, bank) to generate initial table data with
dramatically reduced allocations.
The single row option still exists and will continue to be used by most
workload.Generators, as it's much simpler. An adaptor from the columnar
batch to the old row oriented
[][]interface{}
batch is included forease of use in non-performance sensitive consumers of initial table
data. IMPORT, on the other hand, is switched to using the columnar
batches directly.
Some of the savings here is reusing batches, but the majority comes from
[]interface{}
requiring that everything assinged to it be on the heap,while something like
[]int
doesn't. We also get some amount of speedupin initial table data consumers from fewer type switches.
This could also be accomplished without being columnar, but doing the
minimum non-columnar thing to get the same results will require
reinventing most of the work that the columnar does. Plus it will almost
certainly be convenient for columnar exec benchmarking to have a common
format.
Benchmark results:
We could probably speed the CSV stuff back up with some followup work,
but (a) the overall
fixtures import
benchmark is still faster and (b)we're about to switch to dt's import magic that skips the CSV roundtrip.
Current
fixtures import
Magic
fixtures import
that skips CSV roundtrip:Touches #34809
Release note: None