Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter pushdown over layouts #1124

Merged
merged 51 commits into from
Nov 1, 2024
Merged

Filter pushdown over layouts #1124

merged 51 commits into from
Nov 1, 2024

Conversation

robert3005
Copy link
Member

@robert3005 robert3005 commented Oct 23, 2024

Draw the rest of the f***ing owl

@robert3005 robert3005 marked this pull request as draft October 23, 2024 17:22
@robert3005
Copy link
Member Author

robert3005 commented Oct 23, 2024

Should make benchmarks assert row counts..., still something wrong with q21

@robert3005 robert3005 added the benchmark Run benchmarks on this branch label Oct 23, 2024
@github-actions github-actions bot removed the benchmark Run benchmarks on this branch label Oct 23, 2024
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vortex bytes_at

Benchmark suite Current: 401f855 Previous: 3277e18 Ratio
bytes_at/array_data 713.688958258889 ns (0.9385133640842582) 697.1932301138808 ns (1.900478592199022) 1.02
bytes_at/array_view 489.97083101999846 ns (0.7687406656665132) 489.87167684366335 ns (1.2714639080581946) 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random Access

Benchmark suite Current: 401f855 Previous: 3277e18 Ratio
random-access/vortex-tokio-local-disk 993165.641015356 ns (6223.78871750721) 991917.8809199578 ns (4305.285285678867) 1.00
random-access/vortex-local-fs 1151011.0913910884 ns (11051.714122282574) 1120387.1149837445 ns (6089.661919815) 1.03
random-access/parquet-tokio-local-disk 249385817.3333333 ns (6267040.516666651) 224078783.6 ns (2764673.7695833296) 1.11

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DataFusion

Benchmark suite Current: 401f855 Previous: 3277e18 Ratio
arrow/planning 780537.3253414959 ns (2645.210110048647) 784873.78964399 ns (1585.4675600654446) 0.99
arrow/exec 1375758.9799735523 ns (3829.2770714033395) 1409846.0399014957 ns (5031.298920472269) 0.98
vortex-pushdown-compressed/planning 484435.5495539746 ns (1530.2420453905943) 489286.13268452033 ns (2209.6471550615097) 0.99
vortex-pushdown-compressed/exec 2696228.1678947373 ns (22980.729328947607) 2742886.5277777775 ns (25607.84083333332) 0.98
vortex-pushdown-uncompressed/planning 483389.38816159114 ns (819.0679351813451) 484853.53765889734 ns (1431.9005410389218) 1.00
vortex-pushdown-uncompressed/exec 2981360.877058825 ns (7410.507301470963) 2651567.372105264 ns (6654.316559210885) 1.12
vortex-nopushdown-compressed/planning 800661.2589305517 ns (2278.38412449538) 805467.1448396859 ns (1725.1544119613827) 0.99
vortex-nopushdown-compressed/exec 3831810.6415384603 ns (47672.893471154384) 4455820.533636366 ns (72993.99320454616) 0.86
vortex-nopushdown-uncompressed/planning 790998.5368629418 ns (1437.500284052221) 799919.1150438058 ns (1204.8202391152736) 0.99
vortex-nopushdown-uncompressed/exec 5200854.335000001 ns (17436.35002500005) 5668173.8933333345 ns (51030.160305555444) 0.92

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TPC-H

Benchmark suite Current: 401f855 Previous: 3277e18 Ratio
tpch_q1/vortex-in-memory-no-pushdown 605852762.5 ns (1899061.71875) 563558755.3 ns (1651815.8500000238) 1.08
tpch_q1/vortex-in-memory-pushdown 486790432.1 ns (4794948.125) 453930524.4 ns (1262486.481249988) 1.07
tpch_q1/arrow 586437829.7 ns (3016663.2437499166) 550138905.7 ns (2918642.557500005) 1.07
tpch_q1/parquet 723863253.4 ns (2457066.6887500286) 698800178.9 ns (1480571.75) 1.04
tpch_q1/vortex-file-compressed 568617106.3 ns (2400483.8837500215) 507373815.3 ns (2243968.1462500095) 1.12
tpch_q1/vortex-file-uncompressed 578160995.4 ns (6057697.79124999) 528670709 ns (1999679.731249988) 1.09
tpch_q2/vortex-in-memory-no-pushdown 127734226.87412696 ns (1051027.8983462378) 119509963.06630953 ns (685316.8454791754) 1.07
tpch_q2/vortex-in-memory-pushdown 126265788.5501984 ns (800863.8354166597) 116652218.12206349 ns (421108.0549900979) 1.08
tpch_q2/arrow 126191642.9936111 ns (873470.20737499) 115661607.15531747 ns (796704.8145634905) 1.09
tpch_q2/parquet 152477232.579246 ns (1238935.4538100064) 146542766.53265876 ns (682473.2094310522) 1.04
tpch_q2/vortex-file-compressed 183499323.3005159 ns (656780.1491086334) 159978045.6681746 ns (1440371.858968243) 1.15
tpch_q2/vortex-file-uncompressed 190374736.06666666 ns (1407527.399999991) 163111479.4040873 ns (1012347.1080505997) 1.17
tpch_q3/vortex-in-memory-no-pushdown 167661219.35460317 ns (1502708.2960436344) 157247565.9175 ns (659488.508031249) 1.07
tpch_q3/vortex-in-memory-pushdown 199238393.03333333 ns (1231516.4504166543) 174795280.84857142 ns (1006194.0236785561) 1.14
tpch_q3/arrow 166379896.1288492 ns (2310663.4613789767) 146078964.25928572 ns (926774.5843452513) 1.14
tpch_q3/parquet 361942085.75 ns (2685094.7493749857) 332278903.3 ns (1510815.3806249797) 1.09
tpch_q3/vortex-file-compressed 289119134.45 ns (4634744.439999998) 290571849.15 ns (857713.3162499666) 1.00
tpch_q3/vortex-file-uncompressed 269517761.25 ns (2423098.850000009) 244143049.50000006 ns (1320185.4720833153) 1.10
tpch_q4/vortex-in-memory-no-pushdown 117558660.19079363 ns (1062220.8002758026) 109857307.23400795 ns (579971.9810238108) 1.07
tpch_q4/vortex-in-memory-pushdown 139899133.5403968 ns (1356014.397579357) 128390480.78027777 ns (514980.07854861766) 1.09
tpch_q4/arrow 197635528.49999997 ns (1484497.7258333266) 182944912.09285715 ns (757876.7019226402) 1.08
tpch_q4/parquet 211721636.86666667 ns (1088170.9133333266) 201853150.3333333 ns (636550.0133333057) 1.05
tpch_q4/vortex-file-compressed 111006353.14063492 ns (543793.1947648823) 247841284.06666666 ns (970786.7666666806) 0.45
tpch_q4/vortex-file-uncompressed 120250154.96373017 ns (382951.24831747264) 200997902.8 ns (1174551.2333333343) 0.60
tpch_q5/vortex-in-memory-no-pushdown 292152492.05 ns (1182077.5618750155) 275348690.5 ns (803071.2481250167) 1.06
tpch_q5/vortex-in-memory-pushdown 296400904.85 ns (2047109.3649999797) 280985085.9 ns (1080020.7999999821) 1.05
tpch_q5/arrow 285331334.35 ns (1650329.0193749964) 263973648 ns (1635479.9118750095) 1.08
tpch_q5/parquet 459230211.55 ns (4281774.324999988) 431934917.7 ns (2330111.834374994) 1.06
tpch_q5/vortex-file-compressed 397605915.2 ns (3207647.450000018) 352222524.6 ns (4054751.1212500036) 1.13
tpch_q5/vortex-file-uncompressed 365650174.3 ns (3290448.39062503) 337707647.55 ns (2979031.7856250107) 1.08
tpch_q6/vortex-in-memory-no-pushdown 37701197.716719575 ns (221569.06991270185) 34074677.47695768 ns (178582.30412070453) 1.11
tpch_q6/vortex-in-memory-pushdown 74087121.7705754 ns (209080.6839858666) 73018903.49426588 ns (158467.02186509222) 1.01
tpch_q6/arrow 29323351.990843255 ns (442770.9603177067) 24700216.2627381 ns (56884.5137686003) 1.19
tpch_q6/parquet 141748730.79900792 ns (381751.6989662647) 136488358.6250397 ns (283905.06493996084) 1.04
tpch_q6/vortex-file-compressed 88126795.28416666 ns (11073101.556677073) 70628678.00486112 ns (208984.17524306476) 1.25
tpch_q6/vortex-file-uncompressed 258918799.7 ns (2606450.6287499964) 173789392.92841274 ns (726818.4117499888) 1.49
tpch_q7/vortex-in-memory-no-pushdown 620201487 ns (5512207.327499926) 541797344.1 ns (3425996.2200000286) 1.14
tpch_q7/vortex-in-memory-pushdown 645101917.1 ns (7745041.3549999595) 566056554.7 ns (3729533.600000024) 1.14
tpch_q7/arrow 635627604.9 ns (8320880.001250029) 524963454.8 ns (3771153.349999994) 1.21
tpch_q7/parquet 726181421.6 ns (7057019.183749974) 659755113.5 ns (4761023.449999988) 1.10
tpch_q7/vortex-file-compressed 708174856.2 ns (8917979.449999988) 678668088.2 ns (5032510.411249995) 1.04
tpch_q7/vortex-file-uncompressed 688932452.3 ns (2900784.714999974) 661427204.6 ns (2420240.7349999547) 1.04
tpch_q8/vortex-in-memory-no-pushdown 225562379.3666667 ns (1655886.9120833278) 220785897.96666664 ns (1736727.228333339) 1.02
tpch_q8/vortex-in-memory-pushdown 230620130.0333333 ns (1225646.6558333486) 225961478.8 ns (1031678.3312499821) 1.02
tpch_q8/arrow 214149847.36666667 ns (1305418.9929167032) 205666029.76666668 ns (1391699.7804166824) 1.04
tpch_q8/parquet 494412986.2 ns (2591183.3237499893) 478011881.05 ns (1991853.9581249952) 1.03
tpch_q8/vortex-file-compressed 355594442.05 ns (4643752.770624995) 299454800.9 ns (1018539.0143749714) 1.19
tpch_q8/vortex-file-uncompressed 320774370.5 ns (2499963.856250018) 295143235.2 ns (3132311.3743750155) 1.09
tpch_q9/vortex-in-memory-no-pushdown 458302637.6 ns (4757567.115624994) 407017457.45 ns (2290956.2693749964) 1.13
tpch_q9/vortex-in-memory-pushdown 455608589.15 ns (6586062.418749958) 406037650.55 ns (3066240.25) 1.12
tpch_q9/arrow 447004441.55 ns (5066784.383125007) 393693669.8 ns (3379257.275000006) 1.14
tpch_q9/parquet 745694280.6 ns (5847777.648749948) 687402936.6 ns (4017765.151249945) 1.08
tpch_q9/vortex-file-compressed 596647389.1 ns (7153904.446249962) 483583983.8 ns (3098605.5) 1.23
tpch_q9/vortex-file-uncompressed 531405769.2 ns (5509327.366249979) 473479291.2 ns (4650026.752499998) 1.12
tpch_q10/vortex-in-memory-no-pushdown 290379891.6 ns (3728593.596249968) 274026014.8 ns (1279460.4012500346) 1.06
tpch_q10/vortex-in-memory-pushdown 331615594.35 ns (1203996.7112499774) 312143737.75 ns (1737417.0087499917) 1.06
tpch_q10/arrow 286584279.1 ns (3479580.479375005) 262193059.85 ns (1518121.450000003) 1.09
tpch_q10/parquet 525898584.8 ns (3725695.787499994) 493506128 ns (820000.6225000024) 1.07
tpch_q10/vortex-file-compressed 438596247.2 ns (4514864.524375021) 409373987.6 ns (2132267.4287500083) 1.07
tpch_q10/vortex-file-uncompressed 448814384.45 ns (2335905.8493750095) 402096319.95 ns (2196698.265625) 1.12
tpch_q11/vortex-in-memory-no-pushdown 196438468.13333336 ns (2740202.150000006) 175670027.59174603 ns (1053029.5849603266) 1.12
tpch_q11/vortex-in-memory-pushdown 189768174.5666667 ns (1347695.765833363) 173875845.45142856 ns (788168.1084136963) 1.09
tpch_q11/arrow 186977352.7 ns (888775.8320832998) 173926658.54186505 ns (1860507.3526671678) 1.08
tpch_q11/parquet 196670294.76666668 ns (1251080.7512500286) 185402973.26666665 ns (1329030.1833333075) 1.06
tpch_q11/vortex-file-compressed 290193708.45 ns (2379999.501249999) 268352541.9 ns (3032555.8737499863) 1.08
tpch_q11/vortex-file-uncompressed 283027749.2 ns (1712428.3868750036) 263858083.55 ns (2967639.75) 1.07
tpch_q12/vortex-in-memory-no-pushdown 232265297.8666667 ns (659989.34041664) 229341944.43333334 ns (385787.56166669726) 1.01
tpch_q12/vortex-in-memory-pushdown 279732881.3 ns (819948.4756249785) 277235129.9 ns (826349.625) 1.01
tpch_q12/arrow 188843625.73333335 ns (295433.06291668117) 185181343.99999997 ns (272066.334583357) 1.02
tpch_q12/parquet 336866216.5 ns (881104.4012500048) 336144511.85 ns (1078174.916874975) 1.00
tpch_q12/vortex-file-compressed 401475972.1 ns (4358498.25) 400401782.9 ns (1663641.175000012) 1.00
tpch_q12/vortex-file-uncompressed 420917168.55 ns (4096679.385625005) 419468173.95 ns (1775695.2618750334) 1.00
tpch_q13/vortex-in-memory-no-pushdown 191976339.96666664 ns (2556782.349999994) 167168696.9273016 ns (904039.5819811523) 1.15
tpch_q13/vortex-in-memory-pushdown 198680714.1 ns (3074084.572083339) 167433341.8951984 ns (1019682.1280555576) 1.19
tpch_q13/arrow 185700974.6 ns (2586119.8833333254) 162418628.86793652 ns (1311243.4978670627) 1.14
tpch_q13/parquet 349216773.6 ns (2071042.8093750179) 326372375.6 ns (3417840.130000025) 1.07
tpch_q13/vortex-file-compressed 231436575.23333335 ns (2237791.346666679) 208980351.96666664 ns (2884317.4479166716) 1.11
tpch_q13/vortex-file-uncompressed 222516623.9666667 ns (1109971.3487500101) 204663548.5 ns (2262915.2666666657) 1.09
tpch_q14/vortex-in-memory-no-pushdown 48767312.61595238 ns (607675.6118080355) 42970748.49285714 ns (330082.23798016086) 1.13
tpch_q14/vortex-in-memory-pushdown 82836154.10017857 ns (516548.59794642776) 72851328.9996627 ns (287995.65296031535) 1.14
tpch_q14/arrow 42251094.142420635 ns (627705.670473706) 33281071.366851855 ns (202982.198370371) 1.27
tpch_q14/parquet 231916724.7666667 ns (1639137.524999991) 224754496.13333336 ns (488439.15166668594) 1.03
tpch_q14/vortex-file-compressed 122408873.13484128 ns (502184.9398144707) 119073726.96265872 ns (429542.6285292655) 1.03
tpch_q14/vortex-file-uncompressed 137504194.77444443 ns (817821.8230555356) 140368770.5390873 ns (1311432.8870530576) 0.98
tpch_q15/vortex-in-memory-no-pushdown 76057683.96547619 ns (457045.33184523135) 71790750.68968253 ns (248424.23982143402) 1.06
tpch_q15/vortex-in-memory-pushdown 112654611.13027778 ns (486776.3351388946) 105067046.21920635 ns (315527.33880355954) 1.07
tpch_q15/arrow 66042454.11694445 ns (724068.6147222146) 57344249.058769844 ns (432021.1953501925) 1.15
tpch_q15/parquet 315169485.4 ns (1125494.7493749857) 302103084.1 ns (1394860.2649999857) 1.04
tpch_q15/vortex-file-compressed 244377109.2 ns (1149005.75) 223268756.1666667 ns (1006886.5154166669) 1.09
tpch_q15/vortex-file-uncompressed 277246684.1 ns (1687144.0868749619) 257350647.05 ns (2293255.225000009) 1.08
tpch_q16/vortex-in-memory-no-pushdown 114412716.02825396 ns (710931.8006101102) 106418099.92424604 ns (538416.7916443422) 1.08
tpch_q16/vortex-in-memory-pushdown 126829265.3920238 ns (1230920.588821426) 118590997.87972221 ns (604168.2922222242) 1.07
tpch_q16/arrow 113338958.0211508 ns (1242785.813203387) 105359412.33194444 ns (384372.3280451372) 1.08
tpch_q16/parquet 123707059.58956349 ns (1305622.2368055433) 114581758.44123015 ns (395706.06611109525) 1.08
tpch_q16/vortex-file-compressed 142373587.85075396 ns (932222.9411180764) 132663093.68722221 ns (447697.38984028995) 1.07
tpch_q16/vortex-file-uncompressed 139659079.99234128 ns (726081.6791865081) 133423908.46908729 ns (566767.2521329448) 1.05
tpch_q17/vortex-in-memory-no-pushdown 633760997.2 ns (10521284.887499988) 502786586.9 ns (7338251.254999995) 1.26
tpch_q17/vortex-in-memory-pushdown 690550600.8 ns (5547452.28125) 584441219.9 ns (5980089.721249998) 1.18
tpch_q17/arrow 607177027.6 ns (5472584.252499998) 495474525.9 ns (3967364.032499999) 1.23
tpch_q17/parquet 664312840.8 ns (4334669.38499999) 630346776.4 ns (2794100.289999962) 1.05
tpch_q17/vortex-file-compressed 678641894.5 ns (11208262.174999952) 593117961.6 ns (4990772.024999976) 1.14
tpch_q17/vortex-file-uncompressed 661817535.4 ns (6936608.199999988) 607737416.2 ns (6044423.356250048) 1.09
tpch_q18/vortex-in-memory-no-pushdown 1191325318.2 ns (17854550.693749905) 1055464736.8 ns (6186682.693750024) 1.13
tpch_q18/vortex-in-memory-pushdown 1165806688.7 ns (15237292.748749971) 1049565933 ns (3875237.7975000143) 1.11
tpch_q18/arrow 1157170124.5 ns (14854757.586249948) 1041486306.9 ns (6043088.167500079) 1.11
tpch_q18/parquet 1325949179.7 ns (14597791.27124989) 1200986862.3 ns (7970737.286249995) 1.10
tpch_q18/vortex-file-compressed 1261316022.6 ns (6348699.25) 1090918400.4 ns (5187849.399999976) 1.16
tpch_q18/vortex-file-uncompressed 1186415442.4 ns (6818638.797500014) 1091636317.5 ns (12325983.911249995) 1.09
tpch_q19/vortex-in-memory-no-pushdown 185704373.6 ns (381997.4850000143) 179497140.12515873 ns (351437.663464278) 1.03
tpch_q19/vortex-in-memory-pushdown 297423363.25 ns (1570619.6943750083) 296490734.05 ns (479395.46062502265) 1.00
tpch_q19/arrow 171777932.7475 ns (626657.204562515) 162686060.82611108 ns (320632.15082637966) 1.06
tpch_q19/parquet 460441341.25 ns (1613166.453125) 451888442.65 ns (1322696.4643749893) 1.02
tpch_q19/vortex-file-compressed 456898809.5 ns (3026074.135625005) 337250538.9 ns (1179243.8556250036) 1.35
tpch_q19/vortex-file-uncompressed 492681598.6 ns (1985342.1431250274) 401877433.35 ns (900959.0749999881) 1.23
tpch_q20/vortex-in-memory-no-pushdown 271968005.15 ns (2786375.357499987) 237112517.4333333 ns (1110753.573333323) 1.15
tpch_q20/vortex-in-memory-pushdown 296483103.8 ns (3901202) 252487081.6 ns (1786143.546875) 1.17
tpch_q20/arrow 265001033.1 ns (2797432.399999991) 228550209.1666667 ns (2110736.6875) 1.16
tpch_q20/parquet 382417762.1 ns (2408453.125) 349335637.75 ns (1483596.7543750405) 1.09
tpch_q20/vortex-file-compressed 385140581.05 ns (2738380.4181250036) 339504223.9 ns (1567080.0549999774) 1.13
tpch_q20/vortex-file-uncompressed 397646511.9 ns (1674278.0693749785) 361213199.3 ns (2037610.887499988) 1.10
tpch_q21/vortex-in-memory-no-pushdown 939464605.2 ns (3509452.313749969) 876699758.1 ns (3148857.412500024) 1.07
tpch_q21/vortex-in-memory-pushdown 993153446.2 ns (15713660.586249948) 898738061.1 ns (6534987.466250002) 1.11
tpch_q21/arrow 926524683.9 ns (9165995.850000024) 852099586.7 ns (6856553.196250021) 1.09
tpch_q21/parquet 1047534499.6 ns (8502314.501249969) 956127854.5 ns (5562546.994999945) 1.10
tpch_q21/vortex-file-compressed 355506665.3 ns (3715980.5724999905) 1180026120.9 ns (5417144.650000095) 0.30
tpch_q21/vortex-file-uncompressed 344064123.85 ns (1867301.9456250072) 1069489692.7 ns (4757015.63499999) 0.32
tpch_q22/vortex-in-memory-no-pushdown 77092055.74222222 ns (462019.19905555993) 75806490.51601191 ns (388645.9278273806) 1.02
tpch_q22/vortex-in-memory-pushdown 77515114.3866865 ns (240406.77481871098) 75233551.44293651 ns (313823.21928472817) 1.03
tpch_q22/arrow 76442960.8715873 ns (269182.98918403685) 73650682.17509922 ns (119282.2514243573) 1.04
tpch_q22/parquet 94675621.34369048 ns (707262.5675208345) 92327697.93174604 ns (454039.8261071518) 1.03
tpch_q22/vortex-file-compressed 127729460.85011907 ns (637308.6874791682) 118339212.50873017 ns (336489.2850525826) 1.08
tpch_q22/vortex-file-uncompressed 122895619.02583332 ns (607386.9148333371) 117053664.8313889 ns (413378.25551041216) 1.05

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vortex Compression

Benchmark suite Current: 401f855 Previous: 3277e18 Ratio
compress time/taxi 1258278708.8 ns (6731527.900000095) 1199816302.8 ns (5722503.049999952) 1.05
compress time/taxi throughput 470808924 bytes 470808924 bytes 1
parquet_rs-zstd compress time/taxi 1755102093.8 ns (4290965.653750062) 1778997730.5 ns (3683799.6437499523) 0.99
parquet_rs-zstd compress time/taxi throughput 470808924 bytes 470808924 bytes 1
decompress time/taxi 531824037.4 ns (3506343.1475000083) 423433096.2 ns (3823650.799999982) 1.26
decompress time/taxi throughput 470808924 bytes 470808924 bytes 1
parquet_rs-zstd decompress time/taxi 306350742.25 ns (878664.1450000107) 311296972.05 ns (1176399.5412499905) 0.98
parquet_rs-zstd decompress time/taxi throughput 470808924 bytes 470808924 bytes 1
vortex:parquet-zstd size/taxi 0.9451798857093027 ratio 0.9371571283546467 ratio 1.01
vortex:raw size/taxi 0.11234527916467446 ratio 0.11139141449238969 ratio 1.01
vortex size/taxi 52893160 bytes 52444072 bytes 1.01
compress time/AirlineSentiment 692656.6663510102 ns (1727.141652777791) 685859.9294626446 ns (1218.1701421708567) 1.01
compress time/AirlineSentiment throughput 2020 bytes 2020 bytes 1
parquet_rs-zstd compress time/AirlineSentiment 56081.881914284735 ns (91.87957853519765) 56520.17104530158 ns (200.96131697812234) 0.99
parquet_rs-zstd compress time/AirlineSentiment throughput 2020 bytes 2020 bytes 1
decompress time/AirlineSentiment 77503.16159851402 ns (136.3729915974327) 39409.17819261413 ns (109.90872363405651) 1.97
decompress time/AirlineSentiment throughput 2020 bytes 2020 bytes 1
parquet_rs-zstd decompress time/AirlineSentiment 32969.29712027767 ns (45.36404967109047) 32325.869463956078 ns (140.126427641786) 1.02
parquet_rs-zstd decompress time/AirlineSentiment throughput 2020 bytes 2020 bytes 1
vortex:parquet-zstd size/AirlineSentiment 6.196483971044468 ratio 6.196483971044468 ratio 1
vortex:raw size/AirlineSentiment 2.9663366336633663 ratio 2.9663366336633663 ratio 1
vortex size/AirlineSentiment 5992 bytes 5992 bytes 1
compress time/Arade 2211462850.8 ns (2810205.638749838) 2265286373.7 ns (13868689.903749943) 0.98
compress time/Arade throughput 787023760 bytes 787023760 bytes 1
parquet_rs-zstd compress time/Arade 2932424377.3 ns (4590952.299999952) 3008320736.3 ns (9277296.88499999) 0.97
parquet_rs-zstd compress time/Arade throughput 787023760 bytes 787023760 bytes 1
decompress time/Arade 1182932376.9 ns (3567632.2999999523) 800402832.8 ns (12131436.810000002) 1.48
decompress time/Arade throughput 787023760 bytes 787023760 bytes 1
parquet_rs-zstd decompress time/Arade 643348677.5 ns (2607907.2999999523) 657387207.6 ns (4389854.350000024) 0.98
parquet_rs-zstd decompress time/Arade throughput 787023760 bytes 787023760 bytes 1
vortex:parquet-zstd size/Arade 0.4789069103731621 ratio 0.4789069103731621 ratio 1
vortex:raw size/Arade 0.18583271234403392 ratio 0.18583271234403392 ratio 1
vortex size/Arade 146254760 bytes 146254760 bytes 1
compress time/Bimbo 10532313238.8 ns (8753069.399999619) 9546050628.8 ns (22447441.511249542) 1.10
compress time/Bimbo throughput 7121333608 bytes 7121333608 bytes 1
parquet_rs-zstd compress time/Bimbo 21559613285.8 ns (78053853.92625237) 21000198558.3 ns (55499542.11249924) 1.03
parquet_rs-zstd compress time/Bimbo throughput 7121333608 bytes 7121333608 bytes 1
decompress time/Bimbo 6158732701 ns (43366186.15250015) 4571181199.6 ns (71870951.05875015) 1.35
decompress time/Bimbo throughput 7121333608 bytes 7121333608 bytes 1
parquet_rs-zstd decompress time/Bimbo 2618450872 ns (5359471.776249886) 2626485980.6 ns (5640188.648750067) 1.00
parquet_rs-zstd decompress time/Bimbo throughput 7121333608 bytes 7121333608 bytes 1
vortex:parquet-zstd size/Bimbo 1.237048831798617 ratio 1.1815068966593076 ratio 1.05
vortex:raw size/Bimbo 0.06742581241533095 ratio 0.06439847832501656 ratio 1.05
vortex size/Bimbo 480161704 bytes 458603048 bytes 1.05
compress time/CMSprovider 11446667314.6 ns (25195523.03125) 12205124085.2 ns (59046762.10000038) 0.94
compress time/CMSprovider throughput 5149123964 bytes 5149123964 bytes 1
parquet_rs-zstd compress time/CMSprovider 18698216778.7 ns (24489710.66625023) 18920549638.4 ns (99835018.52875137) 0.99
parquet_rs-zstd compress time/CMSprovider throughput 5149123964 bytes 5149123964 bytes 1
decompress time/CMSprovider 7491455607.6 ns (151501053.08999968) 7446892436.3 ns (77265632.45749998) 1.01
decompress time/CMSprovider throughput 5149123964 bytes 5149123964 bytes 1
parquet_rs-zstd decompress time/CMSprovider 5140026444.5 ns (98599830.3499999) 5014108288.6 ns (39858506.69999981) 1.03
parquet_rs-zstd decompress time/CMSprovider throughput 5149123964 bytes 5149123964 bytes 1
vortex:parquet-zstd size/CMSprovider 1.2143074714199444 ratio 1.2015251768731632 ratio 1.01
vortex:raw size/CMSprovider 0.18147339985074012 ratio 0.17956299177574045 ratio 1.01
vortex size/CMSprovider 934429032 bytes 924592104 bytes 1.01
compress time/Euro2016 2660717266.5 ns (3068819.587499857) 2668723686.7 ns (3997447.672499895) 1.00
compress time/Euro2016 throughput 393253221 bytes 393253221 bytes 1
parquet_rs-zstd compress time/Euro2016 1550945664.3 ns (3095595.832499981) 1585550199.8 ns (6202071.157499909) 0.98
parquet_rs-zstd compress time/Euro2016 throughput 393253221 bytes 393253221 bytes 1
decompress time/Euro2016 416480787.05 ns (2136586.8056250215) 297746397.8 ns (2823391.1168750226) 1.40
decompress time/Euro2016 throughput 393253221 bytes 393253221 bytes 1
parquet_rs-zstd decompress time/Euro2016 492192003.1 ns (3000707.4981250167) 496344965.1 ns (4941529.356874973) 0.99
parquet_rs-zstd decompress time/Euro2016 throughput 393253221 bytes 393253221 bytes 1
vortex:parquet-zstd size/Euro2016 1.4383522286595696 ratio 1.4383522286595696 ratio 1
vortex:raw size/Euro2016 0.4348484255644533 ratio 0.4348484255644533 ratio 1
vortex size/Euro2016 171005544 bytes 171005544 bytes 1
compress time/Food 1147227022.5 ns (5826070.358750105) 1127347657.2 ns (8226763.6712498665) 1.02
compress time/Food throughput 332718229 bytes 332718229 bytes 1
parquet_rs-zstd compress time/Food 1084510910.2 ns (3681184.4500000477) 1067724763.5 ns (4175386.2425000668) 1.02
parquet_rs-zstd compress time/Food throughput 332718229 bytes 332718229 bytes 1
decompress time/Food 253954285.65 ns (1666382.4068749994) 205389161.63333333 ns (1506009.0666666627) 1.24
decompress time/Food throughput 332718229 bytes 332718229 bytes 1
parquet_rs-zstd decompress time/Food 215438826.53333336 ns (637830.150000006) 217049395.16666666 ns (962805.8475000262) 0.99
parquet_rs-zstd decompress time/Food throughput 332718229 bytes 332718229 bytes 1
vortex:parquet-zstd size/Food 1.2386506182043049 ratio 1.238933256800101 ratio 1.00
vortex:raw size/Food 0.13487787589780662 ratio 0.1349090374005327 ratio 1.00
vortex size/Food 44876328 bytes 44886696 bytes 1.00
compress time/HashTags 2616791563.3 ns (2067589.9000000954) 2594094550.1 ns (10638444) 1.01
compress time/HashTags throughput 804495592 bytes 804495592 bytes 1
parquet_rs-zstd compress time/HashTags 2491819246.2 ns (3830624.5724999905) 2486439545.9 ns (9046463.120000124) 1.00
parquet_rs-zstd compress time/HashTags throughput 804495592 bytes 804495592 bytes 1
decompress time/HashTags 825408410.9 ns (5047659.055000007) 613006842.4 ns (3863965.4462500215) 1.35
decompress time/HashTags throughput 804495592 bytes 804495592 bytes 1
parquet_rs-zstd decompress time/HashTags 793843690.4 ns (6729594.62499994) 798286790 ns (12325588.789999962) 0.99
parquet_rs-zstd decompress time/HashTags throughput 804495592 bytes 804495592 bytes 1
vortex:parquet-zstd size/HashTags 1.7048629043277994 ratio 1.6584175128434935 ratio 1.03
vortex:raw size/HashTags 0.2838956611710061 ratio 0.2761615230826523 ratio 1.03
vortex size/HashTags 228392808 bytes 222170728 bytes 1.03
compress time/TPC-H l_comment chunked without fsst 3918088569 ns (38853785.70499992) 3989315351.9 ns (55086752.43499994) 0.98
compress time/TPC-H l_comment chunked without fsst throughput 249197090 bytes 249197090 bytes 1
parquet_rs-zstd compress time/TPC-H l_comment chunked without fsst 913223923 ns (3727834.1262500286) 925472299.3 ns (2078326.4462499619) 0.99
parquet_rs-zstd compress time/TPC-H l_comment chunked without fsst throughput 249197090 bytes 249197090 bytes 1
decompress time/TPC-H l_comment chunked without fsst 120853483.47305556 ns (998794.0191840231) 117013804.05523808 ns (1851055.195029758) 1.03
decompress time/TPC-H l_comment chunked without fsst throughput 249197090 bytes 249197090 bytes 1
parquet_rs-zstd decompress time/TPC-H l_comment chunked without fsst 252776749.75 ns (778863.9362500012) 254826817.35 ns (2192562.6368749887) 0.99
parquet_rs-zstd decompress time/TPC-H l_comment chunked without fsst throughput 249197090 bytes 249197090 bytes 1
vortex:parquet-zstd size/TPC-H l_comment chunked without fsst 4.607739931587605 ratio 4.607492100051195 ratio 1.00
vortex:raw size/TPC-H l_comment chunked without fsst 1.0527099814849363 ratio 1.0527061291125028 ratio 1.00
vortex size/TPC-H l_comment chunked without fsst 262332264 bytes 262331304 bytes 1.00
compress time/TPC-H l_comment chunked 940915980.7 ns (5045854.5512500405) 934628417.9 ns (4951446.75999999) 1.01
compress time/TPC-H l_comment chunked throughput 249197090 bytes 249197090 bytes 1
parquet_rs-zstd compress time/TPC-H l_comment chunked 917558704 ns (4515114.649999976) 931884612.4 ns (2752188.0950000286) 0.98
parquet_rs-zstd compress time/TPC-H l_comment chunked throughput 249197090 bytes 249197090 bytes 1
decompress time/TPC-H l_comment chunked 159076399.77519843 ns (421213.40318898857) 135275310.3498413 ns (731511.7177142948) 1.18
decompress time/TPC-H l_comment chunked throughput 249197090 bytes 249197090 bytes 1
parquet_rs-zstd decompress time/TPC-H l_comment chunked 253082092.6 ns (596091.075000003) 253436685.6 ns (1090212.6306250095) 1.00
parquet_rs-zstd decompress time/TPC-H l_comment chunked throughput 249197090 bytes 249197090 bytes 1
vortex:parquet-zstd size/TPC-H l_comment chunked 1.3480103387566829 ratio 1.347943891623079 ratio 1.00
vortex:raw size/TPC-H l_comment chunked 0.3079739655065796 ratio 0.3079742223314084 ratio 1.00
vortex size/TPC-H l_comment chunked 76746216 bytes 76746280 bytes 1.00
compress time/TPC-H l_comment canonical 921916506.05 ns (2443485.208124995) 934557517.7 ns (3373342.2681249976) 0.99
compress time/TPC-H l_comment canonical throughput 249197106 bytes 249197106 bytes 1
parquet_rs-zstd compress time/TPC-H l_comment canonical 920059253.4 ns (3310766.6162499785) 927496165.3 ns (2423076.0250000358) 0.99
parquet_rs-zstd compress time/TPC-H l_comment canonical throughput 249197106 bytes 249197106 bytes 1
decompress time/TPC-H l_comment canonical 159179436.6436905 ns (402262.679689005) 134060904.4868783 ns (585881.5040095896) 1.19
decompress time/TPC-H l_comment canonical throughput 249197106 bytes 249197106 bytes 1
parquet_rs-zstd decompress time/TPC-H l_comment canonical 253502326.75767857 ns (857150.2099590749) 251131908.93484125 ns (881966.8269980699) 1.01
parquet_rs-zstd decompress time/TPC-H l_comment canonical throughput 249197106 bytes 249197106 bytes 1
vortex:parquet-zstd size/TPC-H l_comment canonical 1.3480091312229783 ratio 1.3479486502744302 ratio 1.00
vortex:raw size/TPC-H l_comment canonical 0.30797394573274056 ratio 0.307974202557553 ratio 1.00
vortex size/TPC-H l_comment canonical 76746216 bytes 76746280 bytes 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@robert3005 robert3005 added the benchmark Run benchmarks on this branch label Oct 23, 2024
@github-actions github-actions bot removed the benchmark Run benchmarks on this branch label Oct 23, 2024
@robert3005 robert3005 force-pushed the rk/filterpushdown2 branch 2 times, most recently from 87ca4b9 to 4f951c9 Compare October 24, 2024 13:50
@robert3005 robert3005 marked this pull request as ready for review October 25, 2024 00:10
@danking danking mentioned this pull request Oct 25, 2024
32 tasks
Copy link
Member

@danking danking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there now no way to enforce a maximum batch size on read?

pyvortex/python/vortex/dataset.py Show resolved Hide resolved
pyvortex/python/vortex/dataset.py Show resolved Hide resolved
pyvortex/python/vortex/dataset.py Show resolved Hide resolved
vortex-serde/src/layouts/read/buffered.rs Outdated Show resolved Hide resolved
@robert3005
Copy link
Member Author

Right now I got rid of batch sizes and they're effectively discovered from how the data has been written. We can add batching if we want but the current reader tries to do minimum splitting to pass you the data as it was written.

@robert3005
Copy link
Member Author

@danking I filed #1138, it should be easy to implement. I will leave some comments in the code

@robert3005 robert3005 dismissed danking’s stale review October 25, 2024 15:14

Addressed comments

vortex-serde/src/layouts/read/expr_project.rs Outdated Show resolved Hide resolved
vortex-serde/src/layouts/read/expr_project.rs Outdated Show resolved Hide resolved
vortex-serde/src/layouts/read/expr_project.rs Show resolved Hide resolved
vortex-serde/src/layouts/read/stream.rs Outdated Show resolved Hide resolved
vortex-serde/src/layouts/read/stream.rs Show resolved Hide resolved
vortex-serde/src/layouts/read/stream.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@gatesn gatesn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small comments. I don't think this is a great implementation, but it is a reasonable base to build on.

vortex-serde/src/layouts/read/stream.rs Outdated Show resolved Hide resolved
vortex-serde/src/layouts/read/mod.rs Outdated Show resolved Hide resolved
vortex-serde/src/layouts/read/stream.rs Outdated Show resolved Hide resolved
vortex-serde/src/layouts/read/stream.rs Show resolved Hide resolved
vortex-serde/src/layouts/read/stream.rs Outdated Show resolved Hide resolved
vortex-serde/src/layouts/read/cache.rs Show resolved Hide resolved
vortex-serde/src/layouts/read/batch.rs Show resolved Hide resolved
@robert3005 robert3005 enabled auto-merge (squash) November 1, 2024 17:54
@robert3005 robert3005 merged commit 00cd875 into develop Nov 1, 2024
5 checks passed
@robert3005 robert3005 deleted the rk/filterpushdown2 branch November 1, 2024 20:31
@robert3005 robert3005 mentioned this pull request Nov 5, 2024
15 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants