Allow to run CI on push to ci-branch #5790

albertvillanova · 2023-04-25T13:57:26Z

This PR allows to run the CI on push to a branch named "ci-*", without needing to open a PR.

This will allow to make CI tests without opening a PR, e.g., for future huggingface-hub releases, future dependency releases (like fsspec, pandas,...)

Note that to build the documentation, we already allow it on push to a branch named "doc-builder*".

See:

Prepare tests for hfh 0.14 #5788

CC: @Wauplin

HuggingFaceDocBuilderDev · 2023-04-25T14:01:22Z

The documentation is not available anymore as the PR was closed or merged.

Wauplin

Good idea! Better not to be specific to hfh :)

github-actions · 2023-04-26T13:43:08Z

Show benchmarks

PyArrow==8.0.0

Show updated benchmarks!

Benchmark: benchmark_array_xd.json

metric	read_batch_formatted_as_numpy after write_array2d	read_batch_formatted_as_numpy after write_flattened_sequence	read_batch_formatted_as_numpy after write_nested_sequence	read_batch_unformated after write_array2d	read_batch_unformated after write_flattened_sequence	read_batch_unformated after write_nested_sequence	read_col_formatted_as_numpy after write_array2d	read_col_formatted_as_numpy after write_flattened_sequence	read_col_formatted_as_numpy after write_nested_sequence	read_col_unformated after write_array2d	read_col_unformated after write_flattened_sequence	read_col_unformated after write_nested_sequence	read_formatted_as_numpy after write_array2d	read_formatted_as_numpy after write_flattened_sequence	read_formatted_as_numpy after write_nested_sequence	read_unformated after write_array2d	read_unformated after write_flattened_sequence	read_unformated after write_nested_sequence	write_array2d	write_flattened_sequence	write_nested_sequence
new / old (diff)	0.007852 / 0.011353 (-0.003500)	0.005804 / 0.011008 (-0.005204)	0.098268 / 0.038508 (0.059760)	0.036440 / 0.023109 (0.013331)	0.299952 / 0.275898 (0.024054)	0.335590 / 0.323480 (0.012111)	0.006332 / 0.007986 (-0.001653)	0.004218 / 0.004328 (-0.000110)	0.074733 / 0.004250 (0.070483)	0.055252 / 0.037052 (0.018200)	0.300854 / 0.258489 (0.042365)	0.353442 / 0.293841 (0.059601)	0.036447 / 0.128546 (-0.092099)	0.012638 / 0.075646 (-0.063009)	0.336680 / 0.419271 (-0.082591)	0.052436 / 0.043533 (0.008903)	0.292606 / 0.255139 (0.037467)	0.319676 / 0.283200 (0.036476)	0.111137 / 0.141683 (-0.030546)	1.449569 / 1.452155 (-0.002586)	1.558110 / 1.492716 (0.065394)

Benchmark: benchmark_getitem_100B.json

metric	get_batch_of_1024_random_rows	get_batch_of_1024_rows	get_first_row	get_last_row
new / old (diff)	0.306043 / 0.018006 (0.288037)	0.563174 / 0.000490 (0.562684)	0.032227 / 0.000200 (0.032027)	0.000491 / 0.000054 (0.000436)

Benchmark: benchmark_indices_mapping.json

metric	select	shard	shuffle	sort	train_test_split
new / old (diff)	0.029874 / 0.037411 (-0.007537)	0.109330 / 0.014526 (0.094805)	0.122579 / 0.176557 (-0.053978)	0.181398 / 0.737135 (-0.555737)	0.127124 / 0.296338 (-0.169215)

Benchmark: benchmark_iterating.json

metric	read 5000	read 50000	read_batch 50000 10	read_batch 50000 100	read_batch 50000 1000	read_formatted numpy 5000	read_formatted pandas 5000	read_formatted tensorflow 5000	read_formatted torch 5000	read_formatted_batch numpy 5000 10	read_formatted_batch numpy 5000 1000	shuffled read 5000	shuffled read 50000	shuffled read_batch 50000 10	shuffled read_batch 50000 100	shuffled read_batch 50000 1000	shuffled read_formatted numpy 5000	shuffled read_formatted_batch numpy 5000 10	shuffled read_formatted_batch numpy 5000 1000
new / old (diff)	0.417950 / 0.215209 (0.202741)	4.163883 / 2.077655 (2.086228)	1.985209 / 1.504120 (0.481089)	1.793660 / 1.541195 (0.252465)	1.895193 / 1.468490 (0.426703)	0.694331 / 4.584777 (-3.890446)	3.820170 / 3.745712 (0.074458)	2.180556 / 5.269862 (-3.089305)	1.490671 / 4.565676 (-3.075006)	0.086132 / 0.424275 (-0.338143)	0.012289 / 0.007607 (0.004682)	0.511182 / 0.226044 (0.285137)	5.117855 / 2.268929 (2.848927)	2.403914 / 55.444624 (-53.040710)	2.071107 / 6.876477 (-4.805369)	2.184108 / 2.142072 (0.042036)	0.835028 / 4.805227 (-3.970199)	0.167707 / 6.500664 (-6.332957)	0.066724 / 0.075469 (-0.008746)

Benchmark: benchmark_map_filter.json

metric	filter	map fast-tokenizer batched	map identity	map identity batched	map no-op batched	map no-op batched numpy	map no-op batched pandas	map no-op batched pytorch	map no-op batched tensorflow
new / old (diff)	1.203921 / 1.841788 (-0.637867)	15.214676 / 8.074308 (7.140368)	14.971337 / 10.191392 (4.779945)	0.170225 / 0.680424 (-0.510199)	0.017924 / 0.534201 (-0.516277)	0.428532 / 0.579283 (-0.150751)	0.449157 / 0.434364 (0.014793)	0.507723 / 0.540337 (-0.032614)	0.615331 / 1.386936 (-0.771605)

PyArrow==latest

Show updated benchmarks!

Benchmark: benchmark_array_xd.json

metric	read_batch_formatted_as_numpy after write_array2d	read_batch_formatted_as_numpy after write_flattened_sequence	read_batch_formatted_as_numpy after write_nested_sequence	read_batch_unformated after write_array2d	read_batch_unformated after write_flattened_sequence	read_batch_unformated after write_nested_sequence	read_col_formatted_as_numpy after write_array2d	read_col_formatted_as_numpy after write_flattened_sequence	read_col_formatted_as_numpy after write_nested_sequence	read_col_unformated after write_array2d	read_col_unformated after write_flattened_sequence	read_col_unformated after write_nested_sequence	read_formatted_as_numpy after write_array2d	read_formatted_as_numpy after write_flattened_sequence	read_formatted_as_numpy after write_nested_sequence	read_unformated after write_array2d	read_unformated after write_flattened_sequence	read_unformated after write_nested_sequence	write_array2d	write_flattened_sequence	write_nested_sequence
new / old (diff)	0.008172 / 0.011353 (-0.003181)	0.005405 / 0.011008 (-0.005603)	0.074684 / 0.038508 (0.036176)	0.039133 / 0.023109 (0.016024)	0.342598 / 0.275898 (0.066700)	0.377752 / 0.323480 (0.054272)	0.006655 / 0.007986 (-0.001331)	0.005788 / 0.004328 (0.001459)	0.074014 / 0.004250 (0.069763)	0.056225 / 0.037052 (0.019173)	0.342330 / 0.258489 (0.083841)	0.381052 / 0.293841 (0.087211)	0.036574 / 0.128546 (-0.091973)	0.012472 / 0.075646 (-0.063174)	0.087574 / 0.419271 (-0.331698)	0.050178 / 0.043533 (0.006646)	0.351116 / 0.255139 (0.095977)	0.363772 / 0.283200 (0.080572)	0.118313 / 0.141683 (-0.023370)	1.436691 / 1.452155 (-0.015463)	1.551397 / 1.492716 (0.058680)

Benchmark: benchmark_getitem_100B.json

metric	get_batch_of_1024_random_rows	get_batch_of_1024_rows	get_first_row	get_last_row
new / old (diff)	0.265201 / 0.018006 (0.247195)	0.561855 / 0.000490 (0.561366)	0.000463 / 0.000200 (0.000263)	0.000058 / 0.000054 (0.000004)

Benchmark: benchmark_indices_mapping.json

metric	select	shard	shuffle	sort	train_test_split
new / old (diff)	0.030540 / 0.037411 (-0.006871)	0.118815 / 0.014526 (0.104289)	0.127689 / 0.176557 (-0.048868)	0.176211 / 0.737135 (-0.560924)	0.133130 / 0.296338 (-0.163208)

Benchmark: benchmark_iterating.json

metric	read 5000	read 50000	read_batch 50000 10	read_batch 50000 100	read_batch 50000 1000	read_formatted numpy 5000	read_formatted pandas 5000	read_formatted tensorflow 5000	read_formatted torch 5000	read_formatted_batch numpy 5000 10	read_formatted_batch numpy 5000 1000	shuffled read 5000	shuffled read 50000	shuffled read_batch 50000 10	shuffled read_batch 50000 100	shuffled read_batch 50000 1000	shuffled read_formatted numpy 5000	shuffled read_formatted_batch numpy 5000 10	shuffled read_formatted_batch numpy 5000 1000
new / old (diff)	0.416318 / 0.215209 (0.201109)	4.146806 / 2.077655 (2.069151)	1.983437 / 1.504120 (0.479317)	1.799733 / 1.541195 (0.258539)	1.889026 / 1.468490 (0.420536)	0.723330 / 4.584777 (-3.861447)	3.817795 / 3.745712 (0.072083)	2.158449 / 5.269862 (-3.111413)	1.377348 / 4.565676 (-3.188328)	0.088504 / 0.424275 (-0.335771)	0.012560 / 0.007607 (0.004953)	0.530382 / 0.226044 (0.304337)	5.308529 / 2.268929 (3.039600)	2.469655 / 55.444624 (-52.974970)	2.136209 / 6.876477 (-4.740267)	2.322997 / 2.142072 (0.180924)	0.861396 / 4.805227 (-3.943831)	0.172747 / 6.500664 (-6.327917)	0.067617 / 0.075469 (-0.007852)

Benchmark: benchmark_map_filter.json

metric	filter	map fast-tokenizer batched	map identity	map identity batched	map no-op batched	map no-op batched numpy	map no-op batched pandas	map no-op batched pytorch	map no-op batched tensorflow
new / old (diff)	1.263225 / 1.841788 (-0.578563)	15.878025 / 8.074308 (7.803717)	14.815627 / 10.191392 (4.624235)	0.148722 / 0.680424 (-0.531702)	0.018071 / 0.534201 (-0.516130)	0.428389 / 0.579283 (-0.150894)	0.428635 / 0.434364 (-0.005729)	0.496953 / 0.540337 (-0.043385)	0.592783 / 1.386936 (-0.794153)

Allow run CI on push to ci-branch

7792d52

Wauplin approved these changes Apr 25, 2023

View reviewed changes

albertvillanova mentioned this pull request Apr 25, 2023

Prepare tests for hfh 0.14 #5788

Merged

Merge remote-tracking branch 'upstream/main' into ci-branch-on-push

5ec1bf0

albertvillanova merged commit d2e5568 into huggingface:main Apr 26, 2023

albertvillanova deleted the ci-branch-on-push branch April 26, 2023 13:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow to run CI on push to ci-branch #5790

Allow to run CI on push to ci-branch #5790

albertvillanova commented Apr 25, 2023

HuggingFaceDocBuilderDev commented Apr 25, 2023 •

edited

Loading

Wauplin left a comment

github-actions bot commented Apr 26, 2023

Benchmark: benchmark_array_xd.json

Benchmark: benchmark_getitem_100B.json

Benchmark: benchmark_indices_mapping.json

Benchmark: benchmark_iterating.json

Benchmark: benchmark_map_filter.json

Benchmark: benchmark_array_xd.json

Benchmark: benchmark_getitem_100B.json

Benchmark: benchmark_indices_mapping.json

Benchmark: benchmark_iterating.json

Benchmark: benchmark_map_filter.json

Allow to run CI on push to ci-branch #5790

Allow to run CI on push to ci-branch #5790

Conversation

albertvillanova commented Apr 25, 2023

HuggingFaceDocBuilderDev commented Apr 25, 2023 • edited Loading

Wauplin left a comment

Choose a reason for hiding this comment

github-actions bot commented Apr 26, 2023

Benchmark: benchmark_array_xd.json

Benchmark: benchmark_getitem_100B.json

Benchmark: benchmark_indices_mapping.json

Benchmark: benchmark_iterating.json

Benchmark: benchmark_map_filter.json

Benchmark: benchmark_array_xd.json

Benchmark: benchmark_getitem_100B.json

Benchmark: benchmark_indices_mapping.json

Benchmark: benchmark_iterating.json

Benchmark: benchmark_map_filter.json

HuggingFaceDocBuilderDev commented Apr 25, 2023 •

edited

Loading