Skip to content

Commit

Permalink
Fix docstring
Browse files Browse the repository at this point in the history
  • Loading branch information
albertvillanova committed Nov 16, 2021
1 parent e178cc7 commit 4c4a687
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion src/datasets/features/audio.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ def encode_example(self, value):
value (:obj:`str` or :obj:`dict`): Data passed as input to Audio feature.
Returns:
:obj:`dict`
:obj:`str` or :obj:`dict`
"""
if isinstance(value, dict):
self._storage_dtype = "struct"
Expand Down

1 comment on commit 4c4a687

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Show benchmarks

PyArrow==3.0.0

Show updated benchmarks!

Benchmark: benchmark_array_xd.json

metric read_batch_formatted_as_numpy after write_array2d read_batch_formatted_as_numpy after write_flattened_sequence read_batch_formatted_as_numpy after write_nested_sequence read_batch_unformated after write_array2d read_batch_unformated after write_flattened_sequence read_batch_unformated after write_nested_sequence read_col_formatted_as_numpy after write_array2d read_col_formatted_as_numpy after write_flattened_sequence read_col_formatted_as_numpy after write_nested_sequence read_col_unformated after write_array2d read_col_unformated after write_flattened_sequence read_col_unformated after write_nested_sequence read_formatted_as_numpy after write_array2d read_formatted_as_numpy after write_flattened_sequence read_formatted_as_numpy after write_nested_sequence read_unformated after write_array2d read_unformated after write_flattened_sequence read_unformated after write_nested_sequence write_array2d write_flattened_sequence write_nested_sequence
new / old (diff) 0.085786 / 0.011353 (0.074433) 0.004512 / 0.011008 (-0.006496) 0.037264 / 0.038508 (-0.001244) 0.044576 / 0.023109 (0.021467) 0.345323 / 0.275898 (0.069425) 0.387029 / 0.323480 (0.063549) 0.096453 / 0.007986 (0.088467) 0.004937 / 0.004328 (0.000608) 0.010747 / 0.004250 (0.006497) 0.048360 / 0.037052 (0.011308) 0.357028 / 0.258489 (0.098539) 0.387758 / 0.293841 (0.093917) 0.101385 / 0.128546 (-0.027162) 0.010225 / 0.075646 (-0.065421) 0.301827 / 0.419271 (-0.117444) 0.054526 / 0.043533 (0.010993) 0.350524 / 0.255139 (0.095385) 0.378292 / 0.283200 (0.095093) 0.099009 / 0.141683 (-0.042674) 2.158204 / 1.452155 (0.706049) 2.216930 / 1.492716 (0.724214)

Benchmark: benchmark_getitem_100B.json

metric get_batch_of_1024_random_rows get_batch_of_1024_rows get_first_row get_last_row
new / old (diff) 0.262687 / 0.018006 (0.244681) 0.486441 / 0.000490 (0.485952) 0.004278 / 0.000200 (0.004078) 0.000128 / 0.000054 (0.000074)

Benchmark: benchmark_indices_mapping.json

metric select shard shuffle sort train_test_split
new / old (diff) 0.044925 / 0.037411 (0.007514) 0.027533 / 0.014526 (0.013007) 0.033342 / 0.176557 (-0.143215) 0.235669 / 0.737135 (-0.501466) 0.035469 / 0.296338 (-0.260870)

Benchmark: benchmark_iterating.json

metric read 5000 read 50000 read_batch 50000 10 read_batch 50000 100 read_batch 50000 1000 read_formatted numpy 5000 read_formatted pandas 5000 read_formatted tensorflow 5000 read_formatted torch 5000 read_formatted_batch numpy 5000 10 read_formatted_batch numpy 5000 1000 shuffled read 5000 shuffled read 50000 shuffled read_batch 50000 10 shuffled read_batch 50000 100 shuffled read_batch 50000 1000 shuffled read_formatted numpy 5000 shuffled read_formatted_batch numpy 5000 10 shuffled read_formatted_batch numpy 5000 1000
new / old (diff) 0.497994 / 0.215209 (0.282785) 4.999268 / 2.077655 (2.921613) 2.162066 / 1.504120 (0.657947) 1.928265 / 1.541195 (0.387070) 2.036541 / 1.468490 (0.568051) 0.494348 / 4.584777 (-4.090429) 5.974410 / 3.745712 (2.228698) 4.326375 / 5.269862 (-0.943486) 1.040727 / 4.565676 (-3.524950) 0.059619 / 0.424275 (-0.364656) 0.013306 / 0.007607 (0.005699) 0.622525 / 0.226044 (0.396480) 6.243713 / 2.268929 (3.974785) 2.725766 / 55.444624 (-52.718859) 2.241680 / 6.876477 (-4.634797) 2.399805 / 2.142072 (0.257733) 0.642234 / 4.805227 (-4.162993) 0.137364 / 6.500664 (-6.363300) 0.068001 / 0.075469 (-0.007468)

Benchmark: benchmark_map_filter.json

metric filter map fast-tokenizer batched map identity map identity batched map no-op batched map no-op batched numpy map no-op batched pandas map no-op batched pytorch map no-op batched tensorflow
new / old (diff) 1.843656 / 1.841788 (0.001868) 14.836662 / 8.074308 (6.762354) 31.323072 / 10.191392 (21.131680) 0.866263 / 0.680424 (0.185840) 0.634006 / 0.534201 (0.099805) 0.441210 / 0.579283 (-0.138073) 0.633586 / 0.434364 (0.199222) 0.303459 / 0.540337 (-0.236879) 0.336230 / 1.386936 (-1.050706)
PyArrow==latest
Show updated benchmarks!

Benchmark: benchmark_array_xd.json

metric read_batch_formatted_as_numpy after write_array2d read_batch_formatted_as_numpy after write_flattened_sequence read_batch_formatted_as_numpy after write_nested_sequence read_batch_unformated after write_array2d read_batch_unformated after write_flattened_sequence read_batch_unformated after write_nested_sequence read_col_formatted_as_numpy after write_array2d read_col_formatted_as_numpy after write_flattened_sequence read_col_formatted_as_numpy after write_nested_sequence read_col_unformated after write_array2d read_col_unformated after write_flattened_sequence read_col_unformated after write_nested_sequence read_formatted_as_numpy after write_array2d read_formatted_as_numpy after write_flattened_sequence read_formatted_as_numpy after write_nested_sequence read_unformated after write_array2d read_unformated after write_flattened_sequence read_unformated after write_nested_sequence write_array2d write_flattened_sequence write_nested_sequence
new / old (diff) 0.088235 / 0.011353 (0.076882) 0.004579 / 0.011008 (-0.006429) 0.035233 / 0.038508 (-0.003275) 0.042776 / 0.023109 (0.019667) 0.403755 / 0.275898 (0.127857) 0.440978 / 0.323480 (0.117498) 0.103206 / 0.007986 (0.095220) 0.005587 / 0.004328 (0.001258) 0.008505 / 0.004250 (0.004254) 0.044334 / 0.037052 (0.007282) 0.401252 / 0.258489 (0.142763) 0.447585 / 0.293841 (0.153744) 0.101995 / 0.128546 (-0.026552) 0.010430 / 0.075646 (-0.065216) 0.300249 / 0.419271 (-0.119023) 0.058916 / 0.043533 (0.015383) 0.408124 / 0.255139 (0.152985) 0.434051 / 0.283200 (0.150852) 0.105779 / 0.141683 (-0.035904) 2.027013 / 1.452155 (0.574858) 2.088516 / 1.492716 (0.595800)

Benchmark: benchmark_getitem_100B.json

metric get_batch_of_1024_random_rows get_batch_of_1024_rows get_first_row get_last_row
new / old (diff) 0.273639 / 0.018006 (0.255633) 0.489720 / 0.000490 (0.489230) 0.003598 / 0.000200 (0.003398) 0.000099 / 0.000054 (0.000045)

Benchmark: benchmark_indices_mapping.json

metric select shard shuffle sort train_test_split
new / old (diff) 0.040237 / 0.037411 (0.002826) 0.025726 / 0.014526 (0.011201) 0.031682 / 0.176557 (-0.144875) 0.233081 / 0.737135 (-0.504054) 0.033048 / 0.296338 (-0.263290)

Benchmark: benchmark_iterating.json

metric read 5000 read 50000 read_batch 50000 10 read_batch 50000 100 read_batch 50000 1000 read_formatted numpy 5000 read_formatted pandas 5000 read_formatted tensorflow 5000 read_formatted torch 5000 read_formatted_batch numpy 5000 10 read_formatted_batch numpy 5000 1000 shuffled read 5000 shuffled read 50000 shuffled read_batch 50000 10 shuffled read_batch 50000 100 shuffled read_batch 50000 1000 shuffled read_formatted numpy 5000 shuffled read_formatted_batch numpy 5000 10 shuffled read_formatted_batch numpy 5000 1000
new / old (diff) 0.497001 / 0.215209 (0.281792) 5.008874 / 2.077655 (2.931219) 2.146654 / 1.504120 (0.642534) 1.907614 / 1.541195 (0.366419) 2.004663 / 1.468490 (0.536172) 0.495755 / 4.584777 (-4.089022) 5.678990 / 3.745712 (1.933278) 2.472732 / 5.269862 (-2.797129) 1.022838 / 4.565676 (-3.542839) 0.058844 / 0.424275 (-0.365431) 0.013345 / 0.007607 (0.005738) 0.624328 / 0.226044 (0.398284) 6.274043 / 2.268929 (4.005115) 2.724758 / 55.444624 (-52.719866) 2.307219 / 6.876477 (-4.569258) 2.416340 / 2.142072 (0.274268) 0.638376 / 4.805227 (-4.166852) 0.136171 / 6.500664 (-6.364493) 0.067240 / 0.075469 (-0.008229)

Benchmark: benchmark_map_filter.json

metric filter map fast-tokenizer batched map identity map identity batched map no-op batched map no-op batched numpy map no-op batched pandas map no-op batched pytorch map no-op batched tensorflow
new / old (diff) 1.816055 / 1.841788 (-0.025733) 14.549836 / 8.074308 (6.475528) 30.904540 / 10.191392 (20.713148) 0.906045 / 0.680424 (0.225621) 0.624240 / 0.534201 (0.090039) 0.436403 / 0.579283 (-0.142880) 0.638311 / 0.434364 (0.203947) 0.324873 / 0.540337 (-0.215464) 0.326025 / 1.386936 (-1.060911)

CML watermark

Please sign in to comment.