Skip to content

Commit

Permalink
Update paper link in medmcqa dataset card (#4290)
Browse files Browse the repository at this point in the history
* Update README.md

* Replace paper link to abstract page

Co-authored-by: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com>
  • Loading branch information
monk1337 and albertvillanova authored Sep 30, 2022
1 parent db2e5b5 commit 0869a3c
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion datasets/medmcqa/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ pretty_name: MedMCQA

- **Homepage:** https://medmcqa.github.io
- **Repository:** https://github.com/medmcqa/medmcqa
- **Paper:** https://arxiv.org/abs/2203.14371
- **Paper:** [MedMCQA: A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question Answering](https://proceedings.mlr.press/v174/pal22a)
- **Leaderboard:** https://paperswithcode.com/dataset/medmcqa
- **Point of Contact:** [Aaditya Ura](mailto:aadityaura@gmail.com)

Expand Down

1 comment on commit 0869a3c

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Show benchmarks

PyArrow==6.0.0

Show updated benchmarks!

Benchmark: benchmark_array_xd.json

metric read_batch_formatted_as_numpy after write_array2d read_batch_formatted_as_numpy after write_flattened_sequence read_batch_formatted_as_numpy after write_nested_sequence read_batch_unformated after write_array2d read_batch_unformated after write_flattened_sequence read_batch_unformated after write_nested_sequence read_col_formatted_as_numpy after write_array2d read_col_formatted_as_numpy after write_flattened_sequence read_col_formatted_as_numpy after write_nested_sequence read_col_unformated after write_array2d read_col_unformated after write_flattened_sequence read_col_unformated after write_nested_sequence read_formatted_as_numpy after write_array2d read_formatted_as_numpy after write_flattened_sequence read_formatted_as_numpy after write_nested_sequence read_unformated after write_array2d read_unformated after write_flattened_sequence read_unformated after write_nested_sequence write_array2d write_flattened_sequence write_nested_sequence
new / old (diff) 0.011947 / 0.011353 (0.000594) 0.004966 / 0.011008 (-0.006043) 0.035546 / 0.038508 (-0.002962) 0.039134 / 0.023109 (0.016025) 0.366502 / 0.275898 (0.090604) 0.435716 / 0.323480 (0.112236) 0.007368 / 0.007986 (-0.000617) 0.005540 / 0.004328 (0.001212) 0.008204 / 0.004250 (0.003954) 0.044961 / 0.037052 (0.007909) 0.391745 / 0.258489 (0.133256) 0.453662 / 0.293841 (0.159821) 0.054651 / 0.128546 (-0.073895) 0.014780 / 0.075646 (-0.060866) 0.347247 / 0.419271 (-0.072024) 0.070263 / 0.043533 (0.026730) 0.374007 / 0.255139 (0.118868) 0.454460 / 0.283200 (0.171260) 0.111887 / 0.141683 (-0.029795) 1.774525 / 1.452155 (0.322371) 1.790146 / 1.492716 (0.297430)

Benchmark: benchmark_getitem_100B.json

metric get_batch_of_1024_random_rows get_batch_of_1024_rows get_first_row get_last_row
new / old (diff) 0.226561 / 0.018006 (0.208555) 0.642592 / 0.000490 (0.642102) 0.003043 / 0.000200 (0.002843) 0.000176 / 0.000054 (0.000122)

Benchmark: benchmark_indices_mapping.json

metric select shard shuffle sort train_test_split
new / old (diff) 0.026789 / 0.037411 (-0.010622) 0.118780 / 0.014526 (0.104254) 0.129254 / 0.176557 (-0.047303) 0.180442 / 0.737135 (-0.556693) 0.134086 / 0.296338 (-0.162252)

Benchmark: benchmark_iterating.json

metric read 5000 read 50000 read_batch 50000 10 read_batch 50000 100 read_batch 50000 1000 read_formatted numpy 5000 read_formatted pandas 5000 read_formatted tensorflow 5000 read_formatted torch 5000 read_formatted_batch numpy 5000 10 read_formatted_batch numpy 5000 1000 shuffled read 5000 shuffled read 50000 shuffled read_batch 50000 10 shuffled read_batch 50000 100 shuffled read_batch 50000 1000 shuffled read_formatted numpy 5000 shuffled read_formatted_batch numpy 5000 10 shuffled read_formatted_batch numpy 5000 1000
new / old (diff) 0.655055 / 0.215209 (0.439845) 6.148132 / 2.077655 (4.070477) 2.448080 / 1.504120 (0.943960) 2.138412 / 1.541195 (0.597217) 2.057410 / 1.468490 (0.588920) 0.756717 / 4.584777 (-3.828060) 5.314829 / 3.745712 (1.569117) 2.986993 / 5.269862 (-2.282869) 1.945056 / 4.565676 (-2.620620) 0.098632 / 0.424275 (-0.325643) 0.014053 / 0.007607 (0.006446) 0.828259 / 0.226044 (0.602215) 8.047426 / 2.268929 (5.778498) 3.157042 / 55.444624 (-52.287583) 2.470998 / 6.876477 (-4.405479) 2.581889 / 2.142072 (0.439817) 0.959448 / 4.805227 (-3.845780) 0.223033 / 6.500664 (-6.277631) 0.089160 / 0.075469 (0.013691)

Benchmark: benchmark_map_filter.json

metric filter map fast-tokenizer batched map identity map identity batched map no-op batched map no-op batched numpy map no-op batched pandas map no-op batched pytorch map no-op batched tensorflow
new / old (diff) 1.897759 / 1.841788 (0.055972) 16.568141 / 8.074308 (8.493833) 43.703151 / 10.191392 (33.511759) 1.240364 / 0.680424 (0.559940) 0.744592 / 0.534201 (0.210391) 0.469855 / 0.579283 (-0.109428) 0.643331 / 0.434364 (0.208967) 0.356341 / 0.540337 (-0.183997) 0.367844 / 1.386936 (-1.019092)
PyArrow==latest
Show updated benchmarks!

Benchmark: benchmark_array_xd.json

metric read_batch_formatted_as_numpy after write_array2d read_batch_formatted_as_numpy after write_flattened_sequence read_batch_formatted_as_numpy after write_nested_sequence read_batch_unformated after write_array2d read_batch_unformated after write_flattened_sequence read_batch_unformated after write_nested_sequence read_col_formatted_as_numpy after write_array2d read_col_formatted_as_numpy after write_flattened_sequence read_col_formatted_as_numpy after write_nested_sequence read_col_unformated after write_array2d read_col_unformated after write_flattened_sequence read_col_unformated after write_nested_sequence read_formatted_as_numpy after write_array2d read_formatted_as_numpy after write_flattened_sequence read_formatted_as_numpy after write_nested_sequence read_unformated after write_array2d read_unformated after write_flattened_sequence read_unformated after write_nested_sequence write_array2d write_flattened_sequence write_nested_sequence
new / old (diff) 0.008416 / 0.011353 (-0.002937) 0.004783 / 0.011008 (-0.006225) 0.032095 / 0.038508 (-0.006413) 0.034275 / 0.023109 (0.011166) 0.402438 / 0.275898 (0.126540) 0.456290 / 0.323480 (0.132810) 0.004336 / 0.007986 (-0.003650) 0.003901 / 0.004328 (-0.000427) 0.005634 / 0.004250 (0.001383) 0.040553 / 0.037052 (0.003501) 0.403452 / 0.258489 (0.144963) 0.478396 / 0.293841 (0.184555) 0.039541 / 0.128546 (-0.089006) 0.012077 / 0.075646 (-0.063569) 0.315784 / 0.419271 (-0.103487) 0.059821 / 0.043533 (0.016289) 0.391237 / 0.255139 (0.136098) 0.420684 / 0.283200 (0.137484) 0.107854 / 0.141683 (-0.033829) 1.823674 / 1.452155 (0.371519) 1.827233 / 1.492716 (0.334516)

Benchmark: benchmark_getitem_100B.json

metric get_batch_of_1024_random_rows get_batch_of_1024_rows get_first_row get_last_row
new / old (diff) 0.315143 / 0.018006 (0.297136) 0.536694 / 0.000490 (0.536204) 0.027901 / 0.000200 (0.027701) 0.000451 / 0.000054 (0.000396)

Benchmark: benchmark_indices_mapping.json

metric select shard shuffle sort train_test_split
new / old (diff) 0.025335 / 0.037411 (-0.012076) 0.118426 / 0.014526 (0.103901) 0.126070 / 0.176557 (-0.050487) 0.169851 / 0.737135 (-0.567284) 0.129541 / 0.296338 (-0.166798)

Benchmark: benchmark_iterating.json

metric read 5000 read 50000 read_batch 50000 10 read_batch 50000 100 read_batch 50000 1000 read_formatted numpy 5000 read_formatted pandas 5000 read_formatted tensorflow 5000 read_formatted torch 5000 read_formatted_batch numpy 5000 10 read_formatted_batch numpy 5000 1000 shuffled read 5000 shuffled read 50000 shuffled read_batch 50000 10 shuffled read_batch 50000 100 shuffled read_batch 50000 1000 shuffled read_formatted numpy 5000 shuffled read_formatted_batch numpy 5000 10 shuffled read_formatted_batch numpy 5000 1000
new / old (diff) 0.634007 / 0.215209 (0.418798) 6.477585 / 2.077655 (4.399930) 2.516419 / 1.504120 (1.012299) 2.194768 / 1.541195 (0.653573) 2.169880 / 1.468490 (0.701390) 0.760305 / 4.584777 (-3.824472) 5.467882 / 3.745712 (1.722169) 5.461289 / 5.269862 (0.191427) 2.676912 / 4.565676 (-1.888765) 0.105961 / 0.424275 (-0.318315) 0.019452 / 0.007607 (0.011845) 0.854526 / 0.226044 (0.628482) 7.963010 / 2.268929 (5.694082) 3.384027 / 55.444624 (-52.060598) 2.636414 / 6.876477 (-4.240063) 2.715414 / 2.142072 (0.573342) 0.972615 / 4.805227 (-3.832613) 0.206418 / 6.500664 (-6.294246) 0.093807 / 0.075469 (0.018338)

Benchmark: benchmark_map_filter.json

metric filter map fast-tokenizer batched map identity map identity batched map no-op batched map no-op batched numpy map no-op batched pandas map no-op batched pytorch map no-op batched tensorflow
new / old (diff) 1.894166 / 1.841788 (0.052378) 16.144658 / 8.074308 (8.070350) 22.907185 / 10.191392 (12.715793) 1.222991 / 0.680424 (0.542567) 0.724475 / 0.534201 (0.190275) 0.464621 / 0.579283 (-0.114662) 0.575330 / 0.434364 (0.140967) 0.340694 / 0.540337 (-0.199644) 0.343095 / 1.386936 (-1.043841)

CML watermark

Please sign in to comment.