make 'read_table()' behave like `read.table()` #717

kevinushey · 2017-09-25T18:38:54Z

Right now, read_table() is (more or less) read_fwf() and read_table2() is read_delimited(); however, the expectation of most R users is that read_table() would behave like R's own read.table(), and expect a whitespace-delimited file.

IMHO read_table() should just read whitespace-delimited files, and read_table2() shouldn't exist. Users who actually want to read fixed-width files should use read_fwf().

This likely wouldn't break most usages since files readable by read_table() should also be accepted by read_table2().

The text was updated successfully, but these errors were encountered:

jimhester · 2017-12-07T21:11:05Z

I agree this is confusing, however thing change may break existing behavior silently by allowing inputs which previously failed to be read. Maybe we will decide this is a worthwhile trade off for simplicity sake in the future.

alistaire47 · 2018-01-16T21:03:01Z

I'd suggest this change is worth consideration. In personal use, I find read_table is far too strict for what I have to throw at it—for the worse stuff, read_table2 is still too strict. In many cases read_fwf is actually easier, when usable. To put some data behind my experience, searching GitHub returns

for a total of 774 files with read_table and 34,543 with read_csv, for a ratio of read_table/read_csv of 0.022406

For R as a whole,

56,517 R and 5,970 Rmd files for "read.table" (which due to inability to limit the search includes read_table as well)
373,612 R and 101,226 Rmd files for "read.csv" (with the same caveat)

for a total of 62,487 files with read.table and 474,838 with read.csv, for a ratio of read.table/read.csv of 0.131596.

Those ratios are non-negligibly different, with means that

readr users are more likely to have data in CSV form,
when it comes to whitespace-delimited files, they're using something else, or
nothing at all, because there's too much error in the data (the numbers do bounce around).

On an absolute scale, the read_table numbers are still relatively small, so while changes may break some code (though likely most would continue to work identically), for now it's not so much that everybody would freak out. Probably.

jennybc mentioned this issue Oct 26, 2017

Implicit vs explicit printing of problems #726

Closed

jimhester added the feature a feature request or enhancement label Dec 7, 2017

jimhester added this to the backlog milestone Nov 15, 2018

jimhester closed this as completed in 6e96a43 May 7, 2021

jennybc mentioned this issue Feb 5, 2022

issue with whitespace parsing in read_table #1118

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

make 'read_table()' behave like `read.table()` #717

make 'read_table()' behave like `read.table()` #717

kevinushey commented Sep 25, 2017

jimhester commented Dec 7, 2017

alistaire47 commented Jan 16, 2018

make 'read_table()' behave like read.table() #717

make 'read_table()' behave like read.table() #717

Comments

kevinushey commented Sep 25, 2017

jimhester commented Dec 7, 2017

alistaire47 commented Jan 16, 2018

make 'read_table()' behave like `read.table()` #717

make 'read_table()' behave like `read.table()` #717