-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG extsort CSV MODE issues #2391
Comments
Happy New Year and thanks for the detailed report @datatraveller1 . However, I can't seem to reproduce your issue given the commands above:
As to the error in your Additional context:
That Is because you sorted on the second column
|
in connection with #2391 [skip ci]
Hi @jqnatividad Thank you very much and a happy new year, too! However, don't you get the io error: invalid record index message?
If you don't get the error, maybe it is a MS Windows issue with the |
Hi @jqnatividad, I have now found out: |
Hi @datatraveller1 , Also, I'd be interested to know what generates the CSV that's causing The csv crate is supposed to handle this transparently - https://docs.rs/csv/latest/csv/enum.Terminator.html |
Hi @jqnatividad, I have attached the file with "Attach files": If this doesn't work, I think you can also simply use
I'm not sure about where what fails. All that said, now I use for what I wanted to achieve (without the need of index and extsort): |
Thanks for the sample file @datatraveller1 . I can now reproduce it and confirm its an underflow bug. The large number should have tipped me off... |
Describe the bug
I want to sort a CSV file with extsort in CSV MODE but sometimes get either a message
"
io error: invalid record index 18446744073709551615 (there are 16 records
)" or the file is sorted wrongly.To Reproduce
Steps to reproduce the behavior:
Input file
test_ids.csv
:Commands:
=>
io error: invalid record index 18446744073709551615 (there are 16 records)
Expected behavior
No error.
Desktop (please complete the following information):
Additional context
In other cases with big files the extsort command works, but
qsv dedup --select tc_id --sorted sorted.csv | qsv select tc_id -o out.csv
shows an error:
Aborting! Input not sorted! ByteRecord(["138" ... is greater than ByteRecord([" ...
=> extsort seems to sort wrongly in these cases.
The text was updated successfully, but these errors were encountered: