Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Handle multiple, different delimiters in a file #956

Open
patrickboehnke opened this issue Dec 29, 2021 · 2 comments
Open

Comments

@patrickboehnke
Copy link

I frequently work with data that uses different delimiters. I would like to be able to store this data as a CSV file without first having to go through and replace the delimiters with just a single choice. The enhancements that I would like to see are for CSV.File as follows:

  1. The 'delim' argument also accepts an Array of Char or String entries to use as delimiters.
  2. The 'ignorerepeated' argument could be updated to consider a three state system: 1) All duplicate delimiters ignored 2) Only duplicate delimiters that are the same are ignored and 3) Each delimiter treated as unique

Thank you for your consideration!

@PallHaraldsson
Copy link

Pandas allows Regexes, so it's something to consider for feature-compatibility with them.

@ryofurue
Copy link

As mentioned in the thread which I quote at the end, readdlm() is "deprecated" (actually, I don't know what that means in practice) and CSV.jl is recommended as an alternative.

But, currently, CSV.jl isn't able to treat a string of consecutive "space" characters as a single delimiter (whereas readdlm() is). Because of this, you very often have to preprocess text input files to use them with CSV.jl.

JuliaData/DelimitedFiles.jl#1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants