Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CSV::TSV class for tab-separated values #319

Merged
merged 6 commits into from
Nov 25, 2024
Merged

Conversation

jsxs0
Copy link
Contributor

@jsxs0 jsxs0 commented Oct 28, 2024

Add TSV class for tab-separated files support

This PR adds a lightweight TSV class that provides first-class support for tab-separated files through a simple inheritance mechanism from the CSV class.

Implementation

The implementation has been simplified to only override the initialize method, which sets the default column separator to tab (\t). This minimalist approach maintains full compatibility with CSV while providing convenient TSV handling:

class TSV < CSV
  def initialize(data, **options)
    super(data, **({col_sep: "\t"}.merge(options)))
  end
end

Features

  • Inherits all functionality from CSV class
  • Uses tab as the default separator
  • Allows override of separator if needed
  • Maintains full interface compatibility with CSV
  • Zero additional maintenance overhead due to minimal implementation

Example Usage

# Read TSV file with default tab separator
TSV.read("data.tsv")

# Parse TSV string
TSV.parse("a\tb\tc")

# Generate TSV content
TSV.generate do |tsv|
  tsv << ["a", "b", "c"]
  tsv << [1, 2, 3]
end

# Override separator if needed (though you might want to use CSV directly in this case)
TSV.read("data.csv", col_sep: ",")

Motivation

This change is motivated by:

  • Providing better support for TSV files without adding complexity
  • Making tab-separated file handling more convenient through a dedicated class
  • Following the principle of least surprise - users expect TSV to use tabs by default

(#272)

Changes from Previous Version

The implementation has been significantly simplified:

  • Removed unnecessary method overrides
  • Now only overrides initialize to set default separator
  • Maintains full CSV functionality through inheritance
  • Added comprehensive test coverage

@kou
Copy link
Member

kou commented Oct 30, 2024

Hmm. Do we need to override all class methods? Can we just override TSV#initialize?

diff --git a/lib/csv.rb b/lib/csv.rb
index f6eb32a..8c71b4d 100644
--- a/lib/csv.rb
+++ b/lib/csv.rb
@@ -2979,6 +2979,12 @@ class CSV
   end
 end
 
+class TSV < CSV
+  def initialize(data, **options)
+    super(data, **({col_sep: "\t"}.merge(options)))
+  end
+end
+
 # Passes +args+ to CSV::instance.
 #
 #   CSV("CSV,data").read

BTW, is CSV::TSV intentional? (::TSV may be easy to use but it may break other library's API that defines ::TSV...)

@jsxs0
Copy link
Contributor Author

jsxs0 commented Oct 30, 2024

@kou Thanks for the feedback! Updated and pushed a new commit.

@kou
Copy link
Member

kou commented Nov 14, 2024

Could you add some tests for this change?
Could you update the PR description for the latest changes?

@jsxs0
Copy link
Contributor Author

jsxs0 commented Nov 14, 2024

Could you add some tests for this change? Could you update the PR description for the latest changes?

@kou Thanks, added tests and updated the description.

run-test.rb Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to change this file?
I think that we don't need to change this file but if you think that we need to change this file, could you open a separated PR?

@jsxs0 jsxs0 requested a review from kou November 14, 2024 06:40
test/csv/test_tsv.rb Outdated Show resolved Hide resolved
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
@jsxs0 jsxs0 requested a review from kou November 14, 2024 07:09
@kou
Copy link
Member

kou commented Nov 14, 2024

Did you run added tests on your local machine? Or did you enable GitHub Actions on your fork?

We need to add CSV:: to all TSV references.

@jsxs0
Copy link
Contributor Author

jsxs0 commented Nov 25, 2024

Did you run added tests on your local machine? Or did you enable GitHub Actions on your fork?

We need to add CSV:: to all TSV references.

Thanks. The fix was a straightforward namespace correction that aligns the test code with the actual implementation in lib/csv.rb.

@kou kou changed the title Add TSV class for tab-separated values Add CSV::TSV class for tab-separated values Nov 25, 2024
@kou kou merged commit 7b8c3ca into ruby:master Nov 25, 2024
46 checks passed
@kou
Copy link
Member

kou commented Nov 25, 2024

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants