Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

General --line-buffered support #336

Merged
merged 8 commits into from
Mar 8, 2021

Conversation

jondegenhardt
Copy link
Contributor

This PR completes basic support for line buffering in the toolkit. It is a follow-up to PRs #333, #334, and #335.

By default, tools read and write in a buffered mode where data is read and written in large blocks. This is a significant performance enhancement over reading and writing line-by-line. However, reading and writing each line as it becomes available is desirable when reading from live input streams having only occasional inputs.

Most tools now support a --line-buffered option that switches to line buffering mode. Tools supporting this are: number-lines, tsv-append, tsv-filter, tsv-join, tsv-sample, tsv-select, tsv-uniq.

This PR also cleaned up some code related to header line processing and stdout flushing. This results better error message processing in a few cases. (More timely error messages in unix pipelines; error messages written after all processed output has been flushed.)

@jondegenhardt jondegenhardt merged commit 41ed15a into eBay:master Mar 8, 2021
@jondegenhardt jondegenhardt deleted the line-buffered-part4 branch March 8, 2021 05:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant