Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trim some path interfaces #2

Open
wants to merge 69 commits into
base: lstree-with-history
Choose a base branch
from

Conversation

gsoltis
Copy link
Collaborator

@gsoltis gsoltis commented Jun 6, 2022

This is not comprehensive, but I didn't find code using the things I commented out, and it also let me update AbsoluteUnixPath.ToSystemPath to return a concrete type rather than an interface, which I think is more useful to callers.

bormanp and others added 30 commits June 6, 2022 20:42
csv reader/writer based on RFC 4180

R=rsc, mattn.jp, r, dchest
CC=golang-dev
https://golang.org/cl/4629085
manual changes in src/pkg/go/printer, src/cmd/gofix/signal_test.go
(cd src/cmd/gofix/testdata; gofmt -w *.in *.out)
(cd src/pkg/go/printer; gotest -update)
gofmt -w misc src

runs all tests

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/4715041
Address the issue coalescing two records together when TrimLeadingSpace
is set to true.

The input

        a,b,
        c,d,e

Would result with a singled a,b,c,d,e record.
With TrailingComma set to true it should give two records.
With TrailingComma set to false it should be an error.

Fixes vercel#2366.

R=golang-dev, go.peter.90, r
CC=golang-dev
https://golang.org/cl/5284046
Nothing terribly interesting here.

R=golang-dev, r, borman
CC=golang-dev
https://golang.org/cl/5315043
…ackage renaming CL #1. This one merely moves the source; the import strings will be changed after the next weekly release.

The only edits are in Makefiles.

R=r, rsc
CC=golang-dev
https://golang.org/cl/5331060
Take advantage of the new terminating statement rule.

R=golang-dev, r, gri
CC=golang-dev
https://golang.org/cl/7712044
Original CL by rsc (11916045):

The motivation for disallowing them was RFC 4180 saying
"The last field in the record must not be followed by a comma."
I believe this is an admonition to CSV generators, not readers.
When reading, anything followed by a comma is not the last field.

Fixes vercel#5892.

R=golang-dev, rsc, r
CC=golang-dev
https://golang.org/cl/12294043
Preparation was in CL 134570043.
This CL contains only the effect of 'hg mv src/pkg/* src'.
For more about the move, see golang.org/s/go14nopkg.
Currently parseRecord will always start with a nil
slice and then resize the slice on append. For input
with a fixed number of fields per record we can preallocate
the slice to avoid having to resize the slice.

This change implements this optimization by using
FieldsPerRecord as capacity if it's > 0 and also adds a
benchmark to better show the differences.

benchmark         old ns/op     new ns/op     delta
BenchmarkRead     19741         17909         -9.28%

benchmark         old allocs     new allocs     delta
BenchmarkRead     59             41             -30.51%

benchmark         old bytes     new bytes     delta
BenchmarkRead     6276          5844          -6.88%

Change-Id: I7c2abc9c80a23571369bcfcc99a8ffc474eae7ab
Reviewed-on: https://go-review.googlesource.com/8880
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Change-Id: I82edd9364e1b4634006f5e043202a69f383dcdbe
Reviewed-on: https://go-review.googlesource.com/10826
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Fixes #11050.

Change-Id: Ie5d16960a1f829af947d82a63fe414924cd02ff6
Reviewed-on: https://go-review.googlesource.com/10666
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
The documentation listing err == EOF can be confusing to newcomers
to the language who are looking for the relevant documentation for
that error.

Change-Id: I301885950d0e1d0fbdf3a1892fca86eac7a0c616
Reviewed-on: https://go-review.googlesource.com/15806
Reviewed-by: Andrew Gerrand <adg@golang.org>
Fixes #14464

Change-Id: Iafc21641cca7d35b7a5631cfc94742ee8e7d5042
Reviewed-on: https://go-review.googlesource.com/19861
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
The tree's pretty inconsistent about single space vs double space
after a period in documentation. Make it consistently a single space,
per earlier decisions. This means contributors won't be confused by
misleading precedence.

This CL doesn't use go/doc to parse. It only addresses // comments.
It was generated with:

$ perl -i -npe 's,^(\s*// .+[a-z]\.)  +([A-Z]),$1 $2,' $(git grep -l -E '^\s*//(.+\.)  +([A-Z])')
$ go test go/doc -update

Change-Id: Iccdb99c37c797ef1f804a94b22ba5ee4b500c4f7
Reviewed-on: https://go-review.googlesource.com/20022
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Dave Day <djd@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Change-Id: Ib0683f27b44e2f107cca7a8dcc01d230cbcd5700
Reviewed-on: https://go-review.googlesource.com/23404
Reviewed-by: Alan Donovan <adonovan@google.com>
The intent of this comment is to reduce the number of issues opened
against the package to add support for new kinds of CSV formats, such as
issues vercel#3150, vercel#8458, #12372, #12755.

Change-Id: I452c0b748e4ca9ebde3e6cea188bf7774372148e
Reviewed-on: https://go-review.googlesource.com/23401
Reviewed-by: Andrew Gerrand <adg@golang.org>
This patch updates the doc about comments whitespace for the
encoding/csv package to reflect that leading whitespace before
the hash will treat the line as not a comment.

Fixes #13775.

Change-Id: Ia468c75b242a487b4b2b4cd3d342bfb8e07720ba
Reviewed-on: https://go-review.googlesource.com/23302
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Benchmarks broken off from https://golang.org/cl/24723 and modified to
allocate less in the places we're not trying to measure.

Updates #16791

Change-Id: I508e4cfeac60322d56f1d71ff1912f6a6f183a63
Reviewed-on: https://go-review.googlesource.com/30357
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This commit changes parseRecord to allocate a single string per record,
instead of per field, by using indexes into the raw record.

Benchstat (done with f69991c17)

name                          old time/op    new time/op    delta
Read-8                          3.17µs ± 0%    2.78µs ± 1%  -12.35%  (p=0.016 n=4+5)
ReadWithFieldsPerRecord-8       3.18µs ± 1%    2.79µs ± 1%  -12.23%  (p=0.008 n=5+5)
ReadWithoutFieldsPerRecord-8    4.59µs ± 0%    2.77µs ± 0%  -39.58%  (p=0.016 n=4+5)
ReadLargeFields-8               57.0µs ± 0%    55.7µs ± 0%   -2.18%  (p=0.008 n=5+5)

name                          old alloc/op   new alloc/op   delta
Read-8                            660B ± 0%      664B ± 0%   +0.61%  (p=0.008 n=5+5)
ReadWithFieldsPerRecord-8         660B ± 0%      664B ± 0%   +0.61%  (p=0.008 n=5+5)
ReadWithoutFieldsPerRecord-8    1.14kB ± 0%    0.66kB ± 0%  -41.75%  (p=0.008 n=5+5)
ReadLargeFields-8               3.86kB ± 0%    3.94kB ± 0%   +1.86%  (p=0.008 n=5+5)

name                          old allocs/op  new allocs/op  delta
Read-8                            30.0 ± 0%      18.0 ± 0%  -40.00%  (p=0.008 n=5+5)
ReadWithFieldsPerRecord-8         30.0 ± 0%      18.0 ± 0%  -40.00%  (p=0.008 n=5+5)
ReadWithoutFieldsPerRecord-8      50.0 ± 0%      18.0 ± 0%  -64.00%  (p=0.008 n=5+5)
ReadLargeFields-8                 66.0 ± 0%      24.0 ± 0%  -63.64%  (p=0.008 n=5+5)

For a simple application that I wrote, which reads in a CSV file (via
ReadAll) and outputs the number of rows read (15857625 rows), this change
reduces the total time on my notebook from ~58 seconds to ~48 seconds.

This reduces time and allocations (bytes) each by ~6% for a real world
CSV file at work (~230000 rows, 13 colums).

Updates #16791

Change-Id: Ia07177c94624e55cdd3064a7d2751fb69322d3e4
Reviewed-on: https://go-review.googlesource.com/24723
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Fixes #17342.

Change-Id: I76af756d7aff464554c5564d444962a468d0eccc
Reviewed-on: https://go-review.googlesource.com/32172
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Quentin Smith <quentin@golang.org>
In many cases the records returned by Reader.Read will only be used between calls
to Read and become garbage once a new record is read. In this case, instead of
allocating a new slice on each call to Read, we can reuse the last allocated slice
for successive calls to avoid unnecessary allocations.

This change adds a new field ReuseRecord to the Reader struct to enable this reuse.

ReuseRecord is false by default to avoid breaking existing code which dependss on
the current behaviour.

I also added 4 new benchmarks, corresponding to the existing Read benchmarks, which
set ReuseRecord to true.

Benchstat on my local machine (old is ReuseRecord = false, new is ReuseRecord = true)

name                          old time/op    new time/op    delta
Read-8                          2.75µs ± 2%    1.88µs ± 1%  -31.52%  (p=0.000 n=14+15)
ReadWithFieldsPerRecord-8       2.75µs ± 0%    1.89µs ± 1%  -31.43%  (p=0.000 n=13+13)
ReadWithoutFieldsPerRecord-8    2.77µs ± 1%    1.88µs ± 1%  -32.06%  (p=0.000 n=15+15)
ReadLargeFields-8               55.4µs ± 1%    54.2µs ± 0%   -2.07%  (p=0.000 n=15+14)

name                          old alloc/op   new alloc/op   delta
Read-8                            664B ± 0%       24B ± 0%  -96.39%  (p=0.000 n=15+15)
ReadWithFieldsPerRecord-8         664B ± 0%       24B ± 0%  -96.39%  (p=0.000 n=15+15)
ReadWithoutFieldsPerRecord-8      664B ± 0%       24B ± 0%  -96.39%  (p=0.000 n=15+15)
ReadLargeFields-8               3.94kB ± 0%    2.98kB ± 0%  -24.39%  (p=0.000 n=15+15)

name                          old allocs/op  new allocs/op  delta
Read-8                            18.0 ± 0%       8.0 ± 0%  -55.56%  (p=0.000 n=15+15)
ReadWithFieldsPerRecord-8         18.0 ± 0%       8.0 ± 0%  -55.56%  (p=0.000 n=15+15)
ReadWithoutFieldsPerRecord-8      18.0 ± 0%       8.0 ± 0%  -55.56%  (p=0.000 n=15+15)
ReadLargeFields-8                 24.0 ± 0%      12.0 ± 0%  -50.00%  (p=0.000 n=15+15)

Fixes #19721

Change-Id: I79b14128bb9bb3465f53f40f93b1b528a9da6f58
Reviewed-on: https://go-review.googlesource.com/41730
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Errors returned by Reader contain the line where the Reader originally
encountered the error. This can be suboptimal since that line does not
always correspond with the line the current record/field started at.

This can easily happen with LazyQuotes as seen in #19019, but also
happens for example when a quoted fields has no closing quote and
the parser hits EOF before it finds another quote.

When this happens finding the erroneous field can be somewhat
complicated and time consuming, and in most cases it would be better to
report the line where the record started.

This change updates Reader to keep track of the line on which a record
begins and uses it for errors instead of the current line, making it
easier to find errors.

Although a user-visible change, this should have no impact on existing
code, since most users don't explicitly work with the line in the error
and probably already expect the new behaviour.

Updates #19019

Change-Id: Ic9bc70fad2651c69435d614d537e7a9266819b05
Reviewed-on: https://go-review.googlesource.com/52830
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The parser mistakenly assumed it could always fold \r\n into \n, which
is not true since a \r\n inside a quoted fields has no special meaning
and should be kept as is.

Fix this by not folding \r\n to \n inside quotes fields.

Fixes #21201

Change-Id: Ifebc302e49cf63e0a027ee90f088dbc050a2b7a6
Reviewed-on: https://go-review.googlesource.com/52810
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The Reader implementation is slow because it operates on a rune-by-rune
basis via bufio.Reader.ReadRune. We speed this up by operating on entire
lines that we read from bufio.Reader.ReadSlice.

In order to ensure that we read the full line, we augment ReadSlice
in our Reader.readLine method to automatically expand the slice if
bufio.ErrBufferFull is every hit.

This change happens to fix #19410 because it no longer relies on
rune-by-rune parsing and only searches for the relevant delimiter rune.

In order to keep column accounting simple and consistent, this change
reverts parts of CL 52830.

This CL is an alternative to CL 36270 and builds on some of the ideas
from that change by Diogo Pinela.

name                                     old time/op    new time/op    delta
Read-8                                   3.12µs ± 1%    2.54µs ± 2%  -18.76%   (p=0.000 n=10+9)
ReadWithFieldsPerRecord-8                3.12µs ± 1%    2.53µs ± 1%  -18.91%    (p=0.000 n=9+9)
ReadWithoutFieldsPerRecord-8             3.13µs ± 0%    2.57µs ± 3%  -18.07%  (p=0.000 n=10+10)
ReadLargeFields-8                        52.3µs ± 1%     5.3µs ± 2%  -89.93%   (p=0.000 n=10+9)
ReadReuseRecord-8                        2.05µs ± 1%    1.40µs ± 1%  -31.48%   (p=0.000 n=10+9)
ReadReuseRecordWithFieldsPerRecord-8     2.05µs ± 1%    1.41µs ± 0%  -31.03%   (p=0.000 n=10+9)
ReadReuseRecordWithoutFieldsPerRecord-8  2.06µs ± 1%    1.40µs ± 1%  -31.70%   (p=0.000 n=9+10)
ReadReuseRecordLargeFields-8             50.9µs ± 0%     4.1µs ± 3%  -92.01%  (p=0.000 n=10+10)

name                                     old alloc/op   new alloc/op
Read-8                                       664B ± 0%      664B ± 0%
ReadWithFieldsPerRecord-8                    664B ± 0%      664B ± 0%
ReadWithoutFieldsPerRecord-8                 664B ± 0%      664B ± 0%
ReadLargeFields-8                          3.94kB ± 0%    3.94kB ± 0%
ReadReuseRecord-8                           24.0B ± 0%     24.0B ± 0%
ReadReuseRecordWithFieldsPerRecord-8        24.0B ± 0%     24.0B ± 0%
ReadReuseRecordWithoutFieldsPerRecord-8     24.0B ± 0%     24.0B ± 0%
ReadReuseRecordLargeFields-8               2.98kB ± 0%    2.98kB ± 0%

name                                     old allocs/op  new allocs/op
Read-8                                       18.0 ± 0%      18.0 ± 0%
ReadWithFieldsPerRecord-8                    18.0 ± 0%      18.0 ± 0%
ReadWithoutFieldsPerRecord-8                 18.0 ± 0%      18.0 ± 0%
ReadLargeFields-8                            24.0 ± 0%      24.0 ± 0%
ReadReuseRecord-8                            8.00 ± 0%      8.00 ± 0%
ReadReuseRecordWithFieldsPerRecord-8         8.00 ± 0%      8.00 ± 0%
ReadReuseRecordWithoutFieldsPerRecord-8      8.00 ± 0%      8.00 ± 0%
ReadReuseRecordLargeFields-8                 12.0 ± 0%      12.0 ± 0%

Updates #22352
Updates #19019
Fixes #16791
Fixes #19410

Change-Id: I31c27cfcc56880e6abac262f36c947179b550bbf
Reviewed-on: https://go-review.googlesource.com/72150
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
@nathanhammond
Copy link
Owner

I'll go through and see if, by removing things, I can get down to a concrete implementation at any call site in our codebase.

Worth noting: AbsoluteUnixPath.ToSystemPath is unsound and should not exist. Good catch. 😅

@nathanhammond nathanhammond force-pushed the lstree-with-history branch 4 times, most recently from 394bb1e to 24ee9e5 Compare June 23, 2022 16:19
nathanhammond pushed a commit that referenced this pull request Oct 25, 2022
nathanhammond pushed a commit that referenced this pull request Mar 8, 2023
### Description

Flag on builds with the Go sandwich

### Testing Instructions

The team manually verified builds from
https://github.com/arlyon/turbo/actions/runs/4355600953 which includes
this flag
nathanhammond pushed a commit that referenced this pull request Mar 13, 2023
Reverts vercel#4115

Found issues w/ alpine linux, vercel build container, and windows
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.