Skip to content
This repository has been archived by the owner on Sep 9, 2020. It is now read-only.

vendor directory writes: add counts to verbose logging, limits writers, abort on error #1043

Merged
merged 3 commits into from
Aug 30, 2017
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions internal/gps/identifier.go
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,10 @@ func (i ProjectIdentifier) errString() string {
return fmt.Sprintf("%s (from %s)", i.ProjectRoot, i.Source)
}

func (i ProjectIdentifier) String() string {
return i.errString()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any objections to this errString format becoming the canonical exported String method format? fmt.Sprintf("%s (from %s)", i.ProjectRoot, i.Source)

errString() has 36 usages, all of which are passed to formatting functions with %s, so I'd like to do a follow up PR to just absorb that method into this one and let them all implicitly call this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mmm yes, i like this. the individual properties are accessible already - callers can easily construct their own output if they so choose. 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(separate PR, though)

}

func (i ProjectIdentifier) normalize() ProjectIdentifier {
if i.Source == "" {
i.Source = string(i.ProjectRoot)
Expand Down
99 changes: 77 additions & 22 deletions internal/gps/result.go
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ import (
"log"
"os"
"path/filepath"
"runtime"
"sync"

"github.com/pkg/errors"
Expand Down Expand Up @@ -65,37 +66,91 @@ func WriteDepTree(basedir string, l Lock, sm SourceManager, sv bool, logger *log
return err
}

var wg sync.WaitGroup
errCh := make(chan error, len(l.Projects()))
lps := l.Projects()

type resp struct {
i int
err error
}
respCh := make(chan resp, len(lps))
writeCh := make(chan int, len(lps))
cancel := make(chan struct{})

for _, p := range l.Projects() {
wg.Add(1)
go func(p LockedProject) {
// Queue work.
for i := range lps {
writeCh <- i
}
close(writeCh)
// Launch writers.
writers := runtime.GOMAXPROCS(-1)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps an optimal number is some factor of GOMAXPROCS, or ultimately something configurable.
Related discussion on how this might be configurable: #1028 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm not sure that GOMAXPROCS is the right thing, here - the goroutines themselves aren't actually doing work, but end up mostly in a sleep mode waiting for their spawned subprocesses. limiting it to GOMAXPROCS seems likely to be lower than we want.

i think i'd rather just pick an arbitrary number to begin with - let's say 16 - and see how we do.

if len(lps) < writers {
writers = len(lps)
}
var wg sync.WaitGroup
wg.Add(writers)
for i := 0; i < writers; i++ {
go func() {
defer wg.Done()
to := filepath.FromSlash(filepath.Join(basedir, string(p.Ident().ProjectRoot)))
logger.Printf("Writing out %s@%s", p.Ident().errString(), p.Version())

if err := sm.ExportProject(p.Ident(), p.Version(), to); err != nil {
errCh <- errors.Wrapf(err, "failed to export %s", p.Ident().ProjectRoot)
return
}
for i := range writeCh {
select {
case <-cancel:
return
default:
}

if sv {
err := filepath.Walk(to, stripVendor)
if err != nil {
errCh <- errors.Wrapf(err, "failed to strip vendor from %s", p.Ident().ProjectRoot)
p := lps[i]
to := filepath.FromSlash(filepath.Join(basedir, string(p.Ident().ProjectRoot)))

if err := sm.ExportProject(p.Ident(), p.Version(), to); err != nil {
respCh <- resp{i, errors.Wrapf(err, "failed to export %s", p.Ident().ProjectRoot)}
continue
}

if sv {
select {
case <-cancel:
return
default:
}

if err := filepath.Walk(to, stripVendor); err != nil {
respCh <- resp{i, errors.Wrapf(err, "failed to strip vendor from %s", p.Ident().ProjectRoot)}
continue
}
}

respCh <- resp{i, nil}
}
}(p)
}()
}
// Monitor writers
go func() {
wg.Wait()
close(respCh)
}()

// Log results and collect errors
var errs []error
var cnt int
for resp := range respCh {
cnt++
msg := "Wrote"
if resp.err != nil {
if len(errs) == 0 {
close(cancel)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to know to break out of the loop explicitly here. i believe there's a possibility of a double-close on the cancel channel:

  1. X and Y are all sent into the writeCh, and are all picked up by their own goroutine
  2. X finishes first and fails, indicates as much to the respCh; it terminates, decrementing the wg
  3. main goroutine receives the fail and closes cancel, logs result, returns to loop and waits for new value
  4. meanwhile, Y finishes and also fails, sends to respCh, and terminates, closing respCh from the waiter goroutine
  5. however, main goroutine is still waiting on the respCh and is already working before it receives second fail, panics on double-close

note that because the range loop will not terminate on a buffered channel until the chan is both closed AND empty, i believe this is guaranteed to happen if two workers simultaneously encounter an error.

Copy link
Collaborator Author

@jmank88 jmank88 Aug 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The close(cancel) is protected by len(errs) == 0 so that the channel is only closed on the first error. We could break and abort on the first error (and only report that single error), but since we want to gracefully block/wait/shutdown anyways, it seemed cleaner to inspect all responses and collect all errors (e.g. might be relevant to investigating the original error; we don't want to block/wait in silence then not log anything; etc...).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gah, you're right, i just breezed right by that check. ok 👍

}
errs = append(errs, resp.err)
msg = "Failed to write"
}
p := lps[resp.i]
logger.Printf("(%d/%d) %s %s@%s\n", cnt, len(lps), msg, p.Ident(), p.Version())
}

wg.Wait()
close(errCh)

if len(errCh) > 0 {
if len(errs) > 0 {
logger.Println("Failed to write dep tree. The following errors occurred:")
for err := range errCh {
logger.Println(" * ", err)
for i, err := range errs {
logger.Printf("(%d/%d) %s\n", i+1, len(errs), err)
}

removeAll(basedir)
Expand Down
5 changes: 3 additions & 2 deletions txn_writer.go
Original file line number Diff line number Diff line change
Expand Up @@ -455,8 +455,9 @@ func (sw *SafeWriter) PrintPreparedActions(output *log.Logger, verbose bool) err
if sw.writeVendor {
if verbose {
output.Printf("Would have written the following %d projects to the vendor directory:\n", len(sw.lock.Projects()))
for _, project := range sw.lock.Projects() {
output.Println(project)
lps := sw.lock.Projects()
for i, p := range lps {
output.Printf("(%d/%d) %s@%s\n", i+1, len(lps), p.Ident(), p.Version())
}
} else {
output.Printf("Would have written %d projects to the vendor directory.\n", len(sw.lock.Projects()))
Expand Down