Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement SkipDir in powerwalk #4

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 11 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,14 @@

Go package for walking files and concurrently calling user code to handle each file. This package walks the file system in the same way `filepath.Walk` does, except instead of calling the `walkFn` inline, it uses goroutines to allow the files to be handled concurrently.

Powerwalk functions by walking concurrently over many files. In order to realize any benefits from this approach, you must tell the runtime to use multiple CPUs. For example:
Powerwalk functions by walking concurrently over many files. When using Go in version lower than 1.5, in order to realize any benefits from this approach, you must tell the runtime to use multiple CPUs. For example:

```
runtime.GOMAXPROCS(runtime.NumCPU())
```

In Go 1.5 and above it isn't needed, as it's set by default.

## Usage

Powerwalk is a drop-in replacement for the `filepath.Walk` method ([read about that for more details](http://golang.org/pkg/path/filepath/#Walk)), and so has the same signature, even using the `filepath.WalkFunc` too.
Expand All @@ -24,5 +26,12 @@ powerwalk.WalkLimit(root string, walkFn filepath.WalkFunc, limit int) error

The `WalkLimit` function does the same as `Walk`, except allows you to specify the number of files to concurrently walk using the `limit` argument. The `limit` argument must be one or higher (i.e. `>0`). Specificying a limit that's too high, causes unnecessary overhead so sensible numbers are encouraged but not enforced.

See the [godoc documentation](http://godoc.org/github.com/stretchr/powerwalk) for more information.
### Omitting directories

In order to skip some nested directories, please call `SkipDir` function with paths to directories to be omitted:

```
powerwalk.SkipDir(dir ...string)
```

See the [godoc documentation](http://godoc.org/github.com/stretchr/powerwalk) for more information.
13 changes: 13 additions & 0 deletions walker.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ import (
"errors"
"os"
"path/filepath"
"strings"
"sync"
)

Expand All @@ -12,6 +13,13 @@ import (
// To use a value other than this one, use the WalkLimit function.
const DefaultConcurrentWalks int = 100

var dirsToSkip []string

// SkipDir takes variable number of arguments, which are paths not to be walked
func SkipDir(dirs ...string) {
dirsToSkip = dirs
}

// Walk walks the file tree rooted at root, calling walkFn for each file or
// directory in the tree, including root. All errors that arise visiting files
// and directories are filtered by walkFn. The output is non-deterministic.
Expand Down Expand Up @@ -92,6 +100,11 @@ func WalkLimit(root string, walkFn filepath.WalkFunc, limit int) error {
close(files)
return errors.New("kill received while walking")
default:
for _, d := range dirsToSkip {
if strings.HasPrefix(p, d) {
return nil
}
}
filesWg.Add(1)
select {
case files <- &walkArgs{path: p, info: info, err: err}:
Expand Down
37 changes: 37 additions & 0 deletions walker_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -278,5 +278,42 @@ func TestPowerWalkError(t *testing.T) {
assert.True(t, v, k)
}
}
}

func TestSkipDir(t *testing.T) {

// max concurrency out
runtime.GOMAXPROCS(runtime.NumCPU())

dirToSkip := "test_files/dirToSkip"

// this time, let's make test dirs + one additional directory to skip
makeTestFiles(5, 10)
if err := os.MkdirAll(dirToSkip, 0777); err != nil {
panic(fmt.Sprintf("%s", err))
}
if err := ioutil.WriteFile(fmt.Sprintf("%s/browserHistory.txt", dirToSkip), []byte("redtube.com"), 0777); err != nil {
panic(fmt.Sprintf("%s", err))
}
defer deleteTestFiles()

// declare directories to skip
SkipDir(dirToSkip)

var seenLock sync.Mutex
seen := make(map[string]bool)
walkFunc := func(p string, info os.FileInfo, err error) error {
if !info.IsDir() {
filename := path.Base(p)
seenLock.Lock()
defer seenLock.Unlock()
seen[filename] = true
}
return nil
}

assert.NoError(t, Walk(testFiles, walkFunc))

// check if file inside "dirToSkip" was omitted
assert.False(t, seen["browserHistory.txt"])
}