Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUGS] data race in connection pool management #2876

Closed
Aden-Q opened this issue Jan 26, 2024 · 0 comments · Fixed by #2885
Closed

[BUGS] data race in connection pool management #2876

Aden-Q opened this issue Jan 26, 2024 · 0 comments · Fixed by #2885

Comments

@Aden-Q
Copy link

Aden-Q commented Jan 26, 2024

There is a data race issue in this newConn function. This function is not thread-safe.

func (p *ConnPool) newConn(ctx context.Context, pooled bool) (*Conn, error) {
if p.closed() {
return nil, ErrClosed
}
if p.cfg.MaxActiveConns > 0 && p.poolSize >= p.cfg.MaxActiveConns {
return nil, ErrPoolExhausted
}
cn, err := p.dialConn(ctx, pooled)
if err != nil {
return nil, err
}
p.connsMu.Lock()
defer p.connsMu.Unlock()
p.conns = append(p.conns, cn)
if pooled {
// If pool is full remove the cn on next Put.
if p.poolSize >= p.cfg.PoolSize {
cn.pooled = false
} else {
p.poolSize++
}
}
return cn, nil
}

There's a shared variable p.poolSize.

There is a read access to p.poolSize on line 171:

if p.cfg.MaxActiveConns > 0 && p.poolSize >= p.cfg.MaxActiveConns {

There is a write access to p.poolSize on line 189:

p.poolSize++

While the mutex enforced on line 180 can only make writes mutually exclusive:

p.connsMu.Lock()
defer p.connsMu.Unlock()

In such a situation, there's a data race:

goroutine #1 is reading p.poolSize (on line 171, it does not need to acquire the mutex lock)

goroutine #2 is writing to p.poolSize (on line 189, it needs to acquire the mutex lock in order to write)

This shared variable p.poolSize is concurrently accessed by two goroutines, one read and one write, resulting in a data race issue

Expected Behavior

There should not be a data race issue when the newConn function is called

Current Behavior

There is a data race issue

Possible Solution

There is a data race because the mutex does not protect the full scope of the critical section, but only part of it. We should run p.connsMu.Lock() and defer p.connsMu.Unlock() earlier than it is, or use a RLock to protect read to the shared variable p.poolSize while RWLock to protect write

Steps to Reproduce

It's a data race and concurrency issue and a little hard to reproduce. While I can provide some output from the CI pipeline:

WARNING: DATA RACE
Write at 0x00c0001143c0 by goroutine 110:
  github.com/redis/go-redis/v9/internal/pool.(*ConnPool).newConn()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/internal/pool/pool.go:189 +0x3b6
  github.com/redis/go-redis/v9/internal/pool.(*ConnPool).Get()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/internal/pool/pool.go:285 +0x1bb
  github.com/redis/go-redis/v9/internal/pool.(*StickyConnPool).Get()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/internal/pool/pool_sticky.go:73 +0x129
  github.com/redis/go-redis/v9.(*baseClient)._getConn()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/redis.go:247 +0x74
  github.com/redis/go-redis/v9.(*baseClient).getConn()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/redis.go:235 +0xc4
  github.com/redis/go-redis/v9.(*baseClient).withConn()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/redis.go:358 +0x6b
  github.com/redis/go-redis/v9.(*baseClient)._process()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/redis.go:400 +0x1dd
  github.com/redis/go-redis/v9.(*baseClient).process()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/redis.go:382 +0xbb
  github.com/redis/go-redis/v9.(*baseClient).process-fm()
      <autogenerated>:1 +0x64
  github.com/redis/go-redis/v9.(*hooksMixin).processHook()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/redis.go:172 +0x71
  github.com/redis/go-redis/v9.(*Conn).Process()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/redis.go:803 +0x3a
  github.com/redis/go-redis/v9.(*Conn).Process-fm()
      <autogenerated>:1 +0x64
  github.com/redis/go-redis/v9.statefulCmdable.Select()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/commands.go:287 +0x1d7

Previous read at 0x00c0001143c0 by goroutine 102:
  github.com/redis/go-redis/v9/internal/pool.(*ConnPool).newConn()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/internal/pool/pool.go:171 +0xce
  github.com/redis/go-redis/v9/internal/pool.(*ConnPool).Get()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/internal/pool/pool.go:285 +0x1bb
  github.com/redis/go-redis/v9/internal/pool.(*StickyConnPool).Get()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/internal/pool/pool_sticky.go:73 +0x129
  github.com/redis/go-redis/v9.(*baseClient)._getConn()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/redis.go:247 +0x74
  github.com/redis/go-redis/v9.(*baseClient).getConn()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/redis.go:235 +0xc4
  github.com/redis/go-redis/v9.(*baseClient).withConn()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/redis.go:358 +0x6b
  github.com/redis/go-redis/v9.(*baseClient)._process()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/redis.go:400 +0x1dd
  github.com/redis/go-redis/v9.(*baseClient).process()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/redis.go:382 +0xbb
  github.com/redis/go-redis/v9.(*baseClient).process-fm()
      <autogenerated>:1 +0x64
  github.com/redis/go-redis/v9.(*hooksMixin).processHook()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/redis.go:172 +0x71
  github.com/redis/go-redis/v9.(*Conn).Process()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/redis.go:803 +0x3a
  github.com/redis/go-redis/v9.(*Conn).Process-fm()
      <autogenerated>:1 +0x64
  github.com/redis/go-redis/v9.statefulCmdable.Select()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/commands.go:287 +0x1d7

Context (Environment)

We see a data race warning when we are running test with the -race option

Detailed Description

Please check the Possible Solution section

Possible Implementation

Please check the Possible Solution section

@Aden-Q Aden-Q changed the title Data race in connection pool management [BUGS] data race in connection pool management Feb 6, 2024
@ofekshenawa ofekshenawa linked a pull request Feb 14, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant