Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data Race in Hooks #1233

Closed
Tiernan-Stapleton opened this issue Feb 17, 2021 · 14 comments
Closed

Data Race in Hooks #1233

Tiernan-Stapleton opened this issue Feb 17, 2021 · 14 comments

Comments

@Tiernan-Stapleton
Copy link

Tiernan-Stapleton commented Feb 17, 2021

I work with a service that uses logrus extensively.
Our test suite has encountered a race condition in the latest version of logrus (v1.7.1)

The test output is as follows:

$ go test -race ./...
==================
WARNING: DATA RACE
Write at 0x00c000012240 by goroutine 11:
  runtime.mapassign_fast32()
      /usr/local/go/src/runtime/map_fast32.go:92 +0x0
  github.com/sirupsen/logrus.LevelHooks.Add()
      /home/tiernan/code/go-logrus-hook/vendor/github.com/sirupsen/logrus/hooks.go:20 +0x195
  github.com/sirupsen/logrus.(*Logger).AddHook()
      /home/tiernan/code/go-logrus-hook/vendor/github.com/sirupsen/logrus/logger.go:369 +0xbe
  github.com/sirupsen/logrus.AddHook()
      /home/tiernan/code/go-logrus-hook/vendor/github.com/sirupsen/logrus/exported.go:51 +0xaa
  github.com/sirupsen/logrus/hooks/test.NewGlobal()
      /home/tiernan/code/go-logrus-hook/vendor/github.com/sirupsen/logrus/hooks/test/test.go:25 +0x74
  github.com/hpe-hcss/go-logrus-hook/pkg/main_test.TestLog.func1()
      /home/tiernan/code/go-logrus-hook/pkg/main/log_output_test.go:19 +0x59
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1054 +0x1eb

Previous read at 0x00c000012240 by goroutine 10:
  runtime.mapaccess1_fast32()
      /usr/local/go/src/runtime/map_fast32.go:12 +0x0
  github.com/sirupsen/logrus.LevelHooks.Fire()
      /home/tiernan/code/go-logrus-hook/vendor/github.com/sirupsen/logrus/hooks.go:27 +0x50
  github.com/sirupsen/logrus.(*Entry).fireHooks()
      /home/tiernan/code/go-logrus-hook/vendor/github.com/sirupsen/logrus/entry.go:270 +0xb6
  github.com/sirupsen/logrus.(*Entry).log()
      /home/tiernan/code/go-logrus-hook/vendor/github.com/sirupsen/logrus/entry.go:247 +0x24a
  github.com/sirupsen/logrus.(*Entry).Log()
      /home/tiernan/code/go-logrus-hook/vendor/github.com/sirupsen/logrus/entry.go:293 +0xba
  github.com/sirupsen/logrus.(*Logger).Log()
      /home/tiernan/code/go-logrus-hook/vendor/github.com/sirupsen/logrus/logger.go:198 +0xa2
  github.com/sirupsen/logrus.(*Logger).Info()
      /home/tiernan/code/go-logrus-hook/vendor/github.com/sirupsen/logrus/logger.go:220 +0x124
  github.com/sirupsen/logrus.Info()
      /home/tiernan/code/go-logrus-hook/vendor/github.com/sirupsen/logrus/exported.go:109 +0xe1
  github.com/hpe-hcss/go-logrus-hook/pkg/main_test.TestLog.func1()
      /home/tiernan/code/go-logrus-hook/pkg/main/log_output_test.go:20 +0xab
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1054 +0x1eb

Goroutine 11 (running) created at:
  testing.(*T).Run()
      /usr/local/go/src/testing/testing.go:1105 +0x700
  github.com/hpe-hcss/go-logrus-hook/pkg/main_test.TestLog()
      /home/tiernan/code/go-logrus-hook/pkg/main/log_output_test.go:16 +0x63
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1054 +0x1eb

Goroutine 10 (running) created at:
  testing.(*T).Run()
      /usr/local/go/src/testing/testing.go:1105 +0x700
  github.com/hpe-hcss/go-logrus-hook/pkg/main_test.TestLog()
      /home/tiernan/code/go-logrus-hook/pkg/main/log_output_test.go:16 +0x63
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1054 +0x1eb
==================
==================
WARNING: DATA RACE
Write at 0x00c00019c088 by goroutine 11:
  github.com/sirupsen/logrus.LevelHooks.Add()
      /home/tiernan/code/go-logrus-hook/vendor/github.com/sirupsen/logrus/hooks.go:20 +0x1aa
  github.com/sirupsen/logrus.(*Logger).AddHook()
      /home/tiernan/code/go-logrus-hook/vendor/github.com/sirupsen/logrus/logger.go:369 +0xbe
  github.com/sirupsen/logrus.AddHook()
      /home/tiernan/code/go-logrus-hook/vendor/github.com/sirupsen/logrus/exported.go:51 +0xaa
  github.com/sirupsen/logrus/hooks/test.NewGlobal()
      /home/tiernan/code/go-logrus-hook/vendor/github.com/sirupsen/logrus/hooks/test/test.go:25 +0x74
  github.com/hpe-hcss/go-logrus-hook/pkg/main_test.TestLog.func1()
      /home/tiernan/code/go-logrus-hook/pkg/main/log_output_test.go:19 +0x59
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1054 +0x1eb

Previous read at 0x00c00019c088 by goroutine 10:
  github.com/sirupsen/logrus.LevelHooks.Fire()
      /home/tiernan/code/go-logrus-hook/vendor/github.com/sirupsen/logrus/hooks.go:27 +0x63
  github.com/sirupsen/logrus.(*Entry).fireHooks()
      /home/tiernan/code/go-logrus-hook/vendor/github.com/sirupsen/logrus/entry.go:270 +0xb6
  github.com/sirupsen/logrus.(*Entry).log()
      /home/tiernan/code/go-logrus-hook/vendor/github.com/sirupsen/logrus/entry.go:247 +0x24a
  github.com/sirupsen/logrus.(*Entry).Log()
      /home/tiernan/code/go-logrus-hook/vendor/github.com/sirupsen/logrus/entry.go:293 +0xba
  github.com/sirupsen/logrus.(*Logger).Log()
      /home/tiernan/code/go-logrus-hook/vendor/github.com/sirupsen/logrus/logger.go:198 +0xa2
  github.com/sirupsen/logrus.(*Logger).Info()
      /home/tiernan/code/go-logrus-hook/vendor/github.com/sirupsen/logrus/logger.go:220 +0x124
  github.com/sirupsen/logrus.Info()
      /home/tiernan/code/go-logrus-hook/vendor/github.com/sirupsen/logrus/exported.go:109 +0xe1
  github.com/hpe-hcss/go-logrus-hook/pkg/main_test.TestLog.func1()
      /home/tiernan/code/go-logrus-hook/pkg/main/log_output_test.go:20 +0xab
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1054 +0x1eb

Goroutine 11 (running) created at:
  testing.(*T).Run()
      /usr/local/go/src/testing/testing.go:1105 +0x700
  github.com/hpe-hcss/go-logrus-hook/pkg/main_test.TestLog()
      /home/tiernan/code/go-logrus-hook/pkg/main/log_output_test.go:16 +0x63
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1054 +0x1eb

Goroutine 10 (running) created at:
  testing.(*T).Run()
      /usr/local/go/src/testing/testing.go:1105 +0x700
  github.com/hpe-hcss/go-logrus-hook/pkg/main_test.TestLog()
      /home/tiernan/code/go-logrus-hook/pkg/main/log_output_test.go:16 +0x63
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1054 +0x1eb
==================
time="2021-02-17T12:27:53Z" level=info msg="Test 1"
time="2021-02-17T12:27:53Z" level=info msg="Test 0"
time="2021-02-17T12:27:53Z" level=info msg="Test 2"
--- FAIL: TestLog (0.00s)
    --- FAIL: TestLog/Test_1 (0.00s)
        testing.go:969: race detected during execution of test
    --- FAIL: TestLog/Test_0 (0.00s)
        testing.go:969: race detected during execution of test
    --- FAIL: TestLog/Test_2 (0.00s)
        testing.go:969: race detected during execution of test
FAIL
FAIL    github.com/hpe-hcss/go-logrus-hook/pkg/main     0.019s
FAIL

I've isolated a minimum test that reproduces the race condition

package main_test

import (
	"fmt"
	"testing"

	"github.com/sirupsen/logrus"
	"github.com/sirupsen/logrus/hooks/test"
)

func TestLog(t *testing.T) {
	t.Parallel()

	for i := 0; i < 3; i++ {
		testname := fmt.Sprintf("Test %d", i)
		t.Run(testname, func(t *testing.T) {
			t.Parallel()

			_ = test.NewGlobal()
			logrus.Info(testname)
		})
	}
}

The race condition appears to be arising from the removal of thread locking here https://github.com/sirupsen/logrus/pull/1229/files#diff-47281dd13b9d34dcb3eb865613c38b837aeef5e4c29287654d1989fa5e84ed98L258

@dgsb
Copy link
Collaborator

dgsb commented Feb 17, 2021

Indeed, adding hook concurrently while using the logger can led to a race.
We have a tradeoff here to not have a performance impact on regular race vs unusual usage.
We were globally locking the full logger and its all derivative entries for each traces.
What is your exact use case ?
Do you really need to concurrently configure your logger while using it ?

@dgsb
Copy link
Collaborator

dgsb commented Feb 17, 2021

We can find a non-performance impacting fix though.
But I would like to know a bit more about your use case anyway.

@Tiernan-Stapleton
Copy link
Author

We add a hook for each concurrent test, if a test fails, we read AllEntries() of that log hook and log them to error log.

@dgsb
Copy link
Collaborator

dgsb commented Feb 17, 2021

What I'm arguing about is the need to add concurrently the same hook on the global logger while firing at it at the same time from different threads.

We can do a fix which will both meets the performance needs and your use case, but I wonder how your test case is significant regarding a real usage of the logger.

Most of the time the pattern is:

  • initialize the logger from the main thread at the program startup
  • then use the logger to fire traces from several threads

@Tiernan-Stapleton
Copy link
Author

Tiernan-Stapleton commented Feb 17, 2021

I'll try to clarify our use case a bit more:

We run each test concurrently, on these test threads we add a new hook and do some logic to check for a failed test at the test cleanup stage. If the test failed, we get all the entries that were logged to that test thread's hook and log them to the testing framework's log.

This way we can easily see the debug log output of failed tests, while suppressing the logs of passed tests.

I appreciate that this usage pattern is out of the norm, but it greatly improves the test verbosity on our large test suite.

@dgsb
Copy link
Collaborator

dgsb commented Feb 18, 2021

@Tiernan-Stapleton you may want to try the last version on master, let me know if that fixes your issue.

@Tiernan-Stapleton
Copy link
Author

Tiernan-Stapleton commented Feb 18, 2021

Unfortunately still getting a data race on master:

$ go clean -testcache && go test -race race2_test.go 
==================
WARNING: DATA RACE
Write at 0x00c0001c8000 by goroutine 11:
  github.com/sirupsen/logrus.(*Logger).SetOutput()
      /home/tiernan/go/pkg/mod/github.com/sirupsen/logrus@v1.8.1-0.20210218074952-6cff360233dc/logger.go:388 +0x8e
  github.com/sirupsen/logrus.SetOutput()
      /home/tiernan/go/pkg/mod/github.com/sirupsen/logrus@v1.8.1-0.20210218074952-6cff360233dc/exported.go:20 +0xec
  command-line-arguments_test.RegisterOutputLogsOnFailure()
      /home/tiernan/go/src/github.com/hpe-hcss/logrus-data-race/race2_test.go:32 +0x91
  command-line-arguments_test.TestLog2.func1()
      /home/tiernan/go/src/github.com/hpe-hcss/logrus-data-race/race2_test.go:53 +0x66
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1054 +0x1eb

Previous read at 0x00c0001c8000 by goroutine 10:
  github.com/sirupsen/logrus.(*TextFormatter).init()
      /home/tiernan/go/pkg/mod/github.com/sirupsen/logrus@v1.8.1-0.20210218074952-6cff360233dc/text_formatter.go:103 +0x1ee
  github.com/sirupsen/logrus.(*TextFormatter).Format.func1()
      /home/tiernan/go/pkg/mod/github.com/sirupsen/logrus@v1.8.1-0.20210218074952-6cff360233dc/text_formatter.go:193 +0x4a
  sync.(*Once).doSlow()
      /usr/local/go/src/sync/once.go:66 +0x103
  sync.(*Once).Do()
      /usr/local/go/src/sync/once.go:57 +0x68
  github.com/sirupsen/logrus.(*TextFormatter).Format()
      /home/tiernan/go/pkg/mod/github.com/sirupsen/logrus@v1.8.1-0.20210218074952-6cff360233dc/text_formatter.go:193 +0x9b0
  github.com/sirupsen/logrus.(*Entry).write()
      /home/tiernan/go/pkg/mod/github.com/sirupsen/logrus@v1.8.1-0.20210218074952-6cff360233dc/entry.go:279 +0x9b
  github.com/sirupsen/logrus.(*Entry).log()
      /home/tiernan/go/pkg/mod/github.com/sirupsen/logrus@v1.8.1-0.20210218074952-6cff360233dc/entry.go:251 +0x371
  github.com/sirupsen/logrus.(*Entry).Log()
      /home/tiernan/go/pkg/mod/github.com/sirupsen/logrus@v1.8.1-0.20210218074952-6cff360233dc/entry.go:293 +0xba
  github.com/sirupsen/logrus.(*Entry).Logln()
      /home/tiernan/go/pkg/mod/github.com/sirupsen/logrus@v1.8.1-0.20210218074952-6cff360233dc/entry.go:383 +0x101
  github.com/sirupsen/logrus.(*Logger).Logln()
      /home/tiernan/go/pkg/mod/github.com/sirupsen/logrus@v1.8.1-0.20210218074952-6cff360233dc/logger.go:292 +0xa2
  github.com/sirupsen/logrus.(*Logger).Debugln()
      /home/tiernan/go/pkg/mod/github.com/sirupsen/logrus@v1.8.1-0.20210218074952-6cff360233dc/logger.go:302 +0xe0
  github.com/sirupsen/logrus.Debugln()
      /home/tiernan/go/pkg/mod/github.com/sirupsen/logrus@v1.8.1-0.20210218074952-6cff360233dc/exported.go:234 +0x9d
  command-line-arguments_test.TestLog2.func1()
      /home/tiernan/go/src/github.com/hpe-hcss/logrus-data-race/race2_test.go:54 +0x67
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1054 +0x1eb

Goroutine 11 (running) created at:
  testing.(*T).Run()
      /usr/local/go/src/testing/testing.go:1105 +0x700
  command-line-arguments_test.TestLog2()
      /home/tiernan/go/src/github.com/hpe-hcss/logrus-data-race/race2_test.go:50 +0x63
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1054 +0x1eb

Goroutine 10 (running) created at:
  testing.(*T).Run()
      /usr/local/go/src/testing/testing.go:1105 +0x700
  command-line-arguments_test.TestLog2()
      /home/tiernan/go/src/github.com/hpe-hcss/logrus-data-race/race2_test.go:50 +0x63
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1054 +0x1eb
==================
--- FAIL: TestLog2 (0.00s)
    --- FAIL: TestLog2/Test_1 (0.00s)
        testing.go:846: time="2021-02-18T11:50:12Z" level=debug msg="Test 1"
             <nil>
        testing.go:846: time="2021-02-18T11:50:12Z" level=debug msg="Test 2"
             <nil>
        testing.go:969: race detected during execution of test
    --- FAIL: TestLog2/Test_2 (0.00s)
        testing.go:846: time="2021-02-18T11:50:12Z" level=debug msg="Test 2"
             <nil>
        testing.go:846: time="2021-02-18T11:50:12Z" level=debug msg="Test 0"
             <nil>
        testing.go:969: race detected during execution of test
    --- FAIL: TestLog2/Test_0 (0.00s)
        testing.go:846: time="2021-02-18T11:50:12Z" level=debug msg="Test 1"
             <nil>
        testing.go:846: time="2021-02-18T11:50:12Z" level=debug msg="Test 2"
             <nil>
        testing.go:846: time="2021-02-18T11:50:12Z" level=debug msg="Test 0"
             <nil>
        testing.go:969: race detected during execution of test
FAIL
FAIL    command-line-arguments  0.019s
FAIL

I've included a small code snippet of our usage in another test that reproduces the data race, it'll help to highlight our use case

package main_test

import (
	"fmt"
	"io/ioutil"
	"os"
	"testing"

	"github.com/sirupsen/logrus"
	"github.com/sirupsen/logrus/hooks/test"
)

func OutputLogsIfFailed(t *testing.T, logHook *test.Hook, previousLevel logrus.Level) {
	t.Helper()

	if t.Failed() {
		for _, entry := range logHook.AllEntries() {
			t.Log(entry.String())
		}
	}

	logHook.Reset()
	logrus.SetLevel(previousLevel)
	logrus.SetOutput(os.Stdout)
}

func RegisterOutputLogsOnFailure(t *testing.T) *test.Hook {
	t.Helper()

	logHook := test.NewGlobal()

	logrus.SetOutput(ioutil.Discard)

	previousLevel := logrus.GetLevel()
	logrus.SetLevel(logrus.DebugLevel)

	t.Cleanup(func() {
		t.Helper()
		OutputLogsIfFailed(t, logHook, previousLevel)
	})

	return logHook
}

func TestLog2(t *testing.T) {
	t.Parallel()

	for i := 0; i < 3; i++ {
		testname := fmt.Sprintf("Test %d", i)
		t.Run(testname, func(t *testing.T) {
			t.Parallel()

			RegisterOutputLogsOnFailure(t)
                        // rest of the test code...
			logrus.Debugln(testname)
		})
	}
}

@dgsb
Copy link
Collaborator

dgsb commented Feb 18, 2021

Event if we add a mutex to protect the Logger field, the outcome will not be predictable because you are modifying in different thread a global logger.

You should really use a different logger per thread to be consistent in your run.

@dgsb
Copy link
Collaborator

dgsb commented Feb 18, 2021

as a workaround you can force a trace a single time to ensure the text formatter is completely initialized before starting other test.

@dgsb
Copy link
Collaborator

dgsb commented Feb 18, 2021

the fix is quite easy for this one, the logger level mutex should be used there

f.isTerminal = checkIfTerminal(entry.Logger.Out)

@Tiernan-Stapleton
Copy link
Author

I'll take a look at changing to use a logger per thread as suggested, might save the two of us a bit of a headache

@Tiernan-Stapleton
Copy link
Author

@dgsb We've made changes on our end that work with your merged PR #1234. Would it be possible for you to make a new release with that change? Cheers

@Tiernan-Stapleton
Copy link
Author

@dgsb Any chance of a new release?

@dgsb
Copy link
Collaborator

dgsb commented Mar 9, 2021

@Tiernan-Stapleton new release available https://github.com/sirupsen/logrus/releases/tag/v1.8.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants