DelegatingLogger reports incorrect source lines when zap.AddCaller() option is used #1737

dobsonj · 2021-12-03T00:20:25Z

I noticed an issue with logging in controller-runtime when source line logging is enabled via the zap.AddCaller() option.

For context, I was debugging local-storage-operator, with the following options passed into SetLogger:

https://github.com/openshift/local-storage-operator/blob/8fc42cc8b990907c88a6da551dc85b55c2dc4417/main.go#L79-L87

	opts := zap.Options{
		Development: true,
		ZapOpts:     []zaplog.Option{zaplog.AddCaller()},
	}

	opts.BindFlags(flag.CommandLine)
	flag.Parse()

	ctrl.SetLogger(zap.New(zap.UseFlagOptions(&opts)))

During startup, it prints log messages like this, attributing them to controller/controller.go:221

2021-12-02T22:17:24.408Z        INFO    controller-runtime.manager.controller.localvolume       controller/controller.go:221    Starting EventSource    {"reconciler group": "local.storage.openshift.io", "reconciler kind": "LocalVolume", "source": "kind source: /, Kind="}
2021-12-02T22:17:24.408Z        INFO    controller-runtime.manager.controller.localvolume       controller/controller.go:221    Starting EventSource    {"reconciler group": "local.storage.openshift.io", "reconciler kind": "LocalVolume", "source": "kind source: /, Kind="}

The problem is that line 221 is not the line where the log message is:

https://github.com/openshift/local-storage-operator/blob/8fc42cc8b990907c88a6da551dc85b55c2dc4417/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go#L221

That is the parent function of the line where the actual log message is:

https://github.com/openshift/local-storage-operator/blob/8fc42cc8b990907c88a6da551dc85b55c2dc4417/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go#L165

So I would expect the log messages to report controller/controller.go:165 instead of controller/controller.go:221 when the AddCaller option is present. This is one example, but the problem applies to other log messages as well.

The logr.Logger object is setting callerSkip to 2... so I added debug.PrintStack to AddCallerSkip(), and these were the places where it was provided a non-zero argument:

https://github.com/openshift/local-storage-operator/blob/8fc42cc8b990907c88a6da551dc85b55c2dc4417/vendor/sigs.k8s.io/controller-runtime/pkg/log/zap/zap.go#L234

goroutine 1 [running]:
runtime/debug.Stack(0x1ed08f0, 0xc0002fddd0, 0xc000590d00)
        /usr/lib/golang/src/runtime/debug/stack.go:24 +0x9f
runtime/debug.PrintStack()
        /usr/lib/golang/src/runtime/debug/stack.go:16 +0x25
go.uber.org/zap.AddCallerSkip(0x1, 0x1ec45e0, 0xc000010020)
        /go/src/github.com/openshift/local-storage-operator/vendor/go.uber.org/zap/options.go:110 +0x25
sigs.k8s.io/controller-runtime/pkg/log/zap.NewRaw(0xc00020bcf8, 0x1, 0x1, 0xc0004a8320)
        /go/src/github.com/openshift/local-storage-operator/vendor/sigs.k8s.io/controller-runtime/pkg/log/zap/zap.go:234 +0x106
sigs.k8s.io/controller-runtime/pkg/log/zap.New(0xc00020bcf8, 0x1, 0x1, 0x1, 0x0)
        /go/src/github.com/openshift/local-storage-operator/vendor/sigs.k8s.io/controller-runtime/pkg/log/zap/zap.go:43 +0x3f
main.main()
        /go/src/github.com/openshift/local-storage-operator/main.go:93 +0x3f4

https://github.com/openshift/local-storage-operator/blob/8fc42cc8b990907c88a6da551dc85b55c2dc4417/vendor/github.com/go-logr/zapr/zapr.go#L182

goroutine 1 [running]:
runtime/debug.Stack(0x6, 0x8, 0xc000339920)
        /usr/lib/golang/src/runtime/debug/stack.go:24 +0x9f
runtime/debug.PrintStack()
        /usr/lib/golang/src/runtime/debug/stack.go:16 +0x25
go.uber.org/zap.AddCallerSkip(0x1, 0xc00000ff50, 0x1ec5120)
        /go/src/github.com/openshift/local-storage-operator/vendor/go.uber.org/zap/options.go:110 +0x25
github.com/go-logr/zapr.newLoggerWithExtraSkip(0xc000339920, 0x1, 0x1, 0xc000339920)
        /go/src/github.com/openshift/local-storage-operator/vendor/github.com/go-logr/zapr/zapr.go:172 +0x2f
github.com/go-logr/zapr.NewLogger(...)
        /go/src/github.com/openshift/local-storage-operator/vendor/github.com/go-logr/zapr/zapr.go:182
sigs.k8s.io/controller-runtime/pkg/log/zap.New(0xc00020bcf8, 0x1, 0x1, 0x1, 0x0)
        /go/src/github.com/openshift/local-storage-operator/vendor/sigs.k8s.io/controller-runtime/pkg/log/zap/zap.go:43 +0x56
main.main()
        /go/src/github.com/openshift/local-storage-operator/main.go:93 +0x3f4

As a quick test, I tried reducing callerSkip to 1 by passing in zaplog.AddCallerSkip(-1):

	ZapOpts:     []zaplog.Option{zaplog.AddCaller(), zaplog.AddCallerSkip(-1)},

And that resulted in the correct line being logged:

2021-12-02T23:47:31.676Z        INFO    controller-runtime.manager.controller.localvolume  controller/controller.go:165     Starting EventSource    {"reconciler group": "local.storage.openshift.io", "reconciler kind": "LocalVolume", "source": "kind source: /, Kind="}
2021-12-02T23:47:31.676Z        INFO    controller-runtime.manager.controller.localvolume  controller/controller.go:165     Starting EventSource    {"reconciler group": "local.storage.openshift.io", "reconciler kind": "LocalVolume", "source": "kind source: /, Kind="}

As I understand it, callerSkip is being set to 2 on the assumption that we're using the wrapper functions here:

https://github.com/openshift/local-storage-operator/blob/8fc42cc8b990907c88a6da551dc85b55c2dc4417/vendor/sigs.k8s.io/controller-runtime/pkg/log/deleg.go#L127

If we use the DelegatingLogger functions, then callerSkip = 2 is correct.
But any time WithName() or WithValues() is called, it returns the underlying logr.Logger object:

https://github.com/openshift/local-storage-operator/blob/8fc42cc8b990907c88a6da551dc85b55c2dc4417/vendor/sigs.k8s.io/controller-runtime/pkg/log/deleg.go#L183

Which means we're not using the DelegatingLogger wrapper functions anymore, we're calling go-logr methods instead without wrappers, and in that case callerSkip = 2 is incorrect, and the parent function ends up getting logged instead of the line where the log statement is.

The text was updated successfully, but these errors were encountered:

wu0407 · 2021-12-31T14:19:23Z

same issue use kubebuilder 3.2.0

k8s-triage-robot · 2022-03-31T15:03:24Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

dobsonj · 2022-03-31T17:03:11Z

/remove-lifecycle stale

alculquicondor · 2022-05-10T13:35:17Z

cc @camilamacedo86

k8s-triage-robot · 2022-08-08T13:42:12Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 31, 2022

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 31, 2022

nayihz mentioned this issue May 10, 2022

add log filename and line number kubernetes-sigs/kueue#252

Merged

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 8, 2022

This was referenced Aug 9, 2022

Need to improve the readability of the log kubernetes-sigs/kueue#14

Closed

🐛 Fix log depth for DelegatingLogSink #1975

Merged

k8s-ci-robot closed this as completed in #1975 Aug 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DelegatingLogger reports incorrect source lines when zap.AddCaller() option is used #1737

DelegatingLogger reports incorrect source lines when zap.AddCaller() option is used #1737

dobsonj commented Dec 3, 2021

wu0407 commented Dec 31, 2021

k8s-triage-robot commented Mar 31, 2022

dobsonj commented Mar 31, 2022

alculquicondor commented May 10, 2022

k8s-triage-robot commented Aug 8, 2022

DelegatingLogger reports incorrect source lines when zap.AddCaller() option is used #1737

DelegatingLogger reports incorrect source lines when zap.AddCaller() option is used #1737

Comments

dobsonj commented Dec 3, 2021

wu0407 commented Dec 31, 2021

k8s-triage-robot commented Mar 31, 2022

dobsonj commented Mar 31, 2022

alculquicondor commented May 10, 2022

k8s-triage-robot commented Aug 8, 2022