Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fatal error: concurrent map iteration and map write #2456

Closed
zouwen opened this issue Jul 31, 2020 · 8 comments · Fixed by #2464
Closed

fatal error: concurrent map iteration and map write #2456

zouwen opened this issue Jul 31, 2020 · 8 comments · Fixed by #2464

Comments

@zouwen
Copy link

zouwen commented Jul 31, 2020

Describe the bug
when a promtail push log entries to another promtail configed loki_push_api, error like this:

fatal error: concurrent map iteration and map write

goroutine 252 [running]:
runtime.throw(0x3457643, 0x26)
        /usr/local/Cellar/go/1.14.2_1/libexec/src/runtime/panic.go:1116 +0x72 fp=0xc0005459f8 sp=0xc0005459c8 pc=0x10362d2
runtime.mapiternext(0xc000545b40)
        /usr/local/Cellar/go/1.14.2_1/libexec/src/runtime/map.go:853 +0x552 fp=0xc000545a78 sp=0xc0005459f8 pc=0x1011a22
github.com/prometheus/common/model.LabelSet.String(0xc000add380, 0xc000545cb2, 0xc000682d80)
        /Users/zouwen/go/src/github.com/grafana/loki/vendor/github.com/prometheus/common/model/labelset.go:134 +0xe4 fp=0xc000545bb0 sp=0xc000545a78 pc=0x123e8a4
github.com/grafana/loki/pkg/promtail/client.(*batch).add(0xc000aff830, 0x0, 0x0, 0xc000add380, 0xbfc1144fb2f07760, 0x16c8fade1, 0x4e96940, 0xc00024e0e0, 0xd2)
        /Users/zouwen/go/src/github.com/grafana/loki/pkg/promtail/client/batch.go:42 +0x58 fp=0xc000545c50 sp=0xc000545bb0 pc=0x187a108
github.com/grafana/loki/pkg/promtail/client.(*client).run(0xc00046c480)
        /Users/zouwen/go/src/github.com/grafana/loki/pkg/promtail/client/client.go:201 +0x595 fp=0xc000545fd8 sp=0xc000545c50 pc=0x187b635
runtime.goexit()
        /usr/local/Cellar/go/1.14.2_1/libexec/src/runtime/asm_amd64.s:1373 +0x1 fp=0xc000545fe0 sp=0xc000545fd8 pc=0x1069051
created by github.com/grafana/loki/pkg/promtail/client.New
        /Users/zouwen/go/src/github.com/grafana/loki/pkg/promtail/client/client.go:147 +0x411
...

Environment:

  • go version go1.14.2 darwin/amd64

Screenshots, Promtail config, or terminal output
promtail1 config

server:
  http_listen_port: 9191
positions:
  filename: /tmp/positions/positions.yaml

clients:
  - url: http://127.0.0.1:3100/loki/api/v1/push

scrape_configs:
  - job_name: loki_push
    loki_push_api:
      server:
        http_listen_port: 9101

promtail2 config

server:
  http_listen_port: 9192
  grpc_listen_port: 9292
positions:
  filename: /tmp/positions/positions2.yaml

clients:
  - url: http://127.0.0.1:9101/loki/api/v1/push
    tenant_id: 111

scrape_configs:
  - job_name: loki_push
    static_configs:
      - targets:
        labels:
          __path__: /foo/bar.log

I think the problem may lie in the code shown below
pkg/promtail/targets/lokipush/pushtarget.go

		// Apply relabeling
		processed := relabel.Process(lb.Labels(), t.relabelConfig...)
		if processed == nil || len(processed) == 0 {
			w.WriteHeader(http.StatusNoContent)
			return
		}

		// Convert to model.LabelSet
		// todo: bug, fatal error: concurrent map iteration and map write
		filtered := model.LabelSet{}
		for i := range processed {
			if strings.HasPrefix(processed[i].Name, "__") {
				continue
			}
			filtered[model.LabelName(processed[i].Name)] = model.LabelValue(processed[i].Value)
		}

		for _, entry := range stream.Entries {
			var err error
			if t.config.KeepTimestamp {
				err = t.handler.Handle(filtered, entry.Timestamp, entry.Line)
			} else {
				err = t.handler.Handle(filtered, time.Now(), entry.Line)
			}

			if err != nil {
				lastErr = err
				continue
			}
		}
@owen-d
Copy link
Member

owen-d commented Aug 3, 2020

Thanks for the writeup. I'm guessing you're building off master because I don't think we've yet made an official release with the promtail push api -- would you mind telling us which build (release or commit) you're using loki --version?

/cc @slim-bean

@slim-bean
Copy link
Collaborator

are you by any chance setting external labels? I don't see any in your config but do are you setting any from the command line?

@zouwen
Copy link
Author

zouwen commented Aug 4, 2020

@owen-d Yes, I tested in the master branch, the last commit id is 03e9059ecb29f39b05dbba9cf278b2c95e64a612.

@zouwen
Copy link
Author

zouwen commented Aug 4, 2020

@slim-bean I set the label in the configuration file, not in the command line. The complete configuration is as follows:

server:
  http_listen_port: 9191
  grpc_listen_port: 0
positions:
  filename: /tmp/positions/positions.yaml

clients:
  - url: http://127.0.0.1:3100/loki/api/v1/push

scrape_configs:
  - job_name: loki_push
    pipeline_stages:
      - regex:
          expression: "\\[(?P<manufacturer>.*)\\]\\[(?P<AlarmObjName>.*)\\]\\[(?P<AlarmName>.*)\\]\\[(?P<AlarmReason>.*)\\]\\[(?P<AlarmTime>.*)\\]"
      - labels:
          manufacturer:
          AlarmObjName:
          AlarmName:
          AlarmReason:
          AlarmTime:
    loki_push_api:
      server:
        http_listen_port: 9101

@slim-bean
Copy link
Collaborator

Is this easy to reproduce? Do you have multiple promtails sending to the promtail configured with loki_push_api?

@slim-bean
Copy link
Collaborator

also I'm pretty sure I see the problem now, seeing that you are doing label extraction on the promtail which is running loki_push_api (which should work just fine) made the bug a lot more obvious to me.

@slim-bean
Copy link
Collaborator

@zouwen thanks again for reporting this and providing helpful details!

Would you be able to test again with the latest master and see if the issue is now gone?

@zouwen
Copy link
Author

zouwen commented Aug 5, 2020

@slim-bean Now the code is running very well, thanks for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants