Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All ingesters die with a panic message immediately after big query #787

Closed
HuippuJanne opened this issue Jul 19, 2019 · 1 comment · Fixed by #790
Closed

All ingesters die with a panic message immediately after big query #787

HuippuJanne opened this issue Jul 19, 2019 · 1 comment · Fixed by #790
Assignees

Comments

@HuippuJanne
Copy link

HuippuJanne commented Jul 19, 2019

Describe the bug
After querying Loki datasource with Grafana (6.2.5) with a large term query, all ingestor pods die with a panic message. Also the querier component (only one of them) used by Grafana gets a panic but does not die.

To Reproduce
Steps to reproduce the behavior:

  1. Started loki (consul 3 pods, ingester 5 pods, querier 3 pods, distributor 3 pods) (sha256@ 4540e21bed15d2a4fa4e7dadb3eb94da64d2236c5e49f1c80871c7e41b09d4ae, git commit da6a133)
  2. Started fluentd plugin from loki repository (v0.1.0) as daemonset, writing rate 3,4TB a week.
  3. Query: {env="dev3"}

Expected behavior
Grafana would show some part of the result (1000) lines. Instead it fails with a msg "Unknown error during query transaction. Please check JS console logs." and all ingester pods are restarting.

Environment:

  • Infrastructure: Kubenetes v1.12.10
  • Deployment tool: Helm v2.12.3

Screenshots, promtail config, or terminal output
Error message as shown by kubetail (all ingestors):

panic: runtime error: slice bounds out of range

goroutine 4535 [running]:
github.com/grafana/loki/pkg/chunkenc.(*bufferedIterator).moveNext(0xc0018e9200, 0xc005db8000, 0x127, 0xfe9e81, 0xc0018e9200, 0x224d8650)
	/go/src/github.com/grafana/loki/pkg/chunkenc/gzip.go:537 +0x4df
github.com/grafana/loki/pkg/chunkenc.(*bufferedIterator).Next(0xc0018e9200, 0xfe9e81)
	/go/src/github.com/grafana/loki/pkg/chunkenc/gzip.go:501 +0x2f
github.com/grafana/loki/pkg/iter.(*nonOverlappingIterator).Next(0xc000685e80, 0xc005db7680)
	/go/src/github.com/grafana/loki/pkg/iter/iterator.go:345 +0x119
github.com/grafana/loki/pkg/iter.(*timeRangedIterator).Next(0xc000685ec0, 0x224d8650)
	/go/src/github.com/grafana/loki/pkg/iter/iterator.go:404 +0x48
github.com/grafana/loki/pkg/iter.(*entryIteratorBackward).load(0xc004318840)
	/go/src/github.com/grafana/loki/pkg/iter/iterator.go:440 +0x4a
github.com/grafana/loki/pkg/iter.(*entryIteratorBackward).Next(0xc004318840, 0xc000162aa0)
	/go/src/github.com/grafana/loki/pkg/iter/iterator.go:449 +0x2f
github.com/grafana/loki/pkg/iter.(*nonOverlappingIterator).Next(0xc000685f00, 0xc000162aa0)
	/go/src/github.com/grafana/loki/pkg/iter/iterator.go:345 +0x119
github.com/grafana/loki/pkg/iter.(*heapIterator).requeue(0xc001c0c780, 0x154b020, 0xc000685f00, 0x0)
	/go/src/github.com/grafana/loki/pkg/iter/iterator.go:143 +0xbf
github.com/grafana/loki/pkg/iter.(*heapIterator).Next(0xc001c0c780, 0xc000d373a8)
	/go/src/github.com/grafana/loki/pkg/iter/iterator.go:174 +0xa3
github.com/grafana/loki/pkg/querier.ReadBatch(0x7f01f752f760, 0xc001c0c780, 0xc000000080, 0x42b0b2, 0xc0016a8670, 0xc000d375d8, 0x18)
	/go/src/github.com/grafana/loki/pkg/querier/querier.go:184 +0xc3
github.com/grafana/loki/pkg/ingester.sendBatches(0x7f01f752f760, 0xc001c0c780, 0x154f840, 0xc00008d140, 0xc0000003e8, 0x0, 0x0)
	/go/src/github.com/grafana/loki/pkg/ingester/instance.go:241 +0xa7
github.com/grafana/loki/pkg/ingester.(*instance).Query(0xc0001f2900, 0xc00009bb60, 0x154f840, 0xc00008d140, 0x0, 0x0)
	/go/src/github.com/grafana/loki/pkg/ingester/instance.go:130 +0x1d5
github.com/grafana/loki/pkg/ingester.(*Ingester).Query(0xc000182b60, 0xc00009bb60, 0x154f840, 0xc00008d140, 0xc000182b60, 0xc005249708)
	/go/src/github.com/grafana/loki/pkg/ingester/ingester.go:187 +0xbe
github.com/grafana/loki/pkg/logproto._Querier_Query_Handler(0x136aa20, 0xc000182b60, 0x154e8e0, 0xc000162700, 0xc000162700, 0xc005249788)
	/go/src/github.com/grafana/loki/pkg/logproto/logproto.pb.go:1404 +0x109
github.com/grafana/loki/vendor/github.com/grpc-ecosystem/go-grpc-middleware.ChainStreamServer.func1.1(0x136aa20, 0xc000182b60, 0x154e8e0, 0xc000162700, 0x1548d20, 0xc000bc1ec0)
	/go/src/github.com/grafana/loki/vendor/github.com/grpc-ecosystem/go-grpc-middleware/chain.go:71 +0xf5
github.com/grafana/loki/pkg/loki.glob..func3(0x136aa20, 0xc000182b60, 0x154e2e0, 0xc0001626e0, 0xc000162620, 0xc0016a87d0, 0x12d7ac0, 0x1548d01)
	/go/src/github.com/grafana/loki/pkg/loki/fake_auth.go:29 +0xde
github.com/grafana/loki/vendor/github.com/grpc-ecosystem/go-grpc-middleware.ChainStreamServer.func1.1(0x136aa20, 0xc000182b60, 0x154e2e0, 0xc0001626e0, 0x1548d20, 0xc000bc1e90)
	/go/src/github.com/grafana/loki/vendor/github.com/grpc-ecosystem/go-grpc-middleware/chain.go:74 +0x9c
github.com/grafana/loki/vendor/github.com/opentracing-contrib/go-grpc.OpenTracingStreamServerInterceptor.func1(0x136aa20, 0xc000182b60, 0x154e4c0, 0xc0000c0780, 0xc000162620, 0xc0016a87d0, 0x0, 0x0)
	/go/src/github.com/grafana/loki/vendor/github.com/opentracing-contrib/go-grpc/server.go:114 +0x336
github.com/grafana/loki/vendor/github.com/grpc-ecosystem/go-grpc-middleware.ChainStreamServer.func1.1(0x136aa20, 0xc000182b60, 0x154e4c0, 0xc0000c0780, 0x0, 0x1)
	/go/src/github.com/grafana/loki/vendor/github.com/grpc-ecosystem/go-grpc-middleware/chain.go:74 +0x9c
github.com/grafana/loki/vendor/github.com/weaveworks/common/middleware.StreamServerInstrumentInterceptor.func1(0x136aa20, 0xc000182b60, 0x154e4c0, 0xc0000c0780, 0xc000162620, 0xc0016a87d0, 0x5d31bd22, 0x161606ce)
	/go/src/github.com/grafana/loki/vendor/github.com/weaveworks/common/middleware/grpc_instrumentation.go:36 +0x9d
github.com/grafana/loki/vendor/github.com/grpc-ecosystem/go-grpc-middleware.ChainStreamServer.func1.1(0x136aa20, 0xc000182b60, 0x154e4c0, 0xc0000c0780, 0xc001bda000, 0xc004f22b80)
	/go/src/github.com/grafana/loki/vendor/github.com/grpc-ecosystem/go-grpc-middleware/chain.go:74 +0x9c
github.com/grafana/loki/vendor/github.com/weaveworks/common/middleware.GRPCServerLog.StreamServerInterceptor(0x1552ac0, 0xc000169400, 0x0, 0x136aa20, 0xc000182b60, 0x154e4c0, 0xc0000c0780, 0xc000162620, 0xc0016a87d0, 0xc000162601, ...)
	/go/src/github.com/grafana/loki/vendor/github.com/weaveworks/common/middleware/grpc_logging.go:48 +0x98
github.com/grafana/loki/vendor/github.com/weaveworks/common/middleware.GRPCServerLog.StreamServerInterceptor-fm(0x136aa20, 0xc000182b60, 0x154e4c0, 0xc0000c0780, 0xc000162620, 0xc0016a87d0, 0xc004f22c30, 0x40c4a8)
	/go/src/github.com/grafana/loki/vendor/github.com/weaveworks/common/server/server.go:135 +0x8a
github.com/grafana/loki/vendor/github.com/grpc-ecosystem/go-grpc-middleware.ChainStreamServer.func1(0x136aa20, 0xc000182b60, 0x154e4c0, 0xc0000c0780, 0xc000162620, 0x13ef520, 0x2, 0x770629)
	/go/src/github.com/grafana/loki/vendor/github.com/grpc-ecosystem/go-grpc-middleware/chain.go:79 +0x14d
github.com/grafana/loki/vendor/google.golang.org/grpc.(*Server).processStreamingRPC(0xc00014f080, 0x1552360, 0xc00062ed80, 0xc000380200, 0xc00016b920, 0x20726a0, 0x0, 0x0, 0x0)
	/go/src/github.com/grafana/loki/vendor/google.golang.org/grpc/server.go:1209 +0x488
github.com/grafana/loki/vendor/google.golang.org/grpc.(*Server).handleStream(0xc00014f080, 0x1552360, 0xc00062ed80, 0xc000380200, 0x0)
	/go/src/github.com/grafana/loki/vendor/google.golang.org/grpc/server.go:1282 +0xd9b
github.com/grafana/loki/vendor/google.golang.org/grpc.(*Server).serveStreams.func1.1(0xc000ac5530, 0xc00014f080, 0x1552360, 0xc00062ed80, 0xc000380200)
	/go/src/github.com/grafana/loki/vendor/google.golang.org/grpc/server.go:717 +0x9f
created by github.com/grafana/loki/vendor/google.golang.org/grpc.(*Server).serveStreams.func1
	/go/src/github.com/grafana/loki/vendor/google.golang.org/grpc/server.go:715 +0xa1

Loki (ingestor pods) are started with command:

/bin/loki -config.file=/etc/loki/loki.yaml -consul.hostname=loki-consul-server.logging.svc.cluster.local:8500 -log.level=debug -target=ingester 

loki.yaml as seen inside the pods:

auth_enabled: false
chunk_store_config:
  max_look_back_period: 0
ingester:
  chunk_block_size: 262144
  chunk_idle_period: 5m
  chunk_retain_period: 30s
  lifecycler:
    ring:
      kvstore:
        store: consul
      replication_factor: 1
limits_config:
  enforce_metric_name: false
  reject_old_samples: true
  reject_old_samples_max_age: 168h
schema_config:
  configs:
  - from: 2018-04-15
    index:
      period: 0
      prefix: 'loki-dev3.our-domain.org'
    object_store: s3
    schema: v9
    store: dynamo
server:
  http_listen_port: 3100
storage_config:
  aws:
    dynamodbconfig:
      dynamodb: 'dynamodb://<hidden>:<hidden>@us-east-1'
    s3: 's3://<hidden>:<hidden>@us-east-1/loki-dev3.our-domain.org'
  boltdb:
    directory: /data/loki/index
  filesystem:
    directory: /data/loki/chunks
table_manager:
  retention_deletes_enabled: false
  retention_period: 0

Ingester resources in K8s:

        resources:
          limits:
            cpu: "2"
            memory: 15Gi
          requests:
            cpu: "1"
            memory: 10Gi

Querier component panic (kubectl logs):

goroutine 21917 [running]: 
net/http.(*conn).serve.func1(0xc000e9a000) 
	/usr/local/go/src/net/http/server.go:1746 +0xd0 
panic(0x11c81c0, 0x206a6f0) 
	/usr/local/go/src/runtime/panic.go:513 +0x1b9 
github.com/grafana/loki/pkg/chunkenc.(*bufferedIterator).moveNext(0xc0007bccf0, 0xc015e90a00, 0x11b, 0xfe9e81, 0xc0007bccf0, 0x11eac2f0) 
	/go/src/github.com/grafana/loki/pkg/chunkenc/gzip.go:537 +0x4df 
github.com/grafana/loki/pkg/chunkenc.(*bufferedIterator).Next(0xc0007bccf0, 0xc015e78fc0) 
	/go/src/github.com/grafana/loki/pkg/chunkenc/gzip.go:501 +0x2f 
github.com/grafana/loki/pkg/iter.(*nonOverlappingIterator).Next(0xc0010c7e40, 0xed4c3c9d2) 
	/go/src/github.com/grafana/loki/pkg/iter/iterator.go:345 +0x119 
github.com/grafana/loki/pkg/iter.(*timeRangedIterator).Next(0xc0010c7e80, 0x10e) 
	/go/src/github.com/grafana/loki/pkg/iter/iterator.go:411 +0xd1 
github.com/grafana/loki/pkg/iter.(*entryIteratorBackward).load(0xc001285c80) 
	/go/src/github.com/grafana/loki/pkg/iter/iterator.go:440 +0x4a 
github.com/grafana/loki/pkg/iter.(*entryIteratorBackward).Next(0xc001285c80, 0xc0070637e0) 
	/go/src/github.com/grafana/loki/pkg/iter/iterator.go:449 +0x2f 
github.com/grafana/loki/pkg/iter.(*nonOverlappingIterator).Next(0xc0010c7ec0, 0xc0070637e0) 
	/go/src/github.com/grafana/loki/pkg/iter/iterator.go:345 +0x119 
github.com/grafana/loki/pkg/iter.(*heapIterator).requeue(0xc001337040, 0x154b020, 0xc0010c7ec0, 0x100) 
	/go/src/github.com/grafana/loki/pkg/iter/iterator.go:143 +0xbf 
github.com/grafana/loki/pkg/iter.(*heapIterator).Next(0xc001337040, 0xc00956ab80) 
	/go/src/github.com/grafana/loki/pkg/iter/iterator.go:174 +0xa3 
github.com/grafana/loki/pkg/iter.(*heapIterator).requeue(0xc00bf94460, 0x7fac293cd088, 0xc001337040, 0x0) 
	/go/src/github.com/grafana/loki/pkg/iter/iterator.go:143 +0xbf 
github.com/grafana/loki/pkg/iter.(*heapIterator).Next(0xc00bf94460, 0x10) 
	/go/src/github.com/grafana/loki/pkg/iter/iterator.go:174 +0xa3 
github.com/grafana/loki/pkg/iter.(*heapIterator).requeue(0xc00bf94780, 0x7fac293cd088, 0xc00bf94460, 0x457300) 
	/go/src/github.com/grafana/loki/pkg/iter/iterator.go:143 +0xbf 
github.com/grafana/loki/pkg/iter.(*heapIterator).Next(0xc00bf94780, 0x104) 
	/go/src/github.com/grafana/loki/pkg/iter/iterator.go:174 +0xa3 
github.com/grafana/loki/pkg/querier.ReadBatch(0x7fac293cd088, 0xc00bf94780, 0xc0000003e8, 0x7fac293cd088, 0xc00bf94780, 0x6, 0x5) 
	/go/src/github.com/grafana/loki/pkg/querier/querier.go:184 +0xc3 
github.com/grafana/loki/pkg/querier.(*Querier).Query(0xc000694060, 0x1548d20, 0xc00116dc80, 0xc00399d860, 0x0, 0x0, 0x0) 
	/go/src/github.com/grafana/loki/pkg/querier/querier.go:127 +0x322 
github.com/grafana/loki/pkg/querier.(*Querier).QueryHandler(0xc000694060, 0x15446e0, 0xc000b91280, 0xc00036fc00) 
	/go/src/github.com/grafana/loki/pkg/querier/http.go:131 +0x1cf 
github.com/grafana/loki/pkg/querier.(*Querier).QueryHandler-fm(0x15446e0, 0xc000b91280, 0xc00036fc00) 
	/go/src/github.com/grafana/loki/pkg/loki/modules.go:145 +0x48 
net/http.HandlerFunc.ServeHTTP(0xc0006f4000, 0x15446e0, 0xc000b91280, 0xc00036fc00) 
	/usr/local/go/src/net/http/server.go:1964 +0x44 
github.com/grafana/loki/pkg/loki.glob..func1.1(0x15446e0, 0xc000b91280, 0xc00036fb00) 
	/go/src/github.com/grafana/loki/pkg/loki/fake_auth.go:18 +0x13b 
net/http.HandlerFunc.ServeHTTP(0xc0006f6000, 0x15446e0, 0xc000b91280, 0xc00036fb00) 
	/usr/local/go/src/net/http/server.go:1964 +0x44 
github.com/grafana/loki/vendor/github.com/gorilla/mux.(*Router).ServeHTTP(0xc0003fccb0, 0x15446e0, 0xc000b91280, 0xc00036fb00) 
	/go/src/github.com/grafana/loki/vendor/github.com/gorilla/mux/mux.go:162 +0xf1 
github.com/grafana/loki/vendor/github.com/weaveworks/common/middleware.Instrument.Wrap.func1(0x15446a0, 0xc008505b00, 0xc00036f900) 
	/go/src/github.com/grafana/loki/vendor/github.com/weaveworks/common/middleware/instrument.go:49 +0x177 
net/http.HandlerFunc.ServeHTTP(0xc0002a74d0, 0x15446a0, 0xc008505b00, 0xc00036f900) 
	/usr/local/go/src/net/http/server.go:1964 +0x44 
github.com/grafana/loki/vendor/github.com/weaveworks/common/middleware.Log.Wrap.func1(0x15484a0, 0xc00340e2d0, 0xc00036f900) 
	/go/src/github.com/grafana/loki/vendor/github.com/weaveworks/common/middleware/logging.go:41 +0x1a2 
net/http.HandlerFunc.ServeHTTP(0xc0002a7500, 0x15484a0, 0xc00340e2d0, 0xc00036f900) 
	/usr/local/go/src/net/http/server.go:1964 +0x44 
net/http.Handler.ServeHTTP-fm(0x15484a0, 0xc00340e2d0, 0xc00036f900) 
	/go/src/github.com/grafana/loki/vendor/github.com/prometheus/client_golang/prometheus/http.go:132 +0x4d 
github.com/grafana/loki/vendor/github.com/opentracing-contrib/go-stdlib/nethttp.MiddlewareFunc.func4(0x1547060, 0xc0002100e0, 0xc00036f700) 
	/go/src/github.com/grafana/loki/vendor/github.com/opentracing-contrib/go-stdlib/nethttp/server.go:118 +0x4af 
net/http.HandlerFunc.ServeHTTP(0xc0002a7560, 0x1547060, 0xc0002100e0, 0xc00036f700) 
	/usr/local/go/src/net/http/server.go:1964 +0x44 
github.com/grafana/loki/vendor/github.com/weaveworks/common/middleware.Tracer.Wrap.func2(0x1547060, 0xc0002100e0, 0xc00036f700) 
	/go/src/github.com/grafana/loki/vendor/github.com/weaveworks/common/middleware/http_tracing.go:39 +0x8f 
net/http.HandlerFunc.ServeHTTP(0xc0002a7590, 0x1547060, 0xc0002100e0, 0xc00036f700) 
	/usr/local/go/src/net/http/server.go:1964 +0x44 
net/http.serverHandler.ServeHTTP(0xc0000f1450, 0x1547060, 0xc0002100e0, 0xc00036f700) 
	/usr/local/go/src/net/http/server.go:2741 +0xab 
net/http.(*conn).serve(0xc000e9a000, 0x1548c60, 0xc001e40000) 
	/usr/local/go/src/net/http/server.go:1847 +0x646  
@cyriltovena cyriltovena self-assigned this Jul 19, 2019
@cyriltovena
Copy link
Contributor

After investigation I believe the bytes.buffer doesn't like to have big variation of log line length.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants