Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-persistent behavior on logging bblfsh parse errors #955

Closed
lwsanty opened this issue Aug 23, 2019 · 10 comments
Closed

Non-persistent behavior on logging bblfsh parse errors #955

lwsanty opened this issue Aug 23, 2019 · 10 comments
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@lwsanty
Copy link
Contributor

lwsanty commented Aug 23, 2019

Having test in bblfsh-mockups regression suite
One of it's sub-cases uses bblfsh mockup that returns such response

&v2.ParseResponse{Language: "go", Uast: bufData, Errors: nil},

where bufData = getUASTBytes("a")

func getUASTBytes(text string) ([]byte, error) {
	node := nodes.String(text)
	buf := bytes.NewBuffer(nil)
	if err := nodesproto.WriteTo(buf, node); err != nil {
		return nil, err
	}
	return buf.Bytes(), nil
}

expected gitbase output

time="2019-08-23T11:44:03+03:00" level=info msg="server started and listening on localhost:3306"
time="2019-08-23T11:44:04+03:00" level=info msg="NewConnection: client 1"
time="2019-08-23T11:44:04+03:00" level=info msg="audit trail" action=authentication address="127.0.0.1:58320" success=true system=audit user=root
time="2019-08-23T11:44:04+03:00" level=info msg="audit trail" action=authorization address="127.0.0.1:58320" connection_id=1 permission=read pid=1 query="select file_path, uast(blob_content) name \nfrom refs natural \njoin commit_files natural \njoin blobs \nwhere LANGUAGE(file_path) = 'Go' \nlimit 10" success=true system=audit user=root
time="2019-08-23T11:44:04+03:00" level=warning msg="unable to parse the given blob using bblfsh: rpc error: code = Canceled desc = context canceled"
time="2019-08-23T11:44:04+03:00" level=info msg="audit trail" action=query address="127.0.0.1:58320" connection_id=1 duration=40.864078ms pid=1 query="select file_path, uast(blob_content) name \nfrom refs natural \njoin commit_files natural \njoin blobs \nwhere LANGUAGE(file_path) = 'Go' \nlimit 10" success=true system=audit user=root
time="2019-08-23T11:44:04+03:00" level=info msg="ConnectionClosed: client 1"

the most crucial line to assert is

time="2019-08-23T11:44:04+03:00" level=warning msg="unable to parse the given blob using bblfsh: rpc error: code = Canceled desc = context canceled"

observed in 20% of this test runs the output does not contain parse error line

time="2019-08-23T11:44:11+03:00" level=info msg="server started and listening on localhost:3306"
time="2019-08-23T11:44:12+03:00" level=info msg="NewConnection: client 1"
time="2019-08-23T11:44:12+03:00" level=info msg="audit trail" action=authentication address="127.0.0.1:58326" success=true system=audit user=root
time="2019-08-23T11:44:12+03:00" level=info msg="audit trail" action=authorization address="127.0.0.1:58326" connection_id=1 permission=read pid=1 query="select file_path, uast(blob_content) name \nfrom refs natural \njoin commit_files natural \njoin blobs \nwhere LANGUAGE(file_path) = 'Go' \nlimit 10" success=true system=audit user=root
time="2019-08-23T11:44:12+03:00" level=info msg="audit trail" action=query address="127.0.0.1:58326" connection_id=1 duration=48.694585ms pid=1 query="select file_path, uast(blob_content) name \nfrom refs natural \njoin commit_files natural \njoin blobs \nwhere LANGUAGE(file_path) = 'Go' \nlimit 10" success=true system=audit user=root
time="2019-08-23T11:44:12+03:00" level=info msg="ConnectionClosed: client 1"

note test uses stdout+stderr capturer and at first it may be a delusion that capturer does not have time to capture this part of log, but it's not true, because:

  1. I have checked this test with different pre-delays(0.5-5 seconds) before capture
  2. imo this warning should appear during the query execution and afaik if ConnectionClosed: client 1 has appeared no additional query-related logs will be observed
@ajnavarro ajnavarro added the bug Something isn't working label Aug 23, 2019
@erizocosmico
Copy link
Contributor

Warn should be printed if it's a blob parse error:

logrus.Warn(err)

@erizocosmico
Copy link
Contributor

I've managed to reproduce this, but after much investigation I have no idea why this is happening, since the log is there. If anyone else wants to give this a try, feel free.

@agarciamontoro
Copy link
Contributor

I spent the day yesterday with this and no clue either. Could it be possible that the error comes from some weird race condition when capturing the output, @lwsanty? I'm just throwing ideas here...

@erizocosmico erizocosmico added the help wanted Extra attention is needed label Sep 27, 2019
@lwsanty
Copy link
Contributor Author

lwsanty commented Sep 27, 2019

@agarciamontoro
I did a couple additional checks for the main suspect of std capturer

  1. I ran tests with race detector flag several times
    result no races detected and still able to reproduce

  2. I removed capturer from the code and still could reproduce the issue

@lwsanty
Copy link
Contributor Author

lwsanty commented Sep 27, 2019

Update gitbase option --parallelism=1 heals data races, but bug is still reproduced and is reproduced constantly now


On the other hand, if gitbase is compiled with race detector then test is reproduced with race

[2019-09-27T17:15:01.168485901+03:00]  INFO =====> TestParseErrorWarnings
[2019-09-27T17:15:01.169418541+03:00]  INFO Preparing run
[2019-09-27T17:15:01.169521102+03:00]  INFO Downloading repositories
[2019-09-27T17:15:01.321752243+03:00]  INFO Preparing gitbase binaries
[2019-09-27T17:15:02.153010817+03:00]  INFO Binary for remote:debug-955 (6679fb0e3b54058dbf4d809b7b65067b90ef3104) already built
[2019-09-27T17:15:02.158484857+03:00]  INFO Running query query={ID: Name: Statements:[select file_path, uast(blob_content) name 
from refs natural 
join commit_files natural 
join blobs 
where LANGUAGE(file_path) = 'Go' 
limit 10]} version=remote:debug-955
[2019-09-27T17:15:02.159184868+03:00]  INFO Executing gitbase test
[2019-09-27T17:15:03.293164969+03:00]  INFO runQueryCtx -> defer
[2019-09-27T17:15:04.304478201+03:00]  INFO out: time="2019-09-27T17:15:02+03:00" level=info msg="squash tables rule is enabled"
time="2019-09-27T17:15:02+03:00" level=info msg="server started and listening on localhost:3306"
time="2019-09-27T17:15:03+03:00" level=info msg="NewConnection: client 1"
time="2019-09-27T17:15:03+03:00" level=info msg="audit trail" action=authentication address="127.0.0.1:43160" success=true system=audit user=root
time="2019-09-27T17:15:03+03:00" level=info msg="audit trail" action=authorization address="127.0.0.1:43160" connection_id=1 permission=read pid=1 query="select file_path, uast(blob_content) name \nfrom refs natural \njoin commit_files natural \njoin blobs \nwhere LANGUAGE(file_path) = 'Go' \nlimit 10" success=true system=audit user=root
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> parsing..."
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> return node, nil"
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> parsing..."
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> return node, nil"
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> parsing..."
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> return node, nil"
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> parsing..."
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> return node, nil"
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> parsing..."
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> return node, nil"
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> parsing..."
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> return node, nil"
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> parsing..."
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> return node, nil"
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> parsing..."
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> return node, nil"
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> parsing..."
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> return node, nil"
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> parsing..."
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> return node, nil"
==================
WARNING: DATA RACE
Write at 0x00c0000d7198 by goroutine 21:
  github.com/src-d/go-mysql-server/sql/plan.(*exchangeRowIter).Close()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/src-d/go-mysql-server@v0.4.1-0.20190821121850-0e0249cf7bc0/sql/plan/exchange.go:269 +0x105
  github.com/src-d/go-mysql-server/sql/plan.(*limitIter).Close()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/src-d/go-mysql-server@v0.4.1-0.20190821121850-0e0249cf7bc0/sql/plan/limit.go:77 +0x55
  github.com/src-d/go-mysql-server/sql.(*spanIter).Close()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/src-d/go-mysql-server@v0.4.1-0.20190821121850-0e0249cf7bc0/sql/session.go:416 +0x71
  github.com/src-d/go-mysql-server/sql/plan.(*trackedRowIter).Close()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/src-d/go-mysql-server@v0.4.1-0.20190821121850-0e0249cf7bc0/sql/plan/process.go:142 +0x73
  github.com/src-d/go-mysql-server/server.(*Handler).ComQuery()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/src-d/go-mysql-server@v0.4.1-0.20190821121850-0e0249cf7bc0/server/handler.go:280 +0xf15
  vitess.io/vitess/go/mysql.(*Conn).execQuery()
      /home/lwsanty/goproj/gopath/pkg/mod/vitess.io/vitess@v3.0.0-rc.3.0.20190602171040-12bfde34629c+incompatible/go/mysql/conn.go:819 +0x20b
  vitess.io/vitess/go/mysql.(*Conn).handleNextCommand()
      /home/lwsanty/goproj/gopath/pkg/mod/vitess.io/vitess@v3.0.0-rc.3.0.20190602171040-12bfde34629c+incompatible/go/mysql/conn.go:749 +0x1705
  vitess.io/vitess/go/mysql.(*Listener).handle()
      /home/lwsanty/goproj/gopath/pkg/mod/vitess.io/vitess@v3.0.0-rc.3.0.20190602171040-12bfde34629c+incompatible/go/mysql/server.go:441 +0x13f8

Previous read at 0x00c0000d7198 by goroutine 27:
  github.com/src-d/go-mysql-server/sql/plan.(*exchangeRowIter).iterPartitions()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/src-d/go-mysql-server@v0.4.1-0.20190821121850-0e0249cf7bc0/sql/plan/exchange.go:182 +0x106

Goroutine 21 (running) created at:
  vitess.io/vitess/go/mysql.(*Listener).Accept()
      /home/lwsanty/goproj/gopath/pkg/mod/vitess.io/vitess@v3.0.0-rc.3.0.20190602171040-12bfde34629c+incompatible/go/mysql/server.go:243 +0x18a
  github.com/src-d/gitbase/cmd/gitbase/command.(*Server).Execute()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/src-d/go-mysql-server@v0.4.1-0.20190821121850-0e0249cf7bc0/server/server.go:79 +0xe43
  github.com/jessevdk/go-flags.(*Parser).ParseArgs()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/jessevdk/go-flags@v1.4.0/parser.go:316 +0x110a
  main.main()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/jessevdk/go-flags@v1.4.0/parser.go:186 +0x8a6

Goroutine 27 (finished) created at:
  github.com/src-d/go-mysql-server/sql/plan.(*exchangeRowIter).start()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/src-d/go-mysql-server@v0.4.1-0.20190821121850-0e0249cf7bc0/sql/plan/exchange.go:135 +0xa0
==================
==================
WARNING: DATA RACE
Read at 0x00c000d34d40 by goroutine 21:
  github.com/src-d/go-borges/libraries.(*mergedRepoIter).Close()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/src-d/go-borges@v0.1.3/libraries/iterator.go:57 +0x3e
  github.com/src-d/gitbase.(*repositoryPartitionIter).Close()
      /home/lwsanty/goproj/lwsanty/gitbase/partition.go:96 +0x7b
  github.com/src-d/go-mysql-server/sql/plan.(*exchangeRowIter).Close()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/src-d/go-mysql-server@v0.4.1-0.20190821121850-0e0249cf7bc0/sql/plan/exchange.go:273 +0xcb
  github.com/src-d/go-mysql-server/sql/plan.(*limitIter).Close()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/src-d/go-mysql-server@v0.4.1-0.20190821121850-0e0249cf7bc0/sql/plan/limit.go:77 +0x55
  github.com/src-d/go-mysql-server/sql.(*spanIter).Close()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/src-d/go-mysql-server@v0.4.1-0.20190821121850-0e0249cf7bc0/sql/session.go:416 +0x71
  github.com/src-d/go-mysql-server/sql/plan.(*trackedRowIter).Close()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/src-d/go-mysql-server@v0.4.1-0.20190821121850-0e0249cf7bc0/sql/plan/process.go:142 +0x73
  github.com/src-d/go-mysql-server/server.(*Handler).ComQuery()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/src-d/go-mysql-server@v0.4.1-0.20190821121850-0e0249cf7bc0/server/handler.go:280 +0xf15
  vitess.io/vitess/go/mysql.(*Conn).execQuery()
      /home/lwsanty/goproj/gopath/pkg/mod/vitess.io/vitess@v3.0.0-rc.3.0.20190602171040-12bfde34629c+incompatible/go/mysql/conn.go:819 +0x20b
  vitess.io/vitess/go/mysql.(*Conn).handleNextCommand()
      /home/lwsanty/goproj/gopath/pkg/mod/vitess.io/vitess@v3.0.0-rc.3.0.20190602171040-12bfde34629c+incompatible/go/mysql/conn.go:749 +0x1705
  vitess.io/vitess/go/mysql.(*Listener).handle()
      /home/lwsanty/goproj/gopath/pkg/mod/vitess.io/vitess@v3.0.0-rc.3.0.20190602171040-12bfde34629c+incompatible/go/mysql/server.go:441 +0x13f8

Previous write at 0x00c000d34d40 by goroutine 27:
  github.com/src-d/go-borges/libraries.(*mergedRepoIter).Next()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/src-d/go-borges@v0.1.3/libraries/iterator.go:37 +0x1ae
  github.com/src-d/gitbase.(*repositoryPartitionIter).Next()
      /home/lwsanty/goproj/lwsanty/gitbase/partition.go:79 +0x61
  github.com/src-d/go-mysql-server/sql/plan.(*exchangeRowIter).iterPartitions()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/src-d/go-mysql-server@v0.4.1-0.20190821121850-0e0249cf7bc0/sql/plan/exchange.go:187 +0x26f

Goroutine 21 (running) created at:
  vitess.io/vitess/go/mysql.(*Listener).Accept()
      /home/lwsanty/goproj/gopath/pkg/mod/vitess.io/vitess@v3.0.0-rc.3.0.20190602171040-12bfde34629c+incompatible/go/mysql/server.go:243 +0x18a
  github.com/src-d/gitbase/cmd/gitbase/command.(*Server).Execute()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/src-d/go-mysql-server@v0.4.1-0.20190821121850-0e0249cf7bc0/server/server.go:79 +0xe43
  github.com/jessevdk/go-flags.(*Parser).ParseArgs()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/jessevdk/go-flags@v1.4.0/parser.go:316 +0x110a
  main.main()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/jessevdk/go-flags@v1.4.0/parser.go:186 +0x8a6

Goroutine 27 (finished) created at:
  github.com/src-d/go-mysql-server/sql/plan.(*exchangeRowIter).start()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/src-d/go-mysql-server@v0.4.1-0.20190821121850-0e0249cf7bc0/sql/plan/exchange.go:135 +0xa0
==================
time="2019-09-27T17:15:03+03:00" level=info msg="audit trail" action=query address="127.0.0.1:43160" connection_id=1 duration=128.906846ms pid=1 query="select file_path, uast(blob_content) name \nfrom refs natural \njoin commit_files natural \njoin blobs \nwhere LANGUAGE(file_path) = 'Go' \nlimit 10" success=true system=audit user=root
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> parsing..."
time="2019-09-27T17:15:03+03:00" level=info msg="ConnectionClosed: client 1"

[2019-09-27T17:15:04.304984522+03:00]  INFO error occurred: repeats of "level=warning": exp: 1 act: 0
[2019-09-27T17:15:04.305168782+03:00]  INFO =====> done
[2019-09-27T17:15:04.305304968+03:00]  INFO 1 tests have failed:
[TestParseErrorWarnings]
exit status 1

lwsanty added a commit that referenced this issue Oct 8, 2019
currently issue #955 forces pipeline to fail and jenkins plot is not built
this PR replaces stages so plot should be displayed before test has failed

also related to src-d/regression-gitbase#25

Signed-off-by: lwsanty <lwsanty@gmail.com>
@lwsanty
Copy link
Contributor Author

lwsanty commented Oct 11, 2019

have one more case, certainly related to this one, but having a slightly different behavior

the other test sub-case that returns mocked response

&v2.ParseResponse{Language: "go", Uast: bufData, Errors: []*v2.ParseError{{"only one parse error"}}},

is still dynamically failing in the same way to the previous test, however if gitbase runs with parallelism=1 flag this test is PASSed in 100 runs out of 100, contrasting to the previous case that fails in 100% of cases when this flag is set

@lwsanty
Copy link
Contributor Author

lwsanty commented Oct 11, 2019

couple of updates here

  1. Non-persistent behavior on logging bblfsh parse errors #955 (comment) is completely fixed with the new version of go-mysql-server
  2. I had a talk with @dennwc and figured out that mocked response from Non-persistent behavior on logging bblfsh parse errors #955 (comment) is valid one so no errors should be returned, thus this test works as expected during parallelism=1 and fails with context canceled with default parameter.

I suspect that for concurrent case some of the requests cancels the context that forces other requests to fail

lwsanty added a commit that referenced this issue Oct 11, 2019
require to fix race cases #955 (comment)
from issue #955

Signed-off-by: lwsanty <lwsanty@gmail.com>
lwsanty added a commit to src-d/regression-gitbase that referenced this issue Oct 11, 2019
as discussed in src-d/gitbase#955 (comment)
nil slice of parse errors should not cause any error warnings

note, don't merge until src-d/gitbase#955 is fixed

Signed-off-by: lwsanty <lwsanty@gmail.com>
@lwsanty
Copy link
Contributor Author

lwsanty commented Oct 11, 2019

OK, I think I can minimize and change the bug's description
if parallelism != 1 and some query performed, for example

select file_path, uast(blob_content) name 
from refs natural 
join commit_files natural 
join blobs 
where LANGUAGE(file_path) = 'Go' 
limit 10

it causes context cancel, but it should not

@lwsanty
Copy link
Contributor Author

lwsanty commented Oct 15, 2019

I have done a small investigation according to my previous comment and wanted to mention on more version here.

IMO the problem of canceled context lies in the specification of go-mysql-server in parallelism mode with the queries that have a limit. Let's consider the cases:

  • during the debug with parallelism = 1 I figured out that getUAST was triggered 10 times(equal to the value of limit in a query) and all the evaluations completed without errors
  • during the run with parallelism > 1 and query's limit removed I figured out that getUAST was triggered in the amount of time that is equal to the number of files, all the evaluations completed without errors
  • during the run with parallelism > 1 and query's limit = 10 getUAST was triggered 11 times, in 10 calls it has been completed, on the 11th one - errored with context canceled, the cancel was invoked here ... So, if I understand the logic correctly if parallelism > 1 and the limit is set then more executions then rows may be started but as soon as the required amount is processed then context gets canceled.

@lwsanty
Copy link
Contributor Author

lwsanty commented Oct 17, 2019

As discussed with @erizocosmico exchange nodes do not know parent nodes which limit nodes + they work unsynced. At the same time this extra exchange routine does not affect anything harmfully as it fails with context cancel with time. Fixing it is probably time consuming.

Taking everything into account we agreed to keep this behavior as a feature. From my side I will update the test suite to ignore context canceled error.

@lwsanty lwsanty closed this as completed Oct 17, 2019
lwsanty added a commit to src-d/regression-gitbase that referenced this issue Oct 17, 2019
lwsanty added a commit to src-d/regression-gitbase that referenced this issue Oct 17, 2019
as discussed in src-d/gitbase#955 (comment)

Signed-off-by: lwsanty <lwsanty@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants