Skip to content
This repository has been archived by the owner on Oct 17, 2023. It is now read-only.

Mongodb Adaptor won't stop #265

Closed
cognusion opened this issue Feb 10, 2017 · 15 comments · Fixed by #280
Closed

Mongodb Adaptor won't stop #265

cognusion opened this issue Feb 10, 2017 · 15 comments · Fixed by #280
Assignees

Comments

@cognusion
Copy link

Bug report

Using Go to create a transporter pipeline, mongodb as the source and stdout as the sink, the mongo adaptor doesn't stop, even if a pipeline.Stop() is explicitly called. The adaptor maintains its connections to the mongos, periodically emits ping and sync messages, etc. For a short-lived program this isn't a big deal, but these leaks are terminal for long running processes.

System info:

  • Transporter version
    Latest Git
  • OS
    Linux-latest
  • DB version(s)
    2.6, 3.2

What did you expect to happened?

Adaptor would stop, connections to mongo would be closed, could carry on.

What actually happened?

Adaptor claimed to have stopped, but sessions persisted, leaking goros, etc. The "Goros: N" is emitted every 1second, and is runtime.NumberGoros() (the number is 7 before the transport stuff is built and run)

INFO[0008] iterating complete collection=thecollection
INFO[0008] Read completed db=thedb
INFO[0008] adaptor Start finished... path=source
INFO[0008] adaptor Stopping... path=source
INFO[0008] adaptor Stopped path=source
Goros: 21
Goros: 21
Goros: 21
Goros: 21
Goros: 21
Goros: 21
Goros: 21
Goros: 21
Goros: 21
Goros: 21
Goros: 21
Goros: 21
Goros: 21
Goros: 21
INFO[0023] Ping for mongodev1:27017 is 23 ms
Goros: 21
INFO[0023] Ping for mongodev2:27017 is 23 ms
Goros: 21
Goros: 21
Goros: 21
Goros: 21
Goros: 19
Goros: 19
Goros: 19
Goros: 19
Goros: 19
Goros: 19
Goros: 19
Goros: 19
Goros: 19
Goros: 19
INFO[0038] Ping for mongodev1:27017 is 23 ms
Goros: 19
INFO[0038] Ping for mongodev2:27017 is 24 ms
INFO[0038] SYNC Starting full topology synchronization...
INFO[0038] SYNC Processing mongodev1:27017...
INFO[0038] SYNC Processing mongodev2:27017...
INFO[0038] SYNC Synchronization was complete (got data from primary).
INFO[0038] SYNC Synchronization completed: 1 master(s) and 1 slave(s) alive.
Goros: 19
... etc etc...

@jipperinbham
Copy link
Contributor

@cognusion thanks for the detailed report. Would it be possible for you to test a mongo -> mongo pipeline? This would help me confirm whether it's an issue with the file adaptor.

@cognusion
Copy link
Author

cognusion commented Feb 10, 2017

Yeah, it keeps both mongo sides open indefinitely:

INFO[0361] Ping for srcmongodev1:27017 is 24 ms
INFO[0361] Ping for srcmongodev2:27017 is 23 ms
INFO[0361] Ping for sinkmongodev1:27017 is 23 ms
INFO[0361] Ping for sinkmongodev2:27017 is 24 ms

I've looked through the adaptor, but I'm getting lost tracing where the mgo.Session is and isn't available. It seems like the client isn't ever closing it, just the pipe and not the Session.

@jipperinbham
Copy link
Contributor

ok, I'm curious if you are using this as a library or binary? The code to create the initial session as well as Copy the session is all here. Since mgo takes care of session re-use under the hood, it is a performance hit to constantly create and close new sessions so we just create an initial one and then copy it.

I'll try and reproduce this locally.

@cognusion
Copy link
Author

Library: It's a daemonized queue listener that takes parameters from passed messages, and then based on the parameters adds nodes, builds a pipeline and does the right thing. It all works great, just leaking awful on the mongo side. I've only used the mongo, file, and postgresql adaptors, so far.

@jipperinbham
Copy link
Contributor

ok, this all makes sense then, we never call Close() on the initial session created so the connection would remain open is using this as a library right now.

@trinchan and I have been discussing some changes that should resolve this issue in the v0.2.0 release which is due this month.

@cognusion
Copy link
Author

Excellent! Even if it didn't happen on pipline.Stop() if it was exposed somehow .. pipeline.ReallyStopAllTheThings() or something would be awesome

@jipperinbham
Copy link
Contributor

@cognusion can you do a fresh pull from master? I just merged #267 which should take care of cleaning up those go routines and mgo session.

@cognusion
Copy link
Author

At first glance, there's no change. I haven't dug into it at all, but will do that tonight/tomorrow. Thanks for the quick turnaround

@jipperinbham
Copy link
Contributor

ok, I've been able to confirm there's still a leaky connection based on the code in master and have a fix locally that I will try to get pushed up soon.

@jipperinbham
Copy link
Contributor

@cognusion can you test again? I merged in some changes earlier today that I'm hoping solved this issue.

@cognusion
Copy link
Author

@jipperinbham FTW! The sessions definitely close when the pipeline is complete, and the goro count drops a little thereafter, and a little more over time, almost back to baseline. There are only two ever-present goros that appear orphaned from compose/transporter thereafter:

goroutine 116 [chan receive]:
github.com/compose/transporter/pkg/transporter.(*Pipeline).startErrorListener(0xc4201455c0, 0xc420145680)
/home/m/gocode/src/github.com/compose/transporter/pkg/transporter/pipeline.go:133 +0x97
created by github.com/compose/transporter/pkg/transporter.NewPipeline
/home/m/gocode/src/github.com/compose/transporter/pkg/transporter/pipeline.go:77 +0x209

goroutine 117 [chan receive]:
github.com/compose/transporter/pkg/transporter.(*Pipeline).startMetricsGatherer(0xc4201455c0)
/home/m/gocode/src/github.com/compose/transporter/pkg/transporter/pipeline.go:149 +0x6b
created by github.com/compose/transporter/pkg/transporter.NewPipeline
/home/m/gocode/src/github.com/compose/transporter/pkg/transporter/pipeline.go:78 +0x22b

@jipperinbham
Copy link
Contributor

@cognusion thanks for testing and providing another detailed report, we'll take a look at those two and get them cleaned up soon.

@jipperinbham
Copy link
Contributor

@cognusion this issue was autoclosed after I merged #280 but it you can test the latest from master, I'm really hoping we've taken care of all leaky goroutines.

@jipperinbham jipperinbham reopened this Feb 27, 2017
@cognusion
Copy link
Author

@jipperinbham confirmed: After running a mongo-sourced pipeline the mongo connections are closed, and dumping goros 1 minute later yields nothing directly from compose/transporter. Thanks again for the quick fixes on these!

@jipperinbham
Copy link
Contributor

👍 thanks for helping us get this issue flushed out and tested.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants