Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Application can leave a zombie process on shutdown #678

Closed
Athanasius opened this issue Aug 28, 2020 · 21 comments
Closed

Application can leave a zombie process on shutdown #678

Athanasius opened this issue Aug 28, 2020 · 21 comments

Comments

@Athanasius
Copy link
Contributor

A user has observed this with 4.0.6.

Sometimes the application process hangs during shutdown. There's no indication of why in the log (i.e. no exception on shutdown).

4.0.6 does have timeouts set on a bunch of network calls. But perhaps we missed something in terms of that.

Preferably we need to catch this in a debugger to see what's actually happened.

@A-UNDERSCORE-D
Copy link
Contributor

As a start here I think we should add a log to all threads on exit, so we can narrow down which threads have not yet closed

@Athanasius
Copy link
Contributor Author

It would be useful to log the OS level thread IDs, so that if a user experiences this we can direct them to check Process Explorer for what OS level threads are still running.

Unfortunately the return from threading.get_ident() is not the OS level thread ID.

Support for getting the OS level thread ID is in Python 3.8: python/cpython#11993

We have blockers on moving to Python 3.8, but I don't want to put something hacky/fragile in to match the functionality, so this part of addressing this issue will have to wait for https://github.com/EDCD/EDMarketConnector/projects/6 to be completed.

@alterNERDtive
Copy link
Contributor

Since I’m randomly here right now and coming across this again … it’s still happening. It reproducibly happened every single time I closed EDMC (4.1.4) during the recent EDSM outage.

But no, ofc I did not save the logs :)

@Athanasius
Copy link
Contributor Author

Since I’m randomly here right now and coming across this again … it’s still happening. It reproducibly happened every single time I closed EDMC (4.1.4) during the recent EDSM outage.

But no, ofc I did not save the logs :)

Well, it's at least a pointer to one cause of it. Although it might be expected that a call initiated in the EDSM worker thread would block if the site isn't responding well, so this doesn't necessarily point to the 'normal' cause. Unless, of course, this is what it is all along. If this happens again give it 2 minutes if you can and see if the hung process is still there.

@bensode
Copy link

bensode commented Dec 15, 2020

As instructed via the release notes in 4.1.5, I am leaving a comment in this thread as the process was hung and wouldn't start (gave a popup that it was still running). What logs/info can I provide to help troubleshoot?

@Athanasius
Copy link
Contributor Author

Wow, that was quick !

The usual log files please, as per (from the bug report template):

  • Please attach BOTH log files as follows:
    1. %TEMP%\EDMarketConnector.log from immediately after the bug occurs (re-running the application overwrites this file).
    2. %TEMP%\EDMarketConnector\EDMarketConnector-debug.log. See Debug Log File. NB: If you don't have this log file then you're not running the latest version of the application and should update first to see if we already fixed the bug you're reporting.

Hopefully your log can at least point us to which part of the shutdown procedure is hanging. Thanks!

@bensode
Copy link

bensode commented Dec 15, 2020

Heh well I usually just leave it running all the time and I had the update notification popup just installed. Here are the raw logs. Hope that they help.
EDMarketConnector.log
EDMarketConnector.zip

@Athanasius
Copy link
Contributor Author

My surprise was more that someone had experienced the shutdown hang this soon, not that many people will have updated yet :) .

Indeed it looks like you didn't have the bug with 4.1.5. It needs to be with that version in order to have the extra logging that will point fingers. So do let us know if you have it again using 4.1.5.

@bensode
Copy link

bensode commented Dec 15, 2020

Indeed it looks like you didn't have the bug with 4.1.5. It needs to be with that version in order to have the extra logging that will point fingers. So do let us know if you have it again using 4.1.5.

Interesting ... did I misread that the persistent window was only in 4.1.5? Because once I started it, I was greeted with a button stating that it couldn't start since it was already running. No biggy though after I killed the process it fired up normally.

@Athanasius
Copy link
Contributor Author

Ah, but was that the 4.1.4 process still lingering ?

@bensode
Copy link

bensode commented Dec 15, 2020

Ah, but was that the 4.1.4 process still lingering ?

DOH! I will crawl back under my rock now ... :)

@Athanasius
Copy link
Contributor Author

To be clear, a log will only be relevant if it's from 4.1.5 and contains a line like:

2020-12-15 21:34:00.804 - INFO - EDMarketConnector.AppWindow.onexit:1066: Starting shutdown procedures...

as well as possibly a whole bunch of other lines with EDMarketConnector.AppWindow.onexit in that part of the line.

@alterNERDtive
Copy link
Contributor

alterNERDtive commented Jan 8, 2021

So I just got this on … whatever is current, I’ll edit when I’ve posted the log and killed+restarted it :)

I literally started EDMC to check sth, tried closing it … and it hangs.

EDMarketConnector-debug.log

Edit:

a) It’s 4.1.5.

b) Just happened AGAIN.

c) It’s nice to be able to click the X again, be notified that it hangs and able to kill it right there instead of having to go through the task manager :)

@alterNERDtive
Copy link
Contributor

Log with --trace:

EDMarketConnector-debug.log

Next up, no plugins.

@Athanasius
Copy link
Contributor Author

So, in your case, it looks like it's a plugin not responding properly to the stop notification.

However, I've had instances of this shutdown hang myself and I don't normally use any third-party plugins. So this could be a red herring related to one of your plugins (but we should still try to find out which one), rather than the overall cause.

@Athanasius
Copy link
Contributor Author

https://github.com/EDCD/EDMarketConnector/releases/tag/Release%2F4.1.6-rc2 has some changes to the shutdown code.

@alterNERDtive
Copy link
Contributor

So, in your case, it looks like it's a plugin not responding properly to the stop notification.

image

@inorton
Copy link
Contributor

inorton commented Jan 9, 2021

This could easily be someone who has written a plugin that creates a thread that doesn't have daemon=True set, python will keep running as long as there is at least one thread that has daemon=False which is the default value

@Athanasius
Copy link
Contributor Author

We might have finally found the cause of this, and it's not directly to do with threads (but the whole use of Tk event_generate() is because of using threads and needing to send messages in a thread-safe manner).

It seems that if you're in a Tk-bound function call (i.e. you specified the function as the target of a 'bind' for a button or event) and try to make a new Tk call in that function then it can deadlock in the Tk DLL.

Specifically the EDSM downtime meant that plugins/edsm.py couldn't send data. When its plugin_stop() is called it makes one last attempt to send queued data. When it fails it wants to set a status line message (that bottom-most line of the UI). This is done by generating a Tk event. That call causes the program to hang when it hits the self.tk.call() in Lib\tkinter\__init__.py:event_generate(). I couldn't manage to find any documentation about tk/tkinter bind and an admonition to not make new tk calls from withing the bound function, but it sure feels like this is a no-no.

So 4.1.6-rc3 now has a flag set when shutdown is initiated and anything that tries to call a Tk event_generate() will check this before proceding.

@Craig8944
Copy link

Only in the past week or so did I begin to experience this shutdown issue with version 4.1.5, no plugins installed, application hanging. 4.1.6-rc3 running and shut down dozen times today and all good. Awesome.

@Athanasius
Copy link
Contributor Author

As we've had no further reports about this since 4.1.6 was released I'm going to very quietly (ssssh, don't wake the sleeping bugs) close this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants