-
Notifications
You must be signed in to change notification settings - Fork 21
Retry on openApplication error (darwin); More error logging #4
Conversation
We could merge this. But we still don't have any theories for:
Any ideas on those? |
tryOpen := func() error { | ||
out, err := exec.Command("/usr/bin/open", applicationPath).CombinedOutput() | ||
if err != nil { | ||
return fmt.Errorf("%s; %s", err, string(out)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the error message also say that open
caused this problem too?
I digged a little bit trying to find the source code to /usr/bin/open to see if I could see under what scenarios a exit status of 1 would result... I pulled all the source for https://opensource.apple.com/ and grepped for strings present in the man page, but was unable to find it. Just trying stuff out, I was able to get /usr/bin/open to return exit code 1 if:
I'm assuming there are other scenarios where open might exit with this code... maybe it always returns this on error... or maybe uses this code if there is a file related issue... If the application is already open then it will noop return exit code 0, so it seems like that isn't the issue. We do use SIGKILL on the app process. This feels like a race condition or possibly flakiness arising from the OS (maybe Gatekeeper?) after a SIGKILL. I have seen Gatekeeper act weird in the past. Perhaps we shouldn't be using SIGKILL and it is too abrupt especially for something like Electron where it has other helper processes, etc? I don't know why this would appear now, other than there have been lots of Electron patch releases lately so maybe some timing changed. I looked at Sparkle and it uses NSWorkspace openFile. Squirrel uses NSWorkspace launchApplicationAtURL. We could make our own open tool (or possibly cgo) which uses these library calls, but I'm not sure if it would give us anything. |
We are also assuming (golang) os.Rename is atomic but maybe we shouldn't be... This would cause the /usr/bin/open to fail with exit code 1 (with file not found)... "There are systems where it cannot be done atomically, therefore it is not atomic." "os: make Rename atomic on Windows", |
Is a directory or a file being swapped? If a file, we can use |
It's an (app bundle) directory. |
I think it's saying that if it's backed by |
I think the next step should be to run dtruss while reproing. It'll test our current hypothesis as well as letting us the see the error returned from |
I got dtruss going. It isn't capturing everything going on, but this looks like what we were looking for:
|
Good news: I'm now able to repro with a service that I built myself that includes this PR. Bad news: The three second delay doesn't help.
|
I tried sending p.Signal(syscall.SIGTERM) instead of p.Kill (SIGKILL). It changed the output but didn't help:
I'd guess that error -600 is because something's still held by the exiting old process. |
I tried waiting 9 seconds instead of 3 seconds, same result. |
We tried open()ing directly from the downloaded tempdir rather than moving the download into Applications and that worked. |
I tried killing the kbfs process during the 9 second delay between open() attempts, in case its open process from the "old" app bundle was preventing our relaunch, but it didn't help. |
Trying open with 10 one second delayed retries worked after 6 retries/opens. But I think launchd is the parent process, not the service? That's strange. @chrisnojima suggests trying with gatekeeper turned off. |
It might be a bug in Electron... we could try downgrading that to an earlier version? |
This is the diff that works for me:
|
I'm running my service outside of launchd (as far as launchd is concerned keybase.service is not running), so that might affect launchd's decisions. |
oh i think i know the bug... it looks like the executable name changed from /Applications/Keybase.app/Contents/MacOS/Electron to /Applications/Keybase.app/Contents/MacOS/Keybase This causes Gatekeeper to get confused. |
I ran into this a few months ago... I'm checking now to see if a electron-packager change did this |
Agree, the filename changed between good build and bad build. Electron changed from v0.37.2 to v0.37.3. |
… name in app bundle See keybase/go-updater#4
…2583) * Symlink Electron to resolve Gatekeeper issue with changing executable name in app bundle See keybase/go-updater#4 * Vendoring updater changes from keybase/go-updater#4
Forgot to update the name of the launch command in slackbot
It will retry open if it fails (after 3 seconds) for OS X... Adds some more logging for errors...