Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Time drift in Moby VM and containers caused by system sleep #17

Closed
alanbrent opened this issue Jul 29, 2016 · 52 comments
Closed

Time drift in Moby VM and containers caused by system sleep #17

alanbrent opened this issue Jul 29, 2016 · 52 comments

Comments

@alanbrent
Copy link

alanbrent commented Jul 29, 2016

Expected behavior

When the Mac wakes from sleep, the Moby VM should resync its local system time with the Mac's system time.

Actual behavior

Moby VM time simply resumes counting from the time the Mac went to sleep. This results in strange and many times difficult to troubleshoot issues with software running in containers.

Information

Docker for Mac: version: win-beta-v1.12.0-rc5-beta21-73-g9cd1dc2
OS X: version 10.11.6 (build: 15G31)
logs: /tmp/1D6BCF46-49C8-45BC-86AC-EF873AB86DFB/20160730-113336.tar.gz
[OK]     docker-cli
[OK]     app
[OK]     moby-syslog
[OK]     disk
[OK]     virtualization
[OK]     system
[OK]     menubar
[OK]     osxfs
[OK]     db
[OK]     slirp
[OK]     moby-console
[OK]     logs
[OK]     vmnetd
[OK]     env
[OK]     moby
[OK]     driver.amd64-linux

Steps to reproduce the behavior

  1. put Mac to sleep, wait n seconds/minutes/hours
  2. wake Mac from sleep
  3. Issue date in container or Moby VM and see that it outputs $(date) - n
@alanbrent
Copy link
Author

I was able to reproduce this in the GA release, but it looks like it then caught up. I'll check on this when I leave work and return home, and if it's truly fixed then I'll close the issue.

@alanbrent
Copy link
Author

alanbrent commented Jul 30, 2016

The issue has cropped back up on my Moby VM. I put my Mac to sleep when I left for work, and it's currently running ~47min behind, and the same is the case for any containers I run.

(Mac Shell)->date
Fri Jul 29 19:35:25 PDT 2016
---
moby login: root
Welcome to the Moby alpha, based on Alpine Linux.
moby:~# date
Sat Jul 30 01:48:52 UTC 2016
---
(Mac Shell)->docker run --rm -it ruby bash
root@c9e297c5bef0:/# date
Sat Jul 30 01:49:48 UTC 2016

UPDATE: Moby VM is now 16h behind after the Mac was in sleep mode all night. This is definitely still a problem.

@ijc
Copy link
Contributor

ijc commented Aug 1, 2016

Hi @alanbrent,

The time should resync after a matter of minutes (if not seconds). For how long is it staying out of sync for you?

Do you have NTP on the OSX host? What does ntpq -p show before and after sleep?

It'd also be useful to see the equivalent (chronyc sources) from within the VM, to get a shell use docker run -it --privileged --pid=host alpine:3.4 nsenter -t 1 -m -n sh

There was supposed to be a section in the troubleshooting guide about NTP issues and possible workarounds, but it looks like it somehow got undone by a rebase somewhere along the line. The diff was:

 +* If system does not have access to an NTP server then after a hibernate the time seen by Docker for Mac may be considerably out of sync with the host. Furthermore the time may slowly drift out of sync during use. The time can be manually reset after hibernation by running:
 +
 +        docker run --rm --privileged alpine hwclock -s
 +
 +  Or alternatively to resolve both issues you may like to add the local clock as a low priority (high stratum) fallback NTP time source for the host by editing the host's `/etc/ntp-restrict.conf` to add:
 +
 +        server 127.127.1.1              # LCL, local clock
 +        fudge  127.127.1.1 stratum 12   # increase stratum
 +
 +  And then restarting the NTP service with:
 +
 +        sudo launchctl unload /System/Library/LaunchDaemons/org.ntp.ntpd.plist
 +        sudo launchctl load /System/Library/LaunchDaemons/org.ntp.ntpd.plist

I'll look at getting this section of the docs reinstated.

@alanbrent
Copy link
Author

alanbrent commented Aug 1, 2016

@ijc25

The time should resync after a matter of minutes (if not seconds). For how long is it staying out of sync for you?

It stays out of sync until I manually restart the Moby VM.

Do you have NTP on the OSX host? What does ntpq -p show before and after sleep?

This is what it shows now, which would be "after" sleep. I'd have to reboot to give you a true reading of what it looks like "before" sleep, and I can't stop to do that at the moment.

>ntpq -p localhost: timed out, nothing received

There was supposed to be a section in the troubleshooting guide about NTP issues and possible workarounds, but it looks like it somehow got undone by a rebase somewhere along the line. The diff was:

I ran hwclock -s directly on the Moby VM* (rather than from within a privileged container) and that immediately cleared up the time drift.

* via screen ~/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/tty

@ketan
Copy link

ketan commented Aug 5, 2016

I can confirm this is still a problem with the latest beta 1.12.0-beta21

@alanbrent
Copy link
Author

That's strange @ketan, because I actually came here to report that it appears to be fixed for me in the 1.12.0-beta21 (build: 11019).

@alanbrent
Copy link
Author

alanbrent commented Aug 5, 2016

Update: i had to reboot my machine the other day and currently I am getting a response from ntpq, so perhaps that really is the center point of this issue:

->ntpq -p
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
*usdal2-ntp-002. .GPSs.           1 u  254  256  377   50.561  620.549   2.355

@alanbrent
Copy link
Author

alanbrent commented Aug 9, 2016

Update

So I ran into nasty issues with the beta channel, tried reverting to the stable channel but it didn't fix it. I ended up completely uninstalling Docker and reinstalling stable from scratch. At this point, the time drift issue has not recurred. I'm not sure if this should be considered related, because I've yet to see a no-response to ntpq -p since my last reboot.

@dave-tucker
Copy link
Contributor

@alanbrent @ketan are either of you still seeing this issue?

If not, I'm tempted to close as we have now updated the troubleshooting documentation

@ketan
Copy link

ketan commented Sep 1, 2016

@dave-tucker — I'm still able to reproduce this. Notice how the created shows 24 hours ago for a container that I'd created a few seconds ago.

If you need any more information, I'm happy to provide it.

This is causing a lot of our API clients to be confused about when the instances were actually created.

$ docker run -it alpine:latest hostname
74220015682f
$ docker ps -a
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS                      PORTS               NAMES
74220015682f        alpine:latest       "hostname"          24 hours ago        Exited (0) 42 seconds ago                       stupefied_heisenberg

After a docker restart —

$ docker run -it alpine:latest hostname
fa942918f6e8
$ docker ps -a
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS                     PORTS               NAMES
fa942918f6e8        alpine:latest       "hostname"          5 seconds ago       Exited (0) 2 seconds ago                       stoic_tesla

Here's the diagnostic information (Diagnostic ID: 68306E84-E35C-4FCC-B1A8-44197FE38F9A)

Docker for Mac: version: 1.12.1-beta24 (0afa5e5)
OS X: version 10.11.6 (build: 15G31)
logs: /tmp/68306E84-E35C-4FCC-B1A8-44197FE38F9A/20160901-193324.tar.gz
[OK]     docker-cli
[OK]     virtualization kern.hv_support
[OK]     menubar
[OK]     moby-syslog
[OK]     dns
[OK]     disk
[OK]     system
[OK]     app
[OK]     osxfs
[OK]     virtualization VT-X
[OK]     db
[OK]     slirp
[OK]     logs
[OK]     env
[OK]     vmnetd
[OK]     moby-console
[OK]     moby
[OK]     driver.amd64-linux

@dsheets dsheets added kind/bug and removed kind/docs labels Sep 1, 2016
@justincormack
Copy link
Member

@ketan what does ntpq -p localhost say on the Mac when this is a problem?

@ketan
Copy link

ketan commented Sep 2, 2016

@justincormack

I just woke up my mac after a good night of sleep. There's about an approx 11 hour drift between the time on my mac and the docker instance. That looks like approximately the time that the lid on my mac was closed.

$ date --utc && docker run -it alpine:latest date --utc && ntpq -p localhost
Fri Sep  2 04:41:17 UTC 2016
Thu Sep  1 17:16:23 UTC 2016
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 hkhkg1-ntp-001. .XFAC.          16 u  22h 1024    0    0.000    0.000   0.000

@djs55
Copy link
Contributor

djs55 commented Sep 5, 2016

With beta 24 I put my Mac to sleep on battery for about 1.5hrs and when it woke up the clock was originally out by 8s, but the gap closed after about 10s. I'll try on another laptop overnight to see if I can reproduce this.

@djs55
Copy link
Contributor

djs55 commented Sep 7, 2016

Another reporter has version

$ docker version
Client:
 Version:      1.12.1
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   23cf638
 Built:        Thu Aug 18 17:32:24 2016
 OS/Arch:      darwin/amd64
 Experimental: true

Server:
 Version:      1.12.1
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   23cf638
 Built:        Thu Aug 18 17:32:24 2016
 OS/Arch:      linux/amd64
 Experimental: true

and got into this state:

$ date
Tue Sep  6 09:49:30 PDT 2016              <--------------- Time known by Mac
$ docker run --rm -ti --entrypoint sh busybox
/ # date
Sun Sep  4 16:33:44 UTC 2016          <------------- time observed inside a container
/ #

and then said

It looks like the container clock is moving forward, it just has a negative offset relative to the mac's system clock.
the clock seen by the docker container did eventually sync back up with the correct time as read from the Mac's internal clock.
It took about 2 hours for NTP to converge after the Mac had been asleep for several days.

@simonplend
Copy link

I am also seeing this issue and it's causing problems with the applications I have running in containers e.g. AWS APIs rejecting requests:

The difference between the request time and the current time is too large.

I can see the workarounds in the troubleshooting docs for Docker for Mac (https://docs.docker.com/docker-for-mac/troubleshoot/#/known-issues), but is there any timeline for this issue to be fixed?

@schmunk42
Copy link

Related #693

Very annoying bug, which also existed in docker-machine AFAIR.
This causes i.e. errors with AWS credential validation.

@stephenrwalli
Copy link

It's back for me.

Expected behavior

Last login: Wed Oct 5 22:53:10 on ttys001
.profile start
.profile end
$ docker run --rm centos:7 date -u && date -u
Mon Oct 10 17:18:18 UTC 2016
Mon Oct 10 17:18:19 UTC 2016

This has been working correctly for a few days.

Actual behavior this morning when I happened to check.

$ docker run --rm centos:7 date -u && date -u
Wed Oct 12 15:32:20 UTC 2016
Wed Oct 12 15:53:15 UTC 2016
$
$ ntpq -p localhost

remote refid st t when poll reach delay offset jitter

uslax1-ntp-002. .GPSs. 1 u 71m 1024 16 39.530 -0.200 241.736

Information

  • Full output of the diagnostics from "Diagnose & Feedback" below

Steps to reproduce the behavior

  1. Still exploring time drift. I was unable to repeat the drift I've seen in previous weeks last week while we were all in Berlin and someone could look.
  2. Since I've been home, the machine has been closed a few times, and has gone through normal sleep most nights. Seemed to be working. I happened to check this morning and the problem is back, admittedly with a small 20 min drift.

Ran the diagnostic
Docker for Mac: version: 1.12.1-beta26.1 (12c3e63)
OS X: version 10.11.6 (build: 15G1004)
logs: /tmp/49B53B43-4DD2-440A-A072-3FDB160F24E3/20161012-085437.tar.gz
[OK] docker-cli
[OK] virtualization kern.hv_support
[OK] menubar
[OK] moby-syslog
[OK] dns
[OK] disk
[OK] system
[OK] app
[OK] osxfs
[OK] virtualization VT-X
[OK] db
[OK] slirp
[OK] logs
[OK] env
[OK] vmnetd
[OK] moby-console
[OK] moby
[OK] driver.amd64-linux

@rickpeters
Copy link

Just to confirm that this is still a problem (I'm on the beta channel). Since my mac is living 'the future' from container and moby vm point of view, the problem is that timestamps of files changed on the mac (through a volume) do not exist yet in the container.
This gives unexpected behaviour...

@theElephant
Copy link

Also happening on 1.13.1 (15353). Skew is proportional to the amount of time that mac goes to sleep. Restarting the docker engine fixes the skew.

@Miserlou
Copy link

A lot of our users have been complaining about this, since our application uses AWS services calls which are very picky about clocks being in sync.

@AdriVanHoudt
Copy link

I also had issues where my MySQL instance got out of date. docker-time-sync-agent did the trick though, although not ideal...

@immae1
Copy link

immae1 commented Feb 27, 2017

very simple way to fix the clockdrift bug:

brew install sleepwatcher
brew services start sleepwatcher
echo /usr/local/bin/docker run --rm --privileged alpine hwclock -s > ~/.wakeup
chmod +x ~/.wakeup

@venth
Copy link

venth commented Mar 2, 2017

I rather to prefer a cron job on alpine that do hwclock -s. It's simpler and I already did this on my docker server. Nothing to install. It just works.

@rogaha
Copy link

rogaha commented Mar 2, 2017

thanks for your feedback @venth. We are working on fixing it for the next release.

@jlutzhbo
Copy link

jlutzhbo commented Mar 8, 2017

This happens after a fresh reboot as well. No need to enter sleep mode.

@theElephant
Copy link

theElephant commented Mar 10, 2017

FYI
I was using the --privileged flag to start the container then issuing hwclock -s. This worked for a few days, then stopped. This time, there was time spring ahead (built in daylight savings...just kidding) inside the docker container. I found the following solution to work for now:

yum install -y ntp # in container

sudo ntpdate <ntp server> # when container first 'boots up.' I used my company's internal ntp server.

@bacoboy
Copy link

bacoboy commented Mar 15, 2017

Confirmed that this is still happpening in stable channel version: 17.03.0-ce-mac2 (15654)

Makes it hard to do crypto stuff when the clock is off. Posting my workaround from bits gathered elsewhere is to force clock sync in the busybox image running the docker containers. First you connect to the moby image:

% screen ~/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/tty
<press return to get prompt at blank screen>

Then force a clock sync from the system:

/ # date -u
Wed Mar 15 07:47:33 UTC 2017
/ # hwclock -s
/ # date -u
Wed Mar 15 13:31:11 UTC 2017
<CTRL-a>d   <— to disconnect from screen session

Now the clock is more or less in sync:

% docker run --rm -it --privileged --pid=host alpine:3.4 date && date -u
Wed Mar 15 13:39:11 UTC 2017
Wed Mar 15 13:39:12 UTC 2017

Running ntpdate in the entrypoint script will also work as @theElephant suggested, but neither of these is a long term fix. At some point the moby image will drift from the mac's clock. All clocks drift, this is why there is ntpd in the first place :)

Hoping for a better long term solution comes out of this thread that is many many months old. :(

@copumpkin
Copy link

@djs55 any updates from the Docker side here? This is causing lots of pain all over the place.

@alfredolopezzz
Copy link

this is quite of an issue for our dev/test environments that we run in docker :(

@avsm
Copy link

avsm commented Mar 31, 2017

Apologies for the delay merging a solution to this into Docker for Mac. A number of the solutions that we've tried have interacted poorly with the power management behaviour (i.e. running something regularly on cron from within the VM), or otherwise interfered with the Docker setup (running hwclock in a container).

Our current work involves building a lightweight RPC service into Hyperkit that can execute a whitelisted set of commands such as hwclock directly from the host, and then using Apple's sleep/wakeup notifications to run the time sync.

This is similar to the workaround with sleepwatcher above, which should work while we integrate this directly into Docker for Mac:

brew install sleepwatcher
brew services start sleepwatcher
echo /usr/local/bin/docker run --rm --privileged alpine hwclock -s > ~/.wakeup
chmod +x ~/.wakeup

Apologies again for the delay -- this is very much on our radar to fix.

@tuanmh
Copy link

tuanmh commented May 9, 2017

We're having the same problem here.

@samudurand
Copy link

There is an improvement in the latest Docker for mac release it seems, I don't know yet to what extent it "solves" the issue

@MagnusS
Copy link
Contributor

MagnusS commented May 11, 2017

In the latest edge release of Docker for Mac (17.05.0-ce-mac9) time is synchronised automatically after sleep - this fix will also be in stable 17.06 when it is released next month.

The HyperKit driver running on the host will now detect a wakeup from sleep and signal a new daemon in the guest (host-timesync-daemon) which will then synchronise the time with the host using hwclock.

As Anil mentioned in #17 (comment), this is similar to some of the workarounds posted here, but the command is run directly in the guest to not interfere with the regular Docker setup. To compensate for time drift when not sleeping chronyd is also still running in the guest VM.

Although this should greatly improve the current situation a proper fix would be to implement a clock that was pinned to the host so we didn't have to synchronise the time. Currently we're unable to do that, see moby/hyperkit #110 (comment).

I'm closing this issue as the reported problem should now be fixed. Please open a new issue if you are running a version newer than 17.05.0-ce-mac9 and still see this or similar issues

@MagnusS MagnusS closed this as completed May 11, 2017
@docker docker locked and limited conversation to collaborators May 11, 2017
@docker-robott
Copy link
Collaborator

Closed issues are locked after 30 days of inactivity.
This helps our team focus on active issues.

If you have found a problem that seems similar to this, please open a new issue.

Send feedback to Docker Community Slack channels #docker-for-mac or #docker-for-windows.
/lifecycle locked

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests