-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why do we prefer CLOCK_MONOTONIC over CLOCK_REALTIME #289
Comments
As the name suggests, CLOCK_MONOTONIC is a monotonically increasing clock source. It will never make large jumps but instead adjust the phase to adjust the time as the local clock drifts. CLOCK_MONOTONIC_RAW is the more extreme example of this, which will not even adjust the phase but instead provide a semblance of a clock by scaling the value of a CPU counter. CLOCK_REALTIME is kept as close to the real time in your clock domain as possible. This means that if you have 2 systems, the only clock that makes sense for the two is CLOCK_REALTIME, you should not compare CLOCK_MONOTONIC between two machines! However, when you start setting timeouts or timers that need to fire very closely to your target, you must be a bit careful with CLOCK_REALTIME as this clock is free to move backward in time. In netchan we used CLOCK_MONOTONIC for this reason and read the PTP Time directly from the NIC. However, this opens you up to a lot of extra complexity. At the same time, if you have microsecond accuracy as a target, you should disable frequency scaling and avoid moving between c-states (see https://github.com/henrikau/net_chan/blob/main/src/netchan.c#L1012 for an example) If you have a PTP synchronized domain and the clocks are reasonably stable, using CLOCK_REALTIME will save you a lot of complexity (which translates into fewer sources of errors) at the cost of suddenly overshooting a timer when the base is yanked backward from under you. |
Thanks for this input Henrik. Very insightful.
|
I am curios to hear what @edwardalee has to say here since the choice of CLOCK_MONOTONIC seems deliberate |
The choice of clock in LF is determined by the
Regardless of what clock you choose, LF uses two safeguards:
If indeed |
It should be possible to set which underlying clock should be used by pthread: https://linux.die.net/man/3/pthread_condattr_setclock |
Correct. The reason was that I had some plans of using netchan to bridge multiple networks where each network did not necessarily live in the same clock domain. This was a mistake and I am currently moving net_chan away from this approach and towards expecting phc2sys to sync the clocks.
Be aware that if you do this at startup, the offset is not necessarily constant. CLOCK_MONOTONIC will have the phase adjusted so if the clocks become better synchronized after a while, then the original offset may not be accurate any more.
Yes, It does not make sense for phc2sys to adjust CLOCK_MONOTONIC. By writing 1 to /dev/cpu_dma_latency and keeping the file open (i.e. do NOT close it), cstate transitions are disabled. You can pass a flag to net_chan to disable this, wakeup accuracy then goes from 18us to above 500us. |
After a discussion with Henrik yesterday I now understand the clock and timer subsystem of Linux a lot better. I have made the following observation:
|
This makes sense. For all of these clocks, our physical time will still be monotonic because this is enforced in lf_time_physical(). |
Should we then update our code generator to pick |
If you have safeguards in place for detecting if the clock jumps backward, I would consider CLOCK_REALTIME purely because it tracks global time. If all you need is a monotonically increasing value, wouldn't the CPU counter be just as effective? And would be transferrable to other platforms than x86 as well |
I think we should have a testbed for distributed LF programs. This way we can actually experiment with different underlying CLOCKS and how it interacts with NTP and PTP or hard clock resets from the user. I think we should add a PR with the following changes:
|
Is the mechanism by which we enforce monotonicity thread-safe? To me it looks like it isn't because it uses the global variable |
That is a good point, Peter. I think you are right. By the way, what would be the consequence of a non-monotonic underlying clock? Might our runtime crash because of some assertions on the current physical time and time of the last handled event? I am asking to understand what could happen if we remove this monotonic-guaranteeing logic. I think the best for us is to let NTP and PTP keep our physical clocks in sync. But they might step the clock forward or backward. It is possible to configure NTP/PTP so that they don't step the clock, or only step the clock when booting the clock sync client. However, it will take these clients a long time to correct even small clock offsets, so stepping the clock might be what you want in many cases. In other words. We shouldn't rely on the underlying physical clock always being monotonic unless we can guarantee that NTP/PTP are configured in a certain way. |
I have been thinking more about this problem and my conclusion now is that we should just stick with CLOCK_REALTIME and forget about CLOCK_MONOTONIC. This means that we need to ensure monotonicity on top of the clock. Also, we must make it thread-safe as Peter observes above. My logic is as follows: Since we want to support using PTP and NTP to synchronize federates, we need to support CLOCK_REALTIME, so we will need the monotonicity enforcing anyway. The main argument for CLOCK_MONOTONIC is if we have federates with clock-sync enabled then we don't want NTP/PTP to mess up our clock sync. But is it a realistic scenario that we have NTP/PTP enabled and still want LF clock sync? I think it is reasonable to require either: If this is an unreasonable requirement for the system on which our federates are running. Then we must consider CLOCK_MONOTONIC. The advantages of removing CLOCK_MONOTONIC are the following:
I think that moving to just using CLOCK_REALTIME removes some complexity and puts a little more responsibility at the user. Which I think is OK. What do you think? |
I agree. Let's go with CLOCK_REALTIME. I think we already ensure monotonicity... |
According to the manpages (
man clock_gettime
):CLOCK_MONOTONIC: Counts nanoseconds since boot and is not affected by discontinuous jumps in system time. But it does not count time when the system is suspended (what does that mean?)
CLOCK_REALTIME: Gives the system POSIX time since 1970. Affected by discontinuous jumps.
While discontinuous jumps could be disadvantageous, I do believe that the linuxPTP tools use CLOCK_REALTIME. (I will ask Henrik about this). Also, seems dangerous that the LF clock should remain unchanged during system suspension.
The text was updated successfully, but these errors were encountered: