Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UTC and ISO 8601 as default timestamps #135

Closed
wants to merge 1 commit into from

Conversation

bsvingen
Copy link

@bsvingen bsvingen commented Nov 4, 2015

Using UTC and ISO 8601 as the default timestamp format, since that tends to lead to the least confusion and surprise.

@ptaoussanis
Copy link
Member

Hi Børge, I'm on board with that - thanks. Any reason you've killed the default :locale key though? I'd also prefer if you left (java.util.TimeZone/getDefault) in, commented out. (Some folks may prefer to use their non-UTC default and this could be a handy example).

@bsvingen
Copy link
Author

bsvingen commented Nov 5, 2015

My thinking was that it would then use the default for the JVM, instead of assuming en - I would be happy to put it back, though.

Using UTC and ISO 8601 as the default timestamp format, since that tends
to lead to the least confusion and surprise.
@ptaoussanis
Copy link
Member

Switching to the JVM default would be okay, but we'd need to do that by adding
:locale (java.util.Locale/getDefault) instead of just removing the locale key (there's no automatic fallback).

Gave this some more thought, and think I'd prefer to stick to the current timestamp pattern; I find iso8601 quite verbose for logging purposes. I have changed the default timezone to UTC and the default locale to the JVM default though.

Have also added some shorthands to make changing to iso8601 easier if you prefer that; you can now use :pattern :iso8601 for example. We can maybe add some other common shorthand options in future.

How do you feel about this approach?

@bsvingen
Copy link
Author

bsvingen commented Nov 6, 2015

The main problem with timestamp formats like the one you are using is that they are not easily sortable, that is, without actually parsing the timestamps, so it becomes harder to align log files. Having good defaults are important, since I don't believe most people are going to change them.

This is your decision, obviously.

https://xkcd.com/1179/

@ptaoussanis
Copy link
Member

Having good defaults are important, since I don't believe most people are going to change them.

Agreed, which is why I'm suggesting we give this some thought before making a decision.

The XKCD is referring to the date-only version of ISO8601 which is fine (if a bit verbose on the year). The full-spec version which you're suggesting might be suboptimal for our purposes:

  1. It uses "T" as a time separator which makes it difficult to visually parse timestamps IMO. (Do you disagree?)
  2. It prints the timezone for every entry, which remains consistent and so is redundant. This is a particular pain for inclusion with email logging since the subject line has a limited amount of space. (Do you disagree?)

I'm not sure how you find the current format "confusing"- can you maybe give an example of how it may be ambiguous?.

Not being sortable is a different story and not something I'd thought of. We could make it sortable by switching to a "yy-MM-dd" date-part format though without picking up the other properties of the full-spec ISO8601 which are less attractive.

What are your thoughts?

@bsvingen
Copy link
Author

bsvingen commented Nov 7, 2015

I believe we are coming at this from slightly different angles.

My main concern is to have logs that are machine readable. I want to be able to easily combine logs, grep them, etc., without having the complexity of parsing multiple formats. I find it's a fairly rare case that I manually read a log without first grepping for the relevant parts (unless I'm tailing real time). All of these things become easier if everyone sticks to 8601, and having it as the standard would be a good step in that direction.

There are clearly different use cases here, with different requirements.

It uses "T" as a time separator which makes it difficult to visually parse timestamps IMO. (Do you disagree?)

I do, but that's probably just because I'm used to the format. I can see how it can seem a bit dense, certainly.

The good thing about the "T" is that it makes it very easy to incrementally search (or grep) for times by typing "T" followed by the actual digits.

It prints the timezone for every entry, which remains consistent and so is redundant. This is a particular pain for inclusion with email logging since the subject line has a limited amount of space. (Do you disagree?)

It is only redundant when you know what it is, and when it's the same for every log line (across files). I prefer every line to be self-contained, so that I don't have to maintain this piece of information on the side.

I'm not familiar with the email case, so I can't really comment on that. If the idea is to email every log line, maybe you could make the timestamp be the actual timestamp for the email, instead of putting it in the subject line?

I'm not sure how you find the current format "confusing"- can you maybe give an example of how it may be ambiguous?.

The "confusion" comes from have to deal with lots of different log formats - it is not ambiguous on its own.

Not being sortable is a different story and not something I'd thought of. We could make it sortable by switching to a "yy-MM-dd" date-part format though without picking up the other properties of the full-spec ISO8601 which are less attractive.

That is certainly preferable over the MMM format, yes.

@ptaoussanis
Copy link
Member

Great, appreciate the detailed rationale!

Let me think about this for a little longer. It's looking like this may be a case of deciding whether we optimize for human-readability or scriptability out the box.

My initial leaning is: people who are going to need/want log merging, etc. are likely to be more sophisticated users. Those are relatively rarer in my experience. And the more sophisticated sort are also less likely to mind about adding a {:pattern :iso8601} option to get the behaviour they want. Does that make sense?

Definitely we at least switch from "yy-MMM-dd" -> "yy-MM-dd", and we add a :iso8601 shorthand.

Again, thanks for your well-reasoned input on this! Cheers :-)

@ptaoussanis
Copy link
Member

Okay, ready to close this. Think the current approach (:iso8601 shorthand, etc.) seems reasonable. Choosing defaults will necessarily always involve making tradeoffs that favour one set of users or another.

In the case of Timbre, think it makes sense to optimise default config for unsophisticated users. Folks who want standard iso8601 and who are prepared to accept the downsides can now get that with a trivial one liner which was definitely a welcome change. Have also adopted UTC and sortable (yy-MM-dd) timestamp defaults.

Thanks again for the input :-)

@bsvingen
Copy link
Author

Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants