Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in parse_from_rfc2822 with -0000 timezone #102

Closed
nicklan opened this issue Oct 12, 2016 · 9 comments · Fixed by #368
Closed

Bug in parse_from_rfc2822 with -0000 timezone #102

nicklan opened this issue Oct 12, 2016 · 9 comments · Fixed by #368

Comments

@nicklan
Copy link

nicklan commented Oct 12, 2016

The following program:

extern crate chrono;

use chrono::datetime::DateTime;

fn main() {
    let minus = "Fri, 1 Feb 2013 13:51:20 -0000";
    let plus  = "Fri, 1 Feb 2013 13:51:20 +0000";
    println!("{} -> {:?}",minus,DateTime::parse_from_rfc2822(minus));
    println!("{} -> {:?}",plus,DateTime::parse_from_rfc2822(plus));
}

outputs:

Fri, 1 Feb 2013 13:51:20 -0000 -> Err(ParseError(NotEnough))
Fri, 1 Feb 2013 13:51:20 +0000 -> Ok(2013-02-01T13:51:20+00:00)

This seems wrong. There should be no difference between -0000 and +0000 right?

@lifthrasiir
Copy link
Contributor

lifthrasiir commented Oct 13, 2016

My understanding is that -0000 indicates the absence of useful time zone information. I'm less sure about what it actually means (especially in practice), however---is it a local time, a UTC without no local time offset (possible with a different interpretation of RFC 2822), or a completely ambiguous timestamp? Chrono currently takes the third option, i.e. they are not safe to read as any time zone, but I would like to change the default if other options are more widespread. (For example, Python email package seems to use the local time option, but only as a last resort.)

@nicklan
Copy link
Author

nicklan commented Oct 13, 2016

Right. The spec is a little hard to read on this. I did notice that python's email.utils parsedate_tz treats +0000 the same as -0000, and gnu date seems to as well.

I guess it's fine either way, but certainly some email clients seem to put -0000 in the date field, meaning if you're parsing those you need to do extra work to transform them to +0000 (or something else)

@Eijebong
Copy link

Any update on this ?

@Riduidel
Copy link

Quite one year elapsed, and no advance on this subject ? Is it possible to help ?

@jcranmer
Copy link

As I read RFC 5322 (not that the text changed from RFC2822), -0000 still indicates that the timestamp is to be semantically interpreted as UTC. Local time is discussed as something that clients should express, but the offset described is the offset of the time-of-day, not the offset of local time from UTC. When it's again discussed at the end of the paragraph, it mentions that -0000 "also" indicates UTC but also indicates that the system's time zone may not be (i.e., is not necessarily) in UTC, and then clarifies more succinctly that the date-time contains no information about the local time zone.

In other words, +0000 means that you should return a DateTimeFixedOffset::east(0), while -0000 means that you really want to return a DateTime instead. In the current implementation, there's no way to disambiguate these two interpretations, but if you allowed DateTime<Option>, it could be done.

@kanekv
Copy link

kanekv commented Nov 17, 2019

For all practical reasons I would like to have -0000 parse same as +0000, if there are no arguments against it - I will provide a PR.

@kanekv
Copy link

kanekv commented Nov 22, 2019

@lifthrasiir Please let us know what we can do here, it's not only python that does it, time 0.1 also generates rfc822 time with -0000: https://docs.diesel.rs/time/struct.Tm.html#method.rfc822z.

@quodlibetor
Copy link
Contributor

I agree that the following sentences from RFC 5322 all say that -0000 should be interpreted the same as +0000, but may mean that the client system wasn't necessarily also set to GMT, which, like, who cares?

The form "+0000" SHOULD be used to indicate a time zone at Universal Time. Though "-0000" also indicates Universal Time, it is used to indicate that the time was generated on a system that may be in a local time zone other than Universal Time and that the date-time contains no information about the local time zone.

and this in particular seems to suggest that -0000 is meant to be used as a
placeholder for "Use UTC but mostly because somebody messed up":

The 1 character military time zones were defined in a non-standard way in [RFC0822] and are therefore unpredictable in their meaning. The original definitions of the military zones "A" through "I" are equivalent to "+0100" through "+0900", respectively; "K", "L", and "M" are equivalent to "+1000", "+1100", and "+1200", respectively; "N" through "Y" are equivalent to "-0100" through "-1200". respectively; and "Z" is equivalent to "+0000". However, because of the error in [RFC0822], they SHOULD all be considered equivalent to "-0000" unless there is out-of-band information confirming their meaning.

And:

Other multi-character (usually between 3 and 5) alphabetic time zones have been used in Internet messages. Any such time zone whose meaning is not known SHOULD be considered equivalent to "-0000" unless there is out-of-band information confirming their meaning.

And from the apendix:

The following are the changes made from [RFC0822] and [RFC1123] to [RFC2822] that remain in this document:
1...snip....
6. Specifically allow and give meaning to "-0000" time zone.

@quodlibetor
Copy link
Contributor

which is all to say I am happy to take a PR that interprets -0000 as UTC.

quodlibetor added a commit to quodlibetor/rust-chrono that referenced this issue Nov 30, 2019
This is a time that is commonly set in some environments, and RFC 5322
explicitly clarifies that we should treat -0000 as UTC[1][2] when interpretting
rfc2822.

Fixes chronotope#102

[1]: chronotope#102 (comment)
[2]: https://tools.ietf.org/html/rfc5322#section-3.3
quodlibetor added a commit to quodlibetor/rust-chrono that referenced this issue Nov 30, 2019
This is a time that is commonly set in some environments, and RFC 5322
explicitly clarifies that we should treat -0000 as UTC[1][2] when interpretting
rfc2822.

Fixes chronotope#102

[1]: chronotope#102 (comment)
[2]: https://tools.ietf.org/html/rfc5322#section-3.3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants