Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: consider removing nanoseconds from intervals #32143

Closed
maddyblue opened this issue Nov 4, 2018 · 8 comments · Fixed by #34202
Closed

sql: consider removing nanoseconds from intervals #32143

maddyblue opened this issue Nov 4, 2018 · 8 comments · Fixed by #34202
Labels
A-sql-datatypes SQL column types usable in table descriptors. A-sql-pgcompat Semantic compatibility with PostgreSQL C-investigation Further steps needed to qualify. C-label will change.

Comments

@maddyblue
Copy link
Contributor

Did some research. The binary encoding for intervals sends the number of microseconds as an int64. The text encoding uses 6 digits after the .. For the text encoding we currently do weirdo things like "1s2ms3ns" which isn't at all what postgres does, and I'd honestly be surprised if that format is understandable by anything except lib/pq, where it happens to work because of Go's duration parsing.

We found exactly the same problems while exploring timestamps which is why we removed their nanoseconds and truncated to micros: nanos just aren't in the spec and things that support them only happen to by accident (usually in text decoding) but the binary spec doesn't support them at all.

So today if you ask for an interval that has nanoseconds you will get a different result in text or binary. This feels quite wrong to me and most definitely must be fixed. However the fix is not obvious and all of the options are bad options.

I think we should bite the bullet (as we should have done years ago just like timestamps):

  1. Change the text encoding to match postgres (this means removing its nanosecond output).
  2. Truncate all intervals to microsecond precision (just like timestamps).

The second point there is very hard because if we do it, then any existing on-disk intervals with nanoseconds are all of a sudden inaccessible by any equality queries in SQL, since there's now no way for a user to specify a time with nanoseconds, even though that's what's on disk.

One option is to teach extract_duration about nanoseconds to allow users to get them out and do whatever they need to, including resaving with microsecond precision.

I'm against a session setting that would increase the precision to 9 or whatever because then we have to keep it around forever, and it just doesn't work due to the binary encoding limitations.

@maddyblue maddyblue added C-investigation Further steps needed to qualify. C-label will change. A-sql-pgcompat Semantic compatibility with PostgreSQL labels Nov 4, 2018
@knz knz added the A-sql-datatypes SQL column types usable in table descriptors. label Nov 6, 2018
@knz
Copy link
Contributor

knz commented Nov 6, 2018

We can truncate all the existing intervals by ensuring that the rowfetcher truncates the nanosecond part when loading interval values.

To ensure a seamless transition we need to create a new KV encoding for intervals that only stores microseconds. The 2.2 executable would be able to read the old encoding (and truncate the nanosecond part) but would only write using the new encoding. We'd backport the decoder for the new encoding into 2.1 at the same time.

@knz
Copy link
Contributor

knz commented Nov 22, 2018

I just realized postgres also supports configuring the precision on intervals, see #32564.

@bobvawter
Copy link
Contributor

Based on a conversation with @mjibson:

We think it's unlikely that nanosecond-precision interval data exists on-disk in the real world, given that most (all?) PG client drivers would assume micro-precision intervals. In order to insert the data, one would have to have used a text-mode client, performed a string cast or parameter binding to interval, or performed some mathematical operation.

As such, we propose the following course of action:

  • Disk and in-memory encoding remains as 64-bit, nanosecond values. This doesn't reclaim any currently-wasted bits, but this isn't precluded from being changed in the future.
  • Always truncate intervals to microsecond precision when encoding intervals to datums.
  • Add and backport telemetry to detect truncation of nanos to micros when encoding datums (e.g. x % 1000 != x)
  • Add support for nanoseconds to the EXTRACT syntax as an escape hatch to allow any extant NS values to be retrieved.

Known risks to highlight in release notes:

  • Point lookups of nano-precision interval values would not find data if intervals are being used as a key. This seems unlikely.
  • A UNIQUE constraint on existing nano-precision data may appear to have duplicates when that data is rendered with microsecond precision.
  • Range lookups might suffer from fencepost issues.
  • Jitter in computations between new, micro-precision nodes and old, nano-precision nodes.
  • Mathematical operations involving intervals are still performed with nanosecond resolution.

@knz
Copy link
Contributor

knz commented Jan 8, 2019

Good discussion!

A couple of comments:

Always truncate intervals to microsecond precision when encoding intervals to datums.

Can you clarify this? Do you mean when converting a string to a dinterval? I thought the in-memory repr did not change?
Did you perhaps intend to mean "When encoding intervals to the KV representation"?

I would perhaps recommend to also truncate upon decoding, what do you think?

Add support for nanoseconds to the EXTRACT syntax as an escape hatch to allow any extant NS values to be retrieved.

Note: we don't currently support EXTRACT over intervals. I tried to do so in #27502 but that PR was not completed.

Range lookups might suffer from fencepost issues.

What is a fencepost issue?

Jitter in computations between new, micro-precision nodes and old, nano-precision nodes.

This sounds more like a real problem. Maybe this change could be gated by a cluster version bump to prevent this.

@bobvawter
Copy link
Contributor

Good discussion!

Thanks.

Always truncate intervals to microsecond precision when encoding intervals to datums.

Can you clarify this? Do you mean when converting a string to a dinterval? I thought the in-memory repr did not change?
Did you perhaps intend to mean "When encoding intervals to the KV representation"?

That's a better way to express it.

I would perhaps recommend to also truncate upon decoding, what do you think?

If we truncate on decoding, wouldn't that cause breakage when trying to update indexes? Actually, truncating when we encode the datum would also be problematic, right? If you had an index on an interval column with nanosecond precision, would we even be able to delete existing, nano-precision entries?

It seems like the behavior that we're after is that we want to truncate only those interval values received from a SQL client or new values that are getting sent to KV. It seems like we do need to be able to round-trip nano-precision intervals in and out of KV.

Maybe we do need a TInterval { micros: true } type. Ugh.

Range lookups might suffer from fencepost issues.

What is a fencepost issue?

It's shorthand for "a boundary-condition problem". The canonical example is to ask: If you're building a fence 10 meters long using 1-meter horizontal rails, how many vertical posts do you need?

Jitter in computations between new, micro-precision nodes and old, nano-precision nodes.

This sounds more like a real problem. Maybe this change could be gated by a cluster version bump to prevent this.

We thought about this, but it seems like the code would have to check that cluster-version any time an interval datum is processed; do you think this would this be unnecessarily expensive?

@knz
Copy link
Contributor

knz commented Jan 8, 2019

I thought about this a little more. The solution is to create a new encoding tag for interval values, i.e. a plain new encoding format.

Any value using the "old" (<=v2.1) format would be decoded with truncation; all new values (including all new index encodes/recodes) would use the new, truncated format. The new format would only be available upon bumping the cluster version to not confuse old nodes in mixed-version clusters.

@maddyblue
Copy link
Contributor Author

Did some searching in our archives and they are really interesting. #6604 was the PR that removed nanos from timestamps. #8864 was the PR that removed the nanosecond workaround functions. (Sadly, I also noted there that intervals should be handled, but never followed up. That would have saved a lot of future angst.) For example:

We changed cockroach to round to micros by default and added 3 functions that could create or display those nanoseconds

This choice ended up causing problems. Indeed extract USED to have a nanosecond option but it had problems so we chose to

remove the nanosecond extract functions because they now won't do what users expect

Maybe not a good idea to add it back. Here's what was in the release notes for that version:

TIMESTAMP values are now stored with microsecond precision instead of nanoseconds. All nanosecond-related functions have been removed. An existing table t with nanoseconds in timestamps of column s can round them to microseconds with UPDATE t SET s = s + '0s'. Note that this could potentially cause uniqueness problems if the timestamp is a primary key

So! We didn't change any encoding or decoding stuff. We just made a wrapper function that made it so that any time a new timestamp was created it would go through a wrapper function that would truncate the nanos. Any already-on-disk timestamps would be unchanged. We told users how to fix them if they wanted. I think we might be able to do the same thing here. That is:

  1. Limit any new intervals to micros (sql: support optional INTERVAL precision #32564)
  2. Tell users how to remove nanos from their old intervals.
  3. Change the pgwire text encoding to truncate nanos to micros (recall that this is what the original timestamp nano -> micro PR did). This will open up the possibility for bugs like cli: dump doesn't preserve nanoseconds in timestamps #8804 to recur. However this time we will learn from our past and instead of working around it with extra SQL functionality, we will just tell users to fix it with an UPDATE.

If users need to get their original nanos out they can convert them to a string and do whatever they need in their application, but we won't have anything that can maintain nanos.

@knz
Copy link
Contributor

knz commented Jan 14, 2019

ok sounds good

craig bot pushed a commit that referenced this issue Jan 31, 2019
34202: sql: remove nanoseconds from INTERVAL r=mjibson a=mjibson

Nanoseconds were not representable over pgwire binary mode and were being
truncated. We previously encountered this problem with timestamps (#6597)
and removed nanoseconds from timestamps at that time. We should have
done the same for intervals, since they have the same kind of problem,
but did not.

It is no longer possible to create intervals with nanosecond
precision. Parsing from string or converting from float or decimal will
round to the nearest microsecond. Similarly any arithmetic operation
(add, sub, mul, div) on intervals will also round to nearest micro. We
round instead of truncate because that's what Postgres does.

Existing on-disk intervals that contain nanoseconds will retain their
underlying value when doing encode/decode operations (so that indexes can
be correctly maintained. However there is no longer any way to retrieve
the nanosecond part. Converting to string, float, or decimal will first
round and then convert.

The reasoning for this restriction on existing on-disk nanoseconds is
related to the original bug, where we were truncating nanos to micros over
binary pgwire. The problem there was that depending on how you queried
the data (text or binary mode), you would get a different result, and
one of them was wrong. Similarly, it would be wrong to have the results
of an interval -> string conversion return a different result than just
querying the interval.

It is unfortunate that upgrading from 2.1 -> 2.2 will completely remove
the ability for users to continue accessing their nanoseconds. Due to
that, we must describe in the major release notes this change. Users who
require nanoseconds to be present will have to modify their application
to use a different data type before upgrading. Further, applications
that do comparisons on intervals may have some edge cases errors due to
rounding and seeming equality. That is, some intervals with nanos will
be rounded up to the next microsecond, possibly changing the results of
an existing query. Also, it is not possible to compare equality to any
existing interval with on-disk nanos. We believe the number of users
affected by this will be very small, and that it is still a necessary
change because of the unavoidable pgwire binary mode bug above, which
may already have been unknowningly affecting them.

Other implementations were worked on, like one where the user could
specify the desired precision of each operation (similary to how
timestamps work). This ended up being very tedious since there
are many operations and they all required the same microsecond
precision. Timestamps are different since there are some operations
that actually do need nanosecond precision, but intervals have no such
need. Thus, it was better to remove the precision argument and hard
code rounding. Another attempt was made to replace Nanos with Micros,
with an additional nanos field to hold on-disk nanoseconds. This had
difficult problems since all of our encoding infra uses nanoseconds
on disk. Converting the Micros field to nanos increased the possibilty
of overflow due multiplying by 1000. Handling the possibility of this
overflow in all possibly locations would require many large and risky
changes.

The implementation changes here are a bit odd and surprising at
first. This change leaves the duration.Nanos field, but (excepting the
Decode func) automatically rounds Nanos to nearist micro. This does leave
open the possible misuse of the Nanos field, since durations are created
directly instead of via a constructor. However, I think this problem is
less of a risk as the other attempts listed above.

See #6604 and #8864 for previous PRs and discussion about this problem
when we fixed it for timestamps.

Fixes #32143

Release note (sql change): INTERVAL values are now stored with microsecond
precision instead of nanoseconds. Existing intervals with nanoseconds
are no longer able to return their nanosecond part. An existing table t
with nanoseconds in intervals of column s can round them to the nearest
microsecond with `UPDATE t SET s = s + '0s'`. Note that this could
potentially cause uniqueness problems if the interval is a primary key.

Co-authored-by: Matt Jibson <matt.jibson@gmail.com>
@craig craig bot closed this as completed in #34202 Jan 31, 2019
craig bot pushed a commit that referenced this issue Mar 20, 2019
35980: sql: Reject INTERVAL nanoseconds r=bobvawter a=bobvawter

This change is a follow-up to #32143 and #34202.  This changes the parser to
reject the "nanoseconds" token when parsing an INTERVAL value.

Fixes: #35872

Release note: None

Co-authored-by: Bob Vawter <bob@cockroachlabs.com>
craig bot pushed a commit that referenced this issue Mar 21, 2019
36001: backport-19.1: sql: Reject INTERVAL nanoseconds r=bobvawter a=bobvawter

This change is a follow-up to #32143 and #34202.  This changes the parser to
reject the "nanoseconds" token when parsing an INTERVAL value.

Fixes: #35872

Release note: None

Co-authored-by: Bob Vawter <bob@cockroachlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-datatypes SQL column types usable in table descriptors. A-sql-pgcompat Semantic compatibility with PostgreSQL C-investigation Further steps needed to qualify. C-label will change.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants