Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: support for nanosecond timestamps #10822

Closed
bmerry opened this issue May 27, 2019 · 15 comments
Closed

Feature request: support for nanosecond timestamps #10822

bmerry opened this issue May 27, 2019 · 15 comments

Comments

@bmerry
Copy link

bmerry commented May 27, 2019

Now that Elasticsearch 7.x supports nanosecond timestamps, it would be nice if they could pass through logstash as well (in my case I want them just for better sorting). It looks like the Timestamp class is still based on Joda rather than Java 8 so presumably only supports milliseconds.

@TomGudman
Copy link

This is a feature we all need when you deal with a high throughput of logs.

Often we check logs on box due to the interleaved effect in Kibana inherent to milliseconds precision only.

I've also discovered that logstash is now the bottleneck for nanoseconds probably because of joda time. Since ES7.x announcement, I guess the whole community believe it's a done deal but it's far from it.

Current state:

  • Elasticsearch 7.x : supports nanoseconds (date_nanos)
  • Logstash : status unknown :(
  • Filebeat : work in progress ( PR )
  • Kibana : supports nanooseconds

References:

@TomGudman
Copy link

TomGudman commented Jan 10, 2020

@bmerry : Can you edit your issue title to add the word nanoseconds?

Google did not find this issue. I only found this issue via Github issue search.

If more people 👍 this request then it may get more traction.

In the mean time, we can only wait for some skilled programmers to improve the situation ;-)

@bmerry bmerry changed the title Feature request: support for high-precision timestamps Feature request: support for nanosecond timestamps Jan 10, 2020
@TomGudman
Copy link

Thanks @bmerry . Now we just have to wait...

@weizhu-us
Copy link

We would love to have this.

@th0ger
Copy link

th0ger commented Jul 29, 2020

Any known workarounds? (in logstash, but without the date filter).

@bmerry
Copy link
Author

bmerry commented Jul 29, 2020

I've worked around it by adding an extra field called timestamp_precise to the GELF log records (in our Python logging framework, before they reach Logstash), and then using an Elasticsearch pipeline to use that field to overwrite @timestamp when available. I couldn't find a way to do this without modifying the source of the logs, because Logstash gobbles up the standard GELF timestamp before any filters have a chance to extract it. You might have better luck if you're extracting timestamps by grokking flat text logs.

Here's the Elasticsearch pipeline:

{
  "description": "transfers timestamp_precise to @timestamp if present",
  "processors": [
    {
      "set": {
        "field": "timestamp_coarse",
        "value": "{{@timestamp}}"
      }
    },
    {
      "remove": {
        "field": "@timestamp",
        "if": "ctx.containsKey('timestamp_precise')"
      }
    },
    {
      "rename": {
        "field": "timestamp_precise",
        "target_field": "@timestamp",
        "ignore_missing": true
      }
    }
  ]
}

@juan-domenech
Copy link

I've got nanoseconds working on logs with this precision having @timestamp and @timesamp_nanos fields coexisting at the same time and telling Kibana to use nanos as main timestamp:
https://discuss.elastic.co/t/how-to-index-nanoseconds-precision-events-with-logstash-7-10-and-type-date-nanos/262029

@yaauie
Copy link
Member

yaauie commented Oct 28, 2021

Logstash 8.0 will have nanoseconds precision internally, keeping nano precision in inbound @timestamp field from upstream or creating new events with best-available precision (on most JVM's & hardware, this is micros) (see: #12797).

As mentioned in this thread, the missing piece is still Logstash's ability to parse a nano-precise timestamp and keep its granularity, which is a limitation of the date filter because it is powered by the similarly-limited Joda Time.

I had hopes of adding precision into the date filter, but the subtle differences between joda time's format strings and those provided by java.time.format are enough that automatic translation is not feasible (see: Elasticsearch's guide for migrating to Java time, which outlines the differences).

To address this, I think our best course is to introduce a new nano-precise filter that only works with Java time format strings, so that a user who is looking to add precision to their pipeline approaches it in a manner that doesn't assume Joda patterns will magically work. I plan to create the specification for this new plugin in the coming week or two, and will add a link to it here.


In the mean-time, on Logstash 8, it is possible to use the Ruby internals to either parse a strict ISO8601 timestamp or to convert the Ruby Time object into the LogStash::Timestamp object that we expect to be in the @timestamp field.

I have created an unoptimized proof-of-concept precision-timestamp-parse-logstash-filter-ruby.rb that accepts one or more format strings:

  • ISO8601: a well-formatted ISO8601 string
  • UNIX: the number of seconds since the unix epoch
  • UNIX_MS: the number of milliseconds since the unix epoch
  • a valid Ruby Time#strptime format string, including %N nanos

For example, if the value in the source field were a no-separators 20211028194357867631621 corresponding to the ruby format string %Y%m%d%H%M%S%N, the following could parse it or well-formatted ISO8601 inputs like 2021-10-28T19:43:57.867631621Z:

filter {
  ruby {
    path => "${RUBY_FILTERS}/precision-timestamp-parse.logstash-filter-ruby.rb"
    script_params => {
      source => "timestamp_precise"
      format => ["ISO8601", "%Y%m%d%H%M%S%N"]
    }
  }
}

@bmerry
Copy link
Author

bmerry commented Oct 31, 2021

Great to hear that there is now some forward progress here. Any idea whether the workaround can be used with the gelf input plugin, or will the plugin need to be updated?

@yaauie
Copy link
Member

yaauie commented Nov 9, 2021

Great to hear that there is now some forward progress here. Any idea whether the workaround can be used with the gelf input plugin, or will the plugin need to be updated?
-- #10822 (comment)

My reading of the code is that on Logstash 8, events created by the Gelf input will maintain the precision of what is available, since it uses the LogStash::Timestamp::at API to generate the timestamp from a numeric epoch, and LogStash::Timestamp maintains precision up to nanos. This means that no "workaround" will be needed.

@bmerry
Copy link
Author

bmerry commented Nov 10, 2021

Thanks, that's great news.

@roaksoax
Copy link
Contributor

roaksoax commented Dec 1, 2021

Given that Logstash 8.0 does already support nanosecond on timestamps, I'll close this issue.

@roaksoax roaksoax closed this as completed Dec 1, 2021
@alfonsoveneziano
Copy link

I've worked around it by adding an extra field called timestamp_precise to the GELF log records (in our Python logging framework, before they reach Logstash), and then using an Elasticsearch pipeline to use that field to overwrite @timestamp when available. I couldn't find a way to do this without modifying the source of the logs, because Logstash gobbles up the standard GELF timestamp before any filters have a chance to extract it. You might have better luck if you're extracting timestamps by grokking flat text logs.

Here's the Elasticsearch pipeline:

{
  "description": "transfers timestamp_precise to @timestamp if present",
  "processors": [
    {
      "set": {
        "field": "timestamp_coarse",
        "value": "{{@timestamp}}"
      }
    },
    {
      "remove": {
        "field": "@timestamp",
        "if": "ctx.containsKey('timestamp_precise')"
      }
    },
    {
      "rename": {
        "field": "timestamp_precise",
        "target_field": "@timestamp",
        "ignore_missing": true
      }
    }
  ]
}

Hi, sorry for reopening the discussion. I've tried to use the pipeline using a timestamp_precise field that contains the date with nanoseconds precision in text format e.g. "timestamp_precise":"2022-10-28T19:43:57.867631621Z") but it does not work for me.
Could you please help to figure out what is wrong?
Thanks
BRs, alfonso

@alfonsoveneziano
Copy link

I've worked around it by adding an extra field called timestamp_precise to the GELF log records (in our Python logging framework, before they reach Logstash), and then using an Elasticsearch pipeline to use that field to overwrite @timestamp when available. I couldn't find a way to do this without modifying the source of the logs, because Logstash gobbles up the standard GELF timestamp before any filters have a chance to extract it. You might have better luck if you're extracting timestamps by grokking flat text logs.
Here's the Elasticsearch pipeline:

{
  "description": "transfers timestamp_precise to @timestamp if present",
  "processors": [
    {
      "set": {
        "field": "timestamp_coarse",
        "value": "{{@timestamp}}"
      }
    },
    {
      "remove": {
        "field": "@timestamp",
        "if": "ctx.containsKey('timestamp_precise')"
      }
    },
    {
      "rename": {
        "field": "timestamp_precise",
        "target_field": "@timestamp",
        "ignore_missing": true
      }
    }
  ]
}

Hi, sorry for reopening the discussion. I've tried to use the pipeline using a timestamp_precise field that contains the date with nanoseconds precision in text format e.g. "timestamp_precise":"2022-10-28T19:43:57.867631621Z") but it does not work for me. Could you please help to figure out what is wrong? Thanks BRs, alfonso

Hi,
I got the same result by adding a pipeline in Elasticsearch:

{
        "description": "mypipeline",
        "processors": [
            {
                "date": {
                    "field": "timestamp_nano",
                    "formats": ["ISO8601"],
                    "output_format": "yyyy-MM-dd'T'HH:mm:ss.SSSSSSSSSZ"
                }
            }
        ]
}

The timestamp_nano includes a timestamp with nanoseconds precision.
BRs

@juan-domenech
Copy link

For those waiting (like me) for the Date filter to handle nanosecond precision in 8.x, there are a couple of workarounds:

I hope it helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants