Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The strptime datetime parser doesn't support %s #3556

Closed
magnalite opened this issue Jun 15, 2023 · 4 comments
Closed

The strptime datetime parser doesn't support %s #3556

magnalite opened this issue Jun 15, 2023 · 4 comments
Assignees
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed

Comments

@magnalite
Copy link
Collaborator

magnalite commented Jun 15, 2023

Data I want to ingest has ended up with timestamps formatted as seconds.milliseconds eg 1686764678.9197. I could transform these into ints however for large volumes of data that can be painful. I would instead like to parse it using the strptime parser via "%s.%f" however it doesn't recognise %s (a unix timestamp in seconds).

@magnalite magnalite added the enhancement New feature or request label Jun 15, 2023
@fulmicoton
Copy link
Contributor

Also, I'd love to have a sandbox in the UI for people to copy and paste their stuff and get visual feedback of how it is interpreted by quickwit.

@fulmicoton
Copy link
Contributor

fulmicoton commented Jun 16, 2023

Ok so after investigating, ideally this means adding support for "%s" in time-fmt.

Just to be sure, can you confirm that your data is fed as a JsonString (as opposed to JsonNumber)

@fulmicoton fulmicoton added good first issue Good for newcomers help wanted Extra attention is needed labels Jun 16, 2023
@magnalite
Copy link
Collaborator Author

magnalite commented Jun 16, 2023

Just to be sure, can you confirm that your data is fed as a JsonString (as opposed to JsonNumber)

I believe it is a JsonNumber. My data looks like this in its raw form
{ "timestamp":1686764741.2561002, ... }

I now realise that this is actually also beyond millisecond resolution too which is unclear to me if %f is happy to deal with or if it can only be milliseconds, eg needs to be truncated to 1686764741.256.

I am wondering if perhaps it would be more appropriate to add a further datetime type unix_timestamp_f64 which accepts unix time in seconds as a f64 (but then internally converted/used by quickwit as i64 ofcourse), maybe in addition to support for %s. However it would be good to add a warning in documentation to avoid using floats for timestamps as float precision is not guaranteed in json and can lead to data inaccuracies. For example if they have a json library which for whatever reason used floats instead of a doubles it would be wildly inaccurate but valid as per the json spec.

This is not something I had truly considered the repercussions of before now as I thought every json library would respect them being doubles, and in practice most do, but this is where I am at. As quickwit sees wider adoption I can see more people ending up in this scenario and instead of saying "you should just use ints" I believe it would be better to give guidance along the lines of "you really shouldn't use floats because xyz but if you really want to you can use unix_timestamp_f64".

@guilload guilload self-assigned this Jun 20, 2023
@guilload
Copy link
Member

Closed via #3639.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants