Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Query log" for a pool #4034

Open
philrz opened this issue Jul 26, 2022 · 2 comments · Fixed by #4385
Open

"Query log" for a pool #4034

philrz opened this issue Jul 26, 2022 · 2 comments · Fixed by #4385

Comments

@philrz
Copy link
Contributor

philrz commented Jul 26, 2022

A community user recently asked:

Do you know how we could easily keep a log of queries issued on a given lake in some sort of 'query log' pool?

Indeed, the architecture easily supports this, and it's something for which the Dev team has also recognized the need. Beyond mere convenience for users that want to see history, an internal motivation is that zync represents the first tool other than the Brim/Zui app that generates lots of queries that are opaque to the user, e.g., different YAML files might cause the generation of different queries, and these will perform differently. A "query log" would help here by saving such things as:

  • Original query text
  • The DAG
  • Query start/end time (so we know how long the query ran)
  • How much of the query could be served from cache
  • The size of the query result

This data could then be fed into a Pareto chart that would help us understand which queries most need help from enhancements in the query optimizer.

@philrz philrz changed the title "Query log" pool "Query log" for a pool Jul 26, 2022
@philrz
Copy link
Contributor Author

philrz commented Oct 2, 2022

The user that originally requested this feature thought of it again after we showed them a demo of the Remote Queries feature. In that case the queries are stored in a pool as a result of an action of the user. But they recognized that it would not be difficult to extend this by just storing all queries executed in such a pool along with the additional metadata described above.

A SQL user, they also thought of https://www.postgresql.org/docs/current/sql-explain.html as something to mimic.

@philrz philrz linked a pull request Feb 17, 2023 that will close this issue
@philrz
Copy link
Contributor Author

philrz commented Feb 17, 2023

Short of the deluxe "explain"-style treatment described above, linked PR #4385 provides an initial baby step by including the Zed query in the lake log at debug level. The following shows this at Zed commit bbc9c4d.

Starting the lake:

$ zed -version
Version: v1.5.0-33-gbbc9c4d4

$ zed -lake lake serve -log.level=debug
{"level":"info","ts":1676652934.2631412,"logger":"core","msg":"Started"}
{"level":"info","ts":1676652934.263395,"logger":"httpd","msg":"Listening","addr":"[::]:9867"}
...

Running a query from another shell:

$ zed create foo
pool created: foo 2LsHc5NaGzyegpm2qtctiBjy6eo

$ zed -use foo load sample.zng 
(2/1) 4910B/4910B 4910B/s 100.00%
2LsHcmtEY8eO6Fy9ORBbpscr2vd committed

$ zed query 'from foo | count()'
{count:31(uint64)}

How it looks in the Zed lake log:

...
{"level":"debug","ts":1676653003.791039,"logger":"http.access","msg":"Request started","request_id":"2LsHjIEL8bpoaCSykJO7o91mNrR","host":"localhost:9867","method":"POST","proto":"HTTP/1.1","remote_addr":"127.0.0.1:52946","request_content_length":-1,"url":"/query?ctrl=T"}
{"level":"debug","ts":1676653003.791219,"logger":"core","msg":"Running Query","request_id":"2LsHjIEL8bpoaCSykJO7o91mNrR","query":"from foo | count()"}
{"level":"info","ts":1676653003.7959218,"logger":"http.access","msg":"Request completed","request_id":"2LsHjIEL8bpoaCSykJO7o91mNrR","host":"localhost:9867","method":"POST","proto":"HTTP/1.1","remote_addr":"127.0.0.1:52946","request_content_length":-1,"url":"/query?ctrl=T","elapsed":0.004865553,"response_content_length":239,"status_code":200}
...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant