-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a view for HTTP headers #25
Conversation
30131c4
to
41848dd
Compare
This commit adds a migration that creates a view of the HTTP headers in the response table. Once the view is in place you can run a query like this without requiring JSON parsing: ```sql SELECT warc_record_id, name, value FROM http_headers; ``` It can be helpful for identifying for things like: ```sql SELECT value, COUNT(*) AS count FROM http_header WHERE name = 'content-type' GROUP BY value ORDER BY count DESC; value count --------------------------------- ----- application/javascript 57 image/png 11 text/css 7 text/html; charset=utf-8 6 image/jpeg 4 image/gif 4 text/fragment+html; charset=utf-8 3 image/svg+xml 3 text/plain 2 text/html; charset=UTF-8 1 ``` Closes Florents-Tselai#24
Two comments
|
Ok, I can adjust the view name and the docs. Our other table names are singular, not plural. I think we should be consistent, and don't have a strong preference either way. Do you? |
Add a similar table for HTTP requests. Prefix the view names with a `v_` to distinguish it in the schema from actual tables. Also add a description of the view with a table that defines the columns.
That was a good idea to treat requests the same, since they have http headers as well. I've updated this PR to create a view for request records as well, renamed both views to use the Let me know if you have a preference for singular or plural table/view names. |
Looks beautiful! No strong preference for singular / plural; let's keep it singular. No problem. |
CI let me know I need to reformat :-) |
This commit adds a migration that creates a view of the HTTP headers in the response table. Once the view is in place you can run a query like this without requiring JSON parsing:
It can be helpful for identifying for things like:
Closes #24