Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

302 response not archived? #137

Closed
edsu opened this issue Jan 31, 2023 · 1 comment · Fixed by #141
Closed

302 response not archived? #137

edsu opened this issue Jan 31, 2023 · 1 comment · Fixed by #141
Labels
bug Something isn't working

Comments

@edsu
Copy link
Collaborator

edsu commented Jan 31, 2023

Over in the forum there is an example of a web page that is being archived, which links to an MP4 which in turn redirects to a new location, but where the redirect is not archived, but the content from the redirected content is. You can reproduce this problem by

  1. Using ArchiveWebPage to start archiving https://www.govinfo.gov/collection/january-6th-committee-final-report
  2. Click on Supporting Materials - Video Exhibits
  3. Click on the MP4 button next to the first video EXH 103 - Behind the Scenes on Election Night House Select January 6th Committee Final Report...
  4. Once archiving is idle, stop recording and open the archive in ReplayWebPage
  5. Try to view the video that you archived, but notice you get a Archived Page Not Found error.
  6. Look for https://www.govinfo.gov/content/pkg/GPO-J6-VIDEO-EXH-103/video/GPO-J6-VIDEO-EXH-103.mp4 in the CDX index and WARC data (they don't appear to be there).
  7. Notice that resource that was redirected to is available: https://customer-uh7tqhki3bpanql6.cloudflarestream.com/815f9de14a4a3efc7cecaa21a955b060/watch

You will notice that a new tab is opened when you click on the MP4 button, and that the server sends a redirect to a new location. Perhaps there is a hitch in how HTTP 302 messages are recorded when they are the result of a new tab opening? Or perhaps recording is confused by an HTTP 302 response that (erroneously) has Content-Type is application/mp4?

$ curl -I https://www.govinfo.gov/content/pkg/GPO-J6-VIDEO-EXH-103/video/GPO-J6-VIDEO-EXH-103.mp4
HTTP/2 302
date: Tue, 31 Jan 2023 14:49:39 GMT
content-type: video/mp4
location: https://customer-uh7tqhki3bpanql6.cloudflarestream.com/815f9de14a4a3efc7cecaa21a955b060/watch
strict-transport-security: max-age=31536000; preload
x-frame-options: SAMEORIGIN
via: 1.1 www.govinfo.gov
x-content-type-options: nosniff
x-permitted-cross-domain-policies: none
content-security-policy: default-src 'self'  https://analytics.govinfo.gov https://stackpath.bootstrapcdn.com https://maxcdn.bootstrapcdn.com; script-src 'unsafe-inline' 'unsafe-eval' 'self'   https://stackpath.bootstrapcdn.com https://api.data.gov https://maxcdn.bootstrapcdn.com; object-src 'unsafe-inline' 'self' ; style-src 'unsafe-inline' 'self' https://maxcdn.bootstrapcdn.com https://api.data.gov  https://stackpath.bootstrapcdn.com https://fonts.googleapis.com; img-src 'unsafe-inline' 'self' http://insideanalytics.gpo.gov https://maxcdn.bootstrapcdn.com https://stackpath.bootstrapcdn.com  https://analytics.govinfo.gov data:; font-src 'unsafe-inline' 'self'  https://stackpath.bootstrapcdn.com https://api.data.gov https://maxcdn.bootstrapcdn.com https://fonts.gstatic.com https://fonts.googleapis.com; connect-src 'self' https://api.data.gov; frame-src 'self'  https://stackpath.bootstrapcdn.com https://maxcdn.bootstrapcdn.com;
cf-cache-status: DYNAMIC
server: cloudflare
cf-ray: 79234156ce919c2b-IAD

I have an example here.

@edsu edsu added the bug Something isn't working label Jan 31, 2023
ikreymer added a commit that referenced this issue Feb 4, 2023
… a redirect, create an implicit redirect record instead of re-fetching in browser

async fetch: allow adding synthetic redirect response instead of trying in browser, if all that's needed is the redirect record itself
should fix #137
@ikreymer
Copy link
Member

ikreymer commented Feb 4, 2023

Fixed in 0.9.7 by adding an extra redirect directly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants