Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: try to serve assets from unencoded and encoded paths #6728

Merged
merged 7 commits into from
Sep 23, 2024
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
757 changes: 10 additions & 747 deletions fixtures/asset-config/html-handling.test.ts
GregBrimble marked this conversation as resolved.
Show resolved Hide resolved

Large diffs are not rendered by default.

631 changes: 631 additions & 0 deletions fixtures/asset-config/test-cases/encoding-test-cases.ts

Large diffs are not rendered by default.

788 changes: 788 additions & 0 deletions fixtures/asset-config/test-cases/html-handling-test-cases.ts

Large diffs are not rendered by default.

9 changes: 4 additions & 5 deletions packages/miniflare/src/plugins/assets/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,12 @@ import fs from "node:fs/promises";
import path from "node:path";
import {
CONTENT_HASH_OFFSET,
encodeFilePath,
ENTRY_SIZE,
getContentType,
HEADER_SIZE,
MAX_ASSET_COUNT,
MAX_ASSET_SIZE,
normalizeFilePath,
PATH_HASH_OFFSET,
PATH_HASH_SIZE,
} from "@cloudflare/workers-shared";
Expand Down Expand Up @@ -239,10 +239,9 @@ const walk = async (dir: string) => {
*/

const [pathHash, contentHash] = await Promise.all([
hashPath(encodeFilePath(relativeFilepath, path.sep)),
hashPath(
encodeFilePath(filepath, path.sep) + filestat.mtimeMs.toString()
),
hashPath(normalizeFilePath(relativeFilepath)),
// used absolute filepath here so that changes to the enclosing asset folder will be registered
hashPath(filepath + filestat.mtimeMs.toString()),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note what goes into contentHash doesn't actually matter, as long as it is uniquely associated with that filepath and updates with file changes. Just removed the filepath encoding because it was unnecessary and because normalizeFilePath() expects a relative filepath now.

]);
manifest.push({
pathHash,
Expand Down
150 changes: 129 additions & 21 deletions packages/workers-shared/asset-worker/src/handler.ts
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,9 @@ export const handleRequest = async (
) => {
const { pathname, search } = new URL(request.url);

const intent = await getIntent(pathname, configuration, exists);
const decodedPathname = decodePath(pathname);
const intent = await getIntent(decodedPathname, configuration, exists);

if (!intent) {
return new NotFoundResponse();
}
Expand All @@ -30,9 +32,19 @@ export const handleRequest = async (
if (!["GET", "HEAD"].includes(method)) {
return new MethodNotAllowedResponse();
}
if (intent.redirect) {
return new TemporaryRedirectResponse(intent.redirect + search);

const decodedDestination = intent.redirect ?? decodedPathname;
const encodedDestination = encodePath(decodedDestination);

/**
* The canonical path we serve an asset at is the decoded and re-encoded version.
* Thus we need to redirect if that is different from the decoded version.
* We combine this with other redirects (e.g. for html_handling) to avoid multiple redirects.
*/
if (encodedDestination !== pathname || intent.redirect) {
emily-shen marked this conversation as resolved.
Show resolved Hide resolved
return new TemporaryRedirectResponse(encodedDestination + search);
}

if (!intent.asset) {
return new InternalServerErrorResponse(new Error("Unknown action"));
}
Expand Down Expand Up @@ -114,7 +126,10 @@ const htmlHandlingAutoTrailingSlash = async (
if (pathname.endsWith("/index")) {
if (exactETag) {
// there's a binary /index file
return { asset: { eTag: exactETag, status: 200 }, redirect: null };
return {
asset: { eTag: exactETag, status: 200 },
redirect: null,
};
} else {
if (
(redirectResult = await safeRedirect(
Expand Down Expand Up @@ -167,7 +182,10 @@ const htmlHandlingAutoTrailingSlash = async (
} else if (pathname.endsWith("/")) {
if ((eTagResult = await exists(`${pathname}index.html`))) {
// /foo/index.html exists so serve at /foo/
return { asset: { eTag: eTagResult, status: 200 }, redirect: null };
return {
asset: { eTag: eTagResult, status: 200 },
redirect: null,
};
} else if (
(redirectResult = await safeRedirect(
`${pathname.slice(0, -"/".length)}.html`,
Expand Down Expand Up @@ -208,10 +226,16 @@ const htmlHandlingAutoTrailingSlash = async (

if (exactETag) {
// there's a binary /foo file
return { asset: { eTag: exactETag, status: 200 }, redirect: null };
return {
asset: { eTag: exactETag, status: 200 },
redirect: null,
};
} else if ((eTagResult = await exists(`${pathname}.html`))) {
// foo.html exists so serve at /foo
return { asset: { eTag: eTagResult, status: 200 }, redirect: null };
return {
asset: { eTag: eTagResult, status: 200 },
redirect: null,
};
} else if (
(redirectResult = await safeRedirect(
`${pathname}/index.html`,
Expand Down Expand Up @@ -240,7 +264,10 @@ const htmlHandlingForceTrailingSlash = async (
if (pathname.endsWith("/index")) {
if (exactETag) {
// there's a binary /index file
return { asset: { eTag: exactETag, status: 200 }, redirect: null };
return {
asset: { eTag: exactETag, status: 200 },
redirect: null,
};
} else {
if (
(redirectResult = await safeRedirect(
Expand Down Expand Up @@ -293,12 +320,18 @@ const htmlHandlingForceTrailingSlash = async (
} else if (pathname.endsWith("/")) {
if ((eTagResult = await exists(`${pathname}index.html`))) {
// /foo/index.html exists so serve at /foo/
return { asset: { eTag: eTagResult, status: 200 }, redirect: null };
return {
asset: { eTag: eTagResult, status: 200 },
redirect: null,
};
} else if (
(eTagResult = await exists(`${pathname.slice(0, -"/".length)}.html`))
) {
// /foo.html exists so serve at /foo/
return { asset: { eTag: eTagResult, status: 200 }, redirect: null };
return {
asset: { eTag: eTagResult, status: 200 },
redirect: null,
};
}
} else if (pathname.endsWith(".html")) {
if (
Expand All @@ -314,7 +347,10 @@ const htmlHandlingForceTrailingSlash = async (
return redirectResult;
} else if (exactETag) {
// there's both /foo.html and /foo/index.html so we serve /foo.html at /foo.html only
return { asset: { eTag: exactETag, status: 200 }, redirect: null };
return {
asset: { eTag: exactETag, status: 200 },
redirect: null,
};
} else if (
(redirectResult = await safeRedirect(
`${pathname.slice(0, -".html".length)}/index.html`,
Expand All @@ -331,7 +367,10 @@ const htmlHandlingForceTrailingSlash = async (

if (exactETag) {
// there's a binary /foo file
return { asset: { eTag: exactETag, status: 200 }, redirect: null };
return {
asset: { eTag: exactETag, status: 200 },
redirect: null,
};
} else if (
(redirectResult = await safeRedirect(
`${pathname}.html`,
Expand Down Expand Up @@ -371,7 +410,10 @@ const htmlHandlingDropTrailingSlash = async (
if (pathname.endsWith("/index")) {
if (exactETag) {
// there's a binary /index file
return { asset: { eTag: exactETag, status: 200 }, redirect: null };
return {
asset: { eTag: exactETag, status: 200 },
redirect: null,
};
} else {
if (pathname === "/index") {
if (
Expand Down Expand Up @@ -436,7 +478,10 @@ const htmlHandlingDropTrailingSlash = async (
return redirectResult;
} else if (exactETag) {
// there's both /foo.html and /foo/index.html so we serve /foo/index.html at /foo/index.html only
return { asset: { eTag: exactETag, status: 200 }, redirect: null };
return {
asset: { eTag: exactETag, status: 200 },
redirect: null,
};
} else if (
(redirectResult = await safeRedirect(
`${pathname.slice(0, -"/index.html".length)}.html`,
Expand All @@ -453,7 +498,10 @@ const htmlHandlingDropTrailingSlash = async (
if (pathname === "/") {
if ((eTagResult = await exists("/index.html"))) {
// /index.html exists so serve at /
return { asset: { eTag: eTagResult, status: 200 }, redirect: null };
return {
asset: { eTag: eTagResult, status: 200 },
redirect: null,
};
}
} else if (
(redirectResult = await safeRedirect(
Expand Down Expand Up @@ -506,13 +554,22 @@ const htmlHandlingDropTrailingSlash = async (

if (exactETag) {
// there's a binary /foo file
return { asset: { eTag: exactETag, status: 200 }, redirect: null };
return {
asset: { eTag: exactETag, status: 200 },
redirect: null,
};
} else if ((eTagResult = await exists(`${pathname}.html`))) {
// /foo.html exists so serve at /foo
return { asset: { eTag: eTagResult, status: 200 }, redirect: null };
return {
asset: { eTag: eTagResult, status: 200 },
redirect: null,
};
} else if ((eTagResult = await exists(`${pathname}/index.html`))) {
// /foo/index.html exists so serve at /foo
return { asset: { eTag: eTagResult, status: 200 }, redirect: null };
return {
asset: { eTag: eTagResult, status: 200 },
redirect: null,
};
}

return notFound(pathname, configuration, exists);
Expand All @@ -525,7 +582,10 @@ const htmlHandlingNone = async (
): Promise<Intent> => {
const exactETag = await exists(pathname);
if (exactETag) {
return { asset: { eTag: exactETag, status: 200 }, redirect: null };
return {
asset: { eTag: exactETag, status: 200 },
redirect: null,
};
} else {
return notFound(pathname, configuration, exists);
}
Expand All @@ -540,7 +600,10 @@ const notFound = async (
case "single-page-application": {
const eTag = await exists("/index.html");
if (eTag) {
return { asset: { eTag, status: 200 }, redirect: null };
return {
asset: { eTag, status: 200 },
redirect: null,
};
}
return null;
}
Expand All @@ -550,7 +613,10 @@ const notFound = async (
cwd = cwd.slice(0, cwd.lastIndexOf("/"));
const eTag = await exists(`${cwd}/404.html`);
if (eTag) {
return { asset: { eTag, status: 404 }, redirect: null };
return {
asset: { eTag, status: 404 },
redirect: null,
};
}
}
return null;
Expand All @@ -575,6 +641,7 @@ const safeRedirect = async (

if (!(await exists(destination))) {
const intent = await getIntent(destination, configuration, exists, true);
// return only if the eTag matches - i.e. not the 404 case
if (intent?.asset && intent.asset.eTag === (await exists(file))) {
return {
asset: null,
Expand All @@ -585,3 +652,44 @@ const safeRedirect = async (

return null;
};
/**
*
* +===========================================+===========+======================+
* | character type | fetch() | encodeURIComponent() |
* +===========================================+===========+======================+
* | unreserved ASCII e.g. a-z | unchanged | unchanged |
* +-------------------------------------------+-----------+----------------------+
* | reserved (sometimes encoded) | unchanged | encoded |
* | e.g. [ ] @ $ ! ' ( ) * + , ; = : ? # & % | | |
* +-------------------------------------------+-----------+----------------------+
* | non-ASCII e.g. ü. and space | encoded | encoded |
* +-------------------------------------------+-----------+----------------------+
*
* 1. Decode incoming path to handle non-ASCII characters or optionally encoded characters (e.g. square brackets)
* 2. Match decoded path to manifest
* 3. Re-encode the path and redirect if the re-encoded path is different from the original path
*
* If the user uploads a file that is already URL-encoded, that is accessible only at the (double) encoded path.
* e.g. /%5Bboop%5D.html is served at /%255Bboop%255D only
*
* */

/**
* Decode all incoming paths to ensure that we can handle paths with non-ASCII characters.
*/
const decodePath = (pathname: string) => {
return pathname
.split("/")
.map((x) => decodeURIComponent(x))
.join("/");
};
/**
* Use the encoded path as the canonical path for sometimes-encoded characters
* e.g. /[boop] -> /%5Bboop%5D 307
*/
const encodePath = (pathname: string) => {
return pathname
.split("/")
.map((x) => encodeURIComponent(x))
.join("/");
};
21 changes: 7 additions & 14 deletions packages/workers-shared/utils/helpers.ts
Original file line number Diff line number Diff line change
@@ -1,22 +1,15 @@
import { isAbsolute, sep } from "node:path";
import { getType } from "mime";

/** normalises sep for windows, and encodes each segment */
export const encodeFilePath = (filePath: string, sep: string) => {
const encodedPath = filePath
.split(sep)
.map((segment) => encodeURIComponent(segment))
.join("/");
/** normalises sep for windows and prefix with `/` */
export const normalizeFilePath = (relativeFilepath: string) => {
if (!isAbsolute(relativeFilepath)) {
emily-shen marked this conversation as resolved.
Show resolved Hide resolved
throw new Error(`Expected relative path`);
}
const encodedPath = relativeFilepath.split(sep).join("/");
return "/" + encodedPath;
};

/** reverses encodeFilePath for accessing from file system */
export const decodeFilePath = (filePath: string, sep: string) => {
return filePath
.split("/")
.map((segment) => decodeURIComponent(segment))
.join(sep);
};

export const getContentType = (absFilePath: string) => {
let contentType = getType(absFilePath) || "application/octet-stream";
if (contentType.startsWith("text/") && !contentType.includes("charset")) {
Expand Down
2 changes: 1 addition & 1 deletion packages/workers-shared/utils/tsconfig.json
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
"lib": ["es2021"],
"module": "NodeNext",
"moduleResolution": "nodenext",
"types": ["@cloudflare/workers-types/experimental"],
"types": ["@cloudflare/workers-types/experimental", "@types/node"],
"noEmit": true,
"isolatedModules": true,
"allowSyntheticDefaultImports": true,
Expand Down
4 changes: 2 additions & 2 deletions packages/wrangler/src/__tests__/deploy.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -4427,11 +4427,11 @@ addEventListener('fetch', event => {});`
expect(manifestBodies.length).toBe(1);
expect(manifestBodies[0]).toEqual({
manifest: {
"/b%C3%A9%C3%ABp/boo%5Ep.txt": {
"/béëp/boo^p.txt": {
hash: "ff5016e92f039aa743a4ff7abb3180fa",
size: 17,
},
"/boop/file%231.txt": {
"/boop/file#1.txt": {
hash: "7574a8cd3094a050388ac9663af1c1d6",
size: 17,
},
Expand Down
Loading