Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Editorial: Have the multipart/form-data encoding algorithm use "encode" #6141

Merged
merged 2 commits into from
Nov 11, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 29 additions & 45 deletions source
Original file line number Diff line number Diff line change
Expand Up @@ -56547,59 +56547,43 @@ fur

<!-- https://hixie.ch/tests/adhoc/html/forms/submission/multipart_form-data/ -->

<!-- NOTE: This algorithm is also used by the XHR spec -->
andreubotella marked this conversation as resolved.
Show resolved Hide resolved
<!-- NOTE: This algorithm is also used by
https://fetch.spec.whatwg.org/#concept-bodyinit-extract -->

<p>The <dfn export><code>multipart/form-data</code> encoding algorithm</dfn>, given an <var>entry
list</var> and <var>encoding</var>, is as follows:</p>

<ol>
<!-- the first few steps of this are the same as in the previous section -->

<li><p>Let <var>result</var> be the empty string.</p></li>

<li>
<p>For each <span data-x="formdata-entry">entry</span> in <var>entry list</var>:</p>

<ol>
<!-- the step that replaces a file with its name is missing in
this version of the algorithm -->

<li><p>For each character in the entry's name and value that cannot be expressed using the
selected character encoding, replace the character by a string consisting of a U+0026 AMPERSAND
character (&amp;), a U+0023 NUMBER SIGN character (#), one or more <span>ASCII digits</span>
representing the code point of the character in base ten, and finally a U+003B (;).</p></li>
<!-- we should say it should be the shortest possible string, no leading zeros. this whole step
is asinine, though, so... -->
<p>Return the byte sequence resulting from encoding the <var>entry list</var> using the rules
described by RFC 7578, <cite>Returning Values from Forms: <code
data-x="">multipart/form-data</code></cite>, given the following conditions:
<ref spec=RFC7578></p>

<!-- this is where the similarities with the previous section end -->
</ol>
<ul>
<li><p>Each entry in <var>entry list</var> is a <i>field</i>, the name of the entry is the
<i>field name</i> and the value of the entry is the <i>field value</i>.</p></li>

<li><p>The order of parts must be the same as the order of fields in <var>entry list</var>.
Multiple entries with the same name must be treated as distinct fields.</p></li>

<li><p>Field names, field values for non-file fields, and file names for file fields, in the
generated <code>multipart/form-data</code> resource must be set to the result of <span
andreubotella marked this conversation as resolved.
Show resolved Hide resolved
data-x="encode">encoding</span> the corresponding entry's name or value with
andreubotella marked this conversation as resolved.
Show resolved Hide resolved
<var>encoding</var>, converted to a byte sequence. In the case of file names, however, the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess as a follow-up we should test if this is what user agents do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thoughts exactly.

precise name may be approximated if necessary (e.g., newlines could be removed from file names,
quotes could be changed to "<code data-x="">%22</code>", and characters not expressible in
<var>encoding</var> could be replaced by other characters before encoding).</p></li>

<li><p>The parts of the generated <code>multipart/form-data</code> resource that correspond to
non-file fields must not have a `<code>Content-Type</code>` header specified.</p></li>

<li><p>The boundary used by the user agent in generating the return value of this algorithm is
the <dfn export><code>multipart/form-data</code> boundary string</dfn>. (This value is used to
generate the MIME type of the form submission payload generated by this algorithm.)</p></li>
</ul>
</li>

<li>
<p>Encode the (now mutated) <var>entry list</var> using the rules described by RFC 7578,
<cite>Returning Values from Forms: <code data-x="">multipart/form-data</code></cite>, and return
the resulting byte stream. <ref spec=RFC7578></p>

<p>Each entry in <var>entry list</var> is a <i>field</i>, the name of the entry is the <i>field
name</i> and the value of the entry is the <i>field value</i>.</p>

<p>The order of parts must be the same as the order of fields in <var>entry list</var>. Multiple
entries with the same name must be treated as distinct fields.</p>

<p>The parts of the generated <code>multipart/form-data</code> resource that correspond to
non-file fields must not have a `<code>Content-Type</code>` header specified. Their names and
values must be encoded using the character encoding selected above.</p>

<p>File names included in the generated <code>multipart/form-data</code> resource (as part of
file fields) must use the character encoding selected above, though the precise name may be
approximated if necessary (e.g. newlines could be removed from file names, quotes could be
changed to "%22", and characters not expressible in the selected character encoding could be
replaced by other characters).

<p>The boundary used by the user agent in generating the return value of this algorithm is the
<dfn export><code>multipart/form-data</code> boundary string</dfn>. (This value is used to
generate the MIME type of the form submission payload generated by this algorithm.)</p> </li>
</ol>
</ol>

</div>

Expand Down