diff --git a/encoding.bs b/encoding.bs index 969476a..7afd377 100644 --- a/encoding.bs +++ b/encoding.bs @@ -1045,12 +1045,17 @@ optional I/O queue of bytes output (default « »), return the result
Standards are strongly discouraged from using decode, encode, and -BOM sniff, except as needed for compatibility. Standards needing these legacy hooks will most -likely also need to use get an encoding (to turn a label into an -encoding) and get an output encoding (to turn an encoding into -another encoding that is suitable to pass into encode). Other algorithms are not -to be used directly. +
Standards are strongly discouraged from using decode, BOM sniff, and + encode, except as needed for compatibility. Standards needing these legacy hooks will + most likely also need to use get an encoding (to turn a label into an + encoding) and get an output encoding (to turn an encoding into + another encoding that is suitable to pass into encode). + +
For the extremely niche case of URL percent-encoding, custom encoder error handling is needed. + The get an encoder and encode or fail algorithms are to be used for that. Other + algorithms are not to be used directly. +
To decode an I/O queue of bytes ioQueue given a fallback encoding encoding and an optional I/O queue of scalar values output (default « »), run @@ -1111,19 +1116,63 @@ corresponding to the byte order mark found, or null otherwise. steps:
Assert: encoding is not replacement or UTF-16BE/LE. +
Let encoder be the result of getting an encoder from encoding. -
Run encoding's encoder with ioQueue,
- output, and "html
".
+
Run encoder with ioQueue, output, and
+ "html
".
Return output.
This is mostly a legacy hook for URLs and HTML forms. Layering -UTF-8 encode on top is safe as it never triggers -errors. -[[URL]] -[[HTML]] +
This is a legacy hook for HTML forms. Layering UTF-8 encode on top +is safe as it never triggers errors. [[HTML]] + +
To get an encoder from an +encoding encoding: + +
Assert: encoding is not replacement or UTF-16BE/LE. + +
Return encoding's encoder. +
To encode or fail an I/O queue of scalar values ioQueue given an +encoder encoder and an I/O queue of bytes output, run these +steps: + +
Let potentialError be the result of running encoder with
+ ioQueue, output, and "fatal
".
+
+
Push end-of-queue to output. + +
If potentialError is an error, then return error's + code point's value. + +
Return null. +
This is a legacy hook for URL percent-encoding. The caller will have to keep an + encoder alive as the ISO-2022-JP encoder can be in two different states when + returning an error. That also means that if the caller emits bytes to encode the error in + some way, these have to be in the range 0x00 to 0x7F, inclusive, excluding 0x0E, 0x0F, 0x1B, 0x5C, + and 0x7E. [[URL]] + +
In particular, if upon returning an error the ISO-2022-JP encoder is in the
+ Roman state, the caller cannot output 0x5C (\) as it will not
+ decode as U+005C (\). For this reason, applications using encode or fail for unintended
+ purposes ought to take care to prevent the use of the ISO-2022-JP encoder in combination
+ with replacement schemes, such as those of JavaScript and CSS, that use U+005C (\) as part of the
+ replacement syntax (e.g., \u2603
) or make sure to pass the replacement syntax through
+ the encoder (in contrast to URL percent-encoding).
+
+
The return value is either the number representing the code point that could not be + encoded or null, if there was no error. When it returns non-null the caller will have to + invoke it again, supplying the same encoder and a new output I/O queue. +