-
Notifications
You must be signed in to change notification settings - Fork 982
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(response): set content-type charset default to utf-8 #1416 #1564
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh snap! Awesome concise implementation and thorough tests 🎉
This is amazing @andersnylund, thank you ❤️
Also request @lrowe |
@@ -491,6 +491,8 @@ function patch(Response) { | |||
|
|||
if (self._charSet) { | |||
type = type + '; charset=' + self._charSet; | |||
} else { | |||
type = type + '; charset=utf-8'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Choosing a default content type for a formatter should probably be the responsibility of the formatter, since binary content type should not have a charset.
Existing formatters are currently also tied to a particular charset since they set the Content-Length
based on Buffer.byteLength(data)
(Buffer.byteLength without a charset defaults to utf-8.) Example: https://github.com/restify/node-restify/blob/master/lib/formatters/json.js
(Setting Content-Length in the formatter should no longer be needed in recent NodeJS versions (>1.5.0), see: nodejs/node#1062.)
Perhaps it would make more sense to set the Content-Type header only after calling the formatter, providing the opportunity for the formatter to set a format appropriate default charset.
const results = formatter(self.req, self, body);
if (self._charSet) {
type = type + '; charset=' + self._charSet;
}
// Update header to the derived content type for our formatter
self.setHeader('Content-Type', type);
// Finally, invoke the formatter and flush the request with it's results
return flush(self, results);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused - I thought the content type indirectly specifies the formatter, not the other way around?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DonutEspresso while the content type picks the formatter the formatter should probably pick the charset if it is unspecified.
t.equal( | ||
res.headers['content-type'], | ||
'application/octet-stream; charset=utf-8' | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Binary content types should not include a charset as it only makes sense for textual content types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is true. Didn't think of that. Having a list of different mime-types that should default to utf-8 is not convenient either keeping in mind all combinations. How about a blacklist of mime-types where not to format with utf-8?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The list of binary types is also unbounded. I think it makes most sense to make the formatter to be responsible for picking a default / forcing a particular charset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Python WebOb library uses an include list for setting a default charset:
* The ``default_charset`` is used as the default character set to return on the ``Content-Type`` header, if the ``Content-Type`` allows for a ``charset`` parameter. Currently the only ``Content-Type``'s that allow for a ``charset`` are defined to be: ``text/*``, ``application/xml``, and ``*/*+xml``. Any other ``Content-Type``'s will not have a ``charset`` added.
https://github.com/Pylons/webob/blob/master/src/webob/response.py#L142
We'd probably need to add at least "application/javascript", "application/json", and "/+json" to that list.
Maintaining such a list does seem very inconvenient. Attaching the default to the formatter somehow seems more extensible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this help? https://www.npmjs.com/package/mime-types
mime.charset('text/markdown') // 'UTF-8'
Binary formatters will likely return a Buffer so perhaps we could do: const results = formatter(self.req, self, body);
if (self._charSet) {
type = type + '; charset=' + self._charSet;
} else if (typeof results === "string") {
type = type + '; charset=utf-8';
}
// Update header to the derived content type for our formatter
self.setHeader('Content-Type', type);
// Finally, invoke the formatter and flush the request with it's results
return flush(self, results); |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue has been automatically closed as stale because it has not had recent activity. |
Pre-Submission Checklist
make prepush
Issues
Closes:
utf-8
charEncoding #1416Changes
Makes a Response's content-type to default to charset=utf-8, if charSet() not specifically called