feat(response): set content-type charset default to utf-8 #1416 #1564

andersnylund · 2017-11-14T20:21:02Z

Pre-Submission Checklist

Opened an issue discussing these changes before opening the PR
Ran the linter and tests via make prepush
Included comprehensive and convincing tests for changes

Issues

Closes:

Issue Default to utf-8 charEncoding #1416
Issue #
Issue #

Summarize the issues that discussed these changes

Changes

What does this PR do?

Makes a Response's content-type to default to charset=utf-8, if charSet() not specifically called

retrohacker

Oh snap! Awesome concise implementation and thorough tests 🎉

This is amazing @andersnylund, thank you ❤️

retrohacker · 2017-11-15T00:44:24Z

Also request @lrowe

lrowe · 2017-11-15T02:33:25Z

lib/response.js

@@ -491,6 +491,8 @@ function patch(Response) {

        if (self._charSet) {
            type = type + '; charset=' + self._charSet;
+        } else {
+            type = type + '; charset=utf-8';


Choosing a default content type for a formatter should probably be the responsibility of the formatter, since binary content type should not have a charset.

Existing formatters are currently also tied to a particular charset since they set the Content-Length based on Buffer.byteLength(data) (Buffer.byteLength without a charset defaults to utf-8.) Example: https://github.com/restify/node-restify/blob/master/lib/formatters/json.js

(Setting Content-Length in the formatter should no longer be needed in recent NodeJS versions (>1.5.0), see: nodejs/node#1062.)

Perhaps it would make more sense to set the Content-Type header only after calling the formatter, providing the opportunity for the formatter to set a format appropriate default charset.

const results = formatter(self.req, self, body); if (self._charSet) { type = type + '; charset=' + self._charSet; } // Update header to the derived content type for our formatter self.setHeader('Content-Type', type); // Finally, invoke the formatter and flush the request with it's results return flush(self, results);

I'm confused - I thought the content type indirectly specifies the formatter, not the other way around?

@DonutEspresso while the content type picks the formatter the formatter should probably pick the charset if it is unspecified.

lrowe · 2017-11-15T02:33:38Z

test/response.test.js

+        t.equal(
+            res.headers['content-type'],
+            'application/octet-stream; charset=utf-8'
+        );


Binary content types should not include a charset as it only makes sense for textual content types.

That is true. Didn't think of that. Having a list of different mime-types that should default to utf-8 is not convenient either keeping in mind all combinations. How about a blacklist of mime-types where not to format with utf-8?

The list of binary types is also unbounded. I think it makes most sense to make the formatter to be responsible for picking a default / forcing a particular charset.

The Python WebOb library uses an include list for setting a default charset:

* The ``default_charset`` is used as the default character set to return on the ``Content-Type`` header, if the ``Content-Type`` allows for a ``charset`` parameter. Currently the only ``Content-Type``'s that allow for a ``charset`` are defined to be: ``text/*``, ``application/xml``, and ``*/*+xml``. Any other ``Content-Type``'s will not have a ``charset`` added.

https://github.com/Pylons/webob/blob/master/src/webob/response.py#L142

We'd probably need to add at least "application/javascript", "application/json", and "/+json" to that list.

Maintaining such a list does seem very inconvenient. Attaching the default to the formatter somehow seems more extensible.

Would this help? https://www.npmjs.com/package/mime-types

mime.charset('text/markdown') // 'UTF-8'

lrowe · 2017-11-15T22:36:19Z

Binary formatters will likely return a Buffer so perhaps we could do:

        const results = formatter(self.req, self, body);
        
        if (self._charSet) {
            type = type + '; charset=' + self._charSet;
        } else if (typeof results === "string") {
            type = type + '; charset=utf-8';
        }

        // Update header to the derived content type for our formatter
        self.setHeader('Content-Type', type);

        // Finally, invoke the formatter and flush the request with it's results
        return flush(self, results);

stale · 2018-04-10T19:54:55Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale · 2018-04-24T20:09:13Z

This issue has been automatically closed as stale because it has not had recent activity.

feat(response): set content-type charset default to utf-8 restify#1416

448344d

retrohacker approved these changes Nov 15, 2017

View reviewed changes

retrohacker requested a review from yunong November 15, 2017 00:44

lrowe suggested changes Nov 15, 2017

View reviewed changes

stale bot added the Stale label Apr 10, 2018

stale bot closed this Apr 24, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(response): set content-type charset default to utf-8 #1416 #1564

feat(response): set content-type charset default to utf-8 #1416 #1564

andersnylund commented Nov 14, 2017

retrohacker left a comment •

edited

Loading

retrohacker commented Nov 15, 2017

lrowe Nov 15, 2017

DonutEspresso Nov 15, 2017

lrowe Nov 15, 2017

lrowe Nov 15, 2017

andersnylund Nov 15, 2017

lrowe Nov 15, 2017

lrowe Nov 15, 2017

retrohacker Nov 15, 2017

lrowe commented Nov 15, 2017 •

edited

Loading

stale bot commented Apr 10, 2018

stale bot commented Apr 24, 2018

feat(response): set content-type charset default to utf-8 #1416 #1564

feat(response): set content-type charset default to utf-8 #1416 #1564

Conversation

andersnylund commented Nov 14, 2017

Pre-Submission Checklist

Issues

Changes

retrohacker left a comment • edited Loading

Choose a reason for hiding this comment

retrohacker commented Nov 15, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lrowe commented Nov 15, 2017 • edited Loading

stale bot commented Apr 10, 2018

stale bot commented Apr 24, 2018

retrohacker left a comment •

edited

Loading

lrowe commented Nov 15, 2017 •

edited

Loading