Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(response): set content-type charset default to utf-8 #1416 #1564

Closed
wants to merge 1 commit into from

Conversation

andersnylund
Copy link

Pre-Submission Checklist

  • Opened an issue discussing these changes before opening the PR
  • Ran the linter and tests via make prepush
  • Included comprehensive and convincing tests for changes

Issues

Closes:

Summarize the issues that discussed these changes

Changes

What does this PR do?

Makes a Response's content-type to default to charset=utf-8, if charSet() not specifically called

Copy link
Member

@retrohacker retrohacker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh snap! Awesome concise implementation and thorough tests 🎉

This is amazing @andersnylund, thank you ❤️

@retrohacker retrohacker requested a review from yunong November 15, 2017 00:44
@retrohacker
Copy link
Member

Also request @lrowe

@@ -491,6 +491,8 @@ function patch(Response) {

if (self._charSet) {
type = type + '; charset=' + self._charSet;
} else {
type = type + '; charset=utf-8';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choosing a default content type for a formatter should probably be the responsibility of the formatter, since binary content type should not have a charset.

Existing formatters are currently also tied to a particular charset since they set the Content-Length based on Buffer.byteLength(data) (Buffer.byteLength without a charset defaults to utf-8.) Example: https://github.com/restify/node-restify/blob/master/lib/formatters/json.js

(Setting Content-Length in the formatter should no longer be needed in recent NodeJS versions (>1.5.0), see: nodejs/node#1062.)

Perhaps it would make more sense to set the Content-Type header only after calling the formatter, providing the opportunity for the formatter to set a format appropriate default charset.

        const results = formatter(self.req, self, body);
        
        if (self._charSet) {
            type = type + '; charset=' + self._charSet;
        }

        // Update header to the derived content type for our formatter
        self.setHeader('Content-Type', type);

        // Finally, invoke the formatter and flush the request with it's results
        return flush(self, results);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused - I thought the content type indirectly specifies the formatter, not the other way around?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DonutEspresso while the content type picks the formatter the formatter should probably pick the charset if it is unspecified.

t.equal(
res.headers['content-type'],
'application/octet-stream; charset=utf-8'
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Binary content types should not include a charset as it only makes sense for textual content types.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is true. Didn't think of that. Having a list of different mime-types that should default to utf-8 is not convenient either keeping in mind all combinations. How about a blacklist of mime-types where not to format with utf-8?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The list of binary types is also unbounded. I think it makes most sense to make the formatter to be responsible for picking a default / forcing a particular charset.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Python WebOb library uses an include list for setting a default charset:

* The ``default_charset`` is used as the default character set to return on
  the ``Content-Type`` header, if the ``Content-Type`` allows for a
  ``charset`` parameter. Currently the only ``Content-Type``'s that allow
  for a ``charset`` are defined to be: ``text/*``, ``application/xml``, and
  ``*/*+xml``. Any other ``Content-Type``'s will not have a ``charset``
  added.

https://github.com/Pylons/webob/blob/master/src/webob/response.py#L142

We'd probably need to add at least "application/javascript", "application/json", and "/+json" to that list.

Maintaining such a list does seem very inconvenient. Attaching the default to the formatter somehow seems more extensible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this help? https://www.npmjs.com/package/mime-types

mime.charset('text/markdown') // 'UTF-8'

@lrowe
Copy link
Contributor

lrowe commented Nov 15, 2017

Binary formatters will likely return a Buffer so perhaps we could do:

        const results = formatter(self.req, self, body);
        
        if (self._charSet) {
            type = type + '; charset=' + self._charSet;
        } else if (typeof results === "string") {
            type = type + '; charset=utf-8';
        }

        // Update header to the derived content type for our formatter
        self.setHeader('Content-Type', type);

        // Finally, invoke the formatter and flush the request with it's results
        return flush(self, results);

@stale
Copy link

stale bot commented Apr 10, 2018

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Stale label Apr 10, 2018
@stale
Copy link

stale bot commented Apr 24, 2018

This issue has been automatically closed as stale because it has not had recent activity.

@stale stale bot closed this Apr 24, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants