Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow removing @charset atrule with expanded OutputStyle #567

Closed
rzhw opened this issue Jan 11, 2019 · 12 comments
Closed

Allow removing @charset atrule with expanded OutputStyle #567

rzhw opened this issue Jan 11, 2019 · 12 comments

Comments

@rzhw
Copy link

rzhw commented Jan 11, 2019

It looks like Dart Sass always prefixes stylesheets with a @charset atrule if the output style isn't compressed.

This is unnecessary for stylesheets embedded in <style> elements:

When a style sheet is embedded in another document, such as in the STYLE element or "style" attribute of HTML, the style sheet shares the character encoding of the whole document.

The MDN has a stronger interpretation of this:

If several @charset at-rules are defined, only the first one is used, and it cannot be used inside a style attribute on an HTML element or inside the <style> element where the character set of the HTML page is relevant.

Outputting the @charset atrule should be optional, because removing this atrule isn't trivial when maintaining a sourcemap.

@nex3
Copy link
Contributor

nex3 commented Jan 12, 2019

What's your use-case here? If you want to minimize the number of bytes Sass produces, that's what --compressed mode is for. We're producing valid CSS that's as compatible as possible. Adding new options comes at a usability price, and I'm not convinced this is worth that price.

It looks like Dart Sass always prefixes stylesheets with a @charset atrule if the output style isn't compressed.

This isn't strictly true; the @charset declaration is added only for stylesheets that contain non-ASCII characters, to ensure that those characters are interpreted correctly in any context.

This is unnecessary for stylesheets embedded in <style> elements:

When a style sheet is embedded in another document, such as in the STYLE element or "style" attribute of HTML, the style sheet shares the character encoding of the whole document.

The MDN has a stronger interpretation of this:

If several @charset at-rules are defined, only the first one is used, and it cannot be used inside a style attribute on an HTML element or inside the <style> element where the character set of the HTML page is relevant.

Outputting the @charset atrule should be optional, because removing this atrule isn't trivial when maintaining a sourcemap.

MDN's use of "cannot" here is overly strong; @charset will simply be ignored by CSS processors that already have an encoding defined.

@rzhw
Copy link
Author

rzhw commented Jan 14, 2019

We do have a fairly uncommon use case use, making this possibly a nice-to-have depending on how usability gets affected.

I work on a website-generating engine, where stylesheets are embedded as <style> elements in template files. Being in a <style> element, the @charset 'utf-8'; as described above, is unnecessary, though being ignored it certainly wouldn't break anything. (The bandwidth argument is probably non-existent as well. Being 17 bytes, 19 bytes with two newlines, it would add at most... 19 MB per 1,000,000 queries before compression?)

(Our engine also currently escapes UTF-8 characters—hence why I've filed #568—which makes the @charset 'utf-8'; atrule broad but not technically incorrect. My above comments would still apply if it was, say, @charset 'us-ascii'; instead.)

@nfagerlund
Copy link

nfagerlund commented Jul 8, 2020

#644 sorted this out for the CLI and the Dart API! But the JavaScript API is still out of luck. Any chance we could get the charset option exposed to render and renderAsync? (I mean, it LOOKS like it just needs to get wired through the wrapper functions around here-ish, plus a validator function for the passed value, but I don't really know my Dart and might be missing some extra steps.)

@Awjin
Copy link
Contributor

Awjin commented Jul 9, 2020

The current JS API needs to be overhauled, and it's generally not a good practice to add API surface to something that will be deleted. The design for the next gen JS API is tracked here.

@lhtin
Copy link

lhtin commented Feb 19, 2021

This isn't strictly true; the @charset declaration is added only for stylesheets that contain non-ASCII characters, to ensure that those characters are interpreted correctly in any context.

@nex3 Hi, why the @charset declaration is not added when using compressed outputStyle, even stylesheets contain non-ASCII characters?

js:

const sass = require('sass');
const result2 = sass.renderSync({
  outputStyle: 'expanded',
  file: "./test.scss",
});
console.log('dart-sass(expanded):\n', result2.css.toString());

const result3 = sass.renderSync({
  outputStyle: 'compressed',
  file: "./test.scss",
});
console.log('dart-sass(compressed):\n', result3.css.toString());

scss file:

.a-icon {
  content: "\E91E";
}
.a-iconb {
  content: "你好"
}

output:

dart-sass(expanded):
 @charset "UTF-8";
.a-icon {
  content: "";
}

.a-iconb {
  content: "你好";
}

dart-sass(compressed):
 .a-icon{content:""}.a-iconb{content:"你好"}

@nex3
Copy link
Contributor

nex3 commented Feb 19, 2021

In compressed output style, we emit a Unicode byte-order mark instead. This has the same effect as @charset (it forces browsers to interpret the stylesheet as UTF-8) while being much shorter (but also invisible, which is why we use @charset instead in expanded mode).

@tobyfisher
Copy link

tobyfisher commented Aug 14, 2021

@nex3 using the JS API for Dart Sass ("dart-sass 1.35.1 (Sass Compiler), dart2js 2.13.3 (Dart Compiler)") I have come across this small issue, in "compressed" mode the byte-order mark seems to be corrupting(?) the first @font-face declaration I had (Chrome Version 92.0.4515.131), i.e. fonts set to "100" were showing up as "300" rather than as "100" weight, the CSS works as expected in "expanded" mode, by adding a throw-away atrule in front solved issue for me e.g.

@page{margin:1cm}@font-face{font-family:"Roboto";font-style:normal;font-weight:100;src:local(""),url("../fonts/roboto-v27-latin_greek/roboto-v27-latin_greek-100.woff2") format("woff2")}@font-face{font-family:"Roboto";font-style:normal;font-weight:300;src:local(""),url("../fonts/roboto-v27-latin_greek/roboto-v27-latin_greek-300.woff2") format("woff2")

@nex3
Copy link
Contributor

nex3 commented Aug 19, 2021

@tobyfisher Can you send me a repository with a minimal reproduction?

adamhooper added a commit to adamhooper/dart-sass that referenced this issue Aug 20, 2021
@adamhooper
Copy link
Contributor

@nex3 Maybe the spec changed since 2019? WhatWG spec is as forceful as Mozilla, as per your linked document https://drafts.csswg.org/css-syntax-3/#charset-rule:

However, there is no actual at-rule named @charset. When a stylesheet is actually parsed, any occurrences of an @charset rule must be treated as an unrecognized rule, and thus dropped as invalid when the stylesheet is grammar-checked.

Note: In CSS 2.1, @charset was a valid rule.

This causes validators (e.g., AMP) to fail when the user injects sass-generated CSS in a <style> tag, as with sass-loader.

Setting output style isn't correct, either: Unicode has also changed in the past few years. Now, U+FEFF in the middle of a text document is an "unsupported character", which I imagine might mean "undefined behavior" in principle? Candidate for becoming �? See https://www.unicode.org/faq/utf_bom.html#bom6

TL;DR: both U+FEFF or @charset are bad for <script> tags.

The solution is a charset: false option. I've implemented it in a pull request, complete with tests.

@mariusa
Copy link

mariusa commented Dec 30, 2021

charset being true by default, does it cause warning: "@charset" must be the first rule in the file on build?
vitejs/vite#5079

I'm trying to figure out the root cause (quite a few complains: https://github.com/search?q=%22must+be+the+first+rule+in+the+file%22&type=issues)

Is the cause concatenating multiple css files from various UI libraries used in a project? (which is a common use case) ?

@nex3
Copy link
Contributor

nex3 commented Dec 30, 2021

Is the cause concatenating multiple css files from various UI libraries used in a project? (which is a common use case) ?

It is generally not safe to naively concatenate multiple CSS files, because CSS does have features that must appear in a certain order in the file (not just @charset but also @import). It's not impossible to do—for example, Sass will do it correctly if you write a file that @uses multiple CSS files in a row—but it requires more intelligence than merely string concatenation.

In other words, if you're seeing bugs due to a @charset rule appearing in the middle of your concatenated CSS, that's a bug in your concatenation logic.

@mariusa
Copy link

mariusa commented Dec 30, 2021

Thanks for clarification, Natalie!
(getting the warnings via vitejs)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants