This repository has been archived by the owner on Aug 7, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 31
Encoding handling #235
Labels
Comments
@Arcanemagus , I'll do a PR if you agree with this approach. |
Oh, it doesn't work the way I hoped either. Basically, |
weirdan
added a commit
to weirdan/linter-phpcs
that referenced
this issue
Jan 28, 2017
PHPCS always gets UTF-8 as its input (see AtomLinter#235)
Merged
Filed a bug on Thanks for tracing this mess down! |
Arcanemagus
pushed a commit
that referenced
this issue
Jan 30, 2017
* Use a single fixed encoding PHPCS always gets UTF-8 as its input (see #235) * Added single-byte encoding tests
Hopefully fixed by #236 in v1.5.8. |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Turns out encoding handling is still broken for single-byte encodings (see comment on #211). That's because I assumed in my latest PRs that file content is passed as binary, however it's not the case.
linter-phpcs
gets file contents as string (withtextEditor.getText()
call). Strings in javascript have no inherent encoding, instead they are sequences of characters. Later on, this string is passed toatom-linter.exec
, which is actually imported fromsb-exec
.sb-exec
starts the process and eventually passes the string (it's still a string at that point) to subprocess'stdin.write()
method (here), but does not specify any encoding.stdin
itself isstream.Writable
. Documentation forstream.Writable.write()
says you need to specifyencoding
if you pass a string aschunk
argument, however it's not enforced.Tracing the calls deeper I found that default encoding used when no encoding was specified for a string is
utf8
(here). However it really should be considered an implementation detail, as it's not documented. In fact, documentation impliesencoding
argument forstream.Writable.write()
method is mandatory when passing string aschunk
argument.The end result is that regardless of the actual file encoding PHPCS always gets UTF-8 on stdin.
I propose to keep UTF-8 as the only encoding used when communicating with PHPCS, as it allows to drop any conversion mumbo-jumbo, including that manually created iconv-lite-to-libiconv JSON mapping. This has a slight chance of breaking custom PHPCS sniffs that expect a single-byte encoding and do not properly consider
--encoding=
parameter. But then it's probably their fault anyway. To make sure we do not rely on implicit default encoding I propose to convert the text to explicit UTF-8 encodedBuffer
(withBuffer.from(fileText, 'utf8')
) and pass it down tosb-exec
to be used as stdin for the subprocess.The text was updated successfully, but these errors were encountered: