Style/Encoding: false doesn't always work for ruby 2.0 #1289

jkingdon · 2014-08-19T22:50:27Z

If you have a ruby file without a # encoding: utf-8 line, and with a non-ASCII character, rubocop may blow up with "Invalid byte sequence in us-ascii". To make it happen every time, set LC_ALL to LC_ALL=en_US.US-ASCII (that's for MacOS Mavericks, exact value may depend on your system).

One workaround is to set LC_ALL to something like en_US.UTF-8 (exact value may vary depending on your system). However, since ruby 2.0+ defaults to UTF-8, rubocop should do the same (at least when being run with ruby 2.0+) rather than rely on a system default character set (which turns into the ruby Encoding.default_external which is what rubocop seems to currently be getting when reading files).

The text was updated successfully, but these errors were encountered:

bbatsov · 2014-08-27T15:16:25Z

@jonas054 Would you look into this?

jonas054 · 2014-08-28T08:28:54Z

OK.

jonas054 · 2014-08-31T08:18:00Z

What happens is that we get an exception from Parser::Source::Buffer#source= when we call it in RuboCop::ProcessedSource#parse. If there's no encoding comment in the file, Parser falls back on the encoding that's set in the parsed string, and that depends on the external encoding set in environment variables, e.g., LC_ALL.

@bbatsov @yujinakayama I have two questions.

Should RuboCop's behavior depend in which Ruby version it's running under? See discussion in Enable Style/Encoding for Ruby >= 2.0 #1304.
Should this be solved in RuboCop or in Parser?

BTW, I was able to reproduce the problem in Linux with export LC_ALL=C.

yujinakayama · 2014-09-01T10:26:33Z

Should RuboCop's behavior depend in which Ruby version it's running under?

I think it should not.

Should this be solved in RuboCop or in Parser?

I'm not completely sure, but in this case I'd prefer handling it in RuboCop.

Parser is a versatile library and it should be flexible. I think fixing encoding in Parser is somewhat inflexible (and doing so may be a breaking change).
We already have control on it. We just need to always pass a UTF-8 encoded string to Parser.

jonas054 · 2014-09-01T21:09:53Z

@yujinakayama Thanks!

[Fix #1289] Use utf-8 as default encoding for inspected files

jonas054 self-assigned this Aug 28, 2014

yujinakayama mentioned this issue Sep 2, 2014

Rubocop: Invalid byte sequence in us-ascii yujinakayama/atom-lint#73

Closed

jonas054 closed this as completed in c0c2b5e Sep 3, 2014

bbatsov added a commit that referenced this issue Sep 3, 2014

Merge pull request #1322 from jonas054/1289_default_encoding

cabc36b

[Fix #1289] Use utf-8 as default encoding for inspected files

sds mentioned this issue Sep 7, 2015

invalid byte sequence in US-ASCII sds/haml-lint#94

Closed

thomthom mentioned this issue Mar 31, 2017

Encoding issues might prevent Rubocop parsing SketchUp/rubocop-sketchup#12

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Style/Encoding: false doesn't always work for ruby 2.0 #1289

Style/Encoding: false doesn't always work for ruby 2.0 #1289

jkingdon commented Aug 19, 2014

bbatsov commented Aug 27, 2014

jonas054 commented Aug 28, 2014

jonas054 commented Aug 31, 2014

yujinakayama commented Sep 1, 2014

jonas054 commented Sep 1, 2014

Style/Encoding: false doesn't always work for ruby 2.0 #1289

Style/Encoding: false doesn't always work for ruby 2.0 #1289

Comments

jkingdon commented Aug 19, 2014

bbatsov commented Aug 27, 2014

jonas054 commented Aug 28, 2014

jonas054 commented Aug 31, 2014

yujinakayama commented Sep 1, 2014

jonas054 commented Sep 1, 2014