You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Originally reported bypablodcar (Bitbucket: pablodcar, GitHub: pablodcar)
Hi, I'm thankful for this wonderful tool. We are using it very extensively and I hope to contribute adding new APIs and features in the future.
When a source code is encoded in UTF-8 with BOM signature, //coverage.phystokens.source_encoding// returns the correct encoding: //"utf-8-sig"//. But when the file is rendered inside the html template, using that encoding to write the report to disk, it raises a //UnicodeDecodeError//, because the BOM can not be in the middle of the final output:
File "/home/pablo/baco-dyn/lib/python2.6/site-packages/coverage/control.py", line 603, in html_report
reporter.report(morfs)
File "/home/pablo/baco-dyn/lib/python2.6/site-packages/coverage/html.py", line 87, in report
self.report_files(self.html_file, morfs, self.config.html_dir)
File "/home/pablo/baco-dyn/lib/python2.6/site-packages/coverage/report.py", line 83, in report_files
report_fn(cu, self.coverage._analyze(cu))
File "/home/pablo/baco-dyn/lib/python2.6/site-packages/coverage/html.py", line 222, in html_file
html = html.encode(encoding)
File "/home/pablo/baco-dyn/lib/python2.6/encodings/utf_8_sig.py", line 15, in encode
return (codecs.BOM_UTF8 + codecs.utf_8_encode(input, errors)[0], len(input))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 18296: ordinal not in range(128)
I'm attaching a patch to decode and encode the source file in advance, using UTF-8 when utf-8-sig is detected. I hope you can review it and consider adding this change.
Originally reported by pablodcar (Bitbucket: pablodcar, GitHub: pablodcar)
Hi, I'm thankful for this wonderful tool. We are using it very extensively and I hope to contribute adding new APIs and features in the future.
When a source code is encoded in UTF-8 with BOM signature, //coverage.phystokens.source_encoding// returns the correct encoding: //"utf-8-sig"//. But when the file is rendered inside the html template, using that encoding to write the report to disk, it raises a //UnicodeDecodeError//, because the BOM can not be in the middle of the final output:
I'm attaching a patch to decode and encode the source file in advance, using UTF-8 when utf-8-sig is detected. I hope you can review it and consider adding this change.
Thanks in advance,
Pablo Carballo
The text was updated successfully, but these errors were encountered: