Skip to content

Commit

Permalink
closes bpo-34056: Always return bytes from _HackedGetData.get_data(). (
Browse files Browse the repository at this point in the history
…pythonGH-8130)

* Always return bytes from _HackedGetData.get_data().

Ensure the imp.load_source shim always returns bytes by reopening the file in
binary mode if needed. Hash-based pycs have to receive the source code in bytes.

It's tempting to change imp.get_suffixes() to always return 'rb' as a mode, but
that breaks some stdlib tests and likely 3rdparty code, too.
(cherry picked from commit b0274f2)

Co-authored-by: Benjamin Peterson <benjamin@python.org>
  • Loading branch information
benjaminp authored and miss-islington committed Jul 7, 2018
1 parent 127bd9b commit 57a81aa
Show file tree
Hide file tree
Showing 3 changed files with 24 additions and 7 deletions.
13 changes: 6 additions & 7 deletions Lib/imp.py
Original file line number Diff line number Diff line change
Expand Up @@ -142,17 +142,16 @@ def __init__(self, fullname, path, file=None):
def get_data(self, path):
"""Gross hack to contort loader to deal w/ load_*()'s bad API."""
if self.file and path == self.path:
# The contract of get_data() requires us to return bytes. Reopen the
# file in binary mode if needed.
if not self.file.closed:
file = self.file
else:
self.file = file = open(self.path, 'r')
if 'b' not in file.mode:
file.close()
if self.file.closed:
self.file = file = open(self.path, 'rb')

with file:
# Technically should be returning bytes, but
# SourceLoader.get_code() just passed what is returned to
# compile() which can handle str. And converting to bytes would
# require figuring out the encoding to decode to and
# tokenize.detect_encoding() only accepts bytes.
return file.read()
else:
return super().get_data(path)
Expand Down
15 changes: 15 additions & 0 deletions Lib/test/test_imp.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
import importlib.util
import os
import os.path
import py_compile
import sys
from test import support
from test.support import script_helper
Expand Down Expand Up @@ -350,6 +351,20 @@ def test_pyc_invalidation_mode_from_cmdline(self):
res = script_helper.assert_python_ok(*args)
self.assertEqual(res.out.strip().decode('utf-8'), expected)

def test_find_and_load_checked_pyc(self):
# issue 34056
with support.temp_cwd():
with open('mymod.py', 'wb') as fp:
fp.write(b'x = 42\n')
py_compile.compile(
'mymod.py',
doraise=True,
invalidation_mode=py_compile.PycInvalidationMode.CHECKED_HASH,
)
file, path, description = imp.find_module('mymod', path=['.'])
mod = imp.load_module('mymod', file, path, description)
self.assertEqual(mod.x, 42)


class ReloadTests(unittest.TestCase):

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Ensure the loader shim created by ``imp.load_module`` always returns bytes
from its ``get_data()`` function. This fixes using ``imp.load_module`` with
:pep:`552` hash-based pycs.

0 comments on commit 57a81aa

Please sign in to comment.