Skip to content

Commit

Permalink
Use defined encoding when reading from URLs (#219)
Browse files Browse the repository at this point in the history
  • Loading branch information
mondeja committed Jun 10, 2024
1 parent 7a7cb71 commit 5fbe1c8
Show file tree
Hide file tree
Showing 8 changed files with 52 additions and 42 deletions.
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -159,8 +159,8 @@ content to include.
**recursive** (_true_): When this option is disabled, included files are not
processed for recursive includes. Possible values are `true` and `false`.
- <a name="include-markdown_encoding" href="#include-markdown_encoding">#</a>
**encoding** (_utf-8_): Specify the encoding of the included file.
If not defined `utf-8` will be used.
**encoding** (_'utf-8'_): Specify the encoding of the included file.
If not defined `'utf-8'` will be used.
- <a name="include-markdown_rewrite-relative-urls" href="#include-markdown_rewrite-relative-urls">#</a>
**rewrite-relative-urls** (_true_): When this option is enabled (default),
Markdown links and images in the content that are specified by a relative URL
Expand Down Expand Up @@ -250,8 +250,8 @@ Includes the content of a file or a group of files.
**recursive** (_true_): When this option is disabled, included files are not
processed for recursive includes. Possible values are `true` and `false`.
- <a name="include_encoding" href="#include_encoding">#</a>
**encoding** (_utf-8_): Specify the encoding of the included file.
If not defined `utf-8` will be used.
**encoding** (_'utf-8'_): Specify the encoding of the included file.
If not defined `'utf-8'` will be used.

##### Examples

Expand Down
8 changes: 4 additions & 4 deletions locale/es/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -149,8 +149,8 @@ se encuentran en el contenido a incluir se eliminan. Los valores posibles son
incluidos no son procesados para incluir de forma recursiva. Los valores
posibles son `true` y `false`.
- <a name="include-markdown_encoding" href="#include-markdown_encoding">#</a>
**encoding** (*utf-8*): Especifica la codificación del archivo incluído. Si no
se define, se usará `utf-8`.
**encoding** (*'utf-8'*): Especifica la codificación del archivo incluído. Si
no se define, se usará `'utf-8'`.
- <a name="include-markdown_rewrite-relative-urls"
href="#include-markdown_rewrite-relative-urls">#</a> **rewrite-relative-urls**
(*true*): Cuando esta opción está habilitada (por defecto), los enlaces e
Expand Down Expand Up @@ -243,8 +243,8 @@ Los valores posibles son `true` y `false`.
procesados para incluir de forma recursiva. Los valores posibles son `true` y
`false`.
- <a name="include_encoding" href="#include_encoding">#</a> **encoding**
(*utf-8*): Especifica la codificación del archivo incluído. Si no se define,
se usará `utf-8`.
(*'utf-8'*): Especifica la codificación del archivo incluído. Si no se define,
se usará `'utf-8'`.

##### Ejemplos

Expand Down
16 changes: 8 additions & 8 deletions locale/es/README.md.po
Original file line number Diff line number Diff line change
Expand Up @@ -257,21 +257,21 @@ msgstr ""

msgid ""
"<a name=\"include-markdown_encoding\" href=\"#include-"
"markdown_encoding\">#</a> **encoding** (*utf-8*): Specify the encoding of "
"the included file. If not defined `utf-8` will be used."
"markdown_encoding\">#</a> **encoding** (*'utf-8'*): Specify the encoding of "
"the included file. If not defined `'utf-8'` will be used."
msgstr ""
"<a name=\"include-markdown_encoding\" href=\"#include-"
"markdown_encoding\">#</a> **encoding** (*utf-8*): Especifica la codificación"
" del archivo incluído. Si no se define, se usará `utf-8`."
"markdown_encoding\">#</a> **encoding** (*'utf-8'*): Especifica la "
"codificación del archivo incluído. Si no se define, se usará `'utf-8'`."

msgid ""
"<a name=\"include_encoding\" href=\"#include_encoding\">#</a> **encoding** "
"(*utf-8*): Specify the encoding of the included file. If not defined `utf-8`"
" will be used."
"(*'utf-8'*): Specify the encoding of the included file. If not defined "
"`'utf-8'` will be used."
msgstr ""
"<a name=\"include_encoding\" href=\"#include_encoding\">#</a> **encoding** "
"(*utf-8*): Especifica la codificación del archivo incluído. Si no se define,"
" se usará `utf-8`."
"(*'utf-8'*): Especifica la codificación del archivo incluído. Si no se "
"define, se usará `'utf-8'`."

msgid "Configuration"
msgstr "Configuración"
Expand Down
8 changes: 4 additions & 4 deletions locale/fr/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -149,8 +149,8 @@ trouvées dans le contenu à inclure sont supprimées. Les valeurs possibles son
inclus ne sont pas traités pour des inclusions récursives. Les valeurs possibles
sont `true` et `false`.
- <a name="include-markdown_encoding" href="#include-markdown_encoding">#</a>
**encoding** (*utf-8*): Spécifiez l'encodage du fichier inclus. S'il n'est pas
défini, `utf-8` sera utilisé.
**encoding** (*'utf-8'*): Spécifiez l'encodage du fichier inclus. S'il n'est
pas défini, `'utf-8'` sera utilisé.
- <a name="include-markdown_rewrite-relative-urls"
href="#include-markdown_rewrite-relative-urls">#</a> **rewrite-relative-urls**
(*true*): Lorsque cette option est activée (par défaut), liens et images
Expand Down Expand Up @@ -242,8 +242,8 @@ valeurs possibles sont `true` et `false`.
traités pour des inclusions récursives. Les valeurs possibles sont `true` et
`false`.
- <a name="include_encoding" href="#include_encoding">#</a> **encoding**
(*utf-8*): Spécifiez l'encodage du fichier inclus. S'il n'est pas défini,
`utf-8` sera utilisé.
(*'utf-8'*): Spécifiez l'encodage du fichier inclus. S'il n'est pas défini,
`'utf-8'` sera utilisé.

##### Exemples

Expand Down
16 changes: 8 additions & 8 deletions locale/fr/README.md.po
Original file line number Diff line number Diff line change
Expand Up @@ -257,21 +257,21 @@ msgstr ""

msgid ""
"<a name=\"include-markdown_encoding\" href=\"#include-"
"markdown_encoding\">#</a> **encoding** (*utf-8*): Specify the encoding of "
"the included file. If not defined `utf-8` will be used."
"markdown_encoding\">#</a> **encoding** (*'utf-8'*): Specify the encoding of "
"the included file. If not defined `'utf-8'` will be used."
msgstr ""
"<a name=\"include-markdown_encoding\" href=\"#include-"
"markdown_encoding\">#</a> **encoding** (*utf-8*): Spécifiez l'encodage du "
"fichier inclus. S'il n'est pas défini, `utf-8` sera utilisé."
"markdown_encoding\">#</a> **encoding** (*'utf-8'*): Spécifiez l'encodage du "
"fichier inclus. S'il n'est pas défini, `'utf-8'` sera utilisé."

msgid ""
"<a name=\"include_encoding\" href=\"#include_encoding\">#</a> **encoding** "
"(*utf-8*): Specify the encoding of the included file. If not defined `utf-8`"
" will be used."
"(*'utf-8'*): Specify the encoding of the included file. If not defined "
"`'utf-8'` will be used."
msgstr ""
"<a name=\"include_encoding\" href=\"#include_encoding\">#</a> **encoding** "
"(*utf-8*): Spécifiez l'encodage du fichier inclus. S'il n'est pas défini, "
"`utf-8` sera utilisé."
"(*'utf-8'*): Spécifiez l'encodage du fichier inclus. S'il n'est pas défini, "
"`'utf-8'` sera utilisé."

msgid "Configuration"
msgstr "Configuration"
Expand Down
12 changes: 6 additions & 6 deletions src/mkdocs_include_markdown_plugin/cache.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,24 +39,24 @@ def generate_unique_key_from_url(cls, url: str) -> str:
hashlib.sha3_512(url.encode()).digest(),
).decode('utf-8')

def read_file(self, fpath: str) -> str: # noqa: D102
with open(fpath, encoding='utf-8') as f:
def read_file(self, fpath: str, encoding: str = 'utf-8') -> str: # noqa: D102
with open(fpath, encoding=encoding) as f:
return f.read().split('\n', 1)[1]

def get_(self, url: str) -> str | None: # noqa: D102
def get_(self, url: str, encoding: str = 'utf-8') -> str | None: # noqa: D102
key = self.generate_unique_key_from_url(url)
fpath = os.path.join(self.cache_dir, key)
if os.path.isfile(fpath):
creation_time = self.get_creation_time_from_fpath(fpath)
if time.time() < creation_time + self.expiration_seconds:
return self.read_file(fpath)
return self.read_file(fpath, encoding=encoding)
os.remove(fpath)
return None

def set_(self, url: str, value: str) -> None: # noqa: D102
def set_(self, url: str, value: str, encoding: str = 'utf-8') -> None: # noqa: D102
key = self.generate_unique_key_from_url(url)
fpath = os.path.join(self.cache_dir, key)
with open(fpath, 'w', encoding='utf-8') as f:
with open(fpath, 'w', encoding=encoding) as f:
f.write(f'{int(time.time())}\n')
f.write(value)

Expand Down
14 changes: 10 additions & 4 deletions src/mkdocs_include_markdown_plugin/event.py
Original file line number Diff line number Diff line change
Expand Up @@ -230,7 +230,9 @@ def found_include_tag( # noqa: PLR0912, PLR0915
expected_but_any_found = [start is not None, end is not None]
for file_path in file_paths_to_include:
if process.is_url(filename):
new_text_to_include = process.read_url(file_path, http_cache)
new_text_to_include = process.read_url(
file_path, http_cache, encoding,
)
else:
new_text_to_include = process.read_file(file_path, encoding)

Expand Down Expand Up @@ -481,7 +483,9 @@ def found_include_markdown_tag( # noqa: PLR0912, PLR0915
text_to_include = ''
for file_path in file_paths_to_include:
if process.is_url(filename):
new_text_to_include = process.read_url(file_path, http_cache)
new_text_to_include = process.read_url(
file_path, http_cache, encoding,
)
else:
new_text_to_include = process.read_file(file_path, encoding)

Expand Down Expand Up @@ -587,9 +591,11 @@ def found_include_markdown_tag( # noqa: PLR0912, PLR0915
nonlocal new_found_include_markdown_contents
markdown_include_index = len(new_found_include_markdown_contents)
placeholder = build_placeholder(
markdown_include_index, 'include-markdown')
markdown_include_index, 'include-markdown',
)
new_found_include_markdown_contents.append(
(placeholder, text_to_include))
(placeholder, text_to_include),
)
return placeholder

# Replace contents by placeholders
Expand Down
12 changes: 8 additions & 4 deletions src/mkdocs_include_markdown_plugin/process.py
Original file line number Diff line number Diff line change
Expand Up @@ -443,16 +443,20 @@ def read_file(file_path: str, encoding: str) -> str:
return f.read()


def read_url(url: str, http_cache: Cache | None) -> Any:
def read_url(
url: str,
http_cache: Cache | None,
encoding: str = 'utf-8',
) -> Any:
"""Read an HTTP location and return its content."""
if http_cache is not None:
cached_content = http_cache.get_(url)
cached_content = http_cache.get_(url, encoding)
if cached_content is not None:
return cached_content
with urlopen(Request(url)) as response:
content = response.read().decode('UTF-8')
content = response.read().decode(encoding)
if http_cache is not None:
http_cache.set_(url, content)
http_cache.set_(url, content, encoding)
return content


Expand Down

0 comments on commit 5fbe1c8

Please sign in to comment.