Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Sphinx version #13225

Merged
merged 1 commit into from
Oct 26, 2022
Merged

Update Sphinx version #13225

merged 1 commit into from
Oct 26, 2022

Conversation

nineinchnick
Copy link
Member

Description

This PR updates Sphinx version from 3.3.0 to 5.0.2. It also bumps all Python dependencies, and introduces a requirements.in file that only lists direct dependencies. pip-compile from pip-tools can be used to easily manage the requirements.txt file and keep all dependencies up to date. It's not a hard dependency though, the file can still be managed manually, and all the new comments (added by pip-compile) make it easier to track dependencies.

The docs generated using this version look the same. When comparing the dirs I see some minor changes in HTML headers and some paragraphs are wrapped in additional <section> tags.

I think the updated Docker image must be built manually and pushed to ghcr.io. I tested it locally using:

docker build --tag ghcr.io/trinodb/build/sphinx:5 docs/

Is this change a fix, improvement, new feature, refactoring, or other?
refactor

Is this a change to the core query engine, a connector, client library, or the SPI interfaces? (be specific)
docs

How would you describe this change to a non-technical end user or system administrator?
n/a

Related issues, pull requests, and links

Documentation

(x) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.

Release notes

(x) No release notes entries required.
( ) Release notes entries required with the following suggested text:

# Section
* Fix some things. ({issue}`issuenumber`)

@cla-bot cla-bot bot added the cla-signed label Jul 19, 2022
@nineinchnick nineinchnick requested a review from mosabua July 19, 2022 10:20
@github-actions github-actions bot added the docs label Jul 19, 2022
@nineinchnick
Copy link
Member Author

This supersedes #11994

@electrum
Copy link
Member

electrum commented Jul 19, 2022

Can you show a diff -ru of the HTML output? That’s how I verify an upgrade.

@nineinchnick
Copy link
Member Author

Can you show a diff -ru of the HTML output? That’s how I verify an upgrade.

For the whole html dir? That's going to be huge. Here's a single file:

% diff -ru docs/target/html/connector/hive-s3.html   ~/tmp/docs-target/html/connector/hive-s3.html 
--- docs/target/html/connector/hive-s3.html	2022-07-19 12:20:04.000000000 +0200
+++ /Users/jwas/tmp/docs-target/html/connector/hive-s3.html	2022-07-19 12:06:26.000000000 +0200
@@ -1,10 +1,11 @@
 
 <!DOCTYPE html>
 
-<html>
+<html lang="en">
   <head>
     <meta charset="utf-8" />
-    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
+    <meta name="viewport" content="width=device-width, initial-scale=1.0" /><meta name="generator" content="Docutils 0.18.1: http://docutils.sourceforge.net/" />
+
   <meta name="viewport" content="width=device-width,initial-scale=1">
   <meta http-equiv="x-ua-compatible" content="ie=edge">
   <meta name="lang:clipboard.copy" content="Copy to clipboard">
@@ -47,19 +48,17 @@
   
   
     <title>Hive connector with Amazon S3 &#8212; Trino 391-SNAPSHOT Documentation</title>
-    <link rel="stylesheet" href="../_static/pygments.css" type="text/css" />
-    <link rel="stylesheet" href="../_static/material.css" type="text/css" />
+    <link rel="stylesheet" type="text/css" href="../_static/pygments.css" />
+    <link rel="stylesheet" type="text/css" href="../_static/material.css" />
     <link rel="stylesheet" type="text/css" href="../_static/copybutton.css" />
     <link rel="stylesheet" type="text/css" href="../_static/trino.css" />
-    <script id="documentation_options" data-url_root="../" src="../_static/documentation_options.js"></script>
+    <script data-url_root="../" id="documentation_options" src="../_static/documentation_options.js"></script>
     <script src="../_static/jquery.js"></script>
     <script src="../_static/underscore.js"></script>
+    <script src="../_static/_sphinx_javascript_frameworks_compat.js"></script>
     <script src="../_static/doctools.js"></script>
-    <script src="../_static/language_data.js"></script>
     <script src="../_static/clipboard.min.js"></script>
     <script src="../_static/copybutton.js"></script>
-    <script async="async" src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.7/latest.js?config=TeX-AMS-MML_HTMLorMML"></script>
-    <script type="text/x-mathjax-config">MathJax.Hub.Config({"tex2jax": {"inlineMath": [["\\(", "\\)"]], "displayMath": [["\\[", "\\]"]], "processRefs": false, "processEnvironments": false}})</script>
     <link rel="index" title="Index" href="../genindex.html" />
     <link rel="search" title="Search" href="../search.html" />
     <link rel="next" title="Hive connector with Azure Storage" href="hive-azure.html" />
@@ -665,21 +664,17 @@
         <div class="md-content">
           <article class="md-content__inner md-typeset" role="main">
             
-  
-<h1 id="connector-hive-s3--page-root">Hive connector with Amazon S3<a class="headerlink" href="#connector-hive-s3--page-root" title="Permalink to this headline">#</a></h1>
+  <section id="hive-connector-with-amazon-s3">
+<h1 id="connector-hive-s3--page-root">Hive connector with Amazon S3<a class="headerlink" href="#connector-hive-s3--page-root" title="Permalink to this heading">#</a></h1>
 <p>The <a class="reference internal" href="hive.html"><span class="doc">Hive connector</span></a> can read and write tables that are stored in
 <a class="reference external" href="https://aws.amazon.com/s3/">Amazon S3</a> or S3-compatible systems.
 This is accomplished by having a table or database location that
 uses an S3 prefix, rather than an HDFS prefix.</p>
 <p>Trino uses its own S3 filesystem for the URI prefixes
 <code class="docutils literal notranslate"><span class="pre">s3://</span></code>, <code class="docutils literal notranslate"><span class="pre">s3n://</span></code> and  <code class="docutils literal notranslate"><span class="pre">s3a://</span></code>.</p>
-
-<h2 id="s3-configuration-properties">S3 configuration properties<a class="headerlink" href="#s3-configuration-properties" title="Permalink to this headline">#</a></h2>
+<section id="s3-configuration-properties">
+<h2 id="s3-configuration-properties">S3 configuration properties<a class="headerlink" href="#s3-configuration-properties" title="Permalink to this heading">#</a></h2>
 <table>
-<colgroup>
-<col style="width: 30%"/>
-<col style="width: 70%"/>
-</colgroup>
 <thead>
 <tr class="row-odd"><th class="head"><p>Property name</p></th>
 <th class="head"><p>Description</p></th>
@@ -816,9 +811,9 @@
 </tr>
 </tbody>
 </table>
-
-
-<span id="hive-s3-credentials"></span><h2 id="s3-credentials">S3 credentials<a class="headerlink" href="#s3-credentials" title="Permalink to this headline">#</a></h2>
+</section>
+<section id="s3-credentials">
+<span id="hive-s3-credentials"></span><h2 id="s3-credentials">S3 credentials<a class="headerlink" href="#s3-credentials" title="Permalink to this heading">#</a></h2>
 <p>If you are running Trino on Amazon EC2, using EMR or another facility,
 it is recommended that you use IAM Roles for EC2 to govern access to S3.
 To enable this, your EC2 instances need to be assigned an IAM Role which
@@ -828,9 +823,9 @@
 setting AWS access and secret keys in the <code class="docutils literal notranslate"><span class="pre">hive.s3.aws-access-key</span></code>
 and <code class="docutils literal notranslate"><span class="pre">hive.s3.aws-secret-key</span></code> settings, and also allows EC2 to automatically
 rotate credentials on a regular basis without any additional work on your part.</p>
-
-
-<h2 id="custom-s3-credentials-provider">Custom S3 credentials provider<a class="headerlink" href="#custom-s3-credentials-provider" title="Permalink to this headline">#</a></h2>
+</section>
+<section id="custom-s3-credentials-provider">
+<h2 id="custom-s3-credentials-provider">Custom S3 credentials provider<a class="headerlink" href="#custom-s3-credentials-provider" title="Permalink to this heading">#</a></h2>
 <p>You can configure a custom S3 credentials provider by setting the configuration
 property <code class="docutils literal notranslate"><span class="pre">trino.s3.credentials-provider</span></code> to the fully qualified class name of
 a custom AWS credentials provider implementation. The property must be set in
@@ -844,9 +839,9 @@
 temporary credentials from STS (using <code class="docutils literal notranslate"><span class="pre">STSSessionCredentialsProvider</span></code>),
 IAM role-based credentials (using <code class="docutils literal notranslate"><span class="pre">STSAssumeRoleSessionCredentialsProvider</span></code>),
 or credentials for a specific use case (e.g., bucket/user specific credentials).</p>
-
-
-<span id="hive-s3-security-mapping"></span><h2 id="s3-security-mapping">S3 security mapping<a class="headerlink" href="#s3-security-mapping" title="Permalink to this headline">#</a></h2>
+</section>
+<section id="s3-security-mapping">
+<span id="hive-s3-security-mapping"></span><h2 id="s3-security-mapping">S3 security mapping<a class="headerlink" href="#s3-security-mapping" title="Permalink to this heading">#</a></h2>
 <p>Trino supports flexible security mapping for S3, allowing for separate
 credentials or IAM roles for specific users or buckets/paths. The IAM role
 for a specific query can be selected from a list of allowed roles by providing
@@ -946,10 +941,6 @@
 </pre></div>
 </div>
 <table>
-<colgroup>
-<col style="width: 45%"/>
-<col style="width: 55%"/>
-</colgroup>
 <thead>
 <tr class="row-odd"><th class="head"><p>Property name</p></th>
 <th class="head"><p>Description</p></th>
@@ -983,19 +974,14 @@
 </tr>
 </tbody>
 </table>
-
-
-<h2 id="tuning-properties">Tuning properties<a class="headerlink" href="#tuning-properties" title="Permalink to this headline">#</a></h2>
+</section>
+<section id="tuning-properties">
+<h2 id="tuning-properties">Tuning properties<a class="headerlink" href="#tuning-properties" title="Permalink to this heading">#</a></h2>
 <p>The following tuning properties affect the behavior of the client
 used by the Trino S3 filesystem when communicating with S3.
 Most of these parameters affect settings on the <code class="docutils literal notranslate"><span class="pre">ClientConfiguration</span></code>
 object associated with the <code class="docutils literal notranslate"><span class="pre">AmazonS3Client</span></code>.</p>
 <table>
-<colgroup>
-<col style="width: 33%"/>
-<col style="width: 53%"/>
-<col style="width: 14%"/>
-</colgroup>
 <thead>
 <tr class="row-odd"><th class="head"><p>Property name</p></th>
 <th class="head"><p>Description</p></th>
@@ -1042,9 +1028,9 @@
 </tr>
 </tbody>
 </table>
-
-
-<h2 id="s3-data-encryption">S3 data encryption<a class="headerlink" href="#s3-data-encryption" title="Permalink to this headline">#</a></h2>
+</section>
+<section id="s3-data-encryption">
+<h2 id="s3-data-encryption">S3 data encryption<a class="headerlink" href="#s3-data-encryption" title="Permalink to this heading">#</a></h2>
 <p>Trino supports reading and writing encrypted data in S3 using both
 server-side encryption with S3 managed keys and client-side encryption using
 either the Amazon KMS or a software plugin to manage AES encryption keys.</p>
@@ -1066,15 +1052,15 @@
 the <code class="docutils literal notranslate"><span class="pre">org.apache.hadoop.conf.Configurable</span></code> interface from the Hadoop Java API, then the Hadoop configuration
 is passed in after the object instance is created, and before it is asked to provision or retrieve any
 encryption keys.</p>
-
-
-<span id="s3selectpushdown"></span><h2 id="s3-select-pushdown">S3 Select pushdown<a class="headerlink" href="#s3-select-pushdown" title="Permalink to this headline">#</a></h2>
+</section>
+<section id="s3-select-pushdown">
+<span id="s3selectpushdown"></span><h2 id="s3-select-pushdown">S3 Select pushdown<a class="headerlink" href="#s3-select-pushdown" title="Permalink to this heading">#</a></h2>
 <p>S3 Select pushdown enables pushing down projection (SELECT) and predicate (WHERE)
 processing to <a class="reference external" href="https://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectSELECTContent.html">S3 Select</a>.
 With S3 Select Pushdown, Trino only retrieves the required data from S3 instead
 of entire S3 objects, reducing both latency and network usage.</p>
-
-<h3 id="is-s3-select-a-good-fit-for-my-workload">Is S3 Select a good fit for my workload?<a class="headerlink" href="#is-s3-select-a-good-fit-for-my-workload" title="Permalink to this headline">#</a></h3>
+<section id="is-s3-select-a-good-fit-for-my-workload">
+<h3 id="is-s3-select-a-good-fit-for-my-workload">Is S3 Select a good fit for my workload?<a class="headerlink" href="#is-s3-select-a-good-fit-for-my-workload" title="Permalink to this heading">#</a></h3>
 <p>Performance of S3 Select pushdown depends on the amount of data filtered by the
 query. Filtering a large number of rows should result in better performance. If
 the query doesn’t filter any data, then pushdown may not add any additional value
@@ -1098,9 +1084,9 @@
 transfer speed and available bandwidth. Amazon S3 Select does not compress
 HTTP responses, so the response size may increase for compressed input files.</p></li>
 </ul>
-
-
-<h3 id="considerations-and-limitations">Considerations and limitations<a class="headerlink" href="#considerations-and-limitations" title="Permalink to this headline">#</a></h3>
+</section>
+<section id="considerations-and-limitations">
+<h3 id="considerations-and-limitations">Considerations and limitations<a class="headerlink" href="#considerations-and-limitations" title="Permalink to this heading">#</a></h3>
 <ul class="simple">
 <li><p>Only objects stored in CSV format are supported. Objects can be uncompressed,
 or optionally compressed with gzip or bzip2.</p></li>
@@ -1111,16 +1097,16 @@
 <li><p>S3 Select Pushdown is not a substitute for using columnar or compressed file
 formats such as ORC and Parquet.</p></li>
 </ul>
-
-
-<h3 id="enabling-s3-select-pushdown">Enabling S3 Select pushdown<a class="headerlink" href="#enabling-s3-select-pushdown" title="Permalink to this headline">#</a></h3>
+</section>
+<section id="enabling-s3-select-pushdown">
+<h3 id="enabling-s3-select-pushdown">Enabling S3 Select pushdown<a class="headerlink" href="#enabling-s3-select-pushdown" title="Permalink to this heading">#</a></h3>
 <p>You can enable S3 Select Pushdown using the <code class="docutils literal notranslate"><span class="pre">s3_select_pushdown_enabled</span></code>
 Hive session property, or using the <code class="docutils literal notranslate"><span class="pre">hive.s3select-pushdown.enabled</span></code>
 configuration property. The session property overrides the config
 property, allowing you enable or disable on a per-query basis.</p>
-
-
-<h3 id="understanding-and-tuning-the-maximum-connections">Understanding and tuning the maximum connections<a class="headerlink" href="#understanding-and-tuning-the-maximum-connections" title="Permalink to this headline">#</a></h3>
+</section>
+<section id="understanding-and-tuning-the-maximum-connections">
+<h3 id="understanding-and-tuning-the-maximum-connections">Understanding and tuning the maximum connections<a class="headerlink" href="#understanding-and-tuning-the-maximum-connections" title="Permalink to this heading">#</a></h3>
 <p>Trino can use its native S3 file system or EMRFS. When using the native FS, the
 maximum connections is configured via the <code class="docutils literal notranslate"><span class="pre">hive.s3.max-connections</span></code>
 configuration property. When using EMRFS, the maximum connections is configured
@@ -1132,9 +1118,9 @@
 <p>If your workload experiences the error <em>Timeout waiting for connection from
 pool</em>, increase the value of both <code class="docutils literal notranslate"><span class="pre">hive.s3select-pushdown.max-connections</span></code> and
 the maximum connections configuration for the file system you are using.</p>
-
-
-
+</section>
+</section>
+</section>
 
 
           </article>

docs/build Show resolved Hide resolved
Copy link
Member

@mosabua mosabua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice .. this will need the container being built and made available as part of the merge. I think @electrum is the only one who has done that in the past.

@nineinchnick nineinchnick added the no-release-notes This pull request does not require release notes entry label Jul 24, 2022
@electrum
Copy link
Member

Ah, I see the HTML changes are non-trivial from a textual perspective. It'd be nice to have a better way to verify that nothing broke. The only real change I see in this example is the removal of the <colgroup> for the tables. Can you show screenshots for both, so we can see if the table formatting looks ok?

Assuming the formatting looks the same, or not worse, this change looks good. Thanks for updating it.

@nineinchnick
Copy link
Member Author

Good catch about the column widths. I checked if it renders correctly before but haven't noticed it, it does look off. I see if there's something in Sphinx config that can restore it.

Current master:
image

This branch:
image

@nineinchnick
Copy link
Member Author

Found it, it was a change in Docutils: https://docutils.sourceforge.io/RELEASE-NOTES.html#release-0-18-2021-10-26

I had to add a docutils.conf to restore the previous setting.

@electrum
Copy link
Member

I published a multi-architecture image:

docker buildx build --push --platform linux/amd64,linux/arm64 --tag ghcr.io/trinodb/build/sphinx:5 docs/

@nineinchnick
Copy link
Member Author

@electrum, thanks! All checks are green.

@electrum electrum merged commit a9f4c08 into trinodb:master Oct 26, 2022
@github-actions github-actions bot added this to the 401 milestone Oct 26, 2022
@nineinchnick nineinchnick deleted the update-sphinx branch November 2, 2022 08:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed docs no-release-notes This pull request does not require release notes entry
Development

Successfully merging this pull request may close these issues.

3 participants