Skip to content

Commit

Permalink
[SPARK-46923][DOCS] Limit width of configuration tables
Browse files Browse the repository at this point in the history
### What changes were proposed in this pull request?

- Assign all config tables in the documentation to the new CSS class `spark-config`.
  - Migrate the config table in `docs/sql-ref-ansi-compliance.md` from Markdown to HTML and assign it to this new CSS class as well.
- Limit the width of the config tables to the width of the main content, and force words to break and wrap if necessary.
- Remove a styling workaround for the documentation for `spark.worker.resourcesFile` that is not needed anymore thanks to these changes.
- Remove some `.global` CSS rules that, due to their [specificity][specificity] interfere with our ability to assign simple rules that apply directly to elements.

[specificity]: https://developer.mozilla.org/en-US/docs/Web/CSS/Specificity

### Why are the changes needed?

Many configs and config defaults have very long names that normally cannot wrap. This causes tables to overflow the viewport. An egregious example of this is `spark.scheduler.listenerbus.eventqueue.executorManagement.capacity`, which has a default of `spark.scheduler.listenerbus.eventqueue.capacity`.

This change will force these long strings to break and wrap, which will keep the table widths limited to the width of the overall content. Because we are hard-coding the column widths, some tables will look slightly worse with this new layout due to extra whitespace. I couldn't figure out a practical way to prevent that while also solving the main problem of table overflow.

In #44755 or #44756 (whichever approach gets accepted), these config tables will be generated automatically. This will give us the opportunity to improve the styling further by setting the column width dynamically based on the content. (This should be possible in CSS, but table styling in CSS is limited and we cannot use properties like `max-width`.) We will also be able to insert [word break opportunities][wbo] so that config names wrap in a more visually pleasing manner.

[wbo]: https://developer.mozilla.org/en-US/docs/Web/HTML/Element/wbr

### Does this PR introduce _any_ user-facing change?

Yes, it changes the presentation of tables, especially config tables, in the main documentation.

### How was this patch tested?

I built the docs and compared them visually across `master` (left) and this branch (right).

`sql-ref-ansi-compliance.html`:
<img width="200px" src="https://github.com/apache/spark/assets/1039369/1ad54764-e942-491a-8060-a23cdc2b1d3f" /> <img width="200px" src="https://github.com/apache/spark/assets/1039369/5e623d71-1f8b-41ca-b0bc-7166e9a2de7e" />

`configuration.html#scheduling`:
<img width="200px" src="https://github.com/apache/spark/assets/1039369/7a14c2d6-b9e6-4114-8902-96c80bebfc87" /> <img width="200px" src="https://github.com/apache/spark/assets/1039369/27dec813-f5a0-49de-a74d-f98b5c0f1606" />
<img width="200px" src="https://github.com/apache/spark/assets/1039369/76cfb54d-dad7-46e8-a29d-1f0d91962ce1" /> <img width="200px" src="https://github.com/apache/spark/assets/1039369/21457627-a7a4-4ede-8b8b-fcffabe1e24c" />

`configuration.html#barrier-execution-mode`:
<img width="200px" src="https://github.com/apache/spark/assets/1039369/e9b4118b-85eb-4ee5-9195-7e69e94e1008" /> <img width="200px" src="https://github.com/apache/spark/assets/1039369/2f3a1f0e-3d67-4352-b884-94b41c0dd6ea" />

`spark-standalone.html`:
<img width="200px" src="https://github.com/apache/spark/assets/1039369/c1923b0b-b57c-45c3-aff9-b8db1c0b39f6" /> <img width="200px" src="https://github.com/apache/spark/assets/1039369/6abb45e1-10e0-4a31-b1a3-91ef3a0478d1" />

`structured-streaming-kafka-integration.html#configuration`:
<img width="200px" src="https://github.com/apache/spark/assets/1039369/8505cda1-46fc-4b61-be22-16362bbf00fc" /> <img width="200px" src="https://github.com/apache/spark/assets/1039369/a5e351af-161a-442b-95a9-2501ec7934c9" />

<!--
<img width="200px" src="" /> <img width="200px" src="" />
-->

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #44955 from nchammas/table-styling.

Authored-by: Nicholas Chammas <nicholas.chammas@gmail.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
  • Loading branch information
nchammas authored and HyukjinKwon committed Jan 31, 2024
1 parent 8e29c0d commit ecdb38e
Show file tree
Hide file tree
Showing 16 changed files with 143 additions and 83 deletions.
2 changes: 1 addition & 1 deletion connector/profiler/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ Then enable the profiling in the configuration.

### Code profiling configuration

<table class="table">
<table class="spark-config">
<tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr>
<tr>
<td><code>spark.executor.profiling.enabled</code></td>
Expand Down
38 changes: 19 additions & 19 deletions docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -139,7 +139,7 @@ of the most common options to set are:

### Application Properties

<table>
<table class="spark-config">
<thead><tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr></thead>
<tr>
<td><code>spark.app.name</code></td>
Expand Down Expand Up @@ -553,7 +553,7 @@ Apart from these, the following properties are also available, and may be useful

### Runtime Environment

<table>
<table class="spark-config">
<thead><tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr></thead>
<tr>
<td><code>spark.driver.extraClassPath</code></td>
Expand Down Expand Up @@ -940,7 +940,7 @@ Apart from these, the following properties are also available, and may be useful

### Shuffle Behavior

<table>
<table class="spark-config">
<thead><tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr></thead>
<tr>
<td><code>spark.reducer.maxSizeInFlight</code></td>
Expand Down Expand Up @@ -1315,7 +1315,7 @@ Apart from these, the following properties are also available, and may be useful

### Spark UI

<table>
<table class="spark-config">
<thead><tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr></thead>
<tr>
<td><code>spark.eventLog.logBlockUpdates.enabled</code></td>
Expand Down Expand Up @@ -1755,7 +1755,7 @@ Apart from these, the following properties are also available, and may be useful

### Compression and Serialization

<table>
<table class="spark-config">
<thead><tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr></thead>
<tr>
<td><code>spark.broadcast.compress</code></td>
Expand Down Expand Up @@ -1972,7 +1972,7 @@ Apart from these, the following properties are also available, and may be useful

### Memory Management

<table>
<table class="spark-config">
<thead><tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr></thead>
<tr>
<td><code>spark.memory.fraction</code></td>
Expand Down Expand Up @@ -2097,7 +2097,7 @@ Apart from these, the following properties are also available, and may be useful

### Execution Behavior

<table>
<table class="spark-config">
<thead><tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr></thead>
<tr>
<td><code>spark.broadcast.blockSize</code></td>
Expand Down Expand Up @@ -2342,7 +2342,7 @@ Apart from these, the following properties are also available, and may be useful

### Executor Metrics

<table>
<table class="spark-config">
<thead><tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr></thead>
<tr>
<td><code>spark.eventLog.logStageExecutorMetrics</code></td>
Expand Down Expand Up @@ -2410,7 +2410,7 @@ Apart from these, the following properties are also available, and may be useful

### Networking

<table>
<table class="spark-config">
<thead><tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr></thead>
<tr>
<td><code>spark.rpc.message.maxSize</code></td>
Expand Down Expand Up @@ -2573,7 +2573,7 @@ Apart from these, the following properties are also available, and may be useful

### Scheduling

<table>
<table class="spark-config">
<thead><tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr></thead>
<tr>
<td><code>spark.cores.max</code></td>
Expand Down Expand Up @@ -3054,7 +3054,7 @@ Apart from these, the following properties are also available, and may be useful

### Barrier Execution Mode

<table>
<table class="spark-config">
<thead><tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr></thead>
<tr>
<td><code>spark.barrier.sync.timeout</code></td>
Expand Down Expand Up @@ -3101,7 +3101,7 @@ Apart from these, the following properties are also available, and may be useful

### Dynamic Allocation

<table>
<table class="spark-config">
<thead><tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr></thead>
<tr>
<td><code>spark.dynamicAllocation.enabled</code></td>
Expand Down Expand Up @@ -3243,7 +3243,7 @@ finer granularity starting from driver and executor. Take RPC module as example
like shuffle, just replace "rpc" with "shuffle" in the property names except
<code>spark.{driver|executor}.rpc.netty.dispatcher.numThreads</code>, which is only for RPC module.

<table>
<table class="spark-config">
<thead><tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr></thead>
<tr>
<td><code>spark.{driver|executor}.rpc.io.serverThreads</code></td>
Expand Down Expand Up @@ -3281,7 +3281,7 @@ the driver or executor, or, in the absence of that value, the number of cores av
Server configurations are set in Spark Connect server, for example, when you start the Spark Connect server with `./sbin/start-connect-server.sh`.
They are typically set via the config file and command-lineoptions with `--conf/-c`.

<table class="table">
<table class="spark-config">
<thead><tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr></thead>
<tr>
<td><code>spark.connect.grpc.binding.port</code></td>
Expand Down Expand Up @@ -3373,7 +3373,7 @@ External users can query the static sql config values via `SparkSession.conf` or

### Spark Streaming

<table>
<table class="spark-config">
<thead><tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr></thead>
<tr>
<td><code>spark.streaming.backpressure.enabled</code></td>
Expand Down Expand Up @@ -3505,7 +3505,7 @@ External users can query the static sql config values via `SparkSession.conf` or

### SparkR

<table>
<table class="spark-config">
<thead><tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr></thead>
<tr>
<td><code>spark.r.numRBackendThreads</code></td>
Expand Down Expand Up @@ -3561,7 +3561,7 @@ External users can query the static sql config values via `SparkSession.conf` or

### GraphX

<table>
<table class="spark-config">
<thead><tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr></thead>
<tr>
<td><code>spark.graphx.pregel.checkpointInterval</code></td>
Expand Down Expand Up @@ -3735,7 +3735,7 @@ Push-based shuffle helps improve the reliability and performance of spark shuffl

### External Shuffle service(server) side configuration options

<table>
<table class="spark-config">
<thead><tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr></thead>
<tr>
<td><code>spark.shuffle.push.server.mergedShuffleFileManagerImpl</code></td>
Expand Down Expand Up @@ -3769,7 +3769,7 @@ Push-based shuffle helps improve the reliability and performance of spark shuffl

### Client side configuration options

<table>
<table class="spark-config">
<thead><tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr></thead>
<tr>
<td><code>spark.shuffle.push.enabled</code></td>
Expand Down
56 changes: 44 additions & 12 deletions docs/css/custom.css
Original file line number Diff line number Diff line change
Expand Up @@ -557,7 +557,6 @@ pre {
border-radius: 4px;
}

code,
pre {
font: 1em Menlo, Monaco, Consolas, "Courier New", monospace;
}
Expand Down Expand Up @@ -741,7 +740,6 @@ h3 {
margin: 0;
}

.global code,
.global pre {
font: 1em Menlo, Monaco, Consolas, "Courier New", monospace;
}
Expand All @@ -761,15 +759,6 @@ h3 {
border-radius: 4px;
}

.global code {
font: 90% "Menlo", "Lucida Console", Consolas, monospace;
white-space: nowrap;
background: transparent;
border-radius: 4px;
padding: 0;
color: inherit;
}

.global pre code {
padding: 0;
font-size: inherit;
Expand Down Expand Up @@ -936,8 +925,14 @@ img {

table {
width: 100%;
overflow-wrap: normal;
overflow-wrap: break-word;
border-collapse: collapse;
white-space: normal;
}

table code {
overflow-wrap: break-word;
white-space: normal;
}

table th,
Expand All @@ -956,3 +951,40 @@ table tr {
table tr:nth-child(2n) {
background-color: #F1F4F5;
}

table.spark-config {
width: 100%;
table-layout: fixed;
white-space: normal;
overflow-wrap: break-word;
}

/* We have long config names and formulas that often show up in tables. To prevent
* any table column from become super wide, we allow the browser to break words at
* any point.
*/
table.spark-config code,
table.spark-config th,
table.spark-config td {
white-space: normal;
overflow-wrap: break-word;
}

/* CSS does not respect max-width on tables or table parts (like cells, columns, etc.),
so we have to pick a fixed width for each column.
See: https://stackoverflow.com/a/8465980
*/
table.spark-config th:nth-child(1),
table.spark-config td:nth-child(1) {
width: 30%;
}

table.spark-config th:nth-child(2),
table.spark-config td:nth-child(2) {
width: 20%;
}

table.spark-config th:nth-child(4),
table.spark-config td:nth-child(4) {
width: 90px;
}
2 changes: 1 addition & 1 deletion docs/monitoring.md
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,7 @@ Use it with caution.
Security options for the Spark History Server are covered more detail in the
[Security](security.html#web-ui) page.

<table>
<table class="spark-config">
<thead>
<tr>
<th>Property Name</th>
Expand Down
2 changes: 1 addition & 1 deletion docs/running-on-kubernetes.md
Original file line number Diff line number Diff line change
Expand Up @@ -592,7 +592,7 @@ See the [configuration page](configuration.html) for information on Spark config

#### Spark Properties

<table>
<table class="spark-config">
<thead><tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr></thead>
<tr>
<td><code>spark.kubernetes.context</code></td>
Expand Down
6 changes: 3 additions & 3 deletions docs/running-on-yarn.md
Original file line number Diff line number Diff line change
Expand Up @@ -143,7 +143,7 @@ To use a custom metrics.properties for the application master and executors, upd

#### Spark Properties

<table>
<table class="spark-config">
<thead><tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr></thead>
<tr>
<td><code>spark.yarn.am.memory</code></td>
Expand Down Expand Up @@ -766,7 +766,7 @@ staging directory of the Spark application.

## YARN-specific Kerberos Configuration

<table>
<table class="spark-config">
<thead><tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr></thead>
<tr>
<td><code>spark.kerberos.keytab</code></td>
Expand Down Expand Up @@ -865,7 +865,7 @@ to avoid garbage collection issues during shuffle.

The following extra configuration options are available when the shuffle service is running on YARN:

<table>
<table class="spark-config">
<thead><tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr></thead>
<tr>
<td><code>spark.yarn.shuffle.stopOnFailure</code></td>
Expand Down
18 changes: 9 additions & 9 deletions docs/security.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ distributing the shared secret. Each application will use a unique shared secret
the case of YARN, this feature relies on YARN RPC encryption being enabled for the distribution of
secrets to be secure.

<table>
<table class="spark-config">
<thead><tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr></thead>
<tr>
<td><code>spark.yarn.shuffle.server.recovery.disabled</code></td>
Expand All @@ -82,7 +82,7 @@ that any user that can list pods in the namespace where the Spark application is
also see their authentication secret. Access control rules should be properly set up by the
Kubernetes admin to ensure that Spark authentication is secure.

<table>
<table class="spark-config">
<thead><tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr></thead>
<tr>
<td><code>spark.authenticate</code></td>
Expand All @@ -103,7 +103,7 @@ Kubernetes admin to ensure that Spark authentication is secure.
Alternatively, one can mount authentication secrets using files and Kubernetes secrets that
the user mounts into their pods.

<table>
<table class="spark-config">
<thead><tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr></thead>
<tr>
<td><code>spark.authenticate.secret.file</code></td>
Expand Down Expand Up @@ -178,7 +178,7 @@ is still required when talking to shuffle services from Spark versions older tha

The following table describes the different options available for configuring this feature.

<table>
<table class="spark-config">
<thead><tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr></thead>
<tr>
<td><code>spark.network.crypto.enabled</code></td>
Expand Down Expand Up @@ -249,7 +249,7 @@ encrypting output data generated by applications with APIs such as `saveAsHadoop

The following settings cover enabling encryption for data written to disk:

<table>
<table class="spark-config">
<thead><tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr></thead>
<tr>
<td><code>spark.io.encryption.enabled</code></td>
Expand Down Expand Up @@ -317,7 +317,7 @@ below.

The following options control the authentication of Web UIs:

<table>
<table class="spark-config">
<thead><tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr></thead>
<tr>
<td><code>spark.ui.allowFramingFrom</code></td>
Expand Down Expand Up @@ -421,7 +421,7 @@ servlet filters.

To enable authorization in the SHS, a few extra options are used:

<table>
<table class="spark-config">
<thead><tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr></thead>
<tr>
<td><code>spark.history.ui.acls.enable</code></td>
Expand Down Expand Up @@ -734,7 +734,7 @@ Apache Spark can be configured to include HTTP headers to aid in preventing Cros
(XSS), Cross-Frame Scripting (XFS), MIME-Sniffing, and also to enforce HTTP Strict Transport
Security.

<table>
<table class="spark-config">
<thead><tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr></thead>
<tr>
<td><code>spark.ui.xXssProtection</code></td>
Expand Down Expand Up @@ -917,7 +917,7 @@ deployment-specific page for more information.

The following options provides finer-grained control for this feature:

<table>
<table class="spark-config">
<thead><tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr></thead>
<tr>
<td><code>spark.security.credentials.${service}.enabled</code></td>
Expand Down
Loading

0 comments on commit ecdb38e

Please sign in to comment.