Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-44997][DOCS] Align example order (Python -> Scala/Java -> R) in all Spark Doc Content #42712

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/_layouts/global.html
Original file line number Diff line number Diff line change
Expand Up @@ -71,9 +71,9 @@
<li class="nav-item dropdown">
<a href="#" class="nav-link dropdown-toggle" id="navbarAPIDocs" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">API Docs</a>
<div class="dropdown-menu" aria-labelledby="navbarAPIDocs">
<a class="dropdown-item" href="api/python/index.html">Python</a>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before:
image

After:
image

<a class="dropdown-item" href="api/scala/org/apache/spark/index.html">Scala</a>
<a class="dropdown-item" href="api/java/index.html">Java</a>
<a class="dropdown-item" href="api/python/index.html">Python</a>
<a class="dropdown-item" href="api/R/index.html">R</a>
<a class="dropdown-item" href="api/sql/index.html">SQL, Built-in Functions</a>
</div>
Expand Down
6 changes: 3 additions & 3 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -120,9 +120,9 @@ options for deployment:

**API Docs:**

* [Spark Python API (Sphinx)](api/python/index.html)
* [Spark Scala API (Scaladoc)](api/scala/org/apache/spark/index.html)
* [Spark Java API (Javadoc)](api/java/index.html)
* [Spark Python API (Sphinx)](api/python/index.html)
* [Spark R API (Roxygen2)](api/R/index.html)
* [Spark SQL, Built-in Functions (MkDocs)](api/sql/index.html)

Expand Down Expand Up @@ -163,7 +163,7 @@ options for deployment:
* AMP Camps: a series of training camps at UC Berkeley that featured talks and
exercises about Spark, Spark Streaming, Mesos, and more. [Videos](https://www.youtube.com/user/BerkeleyAMPLab/search?query=amp%20camp),
are available online for free.
* [Code Examples](https://spark.apache.org/examples.html): more are also available in the `examples` subfolder of Spark ([Scala]({{site.SPARK_GITHUB_URL}}/tree/master/examples/src/main/scala/org/apache/spark/examples),
* [Code Examples](https://spark.apache.org/examples.html): more are also available in the `examples` subfolder of Spark ([Python]({{site.SPARK_GITHUB_URL}}/tree/master/examples/src/main/python),
[Scala]({{site.SPARK_GITHUB_URL}}/tree/master/examples/src/main/scala/org/apache/spark/examples),
[Java]({{site.SPARK_GITHUB_URL}}/tree/master/examples/src/main/java/org/apache/spark/examples),
[Python]({{site.SPARK_GITHUB_URL}}/tree/master/examples/src/main/python),
[R]({{site.SPARK_GITHUB_URL}}/tree/master/examples/src/main/r))
6 changes: 3 additions & 3 deletions docs/ml-pipeline.md
Original file line number Diff line number Diff line change
Expand Up @@ -238,9 +238,9 @@ notes, then it should be treated as a bug to be fixed.

This section gives code examples illustrating the functionality discussed above.
For more info, please refer to the API documentation
([Scala](api/scala/org/apache/spark/ml/package.html),
[Java](api/java/org/apache/spark/ml/package-summary.html),
and [Python](api/python/reference/pyspark.ml.html)).
([Python](api/python/reference/pyspark.ml.html),
[Scala](api/scala/org/apache/spark/ml/package.html),
and [Java](api/java/org/apache/spark/ml/package-summary.html)).

## Example: Estimator, Transformer, and Param

Expand Down
10 changes: 5 additions & 5 deletions docs/quick-start.md
Original file line number Diff line number Diff line change
Expand Up @@ -470,19 +470,19 @@ Congratulations on running your first Spark application!
* For an in-depth overview of the API, start with the [RDD programming guide](rdd-programming-guide.html) and the [SQL programming guide](sql-programming-guide.html), or see "Programming Guides" menu for other components.
* For running applications on a cluster, head to the [deployment overview](cluster-overview.html).
* Finally, Spark includes several samples in the `examples` directory
([Scala]({{site.SPARK_GITHUB_URL}}/tree/master/examples/src/main/scala/org/apache/spark/examples),
([Python]({{site.SPARK_GITHUB_URL}}/tree/master/examples/src/main/python),
[Scala]({{site.SPARK_GITHUB_URL}}/tree/master/examples/src/main/scala/org/apache/spark/examples),
[Java]({{site.SPARK_GITHUB_URL}}/tree/master/examples/src/main/java/org/apache/spark/examples),
[Python]({{site.SPARK_GITHUB_URL}}/tree/master/examples/src/main/python),
[R]({{site.SPARK_GITHUB_URL}}/tree/master/examples/src/main/r)).
You can run them as follows:

{% highlight bash %}
# For Scala and Java, use run-example:
./bin/run-example SparkPi

# For Python examples, use spark-submit directly:
./bin/spark-submit examples/src/main/python/pi.py

# For Scala and Java, use run-example:
./bin/run-example SparkPi

# For R examples, use spark-submit directly:
./bin/spark-submit examples/src/main/r/dataframe.R
{% endhighlight %}
20 changes: 10 additions & 10 deletions docs/rdd-programming-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -945,9 +945,9 @@ documentation](https://docs.oracle.com/javase/8/docs/api/java/lang/Object.html#h

The following table lists some of the common transformations supported by Spark. Refer to the
RDD API doc
([Scala](api/scala/org/apache/spark/rdd/RDD.html),
([Python](api/python/reference/api/pyspark.RDD.html#pyspark.RDD),
[Scala](api/scala/org/apache/spark/rdd/RDD.html),
[Java](api/java/index.html?org/apache/spark/api/java/JavaRDD.html),
[Python](api/python/reference/api/pyspark.RDD.html#pyspark.RDD),
[R](api/R/reference/index.html))
and pair RDD functions doc
([Scala](api/scala/org/apache/spark/rdd/PairRDDFunctions.html),
Expand Down Expand Up @@ -1059,9 +1059,9 @@ for details.

The following table lists some of the common actions supported by Spark. Refer to the
RDD API doc
([Scala](api/scala/org/apache/spark/rdd/RDD.html),
([Python](api/python/reference/api/pyspark.RDD.html#pyspark.RDD),
[Scala](api/scala/org/apache/spark/rdd/RDD.html),
[Java](api/java/index.html?org/apache/spark/api/java/JavaRDD.html),
[Python](api/python/reference/api/pyspark.RDD.html#pyspark.RDD),
[R](api/R/reference/index.html))

and pair RDD functions doc
Expand Down Expand Up @@ -1207,9 +1207,9 @@ In addition, each persisted RDD can be stored using a different *storage level*,
to persist the dataset on disk, persist it in memory but as serialized Java objects (to save space),
replicate it across nodes.
These levels are set by passing a
`StorageLevel` object ([Scala](api/scala/org/apache/spark/storage/StorageLevel.html),
[Java](api/java/index.html?org/apache/spark/storage/StorageLevel.html),
[Python](api/python/reference/api/pyspark.StorageLevel.html#pyspark.StorageLevel))
`StorageLevel` object ([Python](api/python/reference/api/pyspark.StorageLevel.html#pyspark.StorageLevel),
[Scala](api/scala/org/apache/spark/storage/StorageLevel.html),
[Java](api/java/index.html?org/apache/spark/storage/StorageLevel.html))
to `persist()`. The `cache()` method is a shorthand for using the default storage level,
which is `StorageLevel.MEMORY_ONLY` (store deserialized objects in memory). The full set of
storage levels is:
Expand Down Expand Up @@ -1596,9 +1596,9 @@ as Spark does not support two contexts running concurrently in the same program.

You can see some [example Spark programs](https://spark.apache.org/examples.html) on the Spark website.
In addition, Spark includes several samples in the `examples` directory
([Scala]({{site.SPARK_GITHUB_URL}}/tree/master/examples/src/main/scala/org/apache/spark/examples),
([Python]({{site.SPARK_GITHUB_URL}}/tree/master/examples/src/main/python),
[Scala]({{site.SPARK_GITHUB_URL}}/tree/master/examples/src/main/scala/org/apache/spark/examples),
[Java]({{site.SPARK_GITHUB_URL}}/tree/master/examples/src/main/java/org/apache/spark/examples),
[Python]({{site.SPARK_GITHUB_URL}}/tree/master/examples/src/main/python),
[R]({{site.SPARK_GITHUB_URL}}/tree/master/examples/src/main/r)).
You can run Java and Scala examples by passing the class name to Spark's `bin/run-example` script; for instance:

Expand All @@ -1619,4 +1619,4 @@ For help on deploying, the [cluster mode overview](cluster-overview.html) descri
in distributed operation and supported cluster managers.

Finally, full API documentation is available in
[Scala](api/scala/org/apache/spark/), [Java](api/java/), [Python](api/python/) and [R](api/R/).
[Python](api/python/), [Scala](api/scala/org/apache/spark/), [Java](api/java/) and [R](api/R/).
2 changes: 1 addition & 1 deletion docs/sql-getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,7 @@ As an example, the following creates a DataFrame based on the content of a JSON

## Untyped Dataset Operations (aka DataFrame Operations)

DataFrames provide a domain-specific language for structured data manipulation in [Scala](api/scala/org/apache/spark/sql/Dataset.html), [Java](api/java/index.html?org/apache/spark/sql/Dataset.html), [Python](api/python/reference/pyspark.sql/api/pyspark.sql.DataFrame.html) and [R](api/R/reference/SparkDataFrame.html).
DataFrames provide a domain-specific language for structured data manipulation in [Python](api/python/reference/pyspark.sql/api/pyspark.sql.DataFrame.html), [Scala](api/scala/org/apache/spark/sql/Dataset.html), [Java](api/java/index.html?org/apache/spark/sql/Dataset.html) and [R](api/R/reference/SparkDataFrame.html).

As mentioned above, in Spark 2.0, DataFrames are just Dataset of `Row`s in Scala and Java API. These operations are also referred as "untyped transformations" in contrast to "typed transformations" come with strongly typed Scala/Java Datasets.

Expand Down
5 changes: 3 additions & 2 deletions docs/sql-programming-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,8 +55,9 @@ A DataFrame is a *Dataset* organized into named columns. It is conceptually
equivalent to a table in a relational database or a data frame in R/Python, but with richer
optimizations under the hood. DataFrames can be constructed from a wide array of [sources](sql-data-sources.html) such
as: structured data files, tables in Hive, external databases, or existing RDDs.
The DataFrame API is available in Scala,
Java, [Python](api/python/reference/pyspark.sql/api/pyspark.sql.DataFrame.html#pyspark.sql.DataFrame), and [R](api/R/index.html).
The DataFrame API is available in
[Python](api/python/reference/pyspark.sql/api/pyspark.sql.DataFrame.html#pyspark.sql.DataFrame), Scala,
Java and [R](api/R/index.html).
In Scala and Java, a DataFrame is represented by a Dataset of `Row`s.
In [the Scala API][scala-datasets], `DataFrame` is simply a type alias of `Dataset[Row]`.
While, in [Java API][java-datasets], users need to use `Dataset<Row>` to represent a `DataFrame`.
Expand Down
22 changes: 11 additions & 11 deletions docs/streaming-programming-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -762,9 +762,9 @@ DStreams can be created with data streams received through custom receivers. See
For testing a Spark Streaming application with test data, one can also create a DStream based on a queue of RDDs, using `streamingContext.queueStream(queueOfRDDs)`. Each RDD pushed into the queue will be treated as a batch of data in the DStream, and processed like a stream.

For more details on streams from sockets and files, see the API documentations of the relevant functions in
[StreamingContext](api/scala/org/apache/spark/streaming/StreamingContext.html) for
Scala, [JavaStreamingContext](api/java/index.html?org/apache/spark/streaming/api/java/JavaStreamingContext.html)
for Java, and [StreamingContext](api/python/reference/api/pyspark.streaming.StreamingContext.html#pyspark.streaming.StreamingContext) for Python.
[StreamingContext](api/python/reference/api/pyspark.streaming.StreamingContext.html#pyspark.streaming.StreamingContext) for Python,
[StreamingContext](api/scala/org/apache/spark/streaming/StreamingContext.html) for Scala,
and [JavaStreamingContext](api/java/index.html?org/apache/spark/streaming/api/java/JavaStreamingContext.html) for Java.

### Advanced Sources
{:.no_toc}
Expand Down Expand Up @@ -1265,12 +1265,12 @@ JavaPairDStream<String, String> joinedStream = windowedStream.transform(rdd -> r

In fact, you can also dynamically change the dataset you want to join against. The function provided to `transform` is evaluated every batch interval and therefore will use the current dataset that `dataset` reference points to.

The complete list of DStream transformations is available in the API documentation. For the Scala API,
see [DStream](api/scala/org/apache/spark/streaming/dstream/DStream.html)
The complete list of DStream transformations is available in the API documentation. For the Python API,
see [DStream](api/python/reference/api/pyspark.streaming.DStream.html#pyspark.streaming.DStream).
For the Scala API, see [DStream](api/scala/org/apache/spark/streaming/dstream/DStream.html)
and [PairDStreamFunctions](api/scala/org/apache/spark/streaming/dstream/PairDStreamFunctions.html).
For the Java API, see [JavaDStream](api/java/index.html?org/apache/spark/streaming/api/java/JavaDStream.html)
and [JavaPairDStream](api/java/index.html?org/apache/spark/streaming/api/java/JavaPairDStream.html).
For the Python API, see [DStream](api/python/reference/api/pyspark.streaming.DStream.html#pyspark.streaming.DStream).

***

Expand Down Expand Up @@ -2564,21 +2564,21 @@ additional effort may be necessary to achieve exactly-once semantics. There are
- [Custom Receiver Guide](streaming-custom-receivers.html)
* Third-party DStream data sources can be found in [Third Party Projects](https://spark.apache.org/third-party-projects.html)
* API documentation
- Python docs
* [StreamingContext](api/python/reference/api/pyspark.streaming.StreamingContext.html#pyspark.streaming.StreamingContext) and [DStream](api/python/reference/api/pyspark.streaming.DStream.html#pyspark.streaming.DStream)
- Scala docs
* [StreamingContext](api/scala/org/apache/spark/streaming/StreamingContext.html) and
[DStream](api/scala/org/apache/spark/streaming/dstream/DStream.html)
* [KafkaUtils](api/scala/org/apache/spark/streaming/kafka/KafkaUtils$.html),
[KinesisUtils](api/scala/org/apache/spark/streaming/kinesis/KinesisInputDStream.html),
[KinesisUtils](api/scala/org/apache/spark/streaming/kinesis/KinesisInputDStream.html)
- Java docs
* [JavaStreamingContext](api/java/index.html?org/apache/spark/streaming/api/java/JavaStreamingContext.html),
[JavaDStream](api/java/index.html?org/apache/spark/streaming/api/java/JavaDStream.html) and
[JavaPairDStream](api/java/index.html?org/apache/spark/streaming/api/java/JavaPairDStream.html)
* [KafkaUtils](api/java/index.html?org/apache/spark/streaming/kafka/KafkaUtils.html),
[KinesisUtils](api/java/index.html?org/apache/spark/streaming/kinesis/KinesisInputDStream.html)
- Python docs
* [StreamingContext](api/python/reference/api/pyspark.streaming.StreamingContext.html#pyspark.streaming.StreamingContext) and [DStream](api/python/reference/api/pyspark.streaming.DStream.html#pyspark.streaming.DStream)

* More examples in [Scala]({{site.SPARK_GITHUB_URL}}/tree/master/examples/src/main/scala/org/apache/spark/examples/streaming)
* More examples in [Python]({{site.SPARK_GITHUB_URL}}/tree/master/examples/src/main/python/streaming)
and [Scala]({{site.SPARK_GITHUB_URL}}/tree/master/examples/src/main/scala/org/apache/spark/examples/streaming)
and [Java]({{site.SPARK_GITHUB_URL}}/tree/master/examples/src/main/java/org/apache/spark/examples/streaming)
and [Python]({{site.SPARK_GITHUB_URL}}/tree/master/examples/src/main/python/streaming)
* [Paper](http://www.eecs.berkeley.edu/Pubs/TechRpts/2012/EECS-2012-259.pdf) and [video](http://youtu.be/g171ndOHgJ0) describing Spark Streaming.
Loading