Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-12069][SQL] Update documentation with Datasets #10060

Closed
wants to merge 7 commits into from

Conversation

marmbrus
Copy link
Contributor

@marmbrus marmbrus commented Dec 1, 2015

No description provided.

@Experimental
@implicitNotFound("Unable to find encoder for type stored in a Dataset. Primitive types " +
"(Int, String, etc) and Products (case classes) and primitive types are supported by " +
"importing sqlContext.implicits._ Support for serializing other types will be added in future " +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marmbrus Primitive types mentioned twice ? Is it ok ?

@SparkQA
Copy link

SparkQA commented Dec 1, 2015

Test build #46946 has finished for PR 10060 at commit 649541c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

* Encoders are not intended to be thread-safe and thus they are allow to avoid internal locking
* and reuse internal buffers to improve performance.
* == Scala ==
* Encoders are generally created automatically though implicits from a `SQLContext`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be mistaken but I think you meant to write "through" and not "though".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would also be great to expand this slightly and explain what can be inferred automatically right now.

@@ -19,6 +19,9 @@ package org.apache.spark.sql

import java.lang.reflect.Modifier

import org.apache.spark.annotation.Experimental
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import order

@SparkQA
Copy link

SparkQA commented Dec 3, 2015

Test build #47151 has finished for PR 10060 at commit 3e53a4c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

## Datasets

A Dataset is a new experimental interface added in Spark 1.6 that tries to provide the benefits of
RDDs (strong typing, ability to use powerful lambda functions) with the benifits of Spark SQL's
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

benifits -> benefits

@SparkQA
Copy link

SparkQA commented Dec 8, 2015

Test build #47356 has finished for PR 10060 at commit 3ff7a46.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -9,18 +9,51 @@ title: Spark SQL and DataFrames

# Overview

Spark SQL is a Spark module for structured data processing. It provides a programming abstraction called DataFrames and can also act as distributed SQL query engine.
Spark SQL is a Spark module for structured data processing. Unlike the basic Spark RDD API, the interfaces provided
by Spark SQL provide Spark with more about the structure of both the data and the computation being performed. Internally,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a word missing between "more" and "about" like information?

@BenFradet
Copy link
Contributor

I made a few comments, but otherwise it's clear.

@marmbrus
Copy link
Contributor Author

marmbrus commented Dec 8, 2015

Thanks for the comments!

@SparkQA
Copy link

SparkQA commented Dec 8, 2015

Test build #47366 has finished for PR 10060 at commit 4b51ad7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@marmbrus marmbrus changed the title [WIP][SPARK-12069][SQL] Update documentation with Datasets [SPARK-12069][SQL] Update documentation with Datasets Dec 8, 2015
asfgit pushed a commit that referenced this pull request Dec 8, 2015
Author: Michael Armbrust <michael@databricks.com>

Closes #10060 from marmbrus/docs.

(cherry picked from commit 3959489)
Signed-off-by: Michael Armbrust <michael@databricks.com>
@asfgit asfgit closed this in 3959489 Dec 9, 2015
@marmbrus marmbrus deleted the docs branch March 8, 2016 00:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants