Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some bug fixes and docs fixes #2

Merged
merged 5 commits into from
Mar 20, 2015
Merged

Conversation

shivaram
Copy link

I ran R CMD check --as-cran which found a few errors in the code and docs. I fixed the code issues and tried to address a few of the docs issues in this PR. We still have some problem with duplicate arguments for cases where RDD and DataFrame methods exist and documenting generics with '...', but we can fix that in a future PR.

cc @davies

@davies
Copy link

davies commented Mar 20, 2015

LGTM, merging

davies pushed a commit that referenced this pull request Mar 20, 2015
Some bug fixes and docs fixes
@davies davies merged commit 463e28c into amplab-extras:R Mar 20, 2015
davies pushed a commit that referenced this pull request Mar 25, 2015
…, because this will make some UDAF can not work.

spark avoid old inteface of hive, then some udaf can not work like "org.apache.hadoop.hive.ql.udf.generic.GenericUDAFAverage"

Author: DoingDone9 <799203320@qq.com>

Closes apache#5131 from DoingDone9/udaf and squashes the following commits:

9de08d0 [DoingDone9] Update HiveUdfSuite.scala
49c62dc [DoingDone9] Update hiveUdfs.scala
98b134f [DoingDone9] Merge pull request #5 from apache/master
161cae3 [DoingDone9] Merge pull request #4 from apache/master
c87e8b6 [DoingDone9] Merge pull request #3 from apache/master
cb1852d [DoingDone9] Merge pull request #2 from apache/master
c3f046f [DoingDone9] Merge pull request #1 from apache/master
shivaram pushed a commit that referenced this pull request Apr 7, 2015
…her duplicate in HiveQl

Author: DoingDone9 <799203320@qq.com>

Closes apache#4973 from DoingDone9/sort_token and squashes the following commits:

855fa10 [DoingDone9] Update HiveQl.scala
c7080b3 [DoingDone9] Sort these tokens in alphabetic order to avoid further duplicate in HiveQl
c87e8b6 [DoingDone9] Merge pull request #3 from apache/master
cb1852d [DoingDone9] Merge pull request #2 from apache/master
c3f046f [DoingDone9] Merge pull request #1 from apache/master
shivaram pushed a commit that referenced this pull request Apr 7, 2015
… failed!!

wrong code : val tmpDir = Files.createTempDir()
not Files should Utils

Author: DoingDone9 <799203320@qq.com>

Closes apache#5198 from DoingDone9/FilesBug and squashes the following commits:

6e0140d [DoingDone9] Update InsertIntoHiveTableSuite.scala
e57d23f [DoingDone9] Update InsertIntoHiveTableSuite.scala
802261c [DoingDone9] Merge pull request #7 from apache/master
d00303b [DoingDone9] Merge pull request #6 from apache/master
98b134f [DoingDone9] Merge pull request #5 from apache/master
161cae3 [DoingDone9] Merge pull request #4 from apache/master
c87e8b6 [DoingDone9] Merge pull request #3 from apache/master
cb1852d [DoingDone9] Merge pull request #2 from apache/master
c3f046f [DoingDone9] Merge pull request #1 from apache/master
shivaram pushed a commit that referenced this pull request Apr 7, 2015
Added optional model type parameter for  NaiveBayes training. Can be either Multinomial or Bernoulli.

When Bernoulli is given the Bernoulli smoothing is used for fitting and for prediction as per: http://nlp.stanford.edu/IR-book/html/htmledition/the-bernoulli-model-1.html.

 Default for model is original Multinomial fit and predict.

Added additional testing for Bernoulli and Multinomial models.

Author: leahmcguire <lmcguire@salesforce.com>
Author: Joseph K. Bradley <joseph@databricks.com>
Author: Leah McGuire <lmcguire@salesforce.com>

Closes apache#4087 from leahmcguire/master and squashes the following commits:

f3c8994 [leahmcguire] changed checks on model type to requires
acb69af [leahmcguire] removed enum type and replaces all modelType parameters with strings
2224b15 [Leah McGuire] Merge pull request #2 from jkbradley/leahmcguire-master
9ad89ca [Joseph K. Bradley] removed old code
6a8f383 [Joseph K. Bradley] Added new model save/load format 2.0 for NaiveBayesModel after modelType parameter was added.  Updated tests.  Also updated ModelType enum-like type.
852a727 [leahmcguire] merged with upstream master
a22d670 [leahmcguire] changed NaiveBayesModel modelType parameter back to NaiveBayes.ModelType, made NaiveBayes.ModelType serializable, fixed getter method in NavieBayes
18f3219 [leahmcguire] removed private from naive bayes constructor for lambda only
bea62af [leahmcguire] put back in constructor for NaiveBayes
01baad7 [leahmcguire] made fixes from code review
fb0a5c7 [leahmcguire] removed typo
e2d925e [leahmcguire] fixed nonserializable error that was causing naivebayes test failures
2d0c1ba [leahmcguire] fixed typo in NaiveBayes
c298e78 [leahmcguire] fixed scala style errors
b85b0c9 [leahmcguire] Merge remote-tracking branch 'upstream/master'
900b586 [leahmcguire] fixed model call so that uses type argument
ea09b28 [leahmcguire] Merge remote-tracking branch 'upstream/master'
e016569 [leahmcguire] updated test suite with model type fix
85f298f [leahmcguire] Merge remote-tracking branch 'upstream/master'
dc65374 [leahmcguire] integrated model type fix
7622b0c [leahmcguire] added comments and fixed style as per rb
b93aaf6 [Leah McGuire] Merge pull request #1 from jkbradley/nb-model-type
3730572 [Joseph K. Bradley] modified NB model type to be more Java-friendly
b61b5e2 [leahmcguire] added back compatable constructor to NaiveBayesModel to fix MIMA test failure
5a4a534 [leahmcguire] fixed scala style error in NaiveBayes
3891bf2 [leahmcguire] synced with apache spark and resolved merge conflict
d9477ed [leahmcguire] removed old inaccurate comment from test suite for mllib naive bayes
76e5b0f [leahmcguire] removed unnecessary sort from test
0313c0c [leahmcguire] fixed style error in NaiveBayes.scala
4a3676d [leahmcguire] Updated changes re-comments. Got rid of verbose populateMatrix method. Public api now has string instead of enumeration. Docs are updated."
ce73c63 [leahmcguire] added Bernoulli option to niave bayes model in mllib, added optional model type parameter for training. When Bernoulli is given the Bernoulli smoothing is used for fitting and for prediction http://nlp.stanford.edu/IR-book/html/htmledition/the-bernoulli-model-1.html
shivaram pushed a commit that referenced this pull request Apr 7, 2015
1. Test JARs are built & published
1. log4j.resources is explicitly excluded. Without this, downstream test run logging depends on the order the JARs are listed/loaded
1. sql/hive pulls in spark-sql &...spark-catalyst for its test runs
1. The copied in test classes were rm'd, and a test edited to remove its now duplicate assert method
1. Spark streaming is now build with the same plugin/phase as the rest, but its shade plugin declaration is kept in (so different from the rest of the test plugins). Due to (#2), this means the test JAR no longer includes its log4j file.

Outstanding issues:
* should the JARs be shaded? `spark-streaming-test.jar` does, but given these are test jars for developers only, especially in the same spark source tree, it's hard to justify.
* `maven-jar-plugin` v 2.6 was explicitly selected; without this the apache-1.4 parent template JAR version (2.4) chosen.
* Are there any other resources to exclude?

Author: Steve Loughran <stevel@hortonworks.com>

Closes apache#5119 from steveloughran/stevel/patches/SPARK-6433-test-jars and squashes the following commits:

81ceb01 [Steve Loughran] SPARK-6433 add a clearer comment explaining what the plugin is doing & why
a6dca33 [Steve Loughran] SPARK-6433 : pull configuration section form archive plugin
c2b5f89 [Steve Loughran] SPARK-6433 omit "jar" goal from jar plugin
fdac51b [Steve Loughran] SPARK-6433 -002; indentation & delegate plugin version to parent
650f442 [Steve Loughran] SPARK-6433 patch 001: test JARs are built; sql/hive pulls in spark-sql & spark-catalyst for its test runs
shivaram pushed a commit that referenced this pull request Apr 7, 2015
Author: Yin Huai <yhuai@databricks.com>
Author: Michael Armbrust <michael@databricks.com>

Closes apache#5333 from yhuai/lookupRelationLock and squashes the following commits:

59c884f [Michael Armbrust] [SQL] Lock metastore client in analyzeTable
7667030 [Yin Huai] Merge pull request #2 from marmbrus/pr/5333
e4a9b0b [Michael Armbrust] Correctly lock on MetastoreCatalog
d6fc32f [Yin Huai] Missing `)`.
1e241af [Yin Huai] Protect InsertIntoHive.
fee7e9c [Yin Huai] A test?
5416b0f [Yin Huai] Just protect client.
shivaram pushed a commit that referenced this pull request Apr 7, 2015
…s that order.dataType does not match NativeType

It did not conside that order.dataType does not match NativeType. So i add "case other => ..." for other cenarios.

Author: DoingDone9 <799203320@qq.com>

Closes apache#4959 from DoingDone9/case_ and squashes the following commits:

6278846 [DoingDone9] Update rows.scala
cb1852d [DoingDone9] Merge pull request #2 from apache/master
c3f046f [DoingDone9] Merge pull request #1 from apache/master
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants