Skip to content

Commit

Permalink
addedrjava to fix dependencies issue in travis ci
Browse files Browse the repository at this point in the history
  • Loading branch information
trinker committed Aug 22, 2015
1 parent 37c0af6 commit 11afa1d
Show file tree
Hide file tree
Showing 3 changed files with 28 additions and 26 deletions.
9 changes: 5 additions & 4 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
language: c
language: java

sudo: required
before_install:
Expand All @@ -7,7 +7,8 @@ before_install:
- ./travis-tool.sh bootstrap
install:
- sh -e /etc/init.d/xvfb start
- ./travis-tool.sh aptget_install r-cran-xml
- ./travis-tool.sh aptget_install r-cran-xml
- ./travis-tool.sh install_r_binary rjava
- ./travis-tool.sh install_github hadley/devtools
- ./travis-tool.sh install_deps
- ./travis-tool.sh github_package jimhester/covr
Expand All @@ -20,7 +21,7 @@ notifications:
on_failure: change
env:
global:
- R_BUILD_ARGS="--resave-data=best"
- R_BUILD_ARGS="--resave-data=best"
- R_CHECK_ARGS="--as-cran"
- DISPLAY=:99.0
- BOOTSTRAP_LATEX=1
- BOOTSTRAP_LATEX=1
2 changes: 1 addition & 1 deletion README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ knitr::opts_chunk$set(fig.path = "inst/figure/")

**sentimentr** is designed to quickly calulate text polarity sentiment at the sentence level and optionally aggregate by rows or grouping variable(s).

**sentimentr** is a response to my own needs with sentiment detection that were not addressed by the current **R** tools. My own `polarity` function in the **qdap** package is slower on larger data sets. It is a dictionary lookup approach that tries to incorporate weighting for valence shifters (negation and amplifiers/deamplifiers). Matthew Jocker's created the [**syuzhet** ](http://www.matthewjockers.net/2015/02/02/syuzhet/) package that utilizes dictionary lookups for the Bing, NRC, and Afinn methods. He also utilizes a wrapper for the [Stanford coreNLP](http://nlp.stanford.edu/software/corenlp.shtml) which uses much more sophisticated analysis. Jocker's dictionary methods are fast but are more prone to error in the case of valence shifters. Jocker's [addressed these critiques](http://www.matthewjockers.net/2015/03/04/some-thoughts-on-annies-thoughts-about-syuzhet/) with regard to analyzing general sentiment in a piece of literature. He points to the accuracy of the Stanford detection as well. In my own work I need better accuracy than a simple dictionary lookup that considers valence shifters yet retains the speed that Stanford's parser does not have. This leads to a trade off of speed vs. accuracy. The equation below describes the dictionary method of **sentimentr** that may give better results than a dictionary approach that does not consider valence shifters but will likely still be less accurate than Stanford's approach. Simply, **sentimentr** attempts to balance accuracy and speed.
**sentimentr** is a response to my own needs with sentiment detection that were not addressed by the current **R** tools. My own `polarity` function in the **qdap** package is slower on larger data sets. It is a dictionary lookup approach that tries to incorporate weighting for valence shifters (negation and amplifiers/deamplifiers). Matthew Jocker's created the [**syuzhet** ](http://www.matthewjockers.net/2015/02/02/syuzhet/) package that utilizes dictionary lookups for the Bing, NRC, and Afinn methods. He also utilizes a wrapper for the [Stanford coreNLP](http://nlp.stanford.edu/software/corenlp.shtml) which uses much more sophisticated analysis. Jocker's dictionary methods are fast but are more prone to error in the case of valence shifters. Jocker's [addressed these critiques](http://www.matthewjockers.net/2015/03/04/some-thoughts-on-annies-thoughts-about-syuzhet/) explaining that the method is good with regard to analyzing general sentiment in a piece of literature. He points to the accuracy of the Stanford detection as well. In my own work I need better accuracy than a simple dictionary lookup; something that considers valence shifters yet optimizes speed which the Stanford's parser does not. This leads to a trade off of speed vs. accuracy. The equation below describes the dictionary method of **sentimentr** that may give better results than a dictionary approach that does not consider valence shifters but will likely still be less accurate than Stanford's approach. Simply, **sentimentr** attempts to balance accuracy and speed.


# The Equation
Expand Down
43 changes: 22 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,15 +31,16 @@ more sophisticated analysis. Jocker's dictionary methods are fast but
are more prone to error in the case of valence shifters. Jocker's
[addressed these
critiques](http://www.matthewjockers.net/2015/03/04/some-thoughts-on-annies-thoughts-about-syuzhet/)
with regard to analyzing general sentiment in a piece of literature. He
points to the accuracy of the Stanford detection as well. In my own work
I need better accuracy than a simple dictionary lookup that considers
valence shifters yet retains the speed that Stanford's parser does not
have. This leads to a trade off of speed vs. accuracy. The equation
below describes the dictionary method of **sentimentr** that may give
better results than a dictionary approach that does not consider valence
shifters but will likely still be less accurate than Stanford's
approach. Simply, **sentimentr** attempts to balance accuracy and speed.
explaining that the method is good with regard to analyzing general
sentiment in a piece of literature. He points to the accuracy of the
Stanford detection as well. In my own work I need better accuracy than a
simple dictionary lookup; something that considers valence shifters yet
optimizes speed which the Stanford's parser does not. This leads to a
trade off of speed vs. accuracy. The equation below describes the
dictionary method of **sentimentr** that may give better results than a
dictionary approach that does not consider valence shifters but will
likely still be less accurate than Stanford's approach. Simply,
**sentimentr** attempts to balance accuracy and speed.


Table of Contents
Expand Down Expand Up @@ -355,16 +356,16 @@ see that Stanford takes the longest time while **sentimentr** and

Unit: milliseconds
expr min lq mean median
stanford() 19817.0754 20438.8711 21034.3110 21060.6668
sentimentr_hu_liu() 221.1074 222.3142 226.9894 223.5210
sentimentr_sentiword() 968.1034 968.8658 976.8568 969.6282
syuzhet_binn() 321.1439 326.2029 398.5578 331.2619
syuzhet_nrc() 754.1079 760.8421 793.2174 767.5763
syuzhet_afinn() 131.7537 139.6177 145.6872 147.4817
stanford() 20519.1232 20620.1182 20684.4025 20721.1132
sentimentr_hu_liu() 224.5367 232.9833 238.5421 241.4299
sentimentr_sentiword() 977.2767 980.9229 987.7338 984.5692
syuzhet_binn() 254.8387 293.6495 310.7012 332.4602
syuzhet_nrc() 787.3683 790.1853 831.2212 793.0022
syuzhet_afinn() 118.1905 138.8190 149.8055 159.4475
uq max neval
21642.9288 22225.1907 3
229.9305 236.3399 3
981.2335 992.8388 3
437.2647 543.2675 3
812.7721 857.9679 3
152.6540 157.8262 3
20767.0422 20812.9712 3
245.5448 249.6597 3
992.9624 1001.3556 3
338.6324 344.8045 3
853.1477 913.2931 3
165.6131 171.7787 3

0 comments on commit 11afa1d

Please sign in to comment.