Skip to content

Commit

Permalink
Release 1.0.0 (#45)
Browse files Browse the repository at this point in the history
* bump version

* add weights argument (#29)

* remove unused variable (#26)

* fix edf computation for deg = 2

* bump version + update docs

* remove package cache on appveyor

* weighted bandwidth selection

* weighted fitting

* adjust R interface

* add unit tests

* a few more sanity checks

* update docs

* update docs

* fix devtools install link

* stabilize weighted quantiles

* ensure scale != 0 in case of too many ties

* remove stripping of debug symbols

* prepare release

* update docs

* allow already jittered input to be treated as continuous

* some improvements to interpolation (#32)

* use sample quantiles as grid points

* fix unweighted quantile computation

* improve safety of integral computations.

* make sure that weighted version is consistent

* improve grid extensions

* fix boundary handling for bounded domain

* fix interpolation bug when grid distance is small

* install gsl

* extend grid a bit further

* fix bug for  weighted quantile with zero weight

* decrease lower bound for interpolation, related to vinecopulib/rvinecopulib#157

* More numerical stability (#35)

* hybrid quantile/equidist grid

* normalize weights

* fix weighted influence and use it as safeguard

* avoid double computation of influence fot deg = 0

* interpolant positivity and smooth extrapolation (#37)

* switch to linear interpolation if result is negative

* use positive constraints for spline coefficients

* Gaussian tail for extrapolation

* optimal bandwidths for all degrees (#38)

* missing inline

* fix levels in pkde1d

* stabilize edf computation

* tabs = 2 spaces (#39)

* simplified, deterministic jittering (#40)

* tabs = 2 spaces

* simplified, deterministic jittering

* prepare standalone c++ version with interface (#41)

* rename lpdens -> kde1d

* use namespaces

* bw selection in Kde1d class

* jittering + nan handling in cpp

* statistical functions

* constructor from interpolation grid

* all discrete functionality in cpp

* kde1d dir structure

* seperate wrappers from interface

* adapt R code

* go to office

* fix nans in grid

* fix boundary transform order

* refactor unit tests

* Fast integration (#42)

* rename lpdens -> kde1d

* use namespaces

* bw selection in Kde1d class

* jittering + nan handling in cpp

* statistical functions

* constructor from interpolation grid

* all discrete functionality in cpp

* kde1d dir structure

* seperate wrappers from interface

* adapt R code

* go to office

* fix nans in grid

* fix boundary transform order

* refactor unit tests

* unnormalized algorithm

* faster integration by sorting input

* Fit models with the fast fourier transform (#43)

* isolate KdeFFT class

* almost done with fft

* proper weights in edf calculation

* fix grid related issues

* eliminate dead code

* safer integration

* reorder arguments for the feels

* prepare release (#44)

* Release 0.4.0 (#31)

* bump version

* add weights argument (#29)

* remove unused variable (#26)

* fix edf computation for deg = 2

* bump version + update docs

* remove package cache on appveyor

* weighted bandwidth selection

* weighted fitting

* adjust R interface

* add unit tests

* a few more sanity checks

* update docs

* update docs

* fix devtools install link

* stabilize weighted quantiles

* ensure scale != 0 in case of too many ties

* remove stripping of debug symbols

* prepare release

* update docs

* bump version

* update NEWS

* update docs

* update NEWS and DESCRIPTION

* update API docs

* backwards compatibility with bad code in rvineocpulib *sigh*

* initialization order warning

* typo in API doc

* CRAN comments

* API docs one more time
  • Loading branch information
tnagler authored Nov 15, 2019
1 parent 5502fec commit 5eaf81f
Show file tree
Hide file tree
Showing 56 changed files with 3,012 additions and 1,994 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,4 @@
.Ruserdata
revdep
/revdep/.cache.rds
.vscode
1 change: 1 addition & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ before_install:
- sudo apt-get install gcc-5 g++-5 gfortran-5
- sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-5 100
- sudo update-alternatives --install /usr/bin/gfortran gfortran /usr/bin/gfortran-5 100
- sudo apt-get install libgsl-dev

repos:
CRAN: http://cran.rstudio.com
Expand Down
5 changes: 2 additions & 3 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Package: kde1d
Type: Package
Title: Univariate Kernel Density Estimation
Version: 0.4.0
Version: 1.0.0
Authors@R: c(
person("Thomas", "Nagler",, "mail@tnagler.com", role = c("aut", "cre")),
person("Thibault", "Vatter",, "thibault.vatter@gmail.com", role = c("aut"))
Expand All @@ -20,13 +20,12 @@ LinkingTo:
Rcpp,
RcppEigen
Imports:
cctools,
graphics,
Rcpp,
qrng,
stats,
utils
RoxygenNote: 6.1.1
RoxygenNote: 7.0.0
Suggests:
testthat
URL: https://github.com/tnagler/kde1d
Expand Down
3 changes: 1 addition & 2 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,12 @@ S3method(plot,kde1d)
S3method(print,kde1d)
S3method(summary,kde1d)
export(dkde1d)
export(equi_jitter)
export(kde1d)
export(pkde1d)
export(qkde1d)
export(rkde1d)
importFrom(Rcpp,sourceCpp)
importFrom(cctools,cont_conv)
importFrom(cctools,expand_as_numeric)
importFrom(graphics,lines)
importFrom(graphics,plot)
importFrom(qrng,ghalton)
Expand Down
25 changes: 25 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,28 @@
# kde1d 1.0.0

NEW FEATURES

* optimal plug-in bandwidth selection for all polynomial degrees (#38).

* avoid randomness through simplified, deterministic jittering, see
`equi_jitter()` (#40).

* removed dependency `cctools`.

* headers in `inst/include` can be used as standalone C++ library with
convenience wrappers for R (#41).

* (several times) faster `pkde1d()`, `qkde1d()`, and `rkde1d()` due to
a more clever algorithm for numerical integration (#42).

* faster `kde1d()` thanks to the Fast Fourier Transform (#43).

BUG FIXES

* improvements to numerical stability, inter- and extrapolation (#32, #35,
#37).


# kde1d 0.4.0

NEW FEATURE
Expand Down
40 changes: 14 additions & 26 deletions R/RcppExports.R
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,9 @@

#' fits a kernel density estimate and calculates the effective degrees of
#' freedom.
#' @param x vector of observations.
#' @param x vector of observations; catergorical data must be converted to
#' non-negative integers.
#' @param nlevels the number of factor levels; 0 for continuous data.
#' @param bw the bandwidth parameter.
#' @param xmin lower bound for the support of the density, `NaN` means no
#' boundary.
Expand All @@ -13,49 +15,35 @@
#' @return `An Rcpp::List` containing the fitted density values on a grid and
#' additional information.
#' @noRd
fit_kde1d_cpp <- function(x, bw, xmin, xmax, deg, weights) {
.Call('_kde1d_fit_kde1d_cpp', PACKAGE = 'kde1d', x, bw, xmin, xmax, deg, weights)
fit_kde1d_cpp <- function(x, nlevels, bw, mult, xmin, xmax, deg, weights) {
.Call('_kde1d_fit_kde1d_cpp', PACKAGE = 'kde1d', x, nlevels, bw, mult, xmin, xmax, deg, weights)
}

#' computes the pdf of a kernel density estimate by interpolation.
#' @param x vector of evaluation points.
#' @param R_object the fitted object passed from R.
#' @param kde1d_r the fitted object passed from R.
#' @return a vector of pdf values.
#' @noRd
dkde1d_cpp <- function(x, R_object) {
.Call('_kde1d_dkde1d_cpp', PACKAGE = 'kde1d', x, R_object)
dkde1d_cpp <- function(x, kde1d_r) {
.Call('_kde1d_dkde1d_cpp', PACKAGE = 'kde1d', x, kde1d_r)
}

#' computes the cdf of a kernel density estimate by numerical integration.
#' @param x vector of evaluation points.
#' @param R_object the fitted object passed from R.
#' @param kde1d_r the fitted object passed from R.
#' @return a vector of cdf values.
#' @noRd
pkde1d_cpp <- function(x, R_object) {
.Call('_kde1d_pkde1d_cpp', PACKAGE = 'kde1d', x, R_object)
pkde1d_cpp <- function(q, kde1d_r) {
.Call('_kde1d_pkde1d_cpp', PACKAGE = 'kde1d', q, kde1d_r)
}

#' computes the quantile of a kernel density estimate by numerical inversion
#' (bisection method).
#' @param x vector of evaluation points.
#' @param R_object the fitted object passed from R.
#' @param kde1d_r the fitted object passed from R.
#' @return a vector of quantiles.
#' @noRd
qkde1d_cpp <- function(x, R_object) {
.Call('_kde1d_qkde1d_cpp', PACKAGE = 'kde1d', x, R_object)
}

#' @param x vector of observations
#' @param grid_size number of equally-spaced points over which binning is
#' performed to obtain kernel functional approximation
#' @param weights vector of weights for each observation (can be empty).
#' @return the selected bandwidth
#' @noRd
select_bw_cpp <- function(x, bw, mult, discrete, weights) {
.Call('_kde1d_select_bw_cpp', PACKAGE = 'kde1d', x, bw, mult, discrete, weights)
}

quan <- function(x, a, w) {
.Call('_kde1d_quan', PACKAGE = 'kde1d', x, a, w)
qkde1d_cpp <- function(p, kde1d_r) {
.Call('_kde1d_qkde1d_cpp', PACKAGE = 'kde1d', p, kde1d_r)
}

45 changes: 45 additions & 0 deletions R/jitter.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
#' Conditionally equidistant jittering
#'
#' Converts ordered variables to numeric and Adds deterministic uniform noise.
#' See *Details*.
#'
#' Jittering makes discrete variables continuous by adding noise. This simple
#' trick allows to consistently estimate densities with tools designed for the
#' continuous case (see, Nagler, 2018a/b). The drawback is that estimates are
#' random and the noise may deteriorate the estimate by chance.
#'
#' Here, we add a form of deterministic noise that makes estimators well
#' behaved. Tied occurences of a factor level are spread out uniformly
#' (i.e., equidistantly) on the interval \eqn{[-0.5, 0.5]}. This is similar to
#' adding random noise that is uniformly distributed, conditional on the
#' observed outcome. Integrating over the outcome, one can check that the
#' unconditional noise distribution is also uniform on \eqn{[-0.5, 0.5]}.
#'
#' Asymptotically, the deterministic jittering variant is equivalent to the
#' random one.
#'
#' @param x observations; the function does nothing if `x` is already numeric.
#'
#' @export
#'
#' @references
#' Nagler, T. (2018a). *A generic approach to nonparametric function estimation
#' with mixed data.* Statistics & Probability Letters, 137:326–330,
#' [arXiv:1704.07457](https://arxiv.org/abs/1704.07457)
#'
#' Nagler, T. (2018b). *Asymptotic analysis of the jittering kernel density
#' estimator.* Mathematical Methods of Statistics, in press,
#' [arXiv:1705.05431](https://arxiv.org/abs/1705.05431)
#'
#' @examples
#' x <- as.factor(rbinom(10, 1, 0.5))
#' equi_jitter(x)
equi_jitter <- function(x) {
if (is.numeric(x))
return(x)
x <- as.numeric(x)
tab <- table(x)
noise <- unname(unlist(lapply(tab, function(l) -0.5 + 1:l / (l + 1))))
s <- sort(x, index.return = TRUE)
(s$x + noise)[rank(x, ties.method = "first", na.last = "keep")]
}
Loading

0 comments on commit 5eaf81f

Please sign in to comment.