diff --git a/R/comparators.R b/R/comparators.R index 0232713..c362592 100644 --- a/R/comparators.R +++ b/R/comparators.R @@ -24,9 +24,9 @@ #' (\code{TRUE}/\code{FALSE} or \code{1}/\code{0}) result. The result should #' not contain missing values. #' -#' The `jaro_winkler`, `lcs` and `jaccard` functions use the corresponding +#' The \code{jaro_winkler}, \code{lcs} and \code{jaccard} functions use the corresponding #' methods from \code{\link{stringdist}} except that they are transformed from -#' a distnce to a similarity score. +#' a distance to a similarity score. #' #' @return #' The functions return a comparison function (see details). diff --git a/R/data.R b/R/data.R index 0e4924b..cf70524 100644 --- a/R/data.R +++ b/R/data.R @@ -4,13 +4,13 @@ #' Contains fictional records of 7 persons. #' #' \itemize{ -#' \item id the id of the person; this contains no errors and can be used to +#' \item \code{id} the id of the person; this contains no errors and can be used to #' validate the linkage. -#' \item lastname the lastname of the person; contains errors. -#' \item firstname the firstname of the persons; contains errors. -#' \item address the address; contains errors. -#' \item sex the sex; contains errors and missing values. -#' \item postcode the postcode; contains no errors. +#' \item \code{lastname} the last name of the person; contains errors. +#' \item \code{firstname} the first name of the persons; contains errors. +#' \item \code{address} the address; contains errors. +#' \item \code{sex} the sex; contains errors and missing values. +#' \item \code{postcode} the postcode; contains no errors. #' } #' #' @docType data diff --git a/R/filter_pairs_for_deduplication.R b/R/filter_pairs_for_deduplication.R index d4e5a7b..3a8ac65 100644 --- a/R/filter_pairs_for_deduplication.R +++ b/R/filter_pairs_for_deduplication.R @@ -4,7 +4,7 @@ #' #' In case of deduplication one tries to link a data set to itself. Therefore, #' comparisons only have to be made for records for which the index of the -#' records from the first data set is larger than the index fron the record from +#' records from the first data set is larger than the index from the record from #' the second data set. #' #' @param pairs a \code{pairs} object, such as generated by diff --git a/R/pair_blocking.R b/R/pair_blocking.R index 51aaba1..09b5b25 100644 --- a/R/pair_blocking.R +++ b/R/pair_blocking.R @@ -22,7 +22,7 @@ #' @return #' When \code{large} is \code{FALSE}, a \code{data.frame} with two columns, #' \code{x} and \code{y}, is returned. Columns \code{x} and \code{y} are -#' rownumbers from \code{data.frame}s \code{x} and \code{y} respectively. +#' row numbers from \code{data.frame}s \code{x} and \code{y} respectively. #' When \code{large} is \code{TRUE}, an object of type \code{ldat} is returned. #' #' @examples diff --git a/R/predict_problink_em.R b/R/predict_problink_em.R index 79dfb11..2f581e4 100644 --- a/R/predict_problink_em.R +++ b/R/predict_problink_em.R @@ -1,12 +1,12 @@ #' Calculate weights and probabilities for pairs #' -#' @param object an object of type `problink_em` as produced by +#' @param object an object of type \code{problink_em} as produced by #' \code{\link{problink_em}}. #' @param pairs a object with pairs for which to calculate weights. -#' @param newdata an alternative name for the `pairs` argument. Specify -#' `newdata` or `pairs`. -#' @param type a character vector of length one speicifying what to calculate. +#' @param newdata an alternative name for the \code{pairs} argument. Specify +#' \code{newdata} or \code{pairs}. +#' @param type a character vector of length one specifying what to calculate. #' See results for more information. #' @param binary convert comparison vectors to binary vectors using the #' comparison function in comparators. @@ -15,12 +15,12 @@ #' @param ... unused. #' #' @return -#' In case of `type == "weights"` returns a vector (\code{\link{lvec}} or -#' regular R-vector depending on the type of `pairs`). with the linkage weights. -#' In case of `type == "mpost"` returns a vector with the posterior m-probabilities -#' (probability that a pair is a match). In case of `type == "probs"` returns a +#' In case of \code{type == "weights"} returns a vector (\code{\link{lvec}} or +#' regular R-vector depending on the type of \code{pairs}). with the linkage weights. +#' In case of \code{type == "mpost"} returns a vector with the posterior m-probabilities +#' (probability that a pair is a match). In case of \code{type == "probs"} returns a #' data.frame or \code{\link{ldat}} with the m- and u-probabilities and posterior -#' m- and u probabilities. In case `type == "all"` returns a `data.frame` or +#' m- and u probabilities. In case \code{type == "all"} returns a \code{data.frame} or #' \code{\link{ldat}} with both probabilities and weights. #' #' @import ldat diff --git a/R/problink_em.R b/R/problink_em.R index feefa91..1de65d5 100644 --- a/R/problink_em.R +++ b/R/problink_em.R @@ -9,7 +9,7 @@ #' should be lists with numeric values. The names of the elements in the list #' should correspond to the names in \code{by_x} in \code{\link{compare_pairs}}. #' @param p0 the initial estimate of the probability that a pair is a match. -#' @param tol when the change in the m and u-probabilities is smaller than tol +#' @param tol when the change in the m and u-probabilities is smaller than \code{tol} #' the algorithm is stopped. #' #' @return diff --git a/R/score_problink.R b/R/score_problink.R index bd69e4a..c82f740 100644 --- a/R/score_problink.R +++ b/R/score_problink.R @@ -18,7 +18,7 @@ #' \code{...} argument that causes the calculation of multiple scores (such #' are \code{type = "all"}). In that case the text given by \code{var} is #' prepended to the names of the variables returned by -#' \code{\link{predict.problink_em}} (with a seperator '\code{_}'). +#' \code{\link{predict.problink_em}} (with a separator '\code{_}'). #' #' When \code{add = FALSE} the scores are returned as is. #' diff --git a/cran-comments.md b/cran-comments.md index 823f97c..b91ca33 100644 --- a/cran-comments.md +++ b/cran-comments.md @@ -4,6 +4,7 @@ This is a new package. ## Test environments * local ubuntu 18.04 install, R 3.4.4 +* local ubuntu 18.04 install, R 3.4.4 + valgrind * R-devel on windows using the R-builder * R with valgrind and sanitizers on rhub * R-devel on rhub @@ -11,7 +12,9 @@ This is a new package. ## R CMD check results -0 errors | 0 warnings | 0 notes +0 errors | 0 warnings | 1 notes + +1 note because it is a new package ## Reverse dependencies diff --git a/man/comparators.Rd b/man/comparators.Rd index a5c380f..b9320d4 100644 --- a/man/comparators.Rd +++ b/man/comparators.Rd @@ -45,9 +45,9 @@ a previous comparison. The function should translate that result to a binary (\code{TRUE}/\code{FALSE} or \code{1}/\code{0}) result. The result should not contain missing values. -The `jaro_winkler`, `lcs` and `jaccard` functions use the corresponding +The \code{jaro_winkler}, \code{lcs} and \code{jaccard} functions use the corresponding methods from \code{\link{stringdist}} except that they are transformed from -a distnce to a similarity score. +a distance to a similarity score. } \examples{ cmp <- identical() diff --git a/man/filter_pairs_for_deduplication.Rd b/man/filter_pairs_for_deduplication.Rd index a64d39b..d518e9c 100644 --- a/man/filter_pairs_for_deduplication.Rd +++ b/man/filter_pairs_for_deduplication.Rd @@ -13,6 +13,6 @@ filter_pairs_for_deduplication(pairs) \description{ In case of deduplication one tries to link a data set to itself. Therefore, comparisons only have to be made for records for which the index of the -records from the first data set is larger than the index fron the record from +records from the first data set is larger than the index from the record from the second data set. } diff --git a/man/linkexample.Rd b/man/linkexample.Rd index c16b1aa..808997c 100644 --- a/man/linkexample.Rd +++ b/man/linkexample.Rd @@ -11,13 +11,13 @@ Contains fictional records of 7 persons. } \details{ \itemize{ - \item id the id of the person; this contains no errors and can be used to + \item \code{id} the id of the person; this contains no errors and can be used to validate the linkage. - \item lastname the lastname of the person; contains errors. - \item firstname the firstname of the persons; contains errors. - \item address the address; contains errors. - \item sex the sex; contains errors and missing values. - \item postcode the postcode; contains no errors. + \item \code{lastname} the last name of the person; contains errors. + \item \code{firstname} the first name of the persons; contains errors. + \item \code{address} the address; contains errors. + \item \code{sex} the sex; contains errors and missing values. + \item \code{postcode} the postcode; contains no errors. } } \keyword{datasets} diff --git a/man/pair_blocking.Rd b/man/pair_blocking.Rd index 0bf64d1..e49602c 100644 --- a/man/pair_blocking.Rd +++ b/man/pair_blocking.Rd @@ -27,7 +27,7 @@ number of pairs that are kept in memory.} \value{ When \code{large} is \code{FALSE}, a \code{data.frame} with two columns, \code{x} and \code{y}, is returned. Columns \code{x} and \code{y} are -rownumbers from \code{data.frame}s \code{x} and \code{y} respectively. +row numbers from \code{data.frame}s \code{x} and \code{y} respectively. When \code{large} is \code{TRUE}, an object of type \code{ldat} is returned. } \description{ diff --git a/man/predict.problink_em.Rd b/man/predict.problink_em.Rd index 1bcbdc3..3652a8c 100644 --- a/man/predict.problink_em.Rd +++ b/man/predict.problink_em.Rd @@ -9,15 +9,15 @@ ...) } \arguments{ -\item{object}{an object of type `problink_em` as produced by +\item{object}{an object of type \code{problink_em} as produced by \code{\link{problink_em}}.} \item{pairs}{a object with pairs for which to calculate weights.} -\item{newdata}{an alternative name for the `pairs` argument. Specify -`newdata` or `pairs`.} +\item{newdata}{an alternative name for the \code{pairs} argument. Specify +\code{newdata} or \code{pairs}.} -\item{type}{a character vector of length one speicifying what to calculate. +\item{type}{a character vector of length one specifying what to calculate. See results for more information.} \item{binary}{convert comparison vectors to binary vectors using the @@ -29,12 +29,12 @@ When missing \code{attr(pairs, 'comparators')} is used.} \item{...}{unused.} } \value{ -In case of `type == "weights"` returns a vector (\code{\link{lvec}} or -regular R-vector depending on the type of `pairs`). with the linkage weights. -In case of `type == "mpost"` returns a vector with the posterior m-probabilities -(probability that a pair is a match). In case of `type == "probs"` returns a +In case of \code{type == "weights"} returns a vector (\code{\link{lvec}} or +regular R-vector depending on the type of \code{pairs}). with the linkage weights. +In case of \code{type == "mpost"} returns a vector with the posterior m-probabilities +(probability that a pair is a match). In case of \code{type == "probs"} returns a data.frame or \code{\link{ldat}} with the m- and u-probabilities and posterior -m- and u probabilities. In case `type == "all"` returns a `data.frame` or +m- and u probabilities. In case \code{type == "all"} returns a \code{data.frame} or \code{\link{ldat}} with both probabilities and weights. } \description{ diff --git a/man/problink_em.Rd b/man/problink_em.Rd index 6656b4e..aa2041f 100644 --- a/man/problink_em.Rd +++ b/man/problink_em.Rd @@ -18,7 +18,7 @@ should correspond to the names in \code{by_x} in \code{\link{compare_pairs}}.} \item{p0}{the initial estimate of the probability that a pair is a match.} -\item{tol}{when the change in the m and u-probabilities is smaller than tol +\item{tol}{when the change in the m and u-probabilities is smaller than \code{tol} the algorithm is stopped.} } \value{ diff --git a/man/score_problink.Rd b/man/score_problink.Rd index 0e9d3d5..241bfb1 100644 --- a/man/score_problink.Rd +++ b/man/score_problink.Rd @@ -29,7 +29,7 @@ arguments are passed on to \code{\link{predict.problink_em}} using the \code{...} argument that causes the calculation of multiple scores (such are \code{type = "all"}). In that case the text given by \code{var} is prepended to the names of the variables returned by -\code{\link{predict.problink_em}} (with a seperator '\code{_}'). +\code{\link{predict.problink_em}} (with a separator '\code{_}'). When \code{add = FALSE} the scores are returned as is. } diff --git a/vignettes/deduplication.Rmd b/vignettes/deduplication.Rmd index 18f0513..9fe0d91 100644 --- a/vignettes/deduplication.Rmd +++ b/vignettes/deduplication.Rmd @@ -4,7 +4,7 @@ author: "Jan van der Laan" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > - %\VignetteIndexEntry{Vignette Title} + %\VignetteIndexEntry{Deduplication using reclin} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- diff --git a/vignettes/introduction_to_reclin.Rmd b/vignettes/introduction_to_reclin.Rmd index 0cb981b..d9cc006 100644 --- a/vignettes/introduction_to_reclin.Rmd +++ b/vignettes/introduction_to_reclin.Rmd @@ -4,7 +4,7 @@ author: "Jan van der Laan" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > - %\VignetteIndexEntry{Vignette Title} + %\VignetteIndexEntry{Introduction to reclin} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} ---