Skip to content

Commit

Permalink
Update plot_citedby() to fix color error/retractions
Browse files Browse the repository at this point in the history
A publication citing DO has been retracted and 6 colors are no
longer sufficient. This updates color_set to avoid errors
[BREAKING CHANGE - partial] and adds a 'retracted' argument to
specify how those publications should be handled.
  • Loading branch information
allenbaron committed Mar 15, 2024
1 parent a44f9fa commit f6dc1f3
Show file tree
Hide file tree
Showing 3 changed files with 85 additions and 15 deletions.
5 changes: 5 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,11 @@
* `read_omim()` now additionally parses official API-key requiring
phenotypicSeries.txt downloads and may be able to handle additional API-key
requiring downloads.
* `plot_citedby():
* _[BREAKING CHANGE]_ `color_set` argument now requires names and one color
for each of the 7 possible publication types when specifying colors manually.
* `retracted` argument added to specify how retracted articles should be
managed.

### New
* `download_omim()` downloads official API-key requiring files directly from
Expand Down
64 changes: 54 additions & 10 deletions R/plot.R
Original file line number Diff line number Diff line change
Expand Up @@ -107,34 +107,78 @@ plot_branch_counts <- function(DO_repo, out_dir = "graphics/website",
#' citing the DO, as a string.
#' @param out_dir The directory where the plot `"DO_cited_by_count.png"`
#' should be saved, as a string. If `NULL` the plot is not saved to disk.
#' @param color_set A set of 6 colors or the prefix of the color set to use from
#' [DO_colors]. Available sets include: "sat", "accent1", "accent2",
#' and "orange". The default and light versions of the specified color set
#' will be used.
#' @param color_set A named set of 7 colors, one for each of
#' the possible publication types (see Colors section) or the
#' prefix of the color set to use from [DO_colors], as a character vector.
#' @param retracted How to handle retracted publications, as a string.
#' One of:
#' * "warn" (default) to drop them with a warning.
#' * "include" to display them in the plot in their own category.
#' * "other" to include them in the "Other" category.
#' @inheritParams plot_branch_counts
#'
#' @section Data Preparation:
#' To prepare data, execute `scripts/citedby_full_procedure.R`.
#'
#' @section Colors:
#' If specifying a color set manually, one color should be included for each of
#' the following publication types: "Article", "Book", "Clinical Trial",
#' "Conference", "Review", "Other", "Retracted". "Other" serves as a catch all
#' category (generally a small subset of otherwise uncategorized publications).
#'
#' Sets available in [DO_colors] include: "sat" (saturated), "accent1",
#' "accent2", and "orange". The default and light versions of the specified
#' color set will be used to generate a gradient.
#'
#' @export
plot_citedby <- function(data_file = "data/citedby/DO_citedby.csv",
out_dir = "graphics/website",
color_set = c("#C45055", "#934FBB", "#95B1BB", "#83C85F", "#B9964B", "#4C3E45"),
color_set = c(
"Article" = "#4C3E45", "Clinical Trial" = "#B9964B",
"Book" = "#83C85F", "Conference" = "#95B1BB",
"Review" = "#934FBB", "Other" = "#C45055",
"Retracted" = "#000000"
),
retracted = "warn",
w = 6, h = 3.15) {
retracted <- match.arg(retracted, c("warn", "include", "other"))
color_nm <- c("Retracted", "Other", "Review", "Conference", "Book",
"Clinical Trial", "Article")

df <- readr::read_csv(data_file) %>%
dplyr::mutate(
Year = lubridate::year(.data$pub_date),
pub_type = clean_pub_type(.data$pub_type)
)

# set color ramp
if (length(color_set) > 1) {
cb_colors <- color_set
} else {
retracted_n <- sum(df$pub_type == "Retracted")
if (retracted_n > 0) {
if (retracted == "warn") {
df <- dplyr::filter(df, .data$pub_type != "Retracted")
rlang::warn(paste0(retracted_n, " retracted publication(s) dropped."))
}
if (retracted == "other") {
df <- dplyr::mutate(
df,
pub_type = dplyr::recode(.data$pub_type, Retracted = "Other")
)
}
}

# prepare colors
color_n <- dplyr::n_distinct(df$pub_type)
if (length(color_set) == 1) {
cb_colors <- grDevices::colorRampPalette(
DO_colors[paste0(color_set, c("_light", ""))]
)(dplyr::n_distinct(df$pub_type))
)(color_n)
} else {
if (length(color_set) != 7 || !all(names(color_set) %in% color_nm)) {
rlang::error("`color_set` must specify a DO_colors color set or 7 named colors")
}
# order colors to match publication type order
cb_colors <- color_set[color_nm]
# use only colors corresponding to publication types in the data
cb_colors <- cb_colors[names(cb_colors) %in% df$pub_type]
}

g <- ggplot2::ggplot(data = df) +
Expand Down
31 changes: 26 additions & 5 deletions man/plot_citedby.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit f6dc1f3

Please sign in to comment.