Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation #88

Merged
merged 31 commits into from
Aug 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
28b2418
Merge pull request #55 from Public-Health-Scotland/development
Moohan Jul 24, 2023
0f5c401
Update maintainer to Megan (#69)
Moohan Dec 12, 2023
7f98dc7
Update README.Rmd (#64)
Moohan Dec 12, 2023
448b721
Bug - speed up `get_chi()` (#68)
Jennit07 Dec 12, 2023
beeea06
Render `README.md` after changes to the `.Rmd` version (#70)
github-actions[bot] Dec 13, 2023
8abe664
Bump actions/checkout from 3 to 4 (#66)
dependabot[bot] Dec 13, 2023
d50d412
Bump peter-evans/create-pull-request from 4 to 5 (#65)
dependabot[bot] Dec 13, 2023
60e2a52
Bump stefanzweifel/git-auto-commit-action from 4 to 5 (#67)
dependabot[bot] Dec 13, 2023
9730b1a
Bump JamesIves/github-pages-deploy-action from 4.4.3 to 4.5.0 (#71)
dependabot[bot] Dec 18, 2023
75e6dd0
Bump actions/upload-artifact from 3 to 4 (#72)
dependabot[bot] Dec 18, 2023
d04692f
Bump peter-evans/create-pull-request from 5 to 6 (#74)
dependabot[bot] Feb 12, 2024
8cd62a8
Bump actions/cache from 3 to 4 (#73)
dependabot[bot] Feb 12, 2024
fdff827
Update README.md (#75)
Jennit07 Feb 13, 2024
b5eab41
change in episode file cost variable vector (#76)
SwiftySalmon Feb 13, 2024
18b6909
force keytime format to hms (#77)
lizihao-anu Mar 19, 2024
0347056
Bump JamesIves/github-pages-deploy-action from 4.5.0 to 4.6.0 (#79)
dependabot[bot] Apr 22, 2024
aed3aad
add vignette for SLFhelper documentation
Jennit07 Jun 14, 2024
9f875e6
Style package
Jennit07 Jun 14, 2024
46c9f6e
Hide messages
Jennit07 Jun 14, 2024
78bc990
remove conflict
Jennit07 Jun 17, 2024
ac69bd2
Style package
Jennit07 Jun 17, 2024
7cb140f
Split up documentation into 3 vignettes
Jennit07 Jul 26, 2024
0c54bd4
add a comparison table to show the efficiency improvement
lizihao-anu Jul 29, 2024
79148e5
Update - round memory size
Jennit07 Aug 16, 2024
fb509d0
replace columns by col_select and add tidyselect
lizihao-anu Aug 16, 2024
2c38d4c
Style package
lizihao-anu Aug 16, 2024
9e04b8d
update ep_file_vars and indiv_file_vars
lizihao-anu Aug 16, 2024
b1edb49
add session memory recommendation
lizihao-anu Aug 16, 2024
5b62801
Merge branch 'development' into documentation
lizihao-anu Aug 16, 2024
362789d
Update R-CMD-check.yaml
lizihao-anu Aug 16, 2024
6b9061e
fix cmd build error
lizihao-anu Aug 16, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/R-CMD-check.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,14 +18,14 @@ jobs:
strategy:
fail-fast: false
matrix:
r_version: ['3.6.1', '4.0.2', '4.1.2', 'release', 'devel']
r_version: ['4.0.2', '4.1.2', 'release']

env:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
R_KEEP_PKG_SOURCE: yes

steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4

- uses: r-lib/actions/setup-pandoc@v2

Expand Down
6 changes: 3 additions & 3 deletions .github/workflows/document.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ jobs:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
steps:
- name: Checkout repo
uses: actions/checkout@v3
uses: actions/checkout@v4
with:
fetch-depth: 0

Expand All @@ -34,7 +34,7 @@ jobs:

- name: Commit and create a Pull Request on development
if: ${{ github.ref == 'refs/heads/development' }}
uses: peter-evans/create-pull-request@v4
uses: peter-evans/create-pull-request@v6
with:
commit-message: "Update documentation"
branch: document_development
Expand All @@ -46,6 +46,6 @@ jobs:

- name: Commit and push changes on all other branches
if: ${{ github.ref != 'refs/heads/development' }}
uses: stefanzweifel/git-auto-commit-action@v4
uses: stefanzweifel/git-auto-commit-action@v5
with:
commit_message: "Update documentation"
2 changes: 1 addition & 1 deletion .github/workflows/lint.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ jobs:
env:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4

- uses: r-lib/actions/setup-r@v2
with:
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/pkgdown.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ jobs:
permissions:
contents: write
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4

- uses: r-lib/actions/setup-pandoc@v2

Expand All @@ -41,7 +41,7 @@ jobs:

- name: Deploy to GitHub pages 🚀
if: github.event_name != 'pull_request'
uses: JamesIves/github-pages-deploy-action@v4.4.3
uses: JamesIves/github-pages-deploy-action@v4.6.0
with:
clean: false
branch: gh-pages
Expand Down
6 changes: 3 additions & 3 deletions .github/workflows/render-README.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ jobs:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
steps:
- name: Checkout repo
uses: actions/checkout@v3
uses: actions/checkout@v4
with:
fetch-depth: 0

Expand All @@ -35,7 +35,7 @@ jobs:

- name: Commit and create a Pull Request on production
if: ${{ github.ref == 'refs/heads/production' }}
uses: peter-evans/create-pull-request@v5
uses: peter-evans/create-pull-request@v6
with:
commit-message: "Render `README.md` after changes to the `.Rmd` version"
branch: render_readme
Expand All @@ -47,6 +47,6 @@ jobs:

- name: Commit and push changes on all other branches
if: ${{ github.ref != 'refs/heads/production' }}
uses: stefanzweifel/git-auto-commit-action@v4
uses: stefanzweifel/git-auto-commit-action@v5
with:
commit_message: "Render `README.md` after changes to the `.Rmd` version"
8 changes: 4 additions & 4 deletions .github/workflows/style.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ jobs:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
steps:
- name: Checkout repo
uses: actions/checkout@v3
uses: actions/checkout@v4
with:
fetch-depth: 0

Expand Down Expand Up @@ -46,7 +46,7 @@ jobs:
shell: Rscript {0}

- name: Cache styler
uses: actions/cache@v3
uses: actions/cache@v4
with:
path: ${{ steps.styler-location.outputs.location }}
key: ${{ runner.os }}-styler-${{ github.sha }}
Expand All @@ -60,7 +60,7 @@ jobs:

- name: Commit and create a Pull Request on development
if: ${{ github.ref == 'refs/heads/development' }}
uses: peter-evans/create-pull-request@v4
uses: peter-evans/create-pull-request@v6
with:
commit-message: "Style package"
branch: document_development
Expand All @@ -72,6 +72,6 @@ jobs:

- name: Commit and push changes on all other branches
if: ${{ github.ref != 'refs/heads/development' }}
uses: stefanzweifel/git-auto-commit-action@v4
uses: stefanzweifel/git-auto-commit-action@v5
with:
commit_message: "Style package"
4 changes: 2 additions & 2 deletions .github/workflows/test-coverage.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ jobs:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}

steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4

- uses: r-lib/actions/setup-r@v2
with:
Expand Down Expand Up @@ -44,7 +44,7 @@ jobs:

- name: Upload test results
if: failure()
uses: actions/upload-artifact@v3
uses: actions/upload-artifact@v4
with:
name: coverage-test-failures
path: ${{ runner.temp }}/package
16 changes: 8 additions & 8 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,14 @@ Title: Useful functions for working with the Source Linkage Files
Version: 0.10.0.9000
Authors@R: c(
person("Public Health Scotland", , , "phs.source@phs.scot", role = "cph"),
person("James", "McMahon", , "james.mcmahon@phs.scot", role = c("cre", "aut"),
comment = c(ORCID = "0000-0002-5380-2029"))
person("James", "McMahon", , "james.mcmahon@phs.scot", role = c("aut"),
comment = c(ORCID = "0000-0002-5380-2029")),
person("Megan", "McNicol", , "megan.mcnicol2@phs.scot", role = c("cre", "aut"))
)
Description: This package provides a few helper functions for working with
the Source Linkage Files (SLFs). The functions are mainly focussed on
making the first steps of analysis easier. They can read in and filter
the files in an efficient way using minimal syntax. If you find a bug
or have any ideas for new functions or improvements get in touch or
submit a pull request.
Description: This package provides helper functions for working with
the Source Linkage Files (SLFs). The functions are mainly focused on
making the first steps of analysis easier. They can read and filter
the files efficiently using minimal code.
License: MIT + file LICENSE
URL: https://public-health-scotland.github.io/slfhelper/,
https://github.com/Public-Health-Scotland/slfhelper
Expand All @@ -25,6 +24,7 @@ Imports:
dplyr (>= 1.1.2),
fs (>= 1.6.2),
fst (>= 0.9.8),
hms,
lifecycle (>= 1.0.3),
magrittr (>= 2.0.3),
openssl (>= 2.0.6),
Expand Down
6 changes: 5 additions & 1 deletion R/get_anon_chi.R
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,9 @@
#' chi_cohort %>% get_anon_chi()
#' chi_cohort %>% get_anon_chi(chi_var = "upi_number")
#' }
get_anon_chi <- function(chi_cohort, chi_var = "chi", drop = TRUE, check = TRUE) {

Check warning on line 21 in R/get_anon_chi.R

View workflow job for this annotation

GitHub Actions / lint

file=R/get_anon_chi.R,line=21,col=81,[line_length_linter] Lines should not be more than 80 characters. This line is 82 characters.
if (check) {
# Optional code, if the user has phsmethods installed check the CHIs with it.

Check warning on line 23 in R/get_anon_chi.R

View workflow job for this annotation

GitHub Actions / lint

file=R/get_anon_chi.R,line=23,col=81,[line_length_linter] Lines should not be more than 80 characters. This line is 81 characters.
if (rlang::is_installed("phsmethods")) {
checked_chi <- phsmethods::chi_check(
dplyr::pull(chi_cohort, {{ chi_var }})
Expand All @@ -39,11 +39,11 @@
))
} else if (n_invalid > 0) {
cli::cli_alert_warning(
"{n_invalid} CHI number{?s} {?is/are} invalid according to {.fn phsmethods::chi_check}."

Check warning on line 42 in R/get_anon_chi.R

View workflow job for this annotation

GitHub Actions / lint

file=R/get_anon_chi.R,line=42,col=81,[line_length_linter] Lines should not be more than 80 characters. This line is 98 characters.
)
print(
tibble::tibble(
{{ chi_var }} := dplyr::pull(chi_cohort, {{ chi_var }})[which_invalid],

Check warning on line 46 in R/get_anon_chi.R

View workflow job for this annotation

GitHub Actions / lint

file=R/get_anon_chi.R,line=46,col=81,[line_length_linter] Lines should not be more than 80 characters. This line is 83 characters.
reason = checked_chi[which_invalid]
)
)
Expand All @@ -54,7 +54,11 @@
lookup <- tibble::tibble(
chi = unique(chi_cohort[[chi_var]])
) %>%
dplyr::mutate(anon_chi = convert_chi_to_anon_chi(.data$chi))
dplyr::mutate(
chi = dplyr::if_else(is.na(.data$chi), "", .data$chi),
anon_chi = purrr::map_chr(.data$chi, openssl::base64_encode),
anon_chi = dplyr::if_else(.data$anon_chi == "", NA_character_, .data$anon_chi)

Check warning on line 60 in R/get_anon_chi.R

View workflow job for this annotation

GitHub Actions / lint

file=R/get_anon_chi.R,line=60,col=81,[line_length_linter] Lines should not be more than 80 characters. This line is 84 characters.
)

chi_cohort <- chi_cohort %>%
dplyr::left_join(
Expand Down
24 changes: 10 additions & 14 deletions R/get_chi.R
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,11 @@ get_chi <- function(data, anon_chi_var = "anon_chi", drop = TRUE) {
lookup <- tibble::tibble(
anon_chi = unique(data[[anon_chi_var]])
) %>%
dplyr::mutate(chi = convert_anon_chi_to_chi(.data$anon_chi))

dplyr::mutate(
anon_chi = dplyr::if_else(is.na(.data$anon_chi), "", .data$anon_chi),
chi = unname(convert_anon_chi_to_chi(.data$anon_chi)),
chi = dplyr::if_else(.data$chi == "", NA_character_, .data$chi)
)
data <- data %>%
dplyr::left_join(
lookup,
Expand All @@ -36,17 +39,10 @@ get_chi <- function(data, anon_chi_var = "anon_chi", drop = TRUE) {
return(data)
}

convert_anon_chi_to_chi <- function(anon_chi) {
chi <- purrr::map_chr(
anon_chi,
~ dplyr::case_match(.x,
NA_character_ ~ NA_character_,
"" ~ "",
.default = openssl::base64_decode(.x) %>%
substr(2, 2) %>%
paste0(collapse = "")
)
)
convert_anon_chi_to_chi <- Vectorize(function(anon_chi) {
chi <- openssl::base64_decode(anon_chi) %>%
substr(2, 2) %>%
paste0(collapse = "")

return(chi)
}
})
11 changes: 11 additions & 0 deletions R/read_slf.R
Original file line number Diff line number Diff line change
Expand Up @@ -182,6 +182,17 @@ read_slf_episode <- function(
dev = dev
)
)

if ("keytime1" %in% colnames(data)) {
data <- data %>%
dplyr::mutate(keytime1 = hms::as_hms(.data$keytime1))
}
if ("keytime2" %in% colnames(data)) {
data <- data %>%
dplyr::mutate(keytime2 = hms::as_hms(.data$keytime2))
}

return(data)
}

#' Read a Source Linkage individual file
Expand Down
20 changes: 13 additions & 7 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -23,13 +23,19 @@ knitr::opts_chunk$set(

# slfhelper

The goal of slfhelper is to provide some easy-to-use functions that make working with the Source Linkage Files as painless and efficient as possible.
The goal of slfhelper is to provide some easy-to-use functions that make working with the Source Linkage Files as painless and efficient as possible. It is only intended for use by PHS employees and will only work on the PHS R infrastructure.

## Installation

The preferred method of installation is to use the [{`pak`} package](https://pak.r-lib.org/), which does an excellent job of handling the errors which sometimes occur.
The simplest way to install to the PHS Posit Workbench environment is to use the [PHS Package Manager](https://ppm.publichealthscotland.org/client/#/repos/3/packages/slfhelper), this will be the default setting and means you can install `slfhelper` as you would any other package.

```{r package_install}
``` {r package_install_ppm}
install.packages("slfhelper")
```

If this doesn't work you can install it directly from GitHub, there are a number of ways to do this, we recommend the [{`pak`} package](https://pak.r-lib.org/).

```{r package_install_github}
# Install pak (if needed)
install.packages("pak")

Expand All @@ -41,9 +47,9 @@ pak::pak("Public-Health-Scotland/slfhelper")

### Read a file

**Note:** Reading a full file is quite slow and will use a lot of memory, we would always recommend doing a column selection to only keep the variables that you need for your analysis. Just doing this will dramatically speed up the read-time.
**Note:** Reading a full file is quite slow and will use a lot of memory, we would always recommend doing a column selection to only keep the variables that you need for your analysis. Just doing this will dramatically speed up the read time.

We provide some data snippets to help with the column selection and filtering.
We provide some data snippets to help with column selection and filtering.

```{r helper_data}
library(slfhelper)
Expand Down Expand Up @@ -99,11 +105,11 @@ ep_1718 <- read_slf_episode(c("1718", "1819", "1920"),
) %>%
get_chi()

# Change chi numbers from data above back to anon_chi
# Change chi numbers from the data above back to anon_chi
ep_1718_anon <- ep_1718 %>%
get_anon_chi(chi_var = "chi")

# Add anon_chi to cohort sample
# Add anon_chi to the cohort sample
chi_cohort <- chi_cohort %>%
get_anon_chi(chi_var = "upi_number")
```
52 changes: 44 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,13 +14,24 @@ stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://

The goal of slfhelper is to provide some easy-to-use functions that make
working with the Source Linkage Files as painless and efficient as
possible.
possible. It is only intended for use by PHS employees and will only
work on the PHS R infrastructure.

## Installation

The preferred method of installation is to use the [{`pak`}
package](https://pak.r-lib.org/), which does an excellent job of
handling the errors which sometimes occur.
The simplest way to install to the PHS Posit Workbench environment is to
use the [PHS Package
Manager](https://ppm.publichealthscotland.org/client/#/repos/3/packages/slfhelper),
this will be the default setting and means you can install `slfhelper`
as you would any other package.

``` r
install.packages("slfhelper")
```

If this doesn’t work you can install it directly from GitHub, there are
a number of ways to do this, we recommend the [{`pak`}
package](https://pak.r-lib.org/).

``` r
# Install pak (if needed)
Expand All @@ -37,9 +48,9 @@ pak::pak("Public-Health-Scotland/slfhelper")
**Note:** Reading a full file is quite slow and will use a lot of
memory, we would always recommend doing a column selection to only keep
the variables that you need for your analysis. Just doing this will
dramatically speed up the read-time.
dramatically speed up the read time.

We provide some data snippets to help with the column selection and
We provide some data snippets to help with column selection and
filtering.

``` r
Expand All @@ -54,6 +65,31 @@ View(partnerships)

# See a list with descriptions for the recids
View(recids)

# See a list of Long term conditions
View(ltc_vars)

# See a list of bedday related variables
View(ep_file_bedday_vars)

# See a list of cost related variables
View(ep_file_cost_vars)
```

``` r
library(slfhelper)

# Read a group of variables e.g. LTCs (arth, asthma, atrialfib etc)
# A nice 'catch all' for reading in all of the LTC variables
ep_1718 <- read_slf_episode("1718", col_select = c("anon_chi", ltc_vars))

# Read in a group of variables e.g. bedday related variables (yearstay, stay, apr_beddays etc)
# A 'catch all' for reading in bedday related variables
ep_1819 <- read_slf_episode("1819", col_select = c("anon_chi", ep_file_bedday_vars))

# Read in a group of variables e.g. cost related variables (cost_total_net, apr_cost)
# A 'catch all' for reading in cos related variables
ep_1920 <- read_slf_episode("1920", col_select = c("anon_chi", ep_file_cost_vars))
```

``` r
Expand Down Expand Up @@ -97,11 +133,11 @@ ep_1718 <- read_slf_episode(c("1718", "1819", "1920"),
) %>%
get_chi()

# Change chi numbers from data above back to anon_chi
# Change chi numbers from the data above back to anon_chi
ep_1718_anon <- ep_1718 %>%
get_anon_chi(chi_var = "chi")

# Add anon_chi to cohort sample
# Add anon_chi to the cohort sample
chi_cohort <- chi_cohort %>%
get_anon_chi(chi_var = "upi_number")
```
Binary file modified data/ep_file_cost_vars.rda
Binary file not shown.
Binary file modified data/ep_file_vars.rda
Binary file not shown.
Binary file modified data/indiv_file_vars.rda
Binary file not shown.
Loading
Loading