Skip to content

nzbri/msnz

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

msnz

The goal of the msnz package is to provide functions to simplify data analysis from the New Zealand Multiple Sclerosis Prevalence and related studies, for researchers associated with the New Zealand Brain Research Institute (NZBRI).

The package also provides a function to estimate life expectancy from the New Zealand cohort life tables, which can be of use more widely. These individual esteimates are either:

  • Deterministic - the function simply looks up the expected year of death for an individual of a given sex, year of birth, and conditional age, or
  • Probabilistic - the function runs an individual-level random simulation for a person of given characteristics. This means that the generated sample will have a realistic level of variability across individuals, rather than people with the same demographics having an identical expected year of death.

Installation

This package is not available via CRAN and is hosted in a repository on Github at https://github.com/nzbri/msnz

Therefore, the usual installation route using install.packages('msnz') is not possible, and will yield a not-useful error message that the package is not available for your version of R. Instead, install msnz from its development repository on Github as follows:

# install.packages('remotes')
remotes::install_github('nzbri/msnz')

If install_github('nzbri/msnz') is invoked subsequently, the package will be downloaded and installed only if the version on Github is newer than the one installed locally.

If problems arise in a new release, you can downgrade to a previous version by specifying the name of a particular release to revert to, e.g.

remotes::install_github('nzbri/msnz@v0.3.0')

Functions specific to the MS prevalence study

The package provides three convenience functions to extract info from the unique identifier (UIN) assigned to each participant (see Richardson et.al. (2012) Method for identifying eligible individuals for a prevalence survey in the absence of a disease register or population register. Internal Medicine Journal, 42, 1207-1212):

library(msnz)

msnz::dob_from_uin('01011970FAB')
#> [1] "1970-01-01"

msnz::sex_from_uin('01011970FAB')
#> [1] Female
#> Levels: Female Male

msnz::initials_from_uin('01011970FAB')
#> [1] "AB"

It also provides two package-wide constants, giving the original census date of the New Zealand MS Prevalence study, and the censoring date for survival analyses, set to 15 years later:

msnz::census_date
#> [1] "2006-03-07"

msnz::censoring_date
#> [1] "2021-03-07"

Calculating New Zealand life expectancy

This function is the only one of use more generally to external researchers. The package incorporates the New Zealand Cohort Life Tables provided by Statistics New Zealand, as released in March 2021. (See https://www.stats.govt.nz/information-releases/new-zealand-cohort-life-tables-march-2022-update for source data).

The msnz::expected_year_of_death() function allows one to calculate the life expectancy of a New Zealander, given their year of birth, sex, and some conditional age (see below for explanation).

msnz::expected_year_of_death(year_of_birth = 1970, 
                             sex = 'female', 
                             conditional_age = 50)
#> [1] 2058.5

# calculate life expectancy (in years from the conditional age year):
year_of_birth = 1970
conditional_age = 50

expected_year_of_death(year_of_birth = year_of_birth,
                       sex = 'female', 
                       conditional_age = conditional_age) - 
  (year_of_birth + conditional_age) # = 2020
#> [1] 38.5

It is important to specify a conditional age that the person has reached. If 0, then the returned value will reflect life expectancy at birth. For example, the expected year of death of a person is greater if it is known that they have managed to reach a given age, compared to their life expectancy at birth (when they would be subject to child mortality and early adulthood risks):

# life expectancy at birth, which is subject to infant mortality and elevated 
# risks in teenage years/early adulthood (e.g. traffic accidents and suicide):
msnz::expected_year_of_death(year_of_birth = 1970, 'female', 
                             conditional_age = 0)
#> [1] 2055.2

# given that we know that a person has lived to a certain age, their expected 
# year of death should be greater, as they have survived through a period of
# mortality risks:
msnz::expected_year_of_death(year_of_birth = 1970, 'female', 
                             conditional_age = 50)
#> [1] 2058.5

The values returned can either be deterministic (extracted directly from the life table) or the result of a random simulation process (to better approximate the distribution of values that would be seen in an actual population). The simulation approach is particularly useful in a survival analysis where one wishes to compare the survival in a given sample against a synthetic sample derived from the population. That is, for each person in the actual sample, we generate a synthetic comparison person, randomly simulated from the population, conditional on the target person’s year of birth, sex, and age at a censoring date. The simulation approach gives a much more natural-looking population comparison survival distribution, compared to each synthetic person having precisely the mean survival conditional on those values.

The simulation process is invoked by specifying the parameter method = 'sample', rather than the default value method = 'median', which simply returns the tabulated median value from the cohort life tables. Using the 'sample' method runs a simulation for a person of the given year of birth, sex, and conditional age (e.g. the age they reached at the census date). The simulation proceeds by iterating up from that conditional age age, one year at a time. At each year of age, a random number is generated (uniformly distributed between 0.0 and 1.0). If that number is less than the life table probability of such a person surviving to the next year, the age is incremented and the simulation continues for that individual. If the random number exceeds that probability, then the current age is assigned as the synthetic person’s age of death.

For fuller details on other function parameters, look up the help via ?expected_year_of_death. These include control over what estimate is returned (i.e. the median or the 5th or 95th percentile), the number of simulated samples returned per individual, and a seed to allow for reproducibility of the randomly simulated values.

About

Functions for working with NZBRI MS epidemiology data

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Languages