The goal of watercostaccra is to provide users with documentation on two surveys on household water costs, coping mechanisms as well as water point estimates conducted in November 2023 in Accra, Ghana. The data sets are associated with the following project report completed by Elizabeth Vicario for the “data science for openwashdata” course offered by openwashdata.org.
You can install the development version of watercostaccra from GitHub with:
# install.packages("devtools")
devtools::install_github("openwashdata/watercostaccra")
Alternatively, you can download the individual data sets as a CSV or XLSX file from the table below.
dataset | CSV | XLSX |
---|---|---|
watercostaccra1 | Download CSV | Download XLSX |
watercostaccra2 | Download CSV | Download XLSX |
The package provides access to household water costs, coping mechanisms as well as water point estimates.
library(watercostaccra)
The watercostaccra1
data set contains data about a household survey on
water costs and coping strategies in Accra. It has 116 observations and
89 variables. The watercostaccra2
data set contains data about a water
point survey conducted in Accra as well. It has 49 observations and 30
variables. For an overview of the variable names, see the following
table.
variable_name | variable_type | description |
---|---|---|
id | double | identification number of household |
community | character | one of two communities surveyed (Korle Gonno or Abuja) |
housing_type | character | housing type (\[1\] block unit: unit in a row of apartments made of cement blocks, \[2\] wood unit: unit in a row of apartments made of wood, house, \[3\] compound house: single-story L- or C-shaped house with a multiple units around a shared courtyard, \[4\] multi-story apartment building, \[5\] wooden shack, \[6\] no structure, \[7\] other) |
respondent_relationship_to_hh | character | respondent’s relationship to the household head (respondent identified) |
gender | character | gender of respondent (respondent identified) |
tenure | character | tenure status (renter, homeowner, or living without payment) |
years_in_community | integer | number of years respondent has lived in community |
adult_count | double | number of adults in household including respondent. Household is described as those “eating from the same pot” |
child_count | double | number of children under 18 in household. Household is described as those “eating from the same pot” |
rooms_in_hh | double | number of rooms used for sleeping. Household is described as those “eating from the same pot” |
business_ownership | character | household or respondent owns a business (respondent-owned or household-owned) |
business_location | character | home-based, fixed location outside home, or mobile location |
business_category | character | type of business (e.g., salon, shop, water vending) |
business_water_use | character | respondent’s business uses water beyond typical needs of household (true or false) |
business_water_source | character | primary source of water for business use (packaged water, piped to home, piped to neighbor’s home, piped to compound, commercial or public tap, borehole, dug well, spring water, delivered water) |
primary_dw_source | character | primary source of drinking water (packaged water, piped to home, piped to neighbor’s home, piped to compound, commercial or public tap, borehole, dug well, spring water, delivered water) |
dw_reason_x | character | respondent reasons for using drinking water source (convenience, affordability, availability, temperature, cleanliness, taste, habit or cultural norm, trustworthiness, health, other) |
package_type_preference | character | respondent typically purchases individual sachets/bottles, multipacks of these, or both |
package_size_reason_x | character | reason for purchasing preferred package type (storage space in home, cost effectiveness, temperature at time of purchase, availability of money, convenience, size needed for respondent or household, avoiding wasting water by purchasing when needed) |
dw_treatment | character | treatment methods of water before drinking |
primary_water_source | character | primary water source for non-drinking water (packaged water, piped to home, piped to neighbor’s home, piped to compound, commercial or public tap, borehole, dug well, spring water, delivered water) |
primary_source_reason_x | character | reason for using primary source of non-drinking water (proximity to home, convenience, affordability, availability, cleanliness, other) |
other_non_dw_source_use | logical | respondent uses at least one source besides primary non-drinking water source (true or false) |
other_non_dw_sources_x | character | additional water source(s) for non-drinking water (packaged water, piped to home, piped to neighbor’s home, piped to compound, commercial or public tap, borehole, dug well, spring water, delivered water) |
secondary_source_reason_x | character | reason for using secondary source of non-drinking water (primary source is not available, primary source is not clean, primary source is crowded, availability of shower stalls, convenient location) |
tap_payment_mode | character | respondent’s mechanism for paying for piped water (all respondents use piped water as a primary or secondary source), options including pay_to_fetch, shares_bill, and both. |
daily_hh_water_cost_for_pay_to_fetch | double | daily estimated cost of drinking water for respondent’s household |
daily_hh_water_cost_phhm_for_pay_to_fetch | double | daily estimated cost of drinking water for respondent’s household per household member |
past_struggle_to_find_water | logical | respondent has struggled to find water before (defined as extreme difficulty to access water) (true or false) |
time_of_last_struggle_to_find_water | character | respondent’s last time of struggle to find water (e.g., in the last week) |
weekdays_struggle_to_find_water | double | days in a week the respondent typically struggles to find or pay for water |
past_struggle_primary_reason | character | primary reason for past struggles to find water (availability, high cost, distance to nearest source) |
tap_closure_knowledge_x | character | respondent’s knowledge about tap closures (usually known, sometimes known, expected due to patterns in closures, not known, or no answer) |
coping_mechanism_x | character | strategies for coping with water shortage (spending more on the same amount of water, purchasing extra water to store at home, using another source, using packaged water for cooking, skipping cooking, using packaged water for bathing, skipping bathing, closing business due to water shortage, skipping laundry) |
water_storage_drinking_water | logical | respondent typically stores drinking water at home (true or false) |
water_storage_non_drinking_water | logical | respondent typically stores non-drinking water at home (true or false) |
water_storage_none | logical | respondent typically does not store water at home (true or false) |
storage_containers_x | character | if respondent typically stores non-drinking water, types of storage containers (plastic jugs also called jerry cans or Kufuor gallons, uncovered or covered barrels, other covered or uncovered containers) |
estimated_non_dw_storage_capacity | double | estimated capacity of storage for non-drinking water in liters |
estimated_stored_non_dw | double | estimated actual stored non-drinking water in liters |
Here is an example illustrating health risks associated with the water samples collected in Accra.
library(watercostaccra)
library(ggplot2)
library(dplyr)
library(tidyr)
long_data <- watercostaccra2 |>
pivot_longer(cols = c(coli_mpn_health_risk, tc_mpn_health_risk),
names_to = "risk_type",
values_to = "health_risk")
# Count occurrences of each health_risk category within each community and risk_type
count_data <- long_data |>
group_by(community, risk_type, health_risk) |>
summarise(count = n(), .groups = 'drop')
facet_labels <- c(
coli_mpn_health_risk = "Coliform MPN health risk",
tc_mpn_health_risk = "Total Coliform MPN health risk"
)
# Create the bar plot
ggplot(count_data, aes(x = community, y = count, fill = health_risk)) +
geom_bar(stat = "identity", position = "dodge") +
facet_wrap(~ risk_type, labeller = labeller(risk_type = facet_labels)) +
labs(title = "Health risk assessment by community",
x = "community",
y = "count",
fill = "health risk") +
scale_fill_brewer(palette = "Dark2") +
theme_minimal()
Data are available as CC-BY.
Please cite this package using:
citation("watercostaccra")
#> To cite package 'watercostaccra' in publications use:
#>
#> Götschmann M, Vicario E, Davidson B, Amankwaa E, Zhong M (2024).
#> _watercostaccra: Household water costs and coping strategies data
#> from metropolitan Accra_. R package version 0.0.0.9000,
#> <https://github.com/openwashdata/watercostaccra>.
#>
#> A BibTeX entry for LaTeX users is
#>
#> @Manual{,
#> title = {watercostaccra: Household water costs and coping strategies data from metropolitan Accra},
#> author = {Margaux Götschmann and Elizabeth Vicario and Betty Avanu Davidson and Ebenezer F. Amankwaa and Mian Zhong},
#> year = {2024},
#> note = {R package version 0.0.0.9000},
#> url = {https://github.com/openwashdata/watercostaccra},
#> }