Skip to content

Commit

Permalink
Standardize definition of tidy data (#1532)
Browse files Browse the repository at this point in the history
This commit updates the definition of tidy data used in the README and the vignette to match the one presented in ["R for Data Science (2e)"](https://r4ds.hadley.nz/data-tidy#sec-tidy-data).
  • Loading branch information
matthewjnield authored Nov 3, 2023
1 parent 47a030f commit b4d1ec2
Show file tree
Hide file tree
Showing 3 changed files with 9 additions and 9 deletions.
6 changes: 3 additions & 3 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -24,9 +24,9 @@ knitr::opts_chunk$set(

The goal of tidyr is to help you create __tidy data__. Tidy data is data where:

1. Every column is a variable.
1. Every row is an observation.
1. Every cell is a single value.
1. Each variable is a column; each column is a variable.
1. Each observation is a row; each row is an observation.
1. Each value is a cell; each cell is a single value.

Tidy data describes a standard way of storing data that is used wherever possible throughout the [tidyverse](https://www.tidyverse.org/). If you ensure that your data is tidy, you'll spend less time fighting with the tools and more time working on your analysis. Learn more about tidy data in `vignette("tidy-data")`.

Expand Down
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,9 @@ coverage](https://codecov.io/gh/tidyverse/tidyr/branch/main/graph/badge.svg)](ht
The goal of tidyr is to help you create **tidy data**. Tidy data is data
where:

1. Every column is a variable.
2. Every row is an observation.
3. Every cell is a single value.
1. Each variable is a column; each column is a variable.
2. Each observation is a row; each row is an observation.
3. Each value is a cell; each cell is a single value.

Tidy data describes a standard way of storing data that is used wherever
possible throughout the [tidyverse](https://www.tidyverse.org/). If you
Expand Down
6 changes: 3 additions & 3 deletions vignettes/tidy-data.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -99,11 +99,11 @@ Variables may change over the course of analysis. Often the variables in the raw

Tidy data is a standard way of mapping the meaning of a dataset to its structure. A dataset is messy or tidy depending on how rows, columns and tables are matched up with observations, variables and types. In **tidy data**:

1. Every column is a variable.
1. Each variable is a column; each column is a variable.

2. Every row is an observation.
2. Each observation is a row; each row is an observation.

3. Every cell is a single value.
3. Each value is a cell; each cell is a single value.

This is Codd's 3rd normal form, but with the constraints framed in statistical language, and the focus put on a single dataset rather than the many connected datasets common in relational databases. **Messy data** is any other arrangement of the data.

Expand Down

0 comments on commit b4d1ec2

Please sign in to comment.