Standardize definition of tidy data (#1532)

This commit updates the definition of tidy data used in the README and the vignette to match the one presented in ["R for Data Science (2e)"](https://r4ds.hadley.nz/data-tidy#sec-tidy-data).
tidyverse · Nov 3, 2023 · b4d1ec2 · b4d1ec2
1 parent 47a030f
commit b4d1ec2
Show file tree

Hide file tree

Showing 3 changed files with 9 additions and 9 deletions.
diff --git a/README.Rmd b/README.Rmd
@@ -24,9 +24,9 @@ knitr::opts_chunk$set(
 
 The goal of tidyr is to help you create __tidy data__. Tidy data is data where:
 
-1. Every column is a variable.
-1. Every row is an observation.
-1. Every cell is a single value.
+1. Each variable is a column; each column is a variable.
+1. Each observation is a row; each row is an observation.
+1. Each value is a cell; each cell is a single value.
 
 Tidy data describes a standard way of storing data that is used wherever possible throughout the [tidyverse](https://www.tidyverse.org/). If you ensure that your data is tidy, you'll spend less time fighting with the tools and more time working on your analysis. Learn more about tidy data in `vignette("tidy-data")`.
 

diff --git a/README.md b/README.md
@@ -17,9 +17,9 @@ coverage](https://codecov.io/gh/tidyverse/tidyr/branch/main/graph/badge.svg)](ht
 The goal of tidyr is to help you create **tidy data**. Tidy data is data
 where:
 
-1.  Every column is a variable.
-2.  Every row is an observation.
-3.  Every cell is a single value.
+1.  Each variable is a column; each column is a variable.
+2.  Each observation is a row; each row is an observation.
+3.  Each value is a cell; each cell is a single value.
 
 Tidy data describes a standard way of storing data that is used wherever
 possible throughout the [tidyverse](https://www.tidyverse.org/). If you

diff --git a/vignettes/tidy-data.Rmd b/vignettes/tidy-data.Rmd
@@ -99,11 +99,11 @@ Variables may change over the course of analysis. Often the variables in the raw
 
 Tidy data is a standard way of mapping the meaning of a dataset to its structure. A dataset is messy or tidy depending on how rows, columns and tables are matched up with observations, variables and types. In **tidy data**:
 
-1.  Every column is a variable.
+1.  Each variable is a column; each column is a variable.
 
-2.  Every row is an observation.
+2.  Each observation is a row; each row is an observation.
 
-3.  Every cell is a single value.
+3.  Each value is a cell; each cell is a single value.
 
 This is Codd's 3rd normal form, but with the constraints framed in statistical language, and the focus put on a single dataset rather than the many connected datasets common in relational databases. **Messy data** is any other arrangement of the data.