From b4d1ec2cb4dd9e8f6dbd658f11a5fee4ce9f24be Mon Sep 17 00:00:00 2001 From: Matt Nield <64328730+matthewjnield@users.noreply.github.com> Date: Fri, 3 Nov 2023 12:46:53 -0400 Subject: [PATCH] Standardize definition of tidy data (#1532) This commit updates the definition of tidy data used in the README and the vignette to match the one presented in ["R for Data Science (2e)"](https://r4ds.hadley.nz/data-tidy#sec-tidy-data). --- README.Rmd | 6 +++--- README.md | 6 +++--- vignettes/tidy-data.Rmd | 6 +++--- 3 files changed, 9 insertions(+), 9 deletions(-) diff --git a/README.Rmd b/README.Rmd index 6f810170c..115fe14ec 100644 --- a/README.Rmd +++ b/README.Rmd @@ -24,9 +24,9 @@ knitr::opts_chunk$set( The goal of tidyr is to help you create __tidy data__. Tidy data is data where: -1. Every column is a variable. -1. Every row is an observation. -1. Every cell is a single value. +1. Each variable is a column; each column is a variable. +1. Each observation is a row; each row is an observation. +1. Each value is a cell; each cell is a single value. Tidy data describes a standard way of storing data that is used wherever possible throughout the [tidyverse](https://www.tidyverse.org/). If you ensure that your data is tidy, you'll spend less time fighting with the tools and more time working on your analysis. Learn more about tidy data in `vignette("tidy-data")`. diff --git a/README.md b/README.md index 033e8497b..16ca7fcc9 100644 --- a/README.md +++ b/README.md @@ -17,9 +17,9 @@ coverage](https://codecov.io/gh/tidyverse/tidyr/branch/main/graph/badge.svg)](ht The goal of tidyr is to help you create **tidy data**. Tidy data is data where: -1. Every column is a variable. -2. Every row is an observation. -3. Every cell is a single value. +1. Each variable is a column; each column is a variable. +2. Each observation is a row; each row is an observation. +3. Each value is a cell; each cell is a single value. Tidy data describes a standard way of storing data that is used wherever possible throughout the [tidyverse](https://www.tidyverse.org/). If you diff --git a/vignettes/tidy-data.Rmd b/vignettes/tidy-data.Rmd index 64fc55859..c1fa65c92 100644 --- a/vignettes/tidy-data.Rmd +++ b/vignettes/tidy-data.Rmd @@ -99,11 +99,11 @@ Variables may change over the course of analysis. Often the variables in the raw Tidy data is a standard way of mapping the meaning of a dataset to its structure. A dataset is messy or tidy depending on how rows, columns and tables are matched up with observations, variables and types. In **tidy data**: -1. Every column is a variable. +1. Each variable is a column; each column is a variable. -2. Every row is an observation. +2. Each observation is a row; each row is an observation. -3. Every cell is a single value. +3. Each value is a cell; each cell is a single value. This is Codd's 3rd normal form, but with the constraints framed in statistical language, and the focus put on a single dataset rather than the many connected datasets common in relational databases. **Messy data** is any other arrangement of the data.