-
Renamed
table4
andtable5
totable4a
andtable4b
to make their connection more clear. -
full_seq()
works correctly for dates and date/times.
- Restored compatibility with R < 3.3.0 by avoiding
getS3method(envir = )
(#205, @krlmlr).
separate_rows()
separates observations with multiple delimited values into separate rows (#69, @aaronwolen).
-
complete()
preserves grouping created by dplyr (#168). -
expand()
(and hencecomplete()
) preserves the ordered attribute of factors (#165). -
full_seq()
preserve attributes for dates and date/times (#156), and sequences no longer need to start at 0. -
gather()
can now gather together list columns (#175), andgather_.data.frame(na.rm = TRUE)
now only removes missing values if they're actually present (#173). -
nest()
returns correct output if every variable is nested (#186). -
separate()
fills from right-to-left (not left-to-right!) when fill = "left" (#170, @dgrtwo). -
separate()
andunite()
now automatically drop removed variables from grouping (#159, #177). -
spread()
gains asep
argument. If not-null, this will name columns as "keyvalue". Additionally, if sep isNULL
missing values will be converted to<NA>
(#68). -
spread()
works in the presence of list-columns (#199) -
unnest()
works with non-syntactic names (#190). -
unnest()
gains asep
argument. If non-null, this will rename the columns of nested data frames to include both the original column name, and the nested column name, separated by.sep
(#184). -
unnest()
gains.id
argument that works the same way asbind_rows()
. This is useful if you have a named list of data frames or vectors (#125). -
Moved in useful sample datasets from the DSR package.
-
Made compatible with both dplyr 0.4 and 0.5.
-
tidyr functions that create new columns are more aggresive about re-encoding the column names as UTF-8.
- Fixed bug in
nest()
where nested data was ending up in the wrong row (#158).
nest()
and unnest()
have been overhauled to support a useful way of structuring data frames: the nested data frame. In a grouped data frame, you have one row per observation, and additional metadata define the groups. In a nested data frame, you have one row per group, and the individual observations are stored in a column that is a list of data frames. This is a useful structure when you have lists of other objects (like models) with one element per group.
-
nest()
now produces a single list of data frames called "data" rather than a list column for each variable. Nesting variables are not included in nested data frames. It also works with grouped data frames made bydplyr::group_by()
. You can override the default column name with.key
. -
unnest()
gains a.drop
argument which controls what happens to other list columns. By default, they're kept if the output doesn't require row duplication; otherwise they're dropped. -
unnest()
now hasmutate()
semantics for...
- this allows you to unnest transformed columns more easily. (Previously it used select semantics).
-
expand()
once again allows you to evaluate arbitrary expressions likefull_seq(year)
. If you were previously usingc()
to created nested combinations, you'll now need to usenesting()
(#85, #121). -
nesting()
andcrossing()
allow you to create nested and crossed data frames from individual vectors.crossing()
is similar tobase::expand.grid()
-
full_seq(x, period)
creates the full sequence of values frommin(x)
tomax(x)
everyperiod
values.
-
fill()
fills inNULL
s in list-columns. -
fill()
gains a direction argument so that it can fill either upwards or downwards (#114). -
gather()
now stores the key column as character, by default. To revert to the previous behaviour of using a factor (which allows you to preserve the ordering of the columns), usekey_factor = TRUE
(#96). -
All tidyr verbs do the right thing for grouped data frames created by
group_by()
(#122, #129, #81). -
seq_range()
has been removed. It was never used or announced. -
spread()
once again creates columns of mixed type whenconvert = TRUE
(#118, @jennybc).spread()
withdrop = FALSE
handles zero-length factors (#56).spread()
ing a data frame with only key and value columns creates a one row output (#41). -
unite()
now removes old columns before adding new (#89, @krlmlr). -
separate()
now warns if defunct ... argument is used (#151, @krlmlr).
- Fixed bug where attributes of non-gather columns were lost (#104)
-
New
complete()
provides a wrapper aroundexpand()
,left_join()
andreplace_na()
for a common task: completing a data frame with missing combinations of variables. -
fill()
fills in missing values in a column with the last non-missing value (#4). -
New
replace_na()
makes it easy to replace missing values with something meaningful for your data. -
nest()
is the complement ofunnest()
(#3). -
unnest()
can now work with multiple list-columns at the same time. If you don't supply any columns names, it will unlist all list-columns (#44).unnest()
can also handle columns that are lists of data frames (#58).
-
tidyr no longer depends on reshape2. This should fix issues if you also try to load reshape (#88).
-
%>%
is re-exported from magrittr. -
expand()
now supports nesting and crossing (see examples for details). This comes at the expense of creating new variables inline (#46). -
expand_
does SE evaluation correctly so you can pass it a character vector of columns names (or list of formulas etc) (#70). -
extract()
is 10x faster because it now uses stringi instead of base R regular expressions. It also returns NA instead of throwing an error if the regular expression doesn't match (#72). -
extract()
andseparate()
preserve character vectors whenconvert
is TRUE (#99). -
The internals of
spread()
have been rewritten, and now preserve all attributes of the inputvalue
column. This means that you can now spread date (#62) and factor (#35) inputs. -
spread()
gives a more informative error message ifkey
orvalue
don't exist in the input data (#36). -
separate()
only displays the first 20 failures (#50). It has finer control over what happens if there are two few matches: you can fill with missing values on either the "left" or the "right" (#49).separate()
no longer throws an error if the number of pieces aren't as expected - instead it uses drops extra values and fills on the right and gives a warning. -
If the input is NA
separate()
andextract()
both return silently return NA outputs, rather than throwing an error. (#77) -
Experimental
unnest()
method for lists has been removed.
-
Experimental
expand()
function (#21). -
Experiment
unnest()
function for converting named lists into data frames. (#3, #22)
-
extract_numeric()
preserves negative signs (#20). -
gather()
has better defaults ifkey
andvalue
are not supplied. If...
is ommitted,gather()
selects all columns (#28). Performance is now comparable toreshape2::melt()
(#18). -
separate()
gainsextra
argument which lets you control what happens to extra pieces. The default is to throw an "error", but you can also "merge" or "drop". -
spread()
gainsdrop
argument, which allows you to preserve missing factor levels (#25). It converts factor value variables to character vectors, instead of embedding a matrix inside the data frame (#35).