Skip to content

textshape version 1.0.2

Compare
Choose a tag to compare
@trinker trinker released this 24 Feb 15:24
· 75 commits to master since this release

NEWS

Versioning

Releases will be numbered with the following semantic versioning format:

<major>.<minor>.<patch>

And constructed with the following guidelines:

  • Breaking backward compatibility bumps the major (and resets the minor
    and patch)
  • New additions without breaking backward compatibility bumps the minor
    (and resets the patch)
  • Bug fixes and misc changes bumps the patch

textshape 1.0.2

BUG FIXES

  • tidy_list with a list of unnamed data.frames resulted in an error (see
    issue #7). This issue has been fixed.
  • split_word.data.frame and split_token.data.frame both used an incorrect
    column naming of sentence_id for word and token index respectively. These
    columns are now renamed to word_id and token_id respectively.
  • split_token gets a more robust splitting algorithm.

NEW FEATURES

  • column_to_rownames added to enable one to quickly add a column as rownames
    easily within a pipeline. This is useful when turning a data.frame into a
    matrix.
  • tidy_list picks up the ability to tidy a list of named vectors into three
    columns.

CHANGES

  • as.tibble removed from all function arguments. This was a nice interactive
    feature that made programming very difficult to reason about. Having an
    environment dependant output would result in no adoption of the textshape
    package as a dependency. Additionally, set_output and tibble_output,
    two complementary function have been removed without being deprecated. The
    problem was so egregious and the package infant enough, that removal without
    deprecation was warranted.

textshape 1.0.1

NEW FEATURES

  • Users can now globally select a tibble output rather than a data.table
    output for all functions that outputted a data.table. This can be set
    globally via set_output. If the user does not set the output type
    textshape tries to infer based on whether or not the user has dplyr
    loaded. If dplyr is loaded then tibble is the default output.
  • set_output and tibble_output added to globally set the output type
    (tibble or data.table) and to check/infer the desired output type.

textshape 1.0.0

CHANGES

  • bind_list, bind_table, & bind_vector have been renamed to the more
    meaningful forms of tidy_list, tidy_table, & tidy_vector. The former
    version are now deprecated. This bumps the version to 1.0.0 as this is a
    major change that breaks backward compatibility.

textshape 0.1.0 - 0.2.0

NEW FEATURES

  • bind_list added to rbind a list of named data.frames or vectors.
  • split_transcript added to split a transcript style vector (e.g.,
    c("greg: Who me", "sarah: yes you!") into a name and dialogue vector that is
    coerced to a data.table.
  • change_index added for extracting the indices of changes in runs within an
    atomic vector. Pairs well with split_index.
  • bind_vector added to cbind a named atomic vector's names and values.
  • bind_table added to cbind a table's names and values.
  • duration method for numeric vectors added as well as a starts and ends
    function for calculating start and end times from a numeric vector.
  • from_to added to prepare speaker data for a network lot given the flowing
    nature of discourse.
  • tidy_dtm & tidy_tdm added to convert a DocumentTermMatrix
    or TermDocumentMatrix into a tidied data.frame.
  • tidy_colo_dtm & tidy_colo_tdm added to convert a DocumentTermMatrix
    or TermDocumentMatrix into a collocation matrix and then a tidied data.frame.
  • unique_pairs added to compliment the output of tidy_colo_dtm &
    tidy_colo_tdm. Enables the removal of duplicated collocating pairs caused
    by symmetrical mirroring of the upper and lower triangle of the collocation
    matrix.

CHANGES

  • split_index now uses change_index(x) as the default when x is an atomic
    vector.

textshape 0.0.1

Tools that can be used to reshape text data.