Skip to content

Releases: grunwaldlab/poppr

Poppr version 2.8.3

18 Jun 19:22
Compare
Choose a tag to compare

This version of poppr has a bug fix, a much-needed improvement to the display of single-population nodes in MSN (by @fdchevalier), and minor changes to stability that should not be noticed by users.

The new MSN no longer draw the single-population nodes as a pie, so they look more presentable

msn-2019-06-18-4

I also accidentally forgot to include the NEWS file in the official release, but you can still keep track of that here:

BUG FIX

  • read.genalex() now correctly parses strata when the user imports data that
    contains duplicated data AND has some individuals named as integers less than
    the number of samples in the data (prepended by zeroes)
    (See #202).

NEW FEATURES

  • MSN functions: nodes with single populations displayed as circles instead of
    pies. (@fdchevalier, #203)

MISC

  • mlg.vector() is now safer as it now uses a for loop instead of a
    function with the out-of-scope operator (<<-) (see #205)
  • shufflepop() is now safer as it now uses a for loop instead of a
    function with the out-of-scope operator (<<-) (see #205)
  • The MLG class gains a new distenv slot, which will store the environment
    where the distance function or matrix exists. This is accompanied by an
    accessor of the same name (see #206).
  • "mlg.filter<-"() replacement methods will no longer search the global
    environment when evaluating the distance function or matrix (see #206).
  • Tests for mlg.filter() no longer assign objects to the global environment
  • DOIs for the publications have been added to the DESCRIPTION

Poppr version 2.8.2

17 Mar 16:36
Compare
Choose a tag to compare

This is a maintenance release that has no visible impact for users.

Version 2.8.1

29 Aug 10:24
Compare
Choose a tag to compare

This is a maintenance release for poppr.

BUG FIX

  • An error that appeared in some AMOVA calls with genind objects with character-
    based alleles was fixed (see #190 for details)

DOCUMENTATION

  • aboot() documentation was updated to add the citation and make clear its
    purpose and limitations.

MISC

poppr version 2.8.0

20 May 00:39
578a7c4
Compare
Choose a tag to compare

This release contains an updated win.ia(), AMOVA for genlight objects, and a faster and more efficient calculation of Euclidean distance for genlight objects (see image).

image

NEWS

BUG FIX

  • win.ia() now has more consistent behavior with chromosome structure and will
    no longer result in an integer overflow.
    (see #179). Thanks to @MarisaMiller
    for the detailed bug report.
  • plot_filter_stats() will plot stats if supplied a list of thresholds.

ALGORITHMIC CHANGE

  • win.ia() may result in slightly different results because of two changes:
    1. The windows will now always start at position one on any given chromosome.
      This will result in some windows at the beginning of chromosomes having a
      value of NA if the first variant starts beyond the first window.
    2. Windows are now calculated for each chromosome independently. The previous
      version first concatenated chromosomes with at least a window-sized gap
      between the chromosomes, but failed to ensure that the window always started
      at the beginning of the chromosome. This version fixes that issue.
      (see #179).

DEPRECATION

  • The chromosome_buffer argument for win.ia() has been permanently set to
    TRUE and deprecated as it is no longer used.

NEW FEATURES

  • poppr.amova() will now handle genlight/snpclone objects.
    See #185 for details.

  • bitwise.dist() now has two new options: euclidean and scale_missing.
    When both of these are set to TRUE, the distance measured will be Euclidean
    scaled for the amount of missing data in each comparison. This matches the
    output of base R's dist() function at a fraction of time and memory.
    See #176 for details.

  • make_haplotypes() is now a generic defined for both genind and genlight.

  • genind2genalex() will no longer write to "genalex.csv" by default. Instead,
    it will warn the user and write to a temporary file.
    See #175 for details.

  • genind2genalex() now has an overwrite parameter set to FALSE to prevent
    accidental overwriting of files.

  • win.ia() has a new argument name_window, which will give each element in
    the result the designation of the terminal position of that window. Thanks to
    @MarisaMiller for the suggestion!

  • pair.ia() can now calculate p-values via permutations (I forgot to add this in the official NEWS)

DOCUMENTATION

  • cutoff_predictor() was added to the MLG vignette

poppr version 2.8.0 release candidate

16 May 22:11
6aa71b2
Compare
Choose a tag to compare
Pre-release

This release contains an updated win.ia(), AMOVA for genlight objects, and a faster and more efficient calculation of Euclidean distance for genlight objects (see image).

image

NEWS

BUG FIX

  • win.ia() now has more consistent behavior with chromosome structure and will
    no longer result in an integer overflow.
    (see #179). Thanks to @MarisaMiller
    for the detailed bug report.
  • plot_filter_stats() will plot stats if supplied a list of thresholds.

ALGORITHMIC CHANGE

  • win.ia() may result in slightly different results because of two changes:
    1. The windows will now always start at position one on any given chromosome.
      This will result in some windows at the beginning of chromosomes having a
      value of NA if the first variant starts beyond the first window.
    2. Windows are now calculated for each chromosome independently. The previous
      version first concatenated chromosomes with at least a window-sized gap
      between the chromosomes, but failed to ensure that the window always started
      at the beginning of the chromosome. This version fixes that issue.
      (see #179).

DEPRECATION

  • The chromosome_buffer argument for win.ia() has been permanently set to
    TRUE and deprecated as it is no longer used.

NEW FEATURES

  • poppr.amova() will now handle genlight/snpclone objects.
    See #185 for details.

  • bitwise.dist() now has two new options: euclidean and scale_missing.
    When both of these are set to TRUE, the distance measured will be Euclidean
    scaled for the amount of missing data in each comparison. This matches the
    output of base R's dist() function at a fraction of time and memory.
    See #176 for details.

  • make_haplotypes() is now a generic defined for both genind and genlight.

  • genind2genalex() will no longer write to "genalex.csv" by default. Instead,
    it will warn the user and write to a temporary file.
    See #175 for details.

  • genind2genalex() now has an overwrite parameter set to FALSE to prevent
    accidental overwriting of files.

  • win.ia() has a new argument name_window, which will give each element in
    the result the designation of the terminal position of that window. Thanks to
    @MarisaMiller for the suggestion!

DOCUMENTATION

  • cutoff_predictor() was added to the MLG vignette

poppr version 2.7.1

16 Mar 21:45
25bd4ba
Compare
Choose a tag to compare

This minor release updates the AMOVA documentation since it was accidentally lost in version 2.7.0.

Polysat is also added to imports since it's a lightweight package

poppr version 2.7.0

16 Mar 17:33
2e22b55
Compare
Choose a tag to compare

Changes in poppr version 2.7

Poppr version 2.7 introduces a change to how AMOVA is calculated (thanks to Patrick Meirmans for the impetus and sample data) and two new functions for data conversion:

  • make_haplotypes() for splitting data into pseudo-haplotypes
  • as.genambig() for converting genind/genclone objects to polysat's
    genambig class.

The changes will be outlined here.

Calculating (\rho) --- AMOVA from allele frequencies

Rho is a method of calculating population differentiation in the AMOVA
framework without considering within-individual variance and is analogous
to Fst for use with autotetraploid organisms (Ronfort et al 1998;
Meirmans and Liu 2018). The process uses the Euclidean distance of allele
frequencies and can be performed by setting within = FALSE.

library("poppr")
data("Pinf")
Pinf
#> 
#> This is a genclone object
#> -------------------------
#> Genotype information:
#> 
#>    72 multilocus genotypes 
#>    86 tetraploid individuals
#>    11 codominant loci
#> 
#> Population information:
#> 
#>     2 strata - Continent, Country
#>     2 populations defined - South America, North America

# be sure to recode your polyploid data so that there are no zeroes for placeholders
(prc <- recode_polyploids(Pinf, newploidy = TRUE))
#> 
#> This is a genclone object
#> -------------------------
#> Genotype information:
#> 
#>    72 multilocus genotypes 
#>    86 diploid (55) and triploid (31) individuals
#>    11 codominant loci
#> 
#> Population information:
#> 
#>     2 strata - Continent, Country
#>     2 populations defined - South America, North America

# calculate rho
rho  <- poppr.amova(prc, ~Continent/Country, within = FALSE, cutoff = .1)
rho$statphi
#>                              Phi
#> Phi-samples-total     0.12713922
#> Phi-samples-Continent 0.05269217
#> Phi-Continent-total   0.07858802

Here, the value of (\rho) is 0.1271392.

Changes in AMOVA for poppr 2.7 can affect your results

The process of calculating AMOVA in poppr involved four steps:

  1. If the data were diploid, genotypes were split into pseudo-haplotypes
  2. A distance matrix was calculated using diss.dist() and the square root was taken
  3. The matrix and hierarchy were prepared for either ade4 or pegas
  4. AMOVA was calculated using either ade4 or pegas

In this new version of poppr, you now have access to the function that splits
haplotypes called make_haplotypes().

The major change in poppr 2.7 is that the dist() has replaced diss.dist()

Changing diss.dist() to dist()

The default distance calculation for all AMOVA was diss.dist(), which is a
dissimilarity distance. For haploid or pseudo-haploid data, this is
equivalent to a squared Euclidean distance, and was appropriate for
calculating the distance for use when the within = TRUE option was set
(which was default). This method, however, was not appropriate when not
considering within-individual variation.

For example, this is how the previous versions of poppr would have calculated
(\rho):

dissim <- diss.dist(prc)
old  <- poppr.amova(prc, ~Continent/Country, within = FALSE, cutoff = .1, 
                    dist = dissim, squared = TRUE)
#> 
#>  No loci with missing values above 10% found.
#> Distance matrix is non-euclidean.
#> Using quasieuclid correction method. See ?quasieuclid for details.
old$statphi
#>                              Phi
#> Phi-samples-total     0.15032208
#> Phi-samples-Continent 0.12439575
#> Phi-Continent-total   0.02960965

If we compare this result to the one above, we can see that there is a
distinct difference in the values of (\rho).

AMOVA with missing data

The dist() function handles missing data differently than diss.dist(), so
you may see small differences in your results (for details, see this
StackOverflow answer: https://stackoverflow.com/a/18117751/2752888).

For example, the nancycats data set has an average of 2.3% missing data. This
results in a small shift in the (\Phi) statistics. Here are the results with
version 2.7:

data(nancycats)
strata(nancycats) <- data.frame(colony = pop(nancycats))
new <- poppr.amova(nancycats, ~colony, cutoff = .1)
#> 
#>  No loci with missing values above 10% found.
#> Distance matrix is non-euclidean.
#> Using quasieuclid correction method. See ?quasieuclid for details.
new$statphi
#>                          Phi
#> Phi-samples-total  0.1971382
#> Phi-samples-colony 0.1235778
#> Phi-colony-total   0.0839327

To show the results from previous versions, we need to use the new
make_haplotypes() function to create pseudo-haplotypes:

nanhaps <- make_haplotypes(nancycats)

# confirm that the number of individuals is double that of the original data
nInd(nanhaps)
#> [1] 474
2 * nInd(nancycats)
#> [1] 474

# calculate squared Euclidean distance
d2n <- diss.dist(nanhaps)

# calculate AMOVA
old <- poppr.amova(nanhaps, ~colony/Individual, cutoff = .1, 
                   dist = d2n, squared = TRUE)
#> 
#>  No loci with missing values above 10% found.
#> Distance matrix is non-euclidean.
#> Using quasieuclid correction method. See ?quasieuclid for details.
old$statphi
#>                           Phi
#> Phi-samples-total  0.19292024
#> Phi-samples-colony 0.12109840
#> Phi-colony-total   0.08171772

The different treatment of the missing data has created a difference of
0.004218 in (\Phi_{ST}).

Converting genind/genclone to polysat

Polysat is a package that works with polyploid microsatellite data. You can
install it from CRAN with install.packages("polysat"). The poppr function
as.genambig() will convert from genind to genambig:

library("polysat") # load polysat
Pinf
#> 
#> This is a genclone object
#> -------------------------
#> Genotype information:
#> 
#>    72 multilocus genotypes 
#>    86 tetraploid individuals
#>    11 codominant loci
#> 
#> Population information:
#> 
#>     2 strata - Continent, Country
#>     2 populations defined - South America, North America
Pinf.ga <- as.genambig(Pinf) # Convert to genambig
summary(Pinf.ga)             # Show the summary of the contents
#> Dataset with allele copy number ambiguity.
#> Insert dataset description here.
#> Number of missing genotypes: 10
#> 86 samples, 11 loci.
#> 2 populations.
#> Ploidies: 2 3 NA
#> Length(s) of microsatellite repeats: NA

Once you have your genambig object, you can use all the functions polysat has
available.

Created on 2018-03-16 by the reprex package (v0.2.0).

References

Ronfort, Joëlle, Eric Jenczewski, Thomas Bataillon, and François Rousset. "Analysis of population structure in autotetraploid species." Genetics 150, no. 2 (1998): 921-930.

Patrick G. Meirmans and Shenglin Liu. "Analysis of Molecular Variance (AMOVA) for autopolyploids" Submitted (2018)

poppr version 2.6.1

15 Jan 18:35
141db47
Compare
Choose a tag to compare

This is a bugfix release. The bug attempted to read computer memory that wasn't allocated.

BUG FIX

  • An out-of-bounds memory access error in bitwise.dist() was fixed.
    See #169 for details.

poppr version 2.6.0

11 Jan 17:01
v.2.6.0
453f08f
Compare
Choose a tag to compare

The biggest feature of this release is the scaling of nodes by area in the minimum spanning networks. More details on that here: https://zkamvar.github.io/blog/poppr-2-6-0-better-network-plotting/

NEW FUNCTIONS

  • The new function boot.ia() is conceptually similar to resample.ia(),
    except it resamples with replacement.

NEW FEATURES

  • The function resample.ia() now can resample individuals weighted by their
    Psex value.
  • The minimum spanning networks will now scale nodes by area instead of radius.
    This gives a more accurate picture of the differences between MLGs. See
    #154 for details.
  • A legend for samples/node is now added to all minimum spanning networks. See
    #158 for details.
  • The imsn() option for node size scale has been changed to a slider.

BUG FIX

  • An issue where data with sample names containing apostrophes could not be
    imported was fixed (Identified in
    #156).
  • a bug in imsn() where custom MLGs would result in an error was fixed. See
    #155 for details.
  • a bug in plot_poppr_msn() where setting scale.leg = FALSE would result in a
    very small MSN plot was fixed.
  • mlg() now works properly for snpclone and genlight objects. See
    #155 for details.

DEPENDENCIES

  • The minimum version of igraph has been set to 1.0.0.

poppr version 2.5.0

13 Sep 01:34
Compare
Choose a tag to compare

Version 2.5.0 of poppr contains very important bug fixes for read.genalex() and all functions that use Bruvo's distance (see details).

ALGORITHMIC CHANGE

  • Identified in #139, Bruvo's distance will now consider all possible combinations
    of ordered alleles in the calculation under the genome addition and loss models
    for missing data. This will affect those who have polyploid data that contain
    more than one missing allele at any genotype

    To facilitate comparison, the global option old.bruvo.model, has been created.
    By default it is set to FALSE, indicating that poppr should use the ordered
    allele combinations. If the user wants to use the method considering unorderd
    allele combinations, they can set options(old.bruvo.model = TRUE)

    It must be repeated that this does not affect haploid or diploid comparisons,
    those that use the infinite alleles model, or those who do not have more than
    one missing allele at any genotype.

DEPRECATION

  • The warning for a short repeat length vector for Bruvo's distance is
    deprecated and will become an error in the future
  • jack.ia() is deprecated in favor of resample.ia() for clarity.

BUG FIX

  • A bug in read.genalex() where removed samples would have incorrect strata
    labels was fixed. Thanks to Hernán Dario Capador-Barreto for identifying it.
    See #147

MISC

  • The internal plotting function for mlg.table now uses tidy evaluation for
    dplyr versions > 0.5.0
  • The package reshape2 was removed from imports and replaced with base functions
    (see #144 for details)

NEW IMPORTS

  • Due to the migration to dplyr version 0.7.0, poppr now imports the !!
    operator from the rlang package