Skip to content

Genealogy Part 3

vrrenske edited this page Oct 24, 2019 · 2 revisions

Have a first look at your data

We have our three connected datasets myTreeDF, myTreePhylo and myTreeNet, either from the example data, or your own data imported from SuperSegger or Oufti.

Here, myTreeDF is a data frame containing all the information needed to build a tree, plus some extra fields which are characteristics of each cell, for instance meanfluo. These characteristics where recorded by SuperSegger, the program I used to generate this dataset. To see the data frame; use View(). To see the first rows of the dataframe, use head():

head(myTreeDF)
##   node cell birth death edgelength fluorsum fluormean fluorsum_D
## 1  105    1     1     1          0   142476  473.3422     142476
## 2  106    2     1     1          0   117816  473.1566     117816
## 3  107    3     1     8          7    76344  474.1863     129799
## 4  108    4     1     1          0    95546  475.3532      95546
## 5  109    5     1     5          4    81153  474.5789     116804
## 6  110    6     1     8          7    78298  474.5333     152386
##   fluormean_D parent child1 child2 root nodelabel
## 1    473.3422    104     10     11    0         0
## 2    473.1566    104     12     13    0         0
## 3    466.9029    104     24     25    0         0
## 4    475.3532    104     14     15    0         0
## 5    469.0924    104     20     21    0         0
## 6    463.1793    104     28     29    0         0

What I saved as myTreePhylo is a phylo object. This object principally contains the same data as myTreeDF, but it is saved in such a way that the ggtree package and other network packages recognize the data as a phylogenetic tree. You can have a look at the data structure by using View() or summary():

summary(myTreePhylo)
##
## Phylogenetic tree: myTreePhylo
##
##   Number of tips: 103
##   Number of nodes: 95
##   Branch lengths:
##     mean: 7.923858
##     variance: 34.69315
##     distribution summary:
##    Min. 1st Qu.  Median 3rd Qu.    Max.
##       0       3       8      12      31
##   Root edge: 0
##   First ten tip labels: 26
##                         36
##                         37
##                         54
##                         55
##                         66
##                         67
##                         75
##                         81
##                         88
##   First ten node labels: 0
##                          1
##                          2
##                          3
##                          4
##                          5
##                          6
##                          7
##                          8
##                          9

Finally, myTreeNetwork is an iGRAPH network. This network again contains the connections between the daughter and mother cells, but then in a different format, more commonly used to plot network data (for instance social networks, or protein cascades). We’ll look into how we can use this later in the tutorial, but for now you can already have a quick look by plotting the network. For this, first install and load igraph:

install.packages("igraph")
library("igraph")

Now, you can plot the network:

plot(myTreeNetwork)


⬅️ Genealogy Part 2: Data Import ▪️ ◾ ▪️ Genealogy Part 4: Basic Trees ➡️
Clone this wiki locally