-
Notifications
You must be signed in to change notification settings - Fork 2
Segmentation Tutorial Part 5
Finally, we are there: the analysis. Is there a difference in segmentation between programs? And between people?
Below, I’ll show how to use ggplot2 and basic R tools to get answers. If you're just here to learn BactMAP, you can skip through to the next section where we will visualize the segmentation results next to each other.
To plot the distribution of cell length, I can quickly make a box plot
or a violin plot of the cell distribution. For this, I use ggplot2, a
plotting system where you first give the basics (dataset, variables) to
ggplot()
, after which you add layer upon layer to make it look nicer.
Note that now I first load ggplot2, because I’m going to use a lot of ggplot2 functions below.
library(ggplot2)
ggplot(allFrames$finalframe, aes(x=condition, y=max_um)) + geom_violin()
You can see that there is quite a difference in the distributions! It’s
even so large, that I would like to get a bit more information on the
single data points. You can plot those using geom_dotplot()
or
geom_jitter()
instead of geom_violin()
.
However, there’s one issue. Because the mesh dataset contains many points of x/y coordinates for
the outline of each cell, each cell is represented many times in the
dataset. Therefore, it is better to make the dataset smaller by only
taking the variables which are uniform per cell before doing any other
plotting - I do this by using the command unique()
. A bonus point is
that a small dataset is always a bit faster to work
with.
Below, I take the unique data points of the columns "cell", "frame",
"condition", "max_um", "maxwum" and "area" of allFrames$finalframe
:
onePerCell <- unique(allFrames$finalframe[,c("cell", "frame", "condition", "max_um", "maxwum", "area")])
Now I have a dataset with only 1 row per cell, I can plot the single
data points. To give you an idea of the options available,
I also add a boxplot on top to show the distribution, use
theme_minimal()
to get a white background and change the y axis label
using ylab
. Check the documentation on
geom_dotplot
and
geom_boxplot
for more information on their layout options.
ggplot(onePerCell, aes(x=condition, y=max_um)) + #base plot
geom_dotplot(binaxis="y", stackdir="center", binwidth=0.05, color=NA, fill="grey") + #dotplot over y axis, centered, grey fill
geom_boxplot(fill="white", size=1, width=0.1, outlier.color=NA) + # small (width=0.1) boxplot, removed outliers
theme_minimal() + #black/white simple layout
ylab("cell length (micron)") #y axis label
I like to see the single data points because you get a better view of the outliers. In the last paragraph of this tutorial I’ll list a few extra options for plotting nice graphs.
When looking at the previous graph, it looks like the amount of cells found per segmentation is pretty similar. Let’s check just to be sure. We can do that by having a look at the frequency table:
table(onePerCell$condition)
##
## Clement_Oufti Jun_Oufti Jun_SuperSegger
## 844 952 867
## Lance_MicrobeJ Renske_Morphometrics
## 806 802
Alternatively, we can plot a bar graph:
ggplot(onePerCell, aes(x=condition)) + geom_bar()
Apart from cell length and number, we can also look at other cell dimensions, as width or area. I will show these in the end of the tutorial.
⬅️ Segmentation Tutorial part 4: Using combineDataframes | Segmentation Tutorial part 6: Compare segmentations visually ➡️ |
---|