Friends don't let friends make certain types of data visualization - What are they and why are they bad.
This project is motivated by the popular data visualization do's and don't. I decided to create my own version in python, for two reasons. First, it allows me practice my python skills and two allows me reiterate the data visualization problems that can occur when good practices are not adhered to.
To see the original Friend don't let Friends here
The Scripts/
directory contains .Python
files that generate the graphics shown below.
Requirements
- Python
- Pandas
- Scipy
- Matplotlib
This has to be the first one. Means separation plots are some of the most common in scientific publications. We have two or more groups, which contains multiple observations; they may have different means, variances, and distributions. The task of the visualization is to show the means and the spread (dispersion) of the data.
In this example, two groups have similar means and standard deviations, but quite different distributions. Are they really "the same"? Just don't use bar plot for means separation, or at least check a couple things before settling down on a bar plot.
Violin pots makes no sense for small n
Colors scaling should be used thoughfully when visualizing data