-
Notifications
You must be signed in to change notification settings - Fork 173
Commit
- Loading branch information
There are no files selected for viewing
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
{ | ||
"hash": "e7ffcce1c929e7411a0f0723e5df9813", | ||
"result": { | ||
"markdown": "# Applications: Explore {#sec-explore-applications}\n\n\n\n\n\n## Case study: Effective communication of exploratory results {#case-study-effective-comms}\n\nGraphs can powerfully communicate ideas directly and quickly.\nWe all know, after all, that \"a picture is worth 1000 words.\" Unfortunately, however, there are times when an image conveys a message which is inaccurate or misleading.\n\nThis chapter focuses on how graphs can best be utilized to present data accurately and effectively.\nAlong with data modeling, creative visualization is somewhat of an art.\nHowever, even with an art, there are recommended guiding principles.\nWe provide a few best practices for creating data visualizations.\n\n### Keep it simple\n\nWhen creating a graphic, keep in mind what it is that you'd like your reader to see.\nColors should be used to group items or differentiate levels in meaningful ways.\nColors can be distracting when they are only used to brighten up the plot.\n\nConsider a manufacturing company that has summarized their costs into five different categories.\nIn the two graphics provided in Figure @fig-pie-to-bar, notice that the magnitudes in the pie chart are difficult for the eye to compare.\nThat is, can your eye tell how different \"Buildings and administration\" is from \"Workplace materials\" when looking at the slices of pie?\nAdditionally, the colors in the pie chart do not mean anything and are therefore distracting.\nLastly, the three-dimensional aspect of the image does not improve the reader's ability to understand the data presented.\n\nAs an alternative, a bar plot has been provided.\nNotice how much easier it is to identify the magnitude of the differences across categories while not being distracted by other aspects of the image.\nTypically, a bar plot will be easier for the reader to digest than a pie chart, especially if the categorical data being plotted has more than just a few levels.\n\n\n::: {.cell}\n\n:::\n\n::: {.cell layout-ncol=\"2\"}\n::: {.cell-output-display}\n![A pie chart (with added irrelevant features) as compared to a simple bar plot.](images/pie-3d.jpg){#fig-pie-to-bar-1 width=50%}\n:::\n\n::: {.cell-output-display}\n![A pie chart (with added irrelevant features) as compared to a simple bar plot.](06-explore-applications_files/figure-html/fig-pie-to-bar-2.png){#fig-pie-to-bar-2 width=50%}\n:::\n:::\n\n\n### Use color to draw attention\n\nThere are many reasons why you might choose to add **color** to your plots.\nAn important principle to keep in mind is to use color to draw attention.\nOf course, you should still think about how visually pleasing your visualization is, and if you're adding color for making it visually pleasing without drawing attention to a particular feature, that might be fine.\nHowever, you should be critical of default coloring and explicitly decide whether to include color and how.\nNotice that in Plot B in Figure @fig-red-bar the coloring is done in such a way to draw the reader's attention to one particular piece of information.\nThe default coloring in Plot A can be distracting and makes the reader question, for example, is there something similar about the red and purple bars?\nAlso note that not everyone sees color the same way, often it's useful to add color and one more feature (e.g., pattern) so that you can refer to the features you're drawing attention to in multiple ways.\n\n\n::: {.cell}\n::: {.cell-output-display}\n![The default coloring in the first bar plot does nothing for the understanding of the data. In the second plot, the color draws attention directly to the bar on Buildings and Administration.](06-explore-applications_files/figure-html/fig-red-bar-1.png){#fig-red-bar width=90%}\n:::\n:::\n\n\n\n\n### Tell a story\n\nFor many graphs, an important aspect is the inclusion of information which is not provided in the dataset that is being plotted.\nThe external information serves to contextualize the data and helps communicate the narrative of the research.\nIn Figure @fig-duke-hires, the graph on the right is **annotated** with information about the start of the university's fiscal year which contextualizes the information provided by the data.\nSometimes the additional information may be a diagonal line given by $y = x$, points above the line quickly show the reader which values have a $y$ coordinate larger than the $x$ coordinate; points below the line show the opposite.\n\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![Credit: Angela Zoss and Eric Monson, Duke Data Visualization Services](images/time-series-story.png){#fig-duke-hires width=100%}\n:::\n:::\n\n\n### Order matters\n\nMost software programs have built in methods for some of the plot details.\nFor example, the default option for the software program used in this text, R, is to order the bars in a bar plot alphabetically.\nAs seen in Figure @fig-brexit-bars, the alphabetical ordering isn't particularly meaningful for describing the data.\nSometimes it makes sense to **order** the bars from tallest to shortest (or vice versa).\nBut in this case, the best ordering is probably the one in which the questions were asked.\nAn ordering which does not make sense in the context of the problem (e.g., alphabetically here), can mislead the reader who might take a quick glance at the axes and not read the bar labels carefully.\n\n\n\n\n\nIn September 2019, YouGov survey asked 1,639 Great Britain adults the following question[^06-explore-applications-1]:\n\n[^06-explore-applications-1]: Source: [YouGov Survey Results](https://d25d2506sfb94s.cloudfront.net/cumulus_uploads/document/x0msmggx08/YouGov%20-%20Brexit%20and%202019%20election.pdf), retrieved Oct 7, 2019.\n\n> How well or badly do you think the government are doing at handling Britain's exit from the European Union?\n>\n> - Very well\n> - Fairly well\n> - Fairly badly\n> - Very badly\n> - Don't know\n\n\n::: {.cell}\n\n:::\n\n::: {.cell}\n::: {.cell-output-display}\n![Bar plot three different ways. Plot A: Alphabetic ordering of levels, Plot B: Bars ordered in descending order of frequency, Plot C: Bars ordered in the same order as they were presented in the survey question.](06-explore-applications_files/figure-html/fig-brexit-bars-1.png){#fig-brexit-bars width=100%}\n:::\n:::\n\n\n### Make the labels as easy to read as possible\n\nThe Brexit survey results were additionally broken down by region in Great Britain.\nThe stacked bar plot allows for comparison of Brexit opinion across the five regions.\nIn Figure @fig-brexit-region the bars are vertical in Plot A and horizontal in Plot B. While the quantitative information in the two graphics is identical, flipping the graph and creating horizontal bars provides more space for the **axis labels**.\nThe easier the categories are to read, the more the reader will learn from the visualization.\nRemember, the goal is to convey as much information as possible in a succinct and clear manner.\n\n\n::: {.cell}\n::: {.cell-output-display}\n![Stacked bar plots vertically and horizontally. The horizontal orientation makes the region labels easier to read.](06-explore-applications_files/figure-html/fig-brexit-region-1.png){#fig-brexit-region width=100%}\n:::\n:::\n\n\n\n\n### Pick a purpose\n\nEvery graphical decision should be made with a **purpose**.\nAs previously mentioned, sticking with default options is not always best for conveying the narrative of your data story.\nStacked bar plots tell one part of a story.\nDepending on your research question, they may not tell the part of the story most important to the research.\nFigure @fig-seg-three-ways provides three different ways of representing the same information.\nIf the most important comparison across regions is proportion, you might prefer Plot A. If the most important comparison across regions also considers the total number of individuals in the region, you might prefer Plot B. If a separate bar plot for each region makes the point you'd like, use Plot C, which has been **faceted** by region.\n\n\n\n\n\nPlot C in Figure @fig-seg-three-ways also provides full titles and a succinct URL with the data source.\nOther deliberate decisions to consider include using informative labels and avoiding redundancy.\n\n\n::: {.cell}\n::: {.cell-output-display}\n![Three different representations of the two variables including survey opinion and region. Use the graphic that best conveys the data narrative at hand.](06-explore-applications_files/figure-html/fig-seg-three-ways-1.png){#fig-seg-three-ways width=90%}\n:::\n:::\n\n\n\n\n### Select meaningful colors\n\n<!-- An example with an ordinal variable with more levels would be better. -->\n\nOne last consideration for building graphs is to consider color choices.\nDefault or rainbow colors are not always the choice which will best distinguish the level of your variables.\nMuch research has been done to find color combinations which are distinct and which are clear for differently sighted individuals.\nThe cividis scale works well with ordinal data.\n[@Nunez:2018] Figure @fig-brexit-viridis shows the same plot with two different colorings.\n\n\n::: {.cell}\n::: {.cell-output-display}\n![Identical bar plots with two different coloring options. Plot A uses a default color scale, Plot B uses colors from the cividis scale.](06-explore-applications_files/figure-html/fig-brexit-viridis-1.png){#fig-brexit-viridis width=90%}\n:::\n:::\n\n\nIn this chapter different representations are contrasted to demonstrate best practices in creating graphs.\nThe fundamental principle is that your graph should provide maximal information succinctly and clearly.\nLabels should be clear and oriented horizontally for the reader.\nDon't forget titles and, if possible, include the source of the data.\n\n\\clearpage\n\n## Interactive R tutorials {#explore-tutorials}\n\nNavigate the concepts you've learned in this chapter in R using the following self-paced tutorials.\nAll you need is your browser to get started!\n\n::: {.alltutorials data-latex=\"\"}\n[Tutorial 2: Exploratory data analysis](https://openintrostat.github.io/ims-tutorials/02-explore/)\\\n::: {.content-hidden unless-format=\"pdf\"} https://openintrostat.github.io/ims-tutorials/02-explore\n:::\n\n:::\n\n::: {.singletutorial data-latex=\"\"}\n[Tutorial 2 - Lesson 1: Visualizing categorical data](https://openintro.shinyapps.io/ims-02-explore-01/)\\\n::: {.content-hidden unless-format=\"pdf\"} https://openintro.shinyapps.io/ims-02-explore-01\n:::\n\n:::\n\n::: {.singletutorial data-latex=\"\"}\n[Tutorial 2 - Lesson 2: Visualizing numerical data](https://openintro.shinyapps.io/ims-02-explore-02/)\\\n::: {.content-hidden unless-format=\"pdf\"} https://openintro.shinyapps.io/ims-02-explore-02\n:::\n\n:::\n\n::: {.singletutorial data-latex=\"\"}\n[Tutorial 2 - Lesson 3: Summarizing with statistics](https://openintro.shinyapps.io/ims-02-explore-03/)\\\n::: {.content-hidden unless-format=\"pdf\"} https://openintro.shinyapps.io/ims-02-explore-03\n:::\n\n:::\n\n::: {.singletutorial data-latex=\"\"}\n[Tutorial 2 - Lesson 4: Case study](https://openintro.shinyapps.io/ims-02-explore-04/)\\\n::: {.content-hidden unless-format=\"pdf\"} https://openintro.shinyapps.io/ims-02-explore-04\n:::\n\n:::\n\n::: {.content-hidden unless-format=\"pdf\"}\nYou can also access the full list of tutorials supporting this book at\\\n<https://openintrostat.github.io/ims-tutorials>.\n:::\n\n::: {.content-visible when-format=\"html\"}\nYou can also access the full list of tutorials supporting this book [here](https://openintrostat.github.io/ims-tutorials).\n:::\n\n## R labs {#explore-labs}\n\nFurther apply the concepts you've learned in this part in R with computational labs that walk you through a data analysis case study.\n\n::: {.singlelab data-latex=\"\"}\n[Intro to data - Flight delays](https://www.openintro.org/go?id=ims-r-lab-intro-to-data)\\\n::: {.content-hidden unless-format=\"pdf\"} https://www.openintro.org/go?i\nd=ims-r-lab-intro-to-data\n:::\n\n:::\n\n::: {.content-hidden unless-format=\"pdf\"}\nYou can also access the full list of labs supporting this book at\\\n<https://www.openintro.org/go?id=ims-r-labs>.\n:::\n\n::: {.content-visible when-format=\"html\"}\nYou can also access the full list of labs supporting this book [here](https://www.openintro.org/go?id=ims-r-labs).\n:::\n", | ||
"supporting": [ | ||
"06-explore-applications_files" | ||
], | ||
"filters": [ | ||
"rmarkdown/pagebreak.lua" | ||
], | ||
"includes": {}, | ||
"engineDependencies": {}, | ||
"preserve": {}, | ||
"postProcess": true | ||
} | ||
} |
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.