01-how-i-got-into-Bayes.Rpres

Bayesian Regression Models with RStanARM
========================================================
author: TJ Mahr
date: Sept. 21, 2016
autosize: true
incremental: false

Madison R Users Group

<small>
[Github repository](https://github.com/tjmahr/MadR_RStanARM)
<br/>
[@tjmahr](https://twitter.com/tjmahr)
</small>


Overview
===============================================================================

- **[How I got into Bayesian statistics](http://rpubs.com/tjmahr/rep-crisis)**
- [Some intuition-building about Bayes theorem](http://rpubs.com/tjmahr/bayes-theorem)
- [Tour of RStanARM](http://rpubs.com/tjmahr/rstanarm-tour)
- [Where to learn more about Bayesian statistics](http://rpubs.com/tjmahr/bayes-learn-more)


August 2015: The "Crisis" in Psychology
===============================================================================
left: 50%

[Open Science Collaboration (2015)][osf2015] tries to replicate 100 studies
published in 3 psychology different journals in 2008.

![Scatter plot of original vs replicated effect sizes]
(./assets/reproducibility.PNG)

***

- Boil a study down to 1 test statistic and 1 effect size.
  - Compare replication's test and effect size against original.
- Approximately 36% of the studies are replicated (same test statistic.
- On average, effect sizes in replications are half that of the original
  studies.

[osf2015]: http://science.sciencemag.org/content/349/6251/aac4716


Reactions
===============================================================================

We're doomed.

![You sit on a throne of lies!](./assets/throne-of-lies.gif)

Most findings are probably false.

***

No, this is business as usual.

![Busy body penguin with a hat and briefcase](./assets/business-as-usual.gif)

Any credible discipline has to do this kind of house-cleaning from time to time.


Lots of hand-wringing and soul-searching
===============================================================================

Some reactionary:

- Replication creates an [industry for incompetent hacks][replication-hacks].
- Here come the [methodological terrorists][method-terrorists]!

![dumpster fire dot gif](./assets/dumpster-fire.gif)

***

Some constructive:

- [Everything is f'ed][everything-fucked] -- so what else is new?
- Increased rigor and openness are [a good thing][repeat-after-me].

[replication-hacks]: http://www.sciencedirect.com/science/article/pii/S002210311600007X
[method-terrorists]: http://andrewgelman.com/2016/09/21/what-has-happened-down-here-is-the-winds-have-changed/
[everything-fucked]: https://hardsci.wordpress.com/2016/08/11/everything-is-fucked-the-syllabus/
[repeat-after-me]: https://thenib.com/repeat-after-me


I'm in the business-as-usual camp, by the way
===============================================================================

Better research practices are catching on.

- Pre-registration
- Power analyses and [other foresight][stats-autopsy].
- Meta-analytic tools to assess the health of a field
- Better disclosure of other unanalyzed measurements

[stats-autopsy]: http://www.johndcook.com/blog/2010/05/13/statistical-autopsy/


Crisis made me think more of questionable data analysis practices
===============================================================================

Basically, unintentional acts and rituals to appease Statistical Significance
gods.

- HARKing (hypothesizing after results are known)
  - Telling a story to fit the data.
- Garden of forking data
  - Conducting countless sub-tests and sub-analyses on the data
- p-hacking
  - Doing these tests in order to find a significant effect.
- Selective reporting
  - Reporting only the tests that yielded a significant result.


My response to the crisis
===============================================================================

I want to avoid these errors.

- I want to level up my stats and explore new techniques.
  - Maybe more robust estimation techniques
  - Maybe machine learning techniques to complement conventional analyses.
- I want something less finicky than statistical significance.
  - p-values don't mean what many people think they mean.
  - Neither do confidence intervals.
  - Statistical significance is not related to practical significance.


December 2015
===============================================================================

- I started reading the Gelman and Hill book.
  - This is the book for [the `arm` package][arm].
- It emphasizes estimation, uncertainty and simulation.
- Midway through, the book pivots to Bayesian estimation.
- I roll with it.

***

![Cover of Data Analysis USing Regression and Multilevel/Hierarchical Models](./assets/arm.jpeg)

[arm]: https://cran.rstudio.com/web/packages/arm/


Using ARM
===============================================================================
title: false

I try to download the example BUGS scripts, but they are no longer supported.

Instead, I find [a page with the title][use-stan]:

> ### Use Stan instead.

<br/>
Okay, I'll give it try...

[use-stan]: http://www.stat.columbia.edu/~gelman/bugsR/


January 2016
===============================================================================

I'm down a rabbit hole, writing Stan models to fit the models from the ARM
book, and there is an influx of Bayesian tools for R.

- [Statistical Rethinking][rethinking], a book that reteaches regression from a
  Bayesian perspective with R and Stan, is released.
- New version of [brms][brms] is released. This package converts R model code
  into Stan programs.
- [RStanARM][rstanarm] is released.
- This blog post circulates: ["R Users Will Now Inevitably Become Bayesians"]
  [we-r-bayesians].

I eat all this up. I become a convert.

[rethinking]: http://xcelab.net/rm/statistical-rethinking/
[brms]: https://cran.r-project.org/web/packages/brms/index.html
[rstanarm]: https://cran.r-project.org/web/packages/rstanarm/index.html
[we-r-bayesians]: https://thinkinator.com/2016/01/12/r-users-will-now-inevitably-become-bayesians/


Long story short
===============================================================================

The replication crisis sparked my curiosity, and a wave of new tools and
resources made it really easy to get started with Bayesian stats.

No formal coursework. I learned about the math/statistics from playing with
code.

My goal with these toosl has been to make better, more honest scientific
summaries of observed data.


Next
===============================================================================

- [How I got into Bayesian statistics](http://rpubs.com/tjmahr/rep-crisis)
- **[Some intuition-building about Bayes theorem](http://rpubs.com/tjmahr/bayes-theorem)**
- [Tour of RStanARM](http://rpubs.com/tjmahr/rstanarm-tour)
- [Where to learn more about Bayesian statistics](http://rpubs.com/tjmahr/bayes-learn-more)