{ "metadata": { "name": "", "signature": "sha256:448d3166c863013e4d0f1da36e5e9d40fa46db19b24a5dd859045198cf29a708" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "#All models are wrong, some models are useful" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "###Different forms of bias\n", "- selection bias - picking and choosing certain parts\n", "- publication bias (file drawer problem)\n", "- censoring bias\n", "- length bias\n", "- sampling bias - not feasible to sample every one, pay a lot attention to how samples are obtained\n", "\n", "###Why sample from a population?\n", "- often the only feasible way\n", "- general meta-question: what would you do if you had all the data?\n", "- important for computational reasons\n", "\n", "###Many sampling schemes\n", "- simple random sampling - complete random sample\n", "- stratified sampling - population divided into different groups \n", "- cluster sampling - pick a random cluster and sample everyone in that cluster\n", "- snowball sampling - relevant for network, to reach hard to reach population\n", "\n", "###Absolute vs Relative\n", "- in simple random sampling, which matters more, relative or absolute sample size?\n", "- absolute matters much more than relative (example sampling for a bowl of soup vs pot)\n", "\n", "###Bias of an Estimator\n", "- bias of an estimator is how far off it is on average\n", "- why not substract off the bias?\n", "- Consider bias-variance trade-off\n", "- When model gets more complicated, you overfit, lower bias but variance increase\n", "\n", "###How to combine independent estimators for a parameter into 1 estimator?\n", "- Average those numbers (simplistic)\n", "- Giving them weights, all weights should sum to 1\n", "- How to choose weights? The higher standard error, the less weight\n", "- Weights should be inversely proportional to variance\n", "- Important to consider what weights to use" ] } ], "metadata": {} } ] }