Skip to content

Commit

Permalink
Built site for gh-pages
Browse files Browse the repository at this point in the history
  • Loading branch information
Quarto GHA Workflow Runner committed Oct 14, 2024
1 parent 4f105d0 commit bc662c1
Show file tree
Hide file tree
Showing 5 changed files with 236 additions and 234 deletions.
2 changes: 1 addition & 1 deletion .nojekyll
Original file line number Diff line number Diff line change
@@ -1 +1 @@
9d791467
7736b6e9
2 changes: 1 addition & 1 deletion assignments/search.json
Original file line number Diff line number Diff line change
Expand Up @@ -334,7 +334,7 @@
"href": "template6.html#data-preparation-and-sampling-from-the-posterior",
"title": "Notebook for Assignment 6",
"section": "2.1 Data preparation and sampling from the posterior",
"text": "2.1 Data preparation and sampling from the posterior\nData assembly happens here:\n\n# These are our observations y: the proportion of students handing in each assignment (1-8),\n# sorted by year (row-wise) and assignment (column-wise).\n# While the code suggest a matrix structure, \n# the result will actually be a vector of length N = no_years * no_assignments\npropstudents<-c(c(176, 174, 158, 135, 138, 129, 126, 123)/176,\n c(242, 212, 184, 177, 174, 172, 163, 156)/242,\n c(332, 310, 278, 258, 243, 242, 226, 224)/332,\n c(301, 269, 231, 232, 217, 208, 193, 191)/301,\n c(245, 240, 228, 217, 206, 199, 191, 182)/245)\n# These are our predictors x: for each observation, the corresponding assignment number.\nassignment <- rep(1:8, 5)\n# These are in some sense our test data: the proportion of students handing in the last assignment (9),\n# sorted by year. \n# Usually, we would not want to split our data like that and instead\n# use e.g. Leave-One-Out Cross-Validation (LOO-CV, see e.g. http://mc-stan.org/loo/index.html)\n# to evaluate model performance.\npropstudents9 = c(121/176, 153/242, 218/332, 190/301, 175/245)\n# The total number of assignments\nno_assignments = 9\n# The assignment numbers for which we want to generate predictions\nx_predictions = 1:no_assignments\n# (Cmd)Stan(R) expects the data to be passed in the below format:\nmodel_data = list(N=length(assignment),\n x=assignment,\n y=propstudents,\n no_predictions=no_assignments,\n x_predictions=x_predictions)\n\nSampling from the posterior distribution happens here:\n\n# This reads the file at the specified path and tries to compile it. \n# If it fails, an error is thrown.\nretention_model = cmdstan_model(\"assignment6_linear_model.stan\")\n# This \"out <- capture.output(...)\" construction suppresses output from cmdstanr\n# See also https://github.com/stan-dev/cmdstanr/issues/646\nout <- capture.output(\n # Sampling from the posterior distribution happens here:\n fit <- retention_model$sample(data=model_data, refresh=0, show_messages=FALSE)\n)\n\nDraws postprocessing happens here:\n\n# This extracts the draws from the sampling result as a data.frame.\ndraws_df = fit$draws(format=\"draws_df\")\n\n# This does some data/draws wrangling to compute the 5, 50 and 95 percentiles of \n# the mean at the specified covariate values (x_predictions). \n# It can be instructive to play around with each of the data processing steps\n# to find out what each step does, e.g. by removing parts from the back like \"|> gather(pct,y,-x)\"\n# and printing the resulting data.frame.\nmu_quantiles_df = draws_df |> \n subset_draws(variable = c(\"mu_pred\")) |> \n summarise_draws(~quantile2(.x, probs = c(0.05, .5, 0.95))) |> \n mutate(x = 1:9) |> \n pivot_longer(c(q5, q50, q95), names_to = c(\"pct\"))\n# Same as above, but for the predictions.\ny_quantiles_df = draws_df |> \n subset_draws(variable = c(\"y_pred\")) |> \n summarise_draws(~quantile2(.x, probs = c(0.05, .5, 0.95))) |> \n mutate(x = 1:9) |> \n pivot_longer(c(q5, q50, q95), names_to = c(\"pct\"))\n\nPlotting happens here:\n\nggplot() +\n # scatter plot of the training data: \n geom_point(\n aes(x, y, color=assignment), \n data=data.frame(x=assignment, y=propstudents, assignment=\"1-8\")\n) +\n # scatter plot of the test data:\n geom_point(\n aes(x, y, color=assignment), \n data=data.frame(x=no_assignments, y=propstudents9, assignment=\"9\")\n) +\n # you have to tell us what this plots:\n geom_line(aes(x,y=value,linetype=pct), data=mu_quantiles_df, color='grey', linewidth=1.5) +\n # you have to tell us what this plots:\n geom_line(aes(x,y=value,linetype=pct), data=y_quantiles_df, color='red') +\n # adding xticks for each assignment:\n scale_x_continuous(breaks=1:no_assignments) +\n # adding labels to the plot:\n labs(y=\"assignment submission %\", x=\"assignment number\") +\n # specifying that line types repeat:\n scale_linetype_manual(values=c(2,1,2)) +\n # Specify colours of the observations:\n scale_colour_manual(values = c(\"1-8\"=\"black\", \"9\"=\"blue\")) +\n # remove the legend for the linetypes:\n guides(linetype=\"none\")",
"text": "2.1 Data preparation and sampling from the posterior\nData assembly happens here:\n\n# These are our observations y: the proportion of students handing in each assignment (1-8),\n# sorted by year (row-wise) and assignment (column-wise).\n# While the code suggest a matrix structure, \n# the result will actually be a vector of length N = no_years * no_assignments\npropstudents<-c(c(176, 174, 158, 135, 138, 129, 126, 123)/176,\n c(242, 212, 184, 177, 174, 172, 163, 156)/242,\n c(332, 310, 278, 258, 243, 242, 226, 224)/332,\n c(301, 269, 231, 232, 217, 208, 193, 191)/301,\n c(245, 240, 228, 217, 206, 199, 191, 182)/245,\n c(264, 249, 215, 221, 215, 206, 192, 186)/264)\n# These are our predictors x: for each observation, the corresponding assignment number.\nassignment <- rep(1:8, 6)\n# These are in some sense our test data: the proportion of students handing in the last assignment (9),\n# sorted by year. \n# Usually, we would not want to split our data like that and instead\n# use e.g. Leave-One-Out Cross-Validation (LOO-CV, see e.g. http://mc-stan.org/loo/index.html)\n# to evaluate model performance.\npropstudents9 = c(121/176, 153/242, 218/332, 190/301, 175/245, 179/264)\n# The total number of assignments\nno_assignments = 9\n# The assignment numbers for which we want to generate predictions\nx_predictions = 1:no_assignments\n# (Cmd)Stan(R) expects the data to be passed in the below format:\nmodel_data = list(N=length(assignment),\n x=assignment,\n y=propstudents,\n no_predictions=no_assignments,\n x_predictions=x_predictions)\n\nSampling from the posterior distribution happens here:\n\n# This reads the file at the specified path and tries to compile it. \n# If it fails, an error is thrown.\nretention_model = cmdstan_model(\"assignment6_linear_model.stan\")\n# This \"out <- capture.output(...)\" construction suppresses output from cmdstanr\n# See also https://github.com/stan-dev/cmdstanr/issues/646\nout <- capture.output(\n # Sampling from the posterior distribution happens here:\n fit <- retention_model$sample(data=model_data, refresh=0, show_messages=FALSE)\n)\n\nDraws postprocessing happens here:\n\n# This extracts the draws from the sampling result as a data.frame.\ndraws_df = fit$draws(format=\"draws_df\")\n\n# This does some data/draws wrangling to compute the 5, 50 and 95 percentiles of \n# the mean at the specified covariate values (x_predictions). \n# It can be instructive to play around with each of the data processing steps\n# to find out what each step does, e.g. by removing parts from the back like \"|> gather(pct,y,-x)\"\n# and printing the resulting data.frame.\nmu_quantiles_df = draws_df |> \n subset_draws(variable = c(\"mu_pred\")) |> \n summarise_draws(~quantile2(.x, probs = c(0.05, .5, 0.95))) |> \n mutate(x = 1:9) |> \n pivot_longer(c(q5, q50, q95), names_to = c(\"pct\"))\n# Same as above, but for the predictions.\ny_quantiles_df = draws_df |> \n subset_draws(variable = c(\"y_pred\")) |> \n summarise_draws(~quantile2(.x, probs = c(0.05, .5, 0.95))) |> \n mutate(x = 1:9) |> \n pivot_longer(c(q5, q50, q95), names_to = c(\"pct\"))\n\nPlotting happens here:\n\nggplot() +\n # scatter plot of the training data: \n geom_point(\n aes(x, y, color=assignment), \n data=data.frame(x=assignment, y=propstudents, assignment=\"1-8\")\n) +\n # scatter plot of the test data:\n geom_point(\n aes(x, y, color=assignment), \n data=data.frame(x=no_assignments, y=propstudents9, assignment=\"9\")\n) +\n # you have to tell us what this plots:\n geom_line(aes(x,y=value,linetype=pct), data=mu_quantiles_df, color='grey', linewidth=1.5) +\n # you have to tell us what this plots:\n geom_line(aes(x,y=value,linetype=pct), data=y_quantiles_df, color='red') +\n # adding xticks for each assignment:\n scale_x_continuous(breaks=1:no_assignments) +\n # adding labels to the plot:\n labs(y=\"assignment submission %\", x=\"assignment number\") +\n # specifying that line types repeat:\n scale_linetype_manual(values=c(2,1,2)) +\n # Specify colours of the observations:\n scale_colour_manual(values = c(\"1-8\"=\"black\", \"9\"=\"blue\")) +\n # remove the legend for the linetypes:\n guides(linetype=\"none\")",
"crumbs": [
"Templates",
"Notebook for Assignment 6"
Expand Down
Binary file modified assignments/template3_files/figure-html/unnamed-chunk-10-2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit bc662c1

Please sign in to comment.