-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.qmd
80 lines (59 loc) · 3.25 KB
/
README.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
---
title: "GitHub Actions For Scientific Data Workflows"
format: gfm
bibliography: references.bib
---
<!-- README.md is generated from README.qmd. Do not edit README.md, edit README.qmd instead! -->
<!-- badges: start -->
[![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active)
[![DOI](https://zenodo.org/badge/880484181.svg)](https://doi.org/10.5281/zenodo.14232509)
<!-- badges: end -->
This repository contains some example uses of GitHub Actions for automating scientific data workflows.
It is inspired heavily by [a workshop](https://githubactionstutorial-usrse24.readthedocs.io/en/latest/intro.html) given at US-RSE 2024 by [\@valentina-s](https://github.com/valentina-s), [\@gbrencher](https://github.com/gbrencher), and [\@scottyhq](https://github.com/scottyhq).
Their book and repository contain additional examples that are more python focused and maybe more advanced, while the examples in this repository are more R focused and will be used in a shorter format [workshop](https://datascience.cct.arizona.edu/events/automating-research-data-workflows-github-actions) at the University of Arizona.
## Using this repository
This is a template repository, so you can make a copy of it by clicking the green "use this template" button.
![](images/Screenshot.png)
Once you've successfully created a repository in your own GitHub username or organization, you can navigate to the "Actions" tab where you'll find workflows numbered roughly in order of complexity.
The .yaml files defining the workflows have them all set to be triggered on `workflow_dispatch` so you can run them by selecting a workflow and clicking the "Run workflow" button.
The .yaml files defining the workflows are in `.github/workflows`.
R code run by some of these workflows is in the `R/` folder.
Reports rendered by some of the actions include the source for this document, `README.qmd` , and `docs/report.qmd`.
## Example data
The example data in `data/heliconia_sample.csv` is modified from @bruna2023.
Data validation was run on this dataset using GitHub Actions in this repo: <https://github.com/BrunaLab/HeliconiaSurveys>.
### References
::: {#refs}
:::
------------------------------------------------------------------------
## Report
One example workflow involves re-rendering this README to include some updated histograms of the example dataset in `data/heliconia_sample.csv`
```{r}
#| include: false
library(ggplot2)
library(readr)
library(tidyr)
library(dplyr)
heliconia <- read_csv("data/heliconia_sample.csv")
heliconia_tidy <- heliconia |>
select(-starts_with("notes_")) |>
pivot_longer(
cols = c(-ranch, -plot, -tag_number, -row, -column),
names_sep = "_",
names_to = c("variable", "year")
) |>
pivot_wider(names_from = "variable", values_from = "value")
```
```{r}
#| echo: false
#| fig-height: 3
ggplot(heliconia_tidy, aes(shoots)) +
geom_histogram(bins = 20, na.rm = TRUE) +
labs(title = "Shoot Count")
ggplot(heliconia_tidy, aes(height)) +
geom_histogram(bins = 20, na.rm = TRUE) +
labs(title = "Height (cm)")
ggplot(heliconia_tidy, aes(infl)) +
geom_histogram(bins = 20, na.rm = TRUE) +
labs(title = "Infloresence Count")
```