Skip to content

Commit

Permalink
Merge pull request #1 from NICD-UK/script-templates
Browse files Browse the repository at this point in the history
Script templates
  • Loading branch information
m-misiura authored Feb 9, 2023
2 parents a321581 + 965fc1b commit ea7f7f4
Show file tree
Hide file tree
Showing 16 changed files with 230 additions and 12 deletions.
24 changes: 12 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ pip install cookiecutter
cookiecutter https://github.com/NICD-UK/project-template
```

You will be prompted for nine inputs:
You will be prompted for eleven inputs:

1. Project Name
2. Project Directory Name
Expand All @@ -19,8 +19,9 @@ You will be prompted for nine inputs:
6. Project Sponsor Email
7. Project Summary
8. Raw Data Directory
9. `venv` Project (No / Yes)
10. `git` Project (No / Yes)
9. Language (Python / R)
10. `venv` Project (No / Yes)
11. `git` Project (No / Yes)

## Organization

Expand All @@ -32,7 +33,6 @@ data/
├─ model/
├─ raw/
├─ wrangle/
notebooks/
reports/
├─ clean/
├─ final/
Expand All @@ -50,7 +50,7 @@ src/
- **Determine Objectives:**
- **Determine Deliverables:**
- **Determine Resources:**
- **Plan Project:**
- **Plan Project:**

### 2. Data Preparation and Understanding

Expand All @@ -60,16 +60,16 @@ src/

### 3. Prototyping

- **Develop Data Product**
- **Evaluate Data Product**
- **Approve Data Product**
- **Develop Data Product:**
- **Evaluate Data Product:**
- **Approve Data Product:**

### 4. Production

- **Deploy Data Product**
- **Monitor Data Product**
- **Maintain Data Product**
- **Close Project**
- **Deploy Data Product:**
- **Monitor Data Product:**
- **Maintain Data Product:**
- **Close Project:**

## Guide

Expand Down
1 change: 1 addition & 0 deletions cookiecutter.json
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
"project_sponsor_email": "Project Sponsor Email",
"project_summary": "Project Summary",
"raw_data_directory": "data/raw",
"language": ["Python", "R"],
"venv_project": ["No", "Yes"],
"git_project": ["No", "Yes"]
}
14 changes: 14 additions & 0 deletions hooks/post_gen_project.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,21 @@
import subprocess
import glob
import os

venv_project = "{{cookiecutter.venv_project}}"
git_project = "{{cookiecutter.git_project}}"
language = "{{cookiecutter.language}}"

# create Python project
if language == "Python":
os.remove("{{cookiecutter.project_directory_name}}.Rproj")
for file in glob.glob("**/*.Rmd", recursive=True):
os.remove(file)

# create R project
if language == "R":
for file in glob.glob("**/*.py", recursive=True):
os.remove(file)

# create venv project
if venv_project == "Yes":
Expand Down
Empty file.
16 changes: 16 additions & 0 deletions {{cookiecutter.project_directory_name}}/reports/clean/clean.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Load Libraries
```{r message=FALSE}
library(glue)
library(here)
library(tidyverse)
```

# Setup
```{r}
data_name <- "<data-name>"
```

# Read Data
```{r}
clean_data <- read_rds(here("data", "clean", glue("{data_name}.rds")))
```
10 changes: 10 additions & 0 deletions {{cookiecutter.project_directory_name}}/reports/clean/clean.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
#%% Load Libraries
import pandas
from pyprojroot import here
import os

#%% Setup
data_name = "<data-name>"

#%% Read Data
clean_data = pandas.read_pickle(os.path.join(here(), "data", "clean", f"{data_name}.pkl"))
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Load Libraries
```{r message=FALSE}
library(glue)
library(here)
library(tidyverse)
```

# Setup
```{r}
data_name <- "<data-name>"
```

# Read Data
```{r}
wrangle_data <- read_rds(here("data", "wrangle", glue("{data_name}.rds")))
```
10 changes: 10 additions & 0 deletions {{cookiecutter.project_directory_name}}/reports/wrangle/wrangle.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
#%% Load Libraries
import pandas
from pyprojroot import here
import os

#%% Setup
data_name = "<data-name>"

#%% Read Data
wrangle_data = pandas.read_pickle(os.path.join(here(), "data", "wrangle", f"{data_name}.pkl"))
28 changes: 28 additions & 0 deletions {{cookiecutter.project_directory_name}}/src/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Transformation Checklist

## Motivation

## Cleaning Checklist

For each data source:

- [ ] read data from `/data/raw/`
- [ ] ...
- [ ] write data to `/data/clean/`

## Wrangling Checklist

For each data product:

- [ ] read data from `/data/clean/`
- [ ] ...
- [ ] write data to `/data/wrangle/`

## Processing

For models:





26 changes: 26 additions & 0 deletions {{cookiecutter.project_directory_name}}/src/clean/clean.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Load Libraries
```{r message=FALSE}
library(glue)
library(here)
library(tidyverse)
```

# Setup
```{r}
data_name <- "<data-name>"
```

# Read Data
```{r}
raw_data <- read_csv(here("data", "raw", glue("{data_name}.csv")))
```

# Clean Data
```{r}
clean_data <- raw_data
```

# Write Data
```{r}
write_rds(clean_data, here("data", "clean", glue("{data_name}.rds")))
```
16 changes: 16 additions & 0 deletions {{cookiecutter.project_directory_name}}/src/clean/clean.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
#%% Load Libraries
import pandas
from pyprojroot import here
import os

#%% Setup
data_name = "<data-name>"

#%% Read Data
raw_data = pandas.read_csv(os.path.join(here(), "data", "raw", f"{data_name}.csv"))

#%% Clean Data
clean_data = raw_data

#%% Write Data
clean_data.to_pickle(os.path.join(here(), "data", "clean", f"{data_name}.pkl"))
16 changes: 16 additions & 0 deletions {{cookiecutter.project_directory_name}}/src/model/model.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Load Libraries
```{r message=FALSE}
library(glue)
library(here)
library(tidyverse)
```

# Setup
```{r}
data_name <- "<data-name>"
```

# Read Data
```{r}
wrangle_data <- read_rds(here("data", "wrangle", glue("{data_name}.rds")))
```
10 changes: 10 additions & 0 deletions {{cookiecutter.project_directory_name}}/src/model/model.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
#%% Load Libraries
import pandas
from pyprojroot import here
import os

#%% Setup
data_name = "<data-name>"

#%% Read Data
wrangle_data = pandas.read_pickle(os.path.join(here(), "data", "wrangle", f"{data_name}.pkl"))
26 changes: 26 additions & 0 deletions {{cookiecutter.project_directory_name}}/src/wrangle/wrangle.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Load Libraries
```{r message=FALSE}
library(glue)
library(here)
library(tidyverse)
```

# Setup
```{r}
data_name <- "<data-name>"
```

# Read Data
```{r}
clean_data <- read_rds(here("data", "clean", glue("{data_name}.rds")))
```

# Wrangle Data
```{r}
wrangle_data <- clean_data
```

# Write Data
```{r}
write_rds(wrangle_data, here("data", "wrangle", glue("{data_name}.rds")))
```
16 changes: 16 additions & 0 deletions {{cookiecutter.project_directory_name}}/src/wrangle/wrangle.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
#%% Load Libraries
import pandas
from pyprojroot import here
import os

#%% Setup
data_name = "<data-name>"

#%% Read Data
clean_data = pandas.read_pickle(os.path.join(here(), "data", "clean", f"{data_name}.pkl"))

#%% Clean Data
wrangle_data = clean_data

#%% Write Data
wrangle_data.to_pickle(os.path.join(here(), "data", "wrangle", f"{data_name}.pkl"))
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
Version: 1.0

RestoreWorkspace: No
SaveWorkspace: No
AlwaysSaveHistory: No

EnableCodeIndexing: Yes
UseSpacesForTab: Yes
NumSpacesForTab: 2
Encoding: UTF-8

RnwWeave: Sweave
LaTeX: pdfLaTeX

0 comments on commit ea7f7f4

Please sign in to comment.