-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
94 lines (73 loc) · 2.87 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# readfoundry
readfoundry provides a simple wrapper to Foundry to aid in the development
of RAPs within DHSC.
readfoundry can be used to either run raw SparkSQL on the Foundry SQL
server or can be used with `dbplyr` to enable generation of SQL.
## Installation
You can install the readfoundry package like so:
``` r
if (!requireNamespace("librarian")) install.packages("librarian")
librarian::shelf(DataS-DHSC/readfoundry)
```
## Usage
To use the package you will need the path to the dataset you want to query
([Identify a dataset's RID or filepath • Palantir](https://www.palantir.com/docs/foundry/analytics-connectivity/identify-dataset-rid/)) as well as a BEARER token from your Foundry session that has permission to export
this dataset ([User-generated tokens • Palantir](https://www.palantir.com/docs/foundry/platform-security-third-party/user-generated-tokens/)). This token should be
saved in your projects `.Renviron` file under the key "BEARER" (please do not
add your `.REnviron` file to version control).
To then run raw SQL call:
``` r
# create connection
fapi <- readfoundry::FoundryReader$new(Sys.getenv("BEARER"))
# SparkSQL query to run
sql_query <-
paste(
"SELECT",
"*",
"FROM \"/path/to/dataset\"",
"WHERE metric_name IN ('example', 'metric', 'names')"
)
# get query results
df <- fapi$run_query(sql_query)
```
To use the `dbplyr` wrapper start your pipe chain with a call to `fapi$get_dataset("path/to/dataset")` and when you have finished transforming
your data collect the results by calling `fapi$collect()`, e.g.:
``` r
# create connection
fapi <- readfoundry::FoundryReader$new(Sys.getenv("BEARER"))
# get details of metrics available
df_stats <- fapi$get_metrics("path/to/dataset")
# get transformed data
df <- fapi$get_dataset("path/to/dataset") %>%
filter(metric_name == "example_name") %>%
fapi$collect()
```
### Notes
SparkSQL uses single quotes for values with the exception of
double quotes around dataset path. Back ticks are also not supported for
variable names.
For more info on the SQL functions available see [Palantir Spark SQL Reference](https://www.palantir.com/docs/foundry/transforms-sql/spark-reference/)
To change the log level of the wrapper use:
``` r
futile.logger::flog.threshold(
futile.logger::INFO,
name = fapi$get_log_name()
)
```
## Licence
Unless stated otherwise, the codebase is released under the MIT License. This
covers both the codebase and any sample code in the documentation. The
documentation is © Crown copyright and available under the terms of the [Open
Government 3.0 licence](https://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/).