Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new functions #21

Merged
merged 18 commits into from
May 24, 2024
Merged

Add new functions #21

merged 18 commits into from
May 24, 2024

Conversation

Allaht2
Copy link
Contributor

@Allaht2 Allaht2 commented May 22, 2024

Add functions SotkanetInteractive, SotkanetCleanCache and get_sotkanet. Also add helper functions sotkanet_read_cache, sotkanet_write_cache and write_frictionless_metadata.

@Allaht2 Allaht2 requested a review from pitkant May 22, 2024 07:44
#' @param cache_dir a path to cache directory.
#' @param query_hash a character used to identify the data.frame.
#'
#' @references See citation("sotkanet")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@references tag should be used for the following purpose: "Use @references point to published material about the package that users might find helpful." Source: https://mpn.metworx.com/packages/roxygen2/7.1.0/articles/rd.html

Putting package citation in References is actually also used in the eurostat package so we should look into whether this is sensible thing to do

R/SotkanetCleanCache.R Outdated Show resolved Hide resolved
R/SotkanetInteractive.R Outdated Show resolved Hide resolved
R/SotkanetInteractive.R Outdated Show resolved Hide resolved
capture.output(print(
paste0("GetDataSotkanet(indicators = ", search_id,
", years = ", years[1], ":", years[length(years)],
", genders = ", gender_selection, ", regions = ",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gender_selection needs to be wrapped in single or double quotes. I got the following error when trying to re-run retrieval code printed here:

GetDataSotkanet(indicators = 10027, years = 2000:2015, genders = total, regions = NULL, region.category = NULL)
Error: object 'total' not found

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function works nicely but there are still some hiccups:

  1. If I input region.category (and probably regions) without quotes, the function seems to accept this as valid but it still prints the download code without double quotes:
Enter the regions (empty for default): 
Enter the region.category (empty for default): POHJOISMAAT
[...]
#### DOWNLOAD PARAMETERS: 

[1] "get_sotkanet(indicators = 10027, years = 2000:2005, genders = c('male', 'female'), regions = NULL, region.category = POHJOISMAAT, lang = 'fi')"

This results in R searching for a variable called POHJOISMAAT, not string "POHJOISMAAT":

get_sotkanet(indicators = 10027, years = 2000:2005, genders = c('male', 'female'), regions = NULL, region.category = POHJOISMAAT, lang = 'fi')
Error: object 'POHJOISMAAT' not found

So I guess it's more about fine-tuning the example download string.

  1. I get this weird tooltip when I input data manually:
tooltip

I wonder what causes it? It's obviously useless / misleading. Maybe it's from RStudio and not R?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then there's the hassle with selecting regions:

Enter the regions (empty for default): c("Suomi", "Ruotsi")
Enter the region.category (empty for default): 
Input for regions not found from dataset: c("Suomi", "Ruotsi") 
 Please check your parameter input for validity and correctness.

In code for downloading data this input get mangled by extra escape-characters:

"get_sotkanet(indicators = 10027, years = 2000:2005, genders = c('male', 'female'), regions = c(\"Suomi\", \"Ruotsi\"), region.category = NULL, lang = 'fi')"

From the API documentation we see that we should be using region codes anyway for selecting regions, not names:

region
^\d+$
Alueen tunnus, jota käytetään SOTKAnetin ja sen APIn kutsuissa

For example Sweden is 1044 and Finland is 1045. It might be a good idea either to give a hint to end user on how to input multiple values: wrapped in c(), separated by comma...?

Enter the regions (empty for default): 1044, 1045
Enter the region.category (empty for default): 
Input for regions not found from dataset: 1044, 1045 
 Please check your parameter input for validity and correctness.
#### DOWNLOAD PARAMETERS: 

[1] "get_sotkanet(indicators = 10027, years = 2000:2005, genders = c('male', 'female'), regions = 1044, 1045, region.category = NULL, lang = 'fi')"

Maybe the most ideal solution would be to read all available options for each datasets from the metadata and print them using the language that the user chose in the first step. Then the user could select one or more regions, without having to care about Sotkanet region codes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One way would also be to remove regions and region.category selection since the filtering is done locally anyway

R/SotkanetInteractive.R Outdated Show resolved Hide resolved
@pitkant
Copy link
Member

pitkant commented May 23, 2024

frictionless:::print.datapackage function (print method) produces an error when saving a data package object, specifically this line:

if (startsWith(x$id %||% "", "http")) {
    cli::cat_line(cli::format_inline("For more information, see {.url {x$id}}."))
  }

Tested with this:

x <- get_sotkanet(10027, frictionless = TRUE)
> x
A Data Package with 1 resource:
• sotkanet
Error in startsWith(x$id %||% "", "http") : non-character object(s)

I'm not entirely sure what causes this. Is it the id-10027 (x$id-10027`) object?

@Allaht2
Copy link
Contributor Author

Allaht2 commented May 24, 2024

frictionless:::print.datapackage function (print method) produces an error when saving a data package object, specifically this line:

if (startsWith(x$id %||% "", "http")) {
    cli::cat_line(cli::format_inline("For more information, see {.url {x$id}}."))
  }

Tested with this:

x <- get_sotkanet(10027, frictionless = TRUE)
> x
A Data Package with 1 resource:
• sotkanet
Error in startsWith(x$id %||% "", "http") : non-character object(s)

I'm not entirely sure what causes this. Is it the id-10027 (x$id-10027`) object?

The issue was that the frictionless::print.datapackage() was looking for the prefix id in the package metadata and then checking for a link, but this check was failing. I changed the prefix from id to sotkanet, which fixed the problem.

This was linked to issues May 24, 2024
@pitkant pitkant merged commit 970550a into v0.10-dev May 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sotkanet_interactive() Metadata handling Data caching
2 participants