-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add paginators #30
Comments
Hi colleagues, It seems that I face the paginator challenge myself :) Was trying to get my all time historical trainings: sm_client <- paws::sagemaker(config = list(region = myregion') )
total_training_jobs <- list()
j <- 1
sequence_var <- seq.POSIXt(from = as.POSIXct("2020-04-01 00:00:00"), to=as.POSIXct("2020-11-20 00:00:00"), by="hour")
for(i in sequence_var){
total_training_jobs[[j]] <- sm_client$list_training_jobs(MaxResults=100, CreationTimeAfter = i)
j <- j+1
} And I got a nice 400 ThrottlingException. Anyone that has tried a workaround? BR |
Hey, sorry about that. I'll look into this this weekend. To my knowledge the approach to this is to delay requests some amount. |
I put together this attempt at a paginator. You supply it with your AWS API call as the argument to parameter # Get all pages of a given API call, retrying with exponential backoff.
paginate <- function(f, max_retries = 5) {
resp <- f
result <- list(resp)
while ("NextToken" %in% names(resp) && length(resp$NextToken) > 0 && resp$NextToken != "") {
next_token <- resp$NextToken
call <- substitute(f)
call$NextToken <- next_token
# Retry with exponential backoff.
# See https://docs.aws.amazon.com/general/latest/gr/api-retries.html.
# See also https://github.com/paws-r/paws/blob/main/examples/error_handling.R.
retry <- TRUE
retries <- 0
while (retry && retries < max_retries) {
resp <- tryCatch(eval(call), error = function(e) e)
if (inherits(resp, "error")) {
if (retries == max_retries) stop(resp)
wait_time <- 2^retries / 10
Sys.sleep(wait_time)
retries <- retries + 1
}
else retry <- FALSE
}
result <- c(result, list(resp))
}
return(result)
} For an example, see below (using CloudWatch instead of SageMaker in my case). In your case, you'll need to modify the call to use a fixed creation time, e.g. results <- paginate(
cw$get_metric_data(
MetricDataQueries = metric_data_queries,
StartTime = as.POSIXct("2020-01-01"),
EndTime = as.POSIXct("2020-11-22")
)
) |
Of course, How bad of me to have overlooked the next token workaround. The solution is working perfectly @davidkretch, thanks for that! BR |
For paginates I am toying around the idea of an So we have the standard paginator that will loop over every token. library(paws.common)
s3 <- paws::s3()
out <- paginate(
S3$list_objects_v2(
Bucket = "my_bucket"
)
) Secondly we have the apply "family" of paginators that allow users to use a function on each response from the operation. Basic example: out <- paginate_lapply(
S3$list_objects_v2(
Bucket = "my_bucket"
),
\(resp) {
resp$Contents
}
) What are your thoughts on this? Would like your feedback before I go too far down the rabbit's hole 😆 |
paws v-0.4.0 has now been released to the cran. I will close this ticket for now. |
No description provided.
The text was updated successfully, but these errors were encountered: