Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FR: Support returning incomplete results after an interrupt #98

Open
krlmlr opened this issue Oct 13, 2019 · 9 comments
Open

FR: Support returning incomplete results after an interrupt #98

krlmlr opened this issue Oct 13, 2019 · 9 comments
Labels
feature a feature request or enhancement

Comments

@krlmlr
Copy link
Member

krlmlr commented Oct 13, 2019

Scenario: I'm running a query and didn't realize how many results are returned. The progress bar (#26) indicates that it's going to be too long, but I'd like to look at partial results. I'm pressing Escape (or Ctrl + C) but the intermediate results are gone.

As discussed in #86, we could store intermediate results in a private environment and expose with a function, e.g. gh_partial() . Catching interrupts is a bad idea.

@gaborcsardi
Copy link
Member

I am not convinced that this is a good idea. It seems hard to implement it in a way that ensures that the incomplete storage belongs to the last interrupted gh() call.

@gaborcsardi
Copy link
Member

Catching interrupts could actually work, except that curl is buggy: jeroen/curl#216

@gaborcsardi gaborcsardi added the feature a feature request or enhancement label Jan 21, 2020
@gaborcsardi
Copy link
Member

gaborcsardi commented Jan 21, 2020

Sorry, what I meant above is that returning the incomplete results is a great idea, but storing them in a private place and making sure that they belong to the last request seems hard.

E.g. what if you get an interrupt before anything was downloaded? Then the previous incomplete results are there. What if you call gh() from async code (eg. the async package)? Any kind of concurrency makes this hard.

Maybe we can fix the interrupt issue in curl, and then catch the interrupt. That would be pretty cool, because you could even continue the program, from the browser, with .tryResumeInterrupt().

But of course catching the interrupt is not so simple, either, because we don't want to do that if gh() is in downstream code.

@jeroen
Copy link
Member

jeroen commented Jan 21, 2020

It may be easier to implement this with the async curl api, like so:

get_data <- function(){
  url <- 'https://nghttp2.org/httpbin/drip?duration=10&numbytes=500'
  pool <- curl::new_pool()
  buf <- rawConnection(raw(0), "r+")
  on.exit({
    out <- rawConnectionValue(buf)
    close(buf)
    return(out)
  })
  curl::curl_fetch_multi(url, pool = pool, data = function(x, ...){
    writeBin(x, buf)
  })
  curl::multi_run(pool = pool)
}

# Interrupt this after a few sec:
get_data()

@gaborcsardi
Copy link
Member

gaborcsardi commented Jan 21, 2020

Yeah, but why do the callbacks of the multi api support interrupt handlers and the easy api callbacks don't?

@jeroen
Copy link
Member

jeroen commented Jan 22, 2020

I just don't really know a good way to do implement this in C.

@gaborcsardi
Copy link
Member

You could use an approach that is in the cleancall package, and just call R_CheckUserInterrupt() normally.

@gaborcsardi
Copy link
Member

We could just return the result, with a warning, instead of storing it somewhere?

@hadley
Copy link
Member

hadley commented Feb 7, 2023

I forgot where, but I know we've used this pattern elsewhere (i.e. on interrupt, warn and return the current values). But maybe we should only do this when executing in the global environment, to avoid some additional layer of functions from getting an incomplete result?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement
Projects
None yet
Development

No branches or pull requests

4 participants