bucket.getFiles should document (at the top) the `on()` way of iterating. #673

jgeewax · 2015-06-17T23:26:30Z

Right now, in http://googlecloudplatform.github.io/gcloud-node/#/docs/v0.15.0/storage/bucket?method=getFiles

bucket.getFiles(function(err, files, nextQuery, apiResponse) {
  if (nextQuery) {
    // nextQuery will be non-null if there are more results.
    bucket.getFiles(nextQuery, function(err, files, nextQ, apiResponse) {});
  }
});

which is ... absurdly confusing -- I just want to iterate through all the files.

Can we change the examples to document the way to iterate through all files, and abort the traversal early if necessary?

bucket.getFiles().on('data', function(file) {
  console.log(file);
  if (some condition) {
    this.end();
  }
});

jgeewax · 2015-06-17T23:26:58Z

/cc @stephenplusplus : Kicking to you. I'm watching someone struggle with this right now.... and the recursive traversal stuff is confusing the hell out of them.

stephenplusplus · 2015-06-17T23:31:14Z

Separate and name the function to make it easier. We can document how to use streams, but it really isn't the solution. Streams are best when you indend to consume and pass-through all of the data to a connecting stream. It would be best to have a "recurse for me, then give me all of the results in my callback that only gets called once" option and avoid the concept of terminating early. The user always has pageSize and maxResults to avoid getting ridiculous size result sets.

stephenplusplus · 2015-06-17T23:34:05Z

I'll put a better proposal together tomorrow to demo what I'm thinking.

jgeewax · 2015-06-17T23:44:17Z

OK -- just the whole recursive calling next query... is confusing. People want to say "call this method again on the next page"...

stephenplusplus · 2015-06-18T20:09:08Z

just the whole recursive calling next query... is confusing.

Yeah, it is. But it's the most unsurprising implementation. I don't think it would be less confusing to tell the user "your callback is going to be called an unknown amount of times, and each time the result set will be of a different length." I would rather use the callback once after we have all of the results consumed, and not even let the user know the nextQuery/API-determined pagination exists.

As far as the streams go, I added support for the premature ending of a stream as part of the search PR: stephenplusplus@abca6f1 -- after that lands, I'll start to use stream-router throughout the rest of the library.

The only thing about using streams to solve this problem is, it's kind of a weird use of them. The .on('data') API is nicer to deal with, but it's not typically where you see streams being used. A better use case for streams could be something like getting a bunch of data, filtering it down to a set you care about, parsing the metadata out of them and displaying the results in a terminal (bucket.getFiles().pipe(filterFilesOverTenMb).pipe(parseMetadata).pipe(process.stdout)). Streams usually aren't about getting a large set of results, then closing the pipe once you've found the result you're looking for.

Still though, supporting pre-mature closing is probably a safe plan, since the user has ultimate say over how they want to do things.

I still haven't thought of a great API to improve how nextQuery paginating works currently (other than removing it completely), but it's on my mind 😵

jgeewax · 2015-06-18T22:22:00Z

I don't think it would be less confusing to tell the user "your callback is going to be called an unknown amount of times, and each time the result set will be of a different length."

But it's ... already being called an unknown number of times with the recursive way, right?

I would rather use the callback once after we have all of the results consumed, and not event let the user know the nextQuery/API-determined pagination exists.

Hmm... I really appreciated the idea of being able to say "when I've gotten all the items in this chunk, go go again"... this.nextPage() would have solved that problem....

To put it in comparison to Python, we have an Iterator that basically lets you do:

for item in bucket.list_objects():
  print item
  if item.name == 'whatever':
    break

Which is paginating stuff behind the scenes. This is (I think) the .on('data', ...) syntax which... I'm OK with .

stephenplusplus · 2015-06-18T22:31:16Z

But it's ... already being called an unknown number of times with the recursive way, right?

It depends what we're talking about. Going off of "People want to say "call this method again on the next page"...", I thought you meant you wanted getFiles to recurse itself, and execute that callback with each result set returned from the API.

As far as "nextQuery()" vs "nextQuery" being an object, I fought my case in that other issue. I think it's better to give an object to the user, which I think is where we landed. That, plus documenting how to use the streams to early-exit. There will be a big sweep of all the dual methods soon after Search lands. I have to go through and implement the new stream-router, which will be a great time to catch the docs up.

This is (I think) the .on('data', ...) syntax which... I'm OK with .

Yeah, it will work this way.

jgeewax · 2015-06-18T22:37:22Z

I thought you meant you wanted getFiles to recurse itself

I think I did mean that, but if it's not acceptable... then being able to say that "we'll run this function for each page until you tell us to stop (or we've gone through all the items)" is fine too...

Bottom line though is... the whole nextQuery thing, and having to re-establish the callback again, or define it elsewhere and pass it in that calls the next query with a reference to itself and all that stuff is... not the way we should be telling users "here's how you page through all the data". It's hugely confusing...

stephenplusplus · 2015-06-18T22:59:07Z

I'd like to find a higher level solution for the nextQuery process as well. We can change nextQuery the object into a continueProcessing() function. Now I remember (I think) that is where we left off in our last discussion.

jgeewax · 2015-06-18T22:59:49Z

Or something like nextQuery.runAgain(); or something of that nature...?

stephenplusplus · 2015-06-18T23:02:37Z

Something like that might work. I'll be thinking on it 💭

stephenplusplus · 2015-06-25T17:15:38Z

Just an update: merged #648 so I can get this going.

* fix(docs): describe fallback rest option Use gapic-generator-typescript v2.15.1. PiperOrigin-RevId: 456946341 Source-Link: googleapis/googleapis@88fd18d Source-Link: googleapis/googleapis-gen@accfa37 Copy-Tag: eyJwIjoiLmdpdGh1Yi8uT3dsQm90LnlhbWwiLCJoIjoiYWNjZmEzNzFmNjY3NDM5MzEzMzM1YzY0MDQyYjA2M2MxYzUzMTAyZSJ9 * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com>

🤖 I have created a release *beep* *boop* --- ## [5.0.1](googleapis/nodejs-language@v5.0.0...v5.0.1) (2022-06-30) ### Bug Fixes * **docs:** describe fallback rest option ([#673](googleapis/nodejs-language#673)) ([6767804](googleapis/nodejs-language@6767804)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).

* fix(docs): describe fallback rest option Use gapic-generator-typescript v2.15.1. PiperOrigin-RevId: 456946341 Source-Link: googleapis/googleapis@88fd18d Source-Link: googleapis/googleapis-gen@accfa37 Copy-Tag: eyJwIjoiLmdpdGh1Yi8uT3dsQm90LnlhbWwiLCJoIjoiYWNjZmEzNzFmNjY3NDM5MzEzMzM1YzY0MDQyYjA2M2MxYzUzMTAyZSJ9 * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com>

🤖 I have created a release *beep* *boop* --- ## [5.0.1](googleapis/nodejs-language@v5.0.0...v5.0.1) (2022-06-30) ### Bug Fixes * **docs:** describe fallback rest option ([#673](googleapis/nodejs-language#673)) ([6767804](googleapis/nodejs-language@6767804)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: release-please[bot] <55107282+release-please[bot]@users.noreply.github.com>

samples: pull in latest typeless bot, clean up some comments Source-Link: https://github.com/googleapis/synthtool/commit/0a68e568b6911b60bb6fd452eba4848b176031d8 Post-Processor: gcr.io/cloud-devrel-public-resources/owlbot-nodejs:latest@sha256:5b05f26103855c3a15433141389c478d1d3fe088fb5d4e3217c4793f6b3f245e

* chore: fix release_level to reflect current status * 🦉 Updates from OwlBot See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md * 🦉 Updates from OwlBot See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com> Co-authored-by: Benjamin E. Coe <bencoe@google.com>

jgeewax added enhancement api: storage Issues related to the Cloud Storage API. labels Jun 17, 2015

jgeewax assigned stephenplusplus Jun 17, 2015

stephenplusplus mentioned this issue Jun 25, 2015

streamrouter: implement across library #692

Merged

stephenplusplus closed this as completed in #692 Jul 15, 2015

sofisl mentioned this issue Oct 11, 2022

migrate code from googleapis/nodejs-language #3395

Merged

4 tasks

release-please bot mentioned this issue Oct 11, 2022

chore: release main #3397

Closed

sofisl mentioned this issue Nov 10, 2022

migrate code from googleapis/nodejs-translate #3520

Merged

4 tasks

sofisl pushed a commit that referenced this issue Nov 10, 2022

chore: release 6.2.2 (#673)

12b6b71

Co-authored-by: release-please[bot] <55107282+release-please[bot]@users.noreply.github.com>

sofisl mentioned this issue Nov 10, 2022

migrate code from googleapis/nodejs-tasks #3530

Merged

4 tasks

sofisl pushed a commit that referenced this issue Nov 10, 2022

fix(deps): use google-gax v3.5.2 (#673)

75cfc2a

sofisl mentioned this issue Nov 11, 2022

migrate code from googleapis/nodejs-asset #3567

Merged

4 tasks

release-please bot mentioned this issue Nov 12, 2022

chore: release main #3529

Merged

sofisl mentioned this issue Nov 17, 2022

migrate code from googleapis/nodejs-compute #3666

Merged

4 tasks

sofisl mentioned this issue Jan 10, 2023

migrate code from googleapis/nodejs-dlp #3837

Merged

4 tasks

sofisl pushed a commit that referenced this issue Jan 10, 2023

docs(node): support "stable"/"preview" release level (#1312) (#673)

bc718b6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bucket.getFiles should document (at the top) the `on()` way of iterating. #673

bucket.getFiles should document (at the top) the `on()` way of iterating. #673

jgeewax commented Jun 17, 2015

jgeewax commented Jun 17, 2015

stephenplusplus commented Jun 17, 2015

stephenplusplus commented Jun 17, 2015

jgeewax commented Jun 17, 2015

stephenplusplus commented Jun 18, 2015

jgeewax commented Jun 18, 2015

stephenplusplus commented Jun 18, 2015

jgeewax commented Jun 18, 2015

stephenplusplus commented Jun 18, 2015

jgeewax commented Jun 18, 2015

stephenplusplus commented Jun 18, 2015

stephenplusplus commented Jun 25, 2015

bucket.getFiles should document (at the top) the on() way of iterating. #673

bucket.getFiles should document (at the top) the on() way of iterating. #673

Comments

jgeewax commented Jun 17, 2015

jgeewax commented Jun 17, 2015

stephenplusplus commented Jun 17, 2015

stephenplusplus commented Jun 17, 2015

jgeewax commented Jun 17, 2015

stephenplusplus commented Jun 18, 2015

jgeewax commented Jun 18, 2015

stephenplusplus commented Jun 18, 2015

jgeewax commented Jun 18, 2015

stephenplusplus commented Jun 18, 2015

jgeewax commented Jun 18, 2015

stephenplusplus commented Jun 18, 2015

stephenplusplus commented Jun 25, 2015

bucket.getFiles should document (at the top) the `on()` way of iterating. #673

bucket.getFiles should document (at the top) the `on()` way of iterating. #673