Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return query results as an iterator #37

Closed
chuckwondo opened this issue Apr 18, 2024 · 2 comments · Fixed by #89 or #91
Closed

Return query results as an iterator #37

chuckwondo opened this issue Apr 18, 2024 · 2 comments · Fixed by #89 or #91
Labels
enhancement New feature or request

Comments

@chuckwondo
Copy link
Collaborator

To give users better control over memory consumption, particularly when executing queries that produce large search results (hundreds of thousands, or even millions of granules), query results should not be obtained in their entirety and put into a list all at once before returning the results to the user. This can easily lead to "out of memory" errors for large results. Instead, an iterator (generator) should be returned so the user has the option to limit the number of results held in memory at any given moment.

Since this would be a breaking change to the existing get and get_all query methods, it might be worth defining a new method. However, given that this library has not reached a 1.0 release, making such a breaking change would not be unreasonable. Alternatively, a new method could be added while deprecating the existing ones. Either way, I recommend that returning an iterator being the only option, not an additional option, otherwise it can be confusing to users as to why there are multiple options. This also follows the Zen of Python's "There should be one-- and preferably only one --obvious way to do it." Further, returning an iterator does not prevent the user from populating a list themselves, if they wish.

There has been discussion amongst the users and maintainers of the earthaccess library about this very topic, and a number of us agreed that it makes more sense to implement this here rather than in earthaccess, which simply calls through to this library.

@frankinspace frankinspace added the enhancement New feature or request label Apr 18, 2024
@frankinspace
Copy link
Collaborator

Agreed, generator return type would be preferred implementation.

I would concur with the plan to add new functions, mark the existing ones as deprecated indicating that the list-type functions will be removed in the 1.0.0 release. I know this library is used in a few different places but I don't have a good handle on everywhere it gets used; so I think it is important to have at least 1, maybe more, minor versions with the existing functions marked as deprecated before completely removing them.

@briannapagan
Copy link
Collaborator

briannapagan commented Apr 30, 2024

We (GES DISC) are using this library for some operational scripts so would also vote for minor versions before completely removing. Was going to use today's community for earthaccess to discuss this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
3 participants