Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What if non-ResultSet SELECT execute() returned an error when maxRows was too small? #146

Closed
cjbj opened this issue Jul 22, 2015 · 24 comments
Labels

Comments

@cjbj
Copy link
Member

cjbj commented Jul 22, 2015

Will you be impacted if there is a change in SELECT execute() behavior
for non-Result Sets when the number of query rows is larger than maxRows
https://github.com/oracle/node-oracledb/blob/master/doc/api.md#propdbmaxrows ?

@dmcghan had an example of a 98 row table where a (non-Result Set) query
with maxRows of 100 currently returns all rows. If the table size
grows to 102, the application will return 100 rows and continue
happily. The application won't know there are 'missing' rows.

Now that node-oracledb has Result Sets it is easier to use them for queries
where the number of rows isn't known in advance, so you don't need to
use huge, speculative maxRows values. And now we have pre-fetching,
then detecting an incomplete fetch can be done efficiently.

The options are

  1. don't change the behavior. Only return maxRows rows.
  2. return maxRows rows but also have a callback parameter attribute to indicate not all data was fetched
  3. optionally return an NJS error when not all data was fetched
  4. always return an NJS error when not all data was fetched

This discussion is only about non-Result Set queries.

Let us know what you think. Is the change beneficial? What code do
you have that depends on the current behavior? How complex do you
want the driver to be? Is the current behavior a feature or a risk?

(Tagging a few random contributors so they see this: @Bigous @bchr02 @rinie @hvrauhal @bjouhier @sagiegurari @tmanolat @nelreina @hexkode @hellboy81 @SorinTO @pmarino90 @jaw187) Don't feel bad if your handle isn't tagged here, I just ran out of patience digging through names.

@rinie
Copy link

rinie commented Jul 22, 2015

1 would be my first choice, although 2 would be nice if not too much work.

Now that you have result sets (nice!) proper change would be to use them.

Regards,
Rinie

2015-07-22 8:59 GMT+02:00 Christopher Jones notifications@github.com:

Will you be impacted if there is a change in SELECT execute() behavior
for non-Result Sets when the number of query rows is larger than maxRows

https://github.com/oracle/node-oracledb/blob/master/doc/api.md#propdbmaxrows
?

@dmcghan https://github.com/dmcghan had an example of a 98 row table
where a (non-Result Set) query
with maxRows of 100 currently returns all rows. If the table size
grows to 102, the application will return 100 rows and continue
happily. The application won't know there are 'missing' rows.

Now that node-oracledb has Result Sets it is easier to use them for queries
where the number of rows isn't known in advance, so you don't need to
use huge, speculative maxRows values. And now we have pre-fetching,
then detecting an incomplete fetch can be done efficiently.

The options are

  1. don't change the behavior. Only return maxRows rows.
  2. return maxRows rows but also have a callback parameter attribute to
    indicate not all data was fetched
  3. optionally return an NJS error when not all data was fetched
  4. always return an NJS error when not all data was fetched

This discussion is only about non-Result Set queries.

Let us know what you think. Is the change beneficial? What code do
you have that depends on the current behavior? How complex do you
want the driver to be? Is the current behavior a feature or a risk?

(Tagging a few random contributors so they see this: @Bigous
https://github.com/Bigous @bchr02 https://github.com/bchr02 @rinie
https://github.com/rinie @hvrauhal https://github.com/hvrauhal
@bjouhier https://github.com/bjouhier @sagiegurari
https://github.com/sagiegurari @tmanolat https://github.com/tmanolat
@nelreina https://github.com/nelreina @hexkode
https://github.com/hexkode @hellboy81 https://github.com/hellboy81
@SorinTO https://github.com/SorinTO @pmarino90
https://github.com/pmarino90 @jaw187 https://github.com/jaw187) Don'
t feel bad if your handle isn't tagged here, I just ran out of patience
digging through names.


Reply to this email directly or view it on GitHub
#146.

Met vriendelijke groeten,
Rinie Kervel.

Don't walk behind me, I may not lead.
Don't walk in front of me, I may not follow.
Just walk beside me and be my friend.

@bjouhier
Copy link

We'll be doing most (if not all) of our reading with result sets so this won't impact us.

Reading without result sets makes sense when you have very small tables or when you do paging. In both cases it seems important to guarantee that what you get is what you asked for (if you ask for a page of 200 you should get a page of 200, not 100 because of a global default). If this is not the case, it should not happen silently. So I would go for option 4. Don't make the API more complex, just make it safe with fail-fast.

@rinie
Copy link

rinie commented Jul 22, 2015

We often have the case that we show a user just the first 10..80 rows of
data to give some feedback about why our application makes a certain
decision.
In that case an error seems inappopriate. IMHO 'first rows' behaviour
without errors/exceptions should be default. I would not like being forced
into using result sets for that case instead of using result sets for
paging and 'all rows'.
Then again we agree that reading with result sets is the right thing to
do...

2015-07-22 11:13 GMT+02:00 Bruno Jouhier notifications@github.com:

We'll be doing most (if not all) of our reading with result sets so this
won't impact us.

Reading without result sets makes sense when you have very small tables or
when you do paging. In both cases it seems important to guarantee that what
you get is what you asked for (if you ask for a page of 200 you should get
a page of 200, not 100 because of a global default). If this is not the
case, it should not happen silently. So I would go for option 4. Don't make
the API more complex, just make it safe with fail-fast.


Reply to this email directly or view it on GitHub
#146 (comment)
.

Met vriendelijke groeten,
Rinie Kervel.

Don't walk behind me, I may not lead.
Don't walk in front of me, I may not follow.
Just walk beside me and be my friend.

@bjouhier
Copy link

@rinie If you only want the first 80 rows you should limit your query with a fetch first 80 rows only. maxRows looks more like a system setting to avoid excessive buffering than an applicative setting.

@bjouhier
Copy link

Maybe proposal 3 is a good compromise.

@rinie
Copy link

rinie commented Jul 22, 2015

@bruno. In part of our system the user can enter a query. I do not want to
modify the query to fetch 80 rows. maxRows 80 as a default is fine with me.
This is how TOAD returns data and then you have to say 'fetch more'.
My preference is to show just a page of 80 rows as a default and 'fetch
more' when needed.
In other applications where I make a mistake in the where clause and get a
result of 10,000 rows that I cannot abort is very annoying.

Coding the number of rows in a query is not my idea of appropriate SQL. I
want to use the same query on a tablet or on a laptop, excessive buffering
or slow VPN connections should not affect my Query, just my preference in
pagesize and client side maxRows.

Then again, from now on all our queries wil use result sets. Changing the
pre 0.7 behaviour into an error seems annoying. What works in 0.6 breaks in
0.7 forcing you to change old code.
Just an appropriate warning in documentation would be sufficient for me.
That way, if you encounter the problem and have time to fix it, you can
adapt the code to result sets. If you don't have time to fix the code being
forced to go back to 0.6 is not nice

2015-07-22 11:40 GMT+02:00 Bruno Jouhier notifications@github.com:

@rinie https://github.com/rinie If you only want the first 80 rows you
should limit your query with a fetch first 80 rows only. maxRows looks
more like a system setting to avoid excessive buffering than like an
applicative setting.


Reply to this email directly or view it on GitHub
#146 (comment)
.

Met vriendelijke groeten,
Rinie Kervel.

Don't walk behind me, I may not lead.
Don't walk in front of me, I may not follow.
Just walk beside me and be my friend.

@Bigous
Copy link
Contributor

Bigous commented Jul 22, 2015

I like the option 2 for convenience, but I strongly suggest everyon to use ResultSets.

@bjouhier
Copy link

@rinie Option 3 could be made backwards compatible. For us it is not a problem to limit with SQL because our SQL statements are generated from metadata, not hardcoded.

@hellboy81
Copy link

3

@bchr02
Copy link

bchr02 commented Jul 23, 2015

Option 2 is my preference.

@cjbj
Copy link
Member Author

cjbj commented Jul 23, 2015

Thanks for the comments so far. Keep 'em coming.

(And let us know how you like Result Sets & prefetching)

@kbhanush
Copy link
Member

While resultsets is the way to go, I'd support option 2 for those unusual cases. I also don't think it makes the driver overly complex...and I'd rather prefer to handle a situation through callbacks than error handling.

Nice, guys! Resultsets and prefetching are cool features, especially for pagination.

@sagiegurari
Copy link

if no results set, option 2 seems to give the best output you could ask.
you get both data and hint that there is more available if you wish to pull more.
that could be good when showing results in a grid for the user.
I'm not for errors in this scenario, as it seems in this case that it is used as a flow control.

naturally best choice is to give some sort of behavior option parameter in the execute function call or/and on the getConnection function (with default to option 2).

@cjbj
Copy link
Member Author

cjbj commented Jul 24, 2015

Option 2 would be like:

{ rows: [ . . .  ],
  somenewflag: true // or false
  resultSet: undefined,
  outBinds: undefined,
  rowsAffected: undefined,
  metaData: [ . . .] }

If the flag indicates that the fetch was incomplete, the app would have to reexecute it with a bigger maxRows and get all the data again; it wouldn't be possible just to get the next set of results. That's what Result Sets are for.

Now, what should the flag be called? rowsTruncated?

@dmcghan
Copy link

dmcghan commented Jul 24, 2015

@cjbj I like option 2, I definitely think an attribute that let's us know our data has been truncated should be there. The name rowsTruncated would work fine.

I also think option 3 should be added as well. It's already a pain to have to check for errors after every IO operation in Node.js. I wouldn't want to have to check for errors and if rowsTruncated was true for every execute.

I'd like to be able to set an option on the base driver class (for defaults) and executes (for overrides), maybe called errorWhenRowsTruncated (terrible name, I know), that turns this condition into an error for me.

@nelreina
Copy link

+1 @cjbj

@Buto
Copy link

Buto commented Aug 2, 2015

Hi Chris!

We vote for option 2.

all my best,

On Sat, Jul 25, 2015 at 9:25 AM, Nelreina notifications@github.com wrote:

+1 @cjbj https://github.com/cjbj


Reply to this email directly or view it on GitHub
#146 (comment)
.

@cjbj
Copy link
Member Author

cjbj commented Aug 3, 2015

Looks like option 2 is the favourite. @bjouhier has some good arguments for option 4, though.

@bjouhier
Copy link

bjouhier commented Aug 3, 2015

@cjbj There seems to be a consensus around option 2. People (like me) who want option 4 can implement it with a light wrapper around the option 2 API.

@bchr02
Copy link

bchr02 commented Aug 7, 2015

For anyone interested, please have a look at my post here which explains in detail my thoughts on non-ResultSet SELECT execute() behavior.

@rinie @bjouhier @Buto @nelreina @dmcghan @kbhanush @sagiegurari @Bigous

@rinie
Copy link

rinie commented Aug 7, 2015

I am interested, but disagree. In my opinion Nodejs (and TCP/IP, file IO
etc) is about pages and streaming.
With huge queries and all rows 'in memory' you trade database speed for
virtual memory paging.
We encountered 30 second timeout problems with PHP and oci_fetch_all...
For spreadsheet writing (not in NodeJS but Delphi) we just switched to a
streaming xls component.
Same thing seems to exist for nodejs although I don not know the quality:
https://github.com/guyonroche/exceljs#streaming-xlxs-writer.

Your example feels like doing file I/O mmap through
https://en.wikipedia.org/wiki/Mmap
That is not my cup of tea

Rinie

2015-08-07 15:42 GMT+02:00 Bill Christo notifications@github.com:

For anyone interested, please have a look at my post here
#65 (comment)
which explains in detail my thoughts on non-ResultSet SELECT execute()
behavior.

@rinie https://github.com/rinie @bjouhier https://github.com/bjouhier
@Buto https://github.com/Buto @nelreina https://github.com/nelreina
@dmcghan https://github.com/dmcghan @kbhanush
https://github.com/kbhanush @sagiegurari
https://github.com/sagiegurari @Bigous https://github.com/Bigous


Reply to this email directly or view it on GitHub
#146 (comment)
.

Met vriendelijke groeten,
Rinie Kervel.

Don't walk behind me, I may not lead.
Don't walk in front of me, I may not follow.
Just walk beside me and be my friend.

@bjouhier
Copy link

bjouhier commented Aug 7, 2015

Disagree too. See my reply in the other issue.

@Bigous
Copy link
Contributor

Bigous commented Aug 11, 2015

@bchr02 I think you got some point here... I mean... All other available oracle drivers works with this option... why not this one?
But I agree with @bjouhier and @rinie when they say almost always the streaming is better for everyone (database, tcp, nodejs and the client application).
I think we should put this on the list when the driver get the RC status... where the fine tuning comes and we can treat this as a bug or known issue.

@cjbj cjbj added the question label Aug 17, 2015
@cjbj
Copy link
Member Author

cjbj commented Aug 17, 2015

We should follow this up in Issue #158

@cjbj cjbj closed this as completed Aug 17, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests