Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Key-Value Pair Query Results, Streamed Data Access #507

Open
Shanmugavel-J opened this issue Nov 22, 2024 · 14 comments
Open

Support for Key-Value Pair Query Results, Streamed Data Access #507

Shanmugavel-J opened this issue Nov 22, 2024 · 14 comments
Labels
enhancement New feature or request

Comments

@Shanmugavel-J
Copy link

Shanmugavel-J commented Nov 22, 2024

Currently, the npm databend-driver has the following limitations that impact usability and efficiency:

  1. Array of Arrays for Results: Query results are returned as arrays of arrays containing only values. To fetch field names (keys) alongside these values, the same query must be executed again using a different function. This approach is inefficient and cumbersome.
  2. Variant Types in String Format: Variant types are returned in string format instead of actual objects or arrays. This requires explicit parsing for each record, adding unnecessary overhead and complexity.
  3. Lack of Streaming Support: The absence of streaming options makes handling large datasets inefficient.
  4. Async Query Execution: There is no support for asynchronous query execution (e.g., async/await), which is a common pattern in modern database drivers like those for MySQL and Snowflake.

Ex:- const results = await conn.exec('SELECT * FROM test;');

results = [{ id: 1 }, { id: 2 }]

Proposed Features:

  1. Key-Value Pair Rows: An option to fetch query results as objects with field names (keys) and corresponding values (e.g., JSON format).
  2. Proper Handling of Variant Types: Return variant types as actual objects or arrays instead of strings to eliminate the need for explicit parsing.
  3. Streaming Support: Introduce support for streaming query results to efficiently handle large datasets.
  4. Async Query Execution: Add methods for executing queries using async/await to align with modern asynchronous programming patterns.

These features would significantly improve the usability, performance, and developer experience of the databend-driver. Thanks!

@everpcpc
Copy link
Member

everpcpc commented Dec 5, 2024

Thank you for your detailed feedback on the npm databend-driver. We are pleased to inform you that the mentioned limitations have been addressed:

  1. Array of Arrays for Results: You can now use row.data() instead of row.values() to fetch query results with field names.
  2. Variant Types in String Format: Support has been added for returning variant types as actual objects or arrays by setting the parameter { variantAsObject: true } for row or client.
  3. Lack of Streaming Support: Our queryIter() method already returns rows as a streaming result.
  4. Async Query Execution: All methods in our Node.js client already support async/await.

We hope these improvements enhance your experience with the databend-driver. If you have any further questions or feedback, please feel free to share.

Thank you again for your valuable input!

@Shanmugavel-J
Copy link
Author

@everpcpc Thanks for your support. We encountered an issue with the output of the record. For example, when we call queryIter or any method to execute a query, the output should ideally be in its respective data type, correct? However, for bigint, the output is returned as '7n' instead of 7 (even though it’s not a very large value). Using the MySQL handler, it provides the correct result. Does the output remain consistent across types—for instance, varchar as a string, and number and boolean in their proper formats? Am I correct?

@everpcpc
Copy link
Member

everpcpc commented Dec 6, 2024

@Shanmugavel-J

Thank you for your feedback. I'd like to clarify the data type mappings between Databend and JavaScript native types. You can find a full list of these mappings in our documentation here: Databend Driver Type Mapping.

Regarding your specific concern, '7n' is indeed the correct output for bigint in JavaScript. In JavaScript, '7n' represents a BigInt, which is a primitive type. For more details on BigInt, you can refer to the MDN documentation: BigInt in JavaScript.

To answer your query about consistency across types:

  • varchar is returned as a string.
  • number and boolean are returned in their proper formats.
  • bigint is returned as a BigInt (e.g., '7n').

This ensures that the output remains consistent and adheres to JavaScript's handling of different data types. If you have any further questions, feel free to ask!

@Shanmugavel-J
Copy link
Author

@everpcpc Thanks! I indeed knew it but since the mysql handler port gave in a different way which made me ask it.

@everpcpc
Copy link
Member

everpcpc commented Dec 6, 2024

@Shanmugavel-J Oh, I see. The type mapping depends on how the driver converts the response from the database. We recommend using the native Databend driver for better support. The MySQL handler is intended for transitional use and is just a compatible implementation. It has not been fully tested, so there may be some corner cases that are not covered.

@Shanmugavel-J
Copy link
Author

Shanmugavel-J commented Dec 6, 2024

@everpcpc Yes, I noticed the features have been released. I’m now updating to the databend-driver package. Instead of an iterator for streams, I had requested a Node.js Readable stream, similar to what Snowflake and mysql provides.

Which could be piped to another stream for transformation and would be useful in many ways

@everpcpc
Copy link
Member

everpcpc commented Dec 6, 2024

@Shanmugavel-J I get it. We will add support for Node.js Readable stream in upcoming releases, making data streaming operations more flexible and convenient.

@Shanmugavel-J
Copy link
Author

@everpcpc That sounds great please let me know once that feature is added for fetching the rows in stream format.

Thanks again for your support!!

@everpcpc
Copy link
Member

@Shanmugavel-J Hi, we have released a new version with Readable stream support. You could refer to the examples: https://www.npmjs.com/package/databend-driver?activeTab=readme#usage

@Shanmugavel-J
Copy link
Author

@everpcpc I will check and get back to you. Thanks!

@Shanmugavel-J
Copy link
Author

Shanmugavel-J commented Dec 11, 2024

@everpcpc Please find my below observations,

  1. From which node version does the package support? is it from version 16?

  2. queryIterExt Method

When i execute the above method i could see its comment saying the rows, schema & stats could be fetched but i don't get the rows as output when i execute this method but i get the stats (execution time etc). but Ideally it should provide me with the rows, schema & the stats am I correct?

  1. Stream It works as expected. I can use the nodejs stream in any way right other than that mentioned in the doc ?.

One observation was like, I couldn't see any types for stream. Could you please update it so that it would be great to see what that function would return?

example code (Other way to use stream):

const rows = await conn.queryIter("SELECT * FROM my_table1;");
const stream = rows.stream();
stream.on('data', (data) => {
  console.log(data.data());
});
stream.on('end', async () => {
  console.log('end');
});
stream.on('error', (err) => {
  console.log(err);
});
  1. Error

When i execute the below logic after successful execution of the below code, i get the error attached

Query:

const client = new Client(  `databend+http://${process.env.DATABEND_USERNAME}:${process.env.DATABEND_PASSWORD}@${process.env.DATABEND_HOST}:8000/?sslmode=disable`,
  );
try {
  const conn = await client.getConn();
  const status = await conn.exec("CREATE TEMP TABLE my_table1 (id INT, description STRING);");
  console.log(status);
} catch (e) {
  console.log(e);
}

Error:

thread '<unnamed>' panicked at core/src/client.rs:894:13:
there is no reactor running, must be called from the context of a Tokio 1.x runtime
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread '<unnamed>' panicked at core/src/panicking.rs:221:5:
panic in a function that cannot unwind
stack backtrace:
   0:     0x7f76fbc88e3a - <std::sys::backtrace::BacktraceLock::print::DisplayBacktrace as core::fmt::Display>::fmt::h5b6bd5631a6d1f6b
   1:     0x7f76fbcb0443 - core::fmt::write::h7550c97b06c86515
   2:     0x7f76fbc85c83 - std::io::Write::write_fmt::h7b09c64fe0be9c84
   3:     0x7f76fbc88c82 - std::sys::backtrace::BacktraceLock::print::h2395ccd2c84ba3aa
   4:     0x7f76fbc89d6c - std::panicking::default_hook::{{closure}}::he19d4c7230e07961
   5:     0x7f76fbc89bb2 - std::panicking::default_hook::hf614597d3c67bbdb
   6:     0x7f76fbc8a347 - std::panicking::rust_panic_with_hook::h8942133a8b252070
   7:     0x7f76fbc8a1a6 - std::panicking::begin_panic_handler::{{closure}}::hb5f5963570096b29
   8:     0x7f76fbc89319 - std::sys::backtrace::__rust_end_short_backtrace::h6208cedc1922feda
   9:     0x7f76fbc89e6c - rust_begin_unwind
  10:     0x7f76fb3164dd - core::panicking::panic_nounwind_fmt::h357fc035dc231634
  11:     0x7f76fb316572 - core::panicking::panic_nounwind::hd0dad372654c389a
  12:     0x7f76fb316696 - core::panicking::panic_cannot_unwind::h65aefd062253eb19
  13:     0x7f76fb36f4fa - napi::bindgen_runtime::raw_finalize_unchecked::h95d712912f7a460e
  14:           0xb455fa - _ZN15node_napi_env__13CallFinalizerEPFvP10napi_env__PvS2_ES2_S2_
  15:           0xb288a4 - _ZThn40_N6v8impl9Reference8FinalizeEv
  16:           0xb49182 - _ZZN4node11Environment11CloseHandleI11uv_handle_sZN6v8impl12_GLOBAL__N_118ThreadSafeFunction26CloseHandlesAndMaybeDeleteEbEUlPS2_E_EEvPT_T0_ENUlS6_E_4_FUNES6_
  17:          0x16674d1 - uv__finish_close
                               at /home/iojs/build/ws/out/../deps/uv/src/unix/core.c:319:5
  18:          0x16674d1 - uv__run_closing_handles
                               at /home/iojs/build/ws/out/../deps/uv/src/unix/core.c:333:5
  19:          0x16674d1 - uv_run
                               at /home/iojs/build/ws/out/../deps/uv/src/unix/core.c:421:5
  20:           0xb0af60 - _ZN4node11Environment14CleanupHandlesEv
  21:           0xb0b02c - _ZN4node11Environment10RunCleanupEv
  22:           0xac8ed7 - _ZN4node15FreeEnvironmentEPNS_11EnvironmentE
  23:           0xbc8caa - _ZN4node16NodeMainInstance3RunEv
  24:           0xb3ec18 - _ZN4node22LoadSnapshotDataAndRunEPPKNS_12SnapshotDataEPKNS_20InitializationResultE
  25:           0xb4272f - _ZN4node5StartEiPPc
  26:     0x7f77134ee083 - __libc_start_main
                               at /build/glibc-LcI20x/glibc-2.31/csu/../csu/libc-start.c:308:16
  27:           0xac1fee - _start
  28:                0x0 - <unknown>
thread caused non-unwinding panic. aborting.

Object (Variant Data) one I will try and let you know. Thanks!

@everpcpc
Copy link
Member

cc @youngsofun please check this error:

thread '' panicked at core/src/client.rs:894:13:
there is no reactor running, must be called from the context of a Tokio 1.x runtime
note: run with RUST_BACKTRACE=1 environment variable to display a backtrace
thread '' panicked at core/src/panicking.rs:221:5:
panic in a function that cannot unwind

@everpcpc
Copy link
Member

Hi, @Shanmugavel-J

  1. We've tested this driver with Node versions 18, 20, and 22, but not with 16 since it is deprecated. However, it might still work with version 16 as it is a native plugin, so feel free to try it out and report any bugs.
  2. queryIterExt is not intended for customer use, but it is okay to use it. This function returns either a row or statistics with each fetch.
  3. rows.stream() returns a common implementation for node:stream:Readable with Row as the item in objectMode.
  4. This error will be fixed in the next release.

@Shanmugavel-J
Copy link
Author

@everpcpc Thanks! The variantObject option works as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants