Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use batch.Flush() only generate one 'QueryStart' log in system.query_log #1271

Closed
9 tasks
hongker opened this issue Apr 12, 2024 · 6 comments
Closed
9 tasks

Comments

@hongker
Copy link
Contributor

hongker commented Apr 12, 2024

Observed

1.reuse batch and Flush() data, only generate one 'QueryStart' log in system.query_log, no QueryFinish log
2.successfully flush data into server, but query immediately not return last inserts result, need wait some seconds can query result.

Expected behaviour

1.like batch.Send(), Flush success will generate QueryFinish log
2.What config control the frequency of insert when use flush data into table can be query successed

Code example

package code
var conn driver.Conn
func func TestPrepareBatch(t *testing.T) {
        ctx := context.Background()
        insertSql := "INSERT INTO default.users (name,ts)"

	batch, err := conn.PrepareBatch(ctx, insertSql)
	if err != nil {
		t.Fatal(err)
	}
	for {
		time.Sleep(time.Second)
		start := time.Now()
		for i := 0; i < 100000; i++ {
			if err = batch.Append(fmt.Sprintf("test%d", i), start); err != nil {
				t.Fatal(err)
			}
		}

		err = batch.Flush()
		if err != nil {
			t.Fatal(err)
		}
		log.Printf("insert 1000 rows cost:%v", time.Since(start))
	}
}

query_log

type:                                  QueryStart
event_date:                            2024-04-12
event_time:                            2024-04-12 12:09:42
event_time_microseconds:               2024-04-12 12:09:42.928242
query_start_time:                      2024-04-12 12:09:42
query_start_time_microseconds:         2024-04-12 12:09:42.928242
query_duration_ms:                     0
read_rows:                             0
read_bytes:                            0
written_rows:                          0
written_bytes:                         0
result_rows:                           0
result_bytes:                          0
memory_usage:                          0
current_database:                      default
query:                                 INSERT INTO default.users (name,ts) VALUES
formatted_query:                       
normalized_query_hash:                 2989151148266274288
query_kind:                            Insert
databases:                             ['default']
tables:                                ['default.users']
columns:                               []
partitions:                            []
projections:                           []
views:                                 []
exception_code:                        0
exception:                             
stack_trace:                           
is_initial_query:                      1
user:                                  admin
query_id:                              3aa0d52e-5230-47b0-9247-7ed96ba8f392
address:                               ::ffff:127.0.0.1
port:                                  32912
initial_user:                          admin
initial_query_id:                      3aa0d52e-5230-47b0-9247-7ed96ba8f392
initial_address:                       ::ffff:127.0.0.1
initial_port:                          32912
initial_query_start_time:              2024-04-12 12:09:42
initial_query_start_time_microseconds: 2024-04-12 12:09:42.928242
interface:                             1
is_secure:                             0

Details

Environment

  • clickhouse-go version: v2.22.4
  • Interface: ClickHouse API / database/sql compatible driver
  • Go version: go1.19.10
  • Operating system: linux
  • ClickHouse version: 23.8.2 revision 5446
  • Is it a ClickHouse Cloud?
  • ClickHouse Server non-default settings, if any:
  • CREATE TABLE statements for tables involved:
│ CREATE TABLE default.users
(
    `name` String CODEC(ZSTD(9)),
    `ts` DateTime('Asia/Shanghai') CODEC(ZSTD(9))
)
ENGINE = MergeTree
PARTITION BY toYYYYMMDD(ts)
ORDER BY ts
SETTINGS index_granularity = 8192 │
@jkaflik
Copy link
Contributor

jkaflik commented Apr 12, 2024

QueryFinish is not logged as the query is not finished until you trigger batch.Send(). Flush() is used to free the client side buffer and send it to the server.

@jkaflik jkaflik closed this as completed Apr 12, 2024
@hongker
Copy link
Contributor Author

hongker commented Apr 12, 2024

QueryFinish is not logged as the query is not finished until you trigger batch.Send(). Flush() is used to free the client side buffer and send it to the server.

how to trigger QueryFinish when use batch.Flush(), I need reuse the batch

@jkaflik
Copy link
Contributor

jkaflik commented Apr 12, 2024

I need reuse the batch

What is your goal of "reusing batch"?

@hongker
Copy link
Contributor Author

hongker commented Apr 12, 2024

I need reuse the batch

What is your goal of reusing batch?
1.insert data faster, because not need close conn in for-loop
2.memory alloc similar, because free the client buffer

@jkaflik
Copy link
Contributor

jkaflik commented Apr 12, 2024

@hongker

insert data faster, because not need close conn in for-loop

There is no need to close/reinitialize connection after every batch.Send.
Reusing batch in the way you described is not supported. Eventually reusing the block buffer could be implemented, but it requires a bit of investigation since it's depend on server first block:

func (c *connect) firstBlock(ctx context.Context, on *onProcess) (*proto.Block, error) {

Would you like to contribute with enhancement?

@hongker
Copy link
Contributor Author

hongker commented Apr 17, 2024

@hongker

insert data faster, because not need close conn in for-loop

There is no need to close/reinitialize connection after every batch.Send. Reusing batch in the way you described is not supported. Eventually reusing the block buffer could be implemented, but it requires a bit of investigation since it's depend on server first block:

func (c *connect) firstBlock(ctx context.Context, on *onProcess) (*proto.Block, error) {

Would you like to contribute with enhancement?

I had create pull request, but some system error happened, like #1265
who can fix this same question(run-tests / single-node (1.22, 24.2) failed)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants