Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature][Connector-V2][Clickhouse] Clickhouse supports spaced write and is relatively real-time data source friendly #4328

Closed
wants to merge 17 commits into from

Conversation

zhilinli123
Copy link
Contributor

Purpose of this pull request

Check list

@zhilinli123 zhilinli123 reopened this Mar 10, 2023
zhilinli added 7 commits March 10, 2023 21:35
…and is relatively real-time data source friendly apache#4309
…feature-flush-ck-interval

# Conflicts:
#	docs/en/start-v2/locally/quick-start-flink.md
#	docs/en/start-v2/locally/quick-start-seatunnel-engine.md
…and is relatively real-time data source friendly apache#4309
…and is relatively real-time data source friendly apache#4309
…and is relatively real-time data source friendly apache#4309
…and is relatively real-time data source friendly apache#4309
…and is relatively real-time data source friendly apache#4309
…and is relatively real-time data source friendly apache#4309
Comment on lines 90 to 92
this.statement = statementMap.get(shardRouter.getShard(shardKey));
this.clickHouseStatement = statement.getJdbcBatchStatementExecutor();
this.sizeHolder = statement.getIntHolder();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't use this to save fields which mapping with shardKey, because one writer can hold many different shardKey, so it will have many statement,sizeHolder. Use this to save it may cause conflusion.

Copy link
Contributor Author

@zhilinli123 zhilinli123 Mar 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be?
`statement = statementMap.get(shardRouter.getShard(shardKey));

clickHouseStatement = statement.getJdbcBatchStatementExecutor();

sizeHolder = statement.getIntHolder();
`

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be? `statement = statementMap.get(shardRouter.getShard(shardKey));

clickHouseStatement = statement.getJdbcBatchStatementExecutor();

sizeHolder = statement.getIntHolder(); `

No, just do not use ClickhouseSinkWriter object's fields to save statement and sizeHolder. Because it belong shard, not writer. One writer may have many shard.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


Should it be? statement = statementMap.get(shardRouter.getShard(shardKey)); clickHouseStatement = statement.getJdbcBatchStatementExecutor(); sizeHolder = statement.getIntHolder();

No, just do not use ClickhouseSinkWriter object's fields to save statement and sizeHolder. Because it belong shard, not writer. One writer may have many shard.

Can you do some hints on code?

Copy link
Contributor Author

@zhilinli123 zhilinli123 Mar 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

statement and sizeHolder Create internal objects?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean one writer may have many sizeHolder, clickHouseStatement and statement. Because one writer have many shard, one shard have one sizeHolder, clickHouseStatement and statement. So the this.sizeHolder will change when last row had change ( the row change then the last shard also change).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your explanation. I think I understand

…and is relatively real-time data source friendly apache#4309
@zhilinli123 zhilinli123 reopened this Mar 13, 2023
username = "default"
password = ""
batch_interval_ms = "6000"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this line.
batch_interval_ms?You mean using the default?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The line 49 is blank.

private transient ClickhouseBatchStatement statement;
private transient IntHolder sizeHolder;
// Whether pre-initialization is required
private transient boolean isOpen;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This field is not necessary

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a historical commentary Please check the latest

sizeHolder.setValue(sizeHolder.getValue() + 1);
addIntoBatch(element, this.clickHouseStatement);
this.sizeHolder.setValue(this.sizeHolder.getValue() + 1);
tryOpen();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move it into constructer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a historical commentary Please check the latest

Copy link
Contributor

@TaoZex TaoZex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Member

@Hisoka-X Hisoka-X left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-1.
I don't think it is a good way to add additional threads to write data in the sink. This implementation method makes resources uncontrollable, and also loses the original intention of the committer interface design.

@Hisoka-X Hisoka-X added the don't merge There needs to be a specific reason in the PR, and it cannot be merged for the time being. label May 5, 2023
@TyrantLucifer
Copy link
Member

same as #4349

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
connectors-v2 don't merge There needs to be a specific reason in the PR, and it cannot be merged for the time being. feature New feature reviewed Waiting for code update
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants