Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NullPointerException while loading Parquet file using ClickHouseClient #1494

Closed
porechajp opened this issue Dec 3, 2023 · 1 comment · Fixed by #1666
Closed

NullPointerException while loading Parquet file using ClickHouseClient #1494

porechajp opened this issue Dec 3, 2023 · 1 comment · Fixed by #1666
Assignees
Labels

Comments

@porechajp
Copy link

porechajp commented Dec 3, 2023

Hello,

I am trying to load Parquet file to Clickhouse table using com.clickhouse.client.ClickHouseClient and the execution fails at the end with the following exception.

Please note that the data does get loaded successfully.

Exception,

Exception in thread "main" java.util.concurrent.ExecutionException: java.lang.NullPointerException: Cannot invoke "com.clickhouse.data.ClickHouseDataProcessor.getInputStream()" because "this.processor" is null
	at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396)
	at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2073)
	at com.tnt.dbimex.ChImportApplication.main(ChImportApplication.java:27)
Caused by: java.lang.NullPointerException: Cannot invoke "com.clickhouse.data.ClickHouseDataProcessor.getInputStream()" because "this.processor" is null
	at com.clickhouse.client.ClickHouseStreamResponse.close(ClickHouseStreamResponse.java:94)
	at com.clickhouse.client.ClickHouseClient.lambda$load$8(ClickHouseClient.java:444)
	at com.clickhouse.client.ClickHouseClient.run(ClickHouseClient.java:232)
	at com.clickhouse.client.ClickHouseClient.lambda$submit$4(ClickHouseClient.java:284)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1768)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1583)

The code written is ,

	public static void main(String[] args) throws FileNotFoundException, InterruptedException, ExecutionException {

		var node = ClickHouseNode.of("http://127.0.0.1:8123?compress=0"); // ?compress_algorithm=gzip

		var future = ClickHouseClient.load(node, "STG_2.AUDIT_TRAIL",
				ClickHousePassThruStream.of(new FileInputStream("C:/Temp/AUDIT_TRAIL.parquet"),
						ClickHouseCompression.NONE, ClickHouseFormat.Parquet));
	
		
		System.out.println(future.get());
		
	}

After doing some step debug, I found that following code in com.clickhouse.client.ClickHouseStreamResponse does not check processor being null,

  @Override
    public void close() {
        final ClickHouseInputStream input = processor.getInputStream();
        if (closed || input.isClosed()) {
            return;
        }

Permalink : https://github.com/ClickHouse/clickhouse-java/blob/6a0856fdc4dbfed89dd2d0030f20accdec63bce8/clickhouse-client/src/main/java/com/clickhouse/client/ClickHouseStreamResponse.java#L94C45-L94C54

The processor remains null because in the constructor of ClickHouseStreamResponse, ClickHouseDataStreamFactory.getInstance().getProcessor returns null for the ClickHouseFormat.Parquet.

It is because the following method instantiates the ClickHouseDataProcessor only if the format is RowBinary and RowBinaryWithNamesAndtypes OR a textual format.

    public ClickHouseDataProcessor getProcessor(ClickHouseDataConfig config, ClickHouseInputStream input,
            ClickHouseOutputStream output, Map<String, Serializable> settings, List<ClickHouseColumn> columns)
            throws IOException {
        ClickHouseFormat format = ClickHouseChecker.nonNull(config, ClickHouseDataConfig.TYPE_NAME).getFormat();
        ClickHouseDataProcessor processor = null;
        if (ClickHouseFormat.RowBinary == format || ClickHouseFormat.RowBinaryWithNamesAndTypes == format) {
            processor = new ClickHouseRowBinaryProcessor(config, input, output, columns, settings);
        } else if (format.isText()) {
            processor = new ClickHouseTabSeparatedProcessor(config, input, output, columns, settings);
        }
        return processor;
    }

It looks like processor is significant for the read use cases but for load use cases, it might not be and hence we can probably introduce null check in close method of com.clickhouse.client.ClickHouseStreamResponse

Note : This issue seems to be introduced starting version 0.4.5 as it works fine till 0.4.4

@porechajp porechajp changed the title NullPoitnerException while loading Parquet file using ClickHouseClient NullPointerException while loading Parquet file using ClickHouseClient Dec 3, 2023
@lee170
Copy link

lee170 commented Apr 9, 2024

I have encountered the same behavior with clickhouse-jdbc-0.5.0.jar. The rows are inserted but a null pointer exception occurs at the end of the query.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants