Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: output of Copy. #12594

Merged
merged 2 commits into from
Aug 29, 2023
Merged

feat: output of Copy. #12594

merged 2 commits into from
Aug 29, 2023

Conversation

youngsofun
Copy link
Member

@youngsofun youngsofun commented Aug 26, 2023

I hereby agree to the terms of the CLA available at: https://databend.rs/dev/policies/cla/

Summary

columns:

  • file
  • rows_loaded
  • errors_seen
  • first_error
  • first_error_line

candidites

  • status = loaded, load failed or partially loaded. (can deduced from rows_loaded and errors_seen)
  • FIRST_ERROR_COLUMN_NAME ( helpful, try add it in next pr?)
  • rows_parsed (redundant for now: where is not supported, each rows_parsed is counted either in rows_loaded or errors_seen)

files skipped for already loaded are not reported.

file_status is stored to table_ctx each time a chuck of data is deserialized, these infos is convert to DataBlock as result after pipeline is successful finished. in distributed copy, slave send status to master before shut_down, reusing the channel for progressInfo.

tasks

  • text files: not summery error by error code, because current error code is not. first_error may be more useful.
  • copy parquet report status too (but no error).
  • ndjson not split file to get first error line.
  • sort results by path to ease tests
  • tests

todo but not in pr:


This change is Reviewable

@vercel
Copy link

vercel bot commented Aug 26, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
databend ⬜️ Ignored (Inspect) Visit Preview Aug 29, 2023 1:02pm

@github-actions github-actions bot added the pr-feature this PR introduces a new feature to the codebase label Aug 26, 2023
@youngsofun youngsofun marked this pull request as draft August 26, 2023 16:45
@youngsofun youngsofun force-pushed the files branch 11 times, most recently from 8d25efa to c5a89e0 Compare August 28, 2023 13:10
@youngsofun youngsofun marked this pull request as ready for review August 28, 2023 13:37
@youngsofun youngsofun force-pushed the files branch 3 times, most recently from 7fdcd15 to 54c36d5 Compare August 29, 2023 12:49
@youngsofun youngsofun merged commit 0ca9a3c into databendlabs:main Aug 29, 2023
52 checks passed
andylokandy pushed a commit to andylokandy/databend that referenced this pull request Nov 27, 2023
* feat: output of Copy.

* update tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-feature this PR introduces a new feature to the codebase
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: COPY INTO returns more status
3 participants