Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate Parquet Support for Bulk Data Export #3139

Closed
prb112 opened this issue Dec 21, 2021 · 0 comments
Closed

Deprecate Parquet Support for Bulk Data Export #3139

prb112 opened this issue Dec 21, 2021 · 0 comments
Labels

Comments

@prb112
Copy link
Contributor

prb112 commented Dec 21, 2021

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

The BulkData feature include an export to application/parquet.
This drags extra dependencies with spark (and spark-sql) to cache and flush parquet using stocator-fs to parquet on S3/COS.

This feature is fairly brittle, and our recommendation is to export to ndjson, and then transform the ndjson to parquet.

Describe the solution you'd like
A clear and concise description of what you want to happen.

  • Remove the Parquet feature.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Acceptance Criteria

  1. GIVEN [a precondition]
    AND [another precondition]
    WHEN [test step]
    AND [test step]
    THEN [verification step]
    AND [verification step]

Additional context
Add any other context or screenshots about the feature request here.

@prb112 prb112 added enhancement New feature or request bulk-data labels Dec 21, 2021
@prb112 prb112 closed this as completed Jan 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants