Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++][Parquet][CI] Seed corpus should also use bad data files #43703

Closed
pitrou opened this issue Aug 15, 2024 · 3 comments
Closed

[C++][Parquet][CI] Seed corpus should also use bad data files #43703

pitrou opened this issue Aug 15, 2024 · 3 comments

Comments

@pitrou
Copy link
Member

pitrou commented Aug 15, 2024

Describe the enhancement requested

Currently our Parquet fuzzing seed corpus consists entirely of testing data files, see here:

# Add Parquet testing examples
cp ${ARROW_CPP}/submodules/parquet-testing/data/*.parquet ${CORPUS_DIR}

We should also add files from the bad_data directory in the testing repository.

Component(s)

C++, Continuous Integration, Parquet

@pitrou
Copy link
Member Author

pitrou commented Aug 15, 2024

@mapleFU Would you like to take a look?

@mapleFU
Copy link
Member

mapleFU commented Aug 15, 2024

I'd like adding this tonight when I back home

pitrou pushed a commit that referenced this issue Aug 15, 2024
…esting (#43708)

### Rationale for this change

Introducing more bad_data for testing

### What changes are included in this PR?

* Upgrade parquet-testing
* Introduce more bad_data
* Update fuzz generation

### Are these changes tested?

They're tests :-)

### Are there any user-facing changes?

no

* GitHub Issue: #43703

Authored-by: mwish <maplewish117@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
@pitrou pitrou added this to the 18.0.0 milestone Aug 15, 2024
@pitrou
Copy link
Member Author

pitrou commented Aug 15, 2024

Issue resolved by pull request 43708
#43708

@pitrou pitrou closed this as completed Aug 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants