Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

misc(parquet): IWYU and fix linking errors #11653

Closed
wants to merge 1 commit into from

Conversation

zuyu
Copy link
Contributor

@zuyu zuyu commented Nov 25, 2024

Fix the linking errors for PageReader.cpp.o in CI: https://github.com/facebookincubator/velox/actions/runs/11977566772/job/33395741582

  • IWYU: add the header velox/dwio/parquet/writer/arrow/LevelConversion.h for DefLevelsToBitmap, DefRepLevelsToList, and DefRepLevelsToBitmap

    switch (mode) {
    case LevelMode::kNulls:
    DefLevelsToBitmap(
    definitionLevels_.data() + begin, end - begin, info, &bits);
    break;
    case LevelMode::kList: {
    arrow::DefRepLevelsToList(
    definitionLevels_.data() + begin,
    repetitionLevels_.data() + begin,
    end - begin,
    info,
    &bits,
    lengths);
    // Convert offsets to lengths.
    for (auto i = 0; i < bits.values_read; ++i) {
    lengths[i] = lengths[i + 1] - lengths[i];
    }
    break;
    }
    case LevelMode::kStructOverLists: {
    DefRepLevelsToBitmap(
    definitionLevels_.data() + begin,
    repetitionLevels_.data() + begin,
    end - begin,
    info,
    &bits);
    break;
    }
    // Converts def_levels to validity bitmaps for non-list arrays and structs that
    // have at least one member that is not a list and has no list descendents. For
    // lists use DefRepLevelsToList and structs where all descendants contain a
    // list use DefRepLevelsToBitmap.
    void PARQUET_EXPORT DefLevelsToBitmap(
    const int16_t* def_levels,
    int64_t num_def_levels,
    LevelInfo level_info,
    ValidityBitmapInputOutput* output);
    // Reconstructs a validity bitmap and list offsets for a list arrays based on
    // def/rep levels. The first element of offsets will not be modified if
    // rep_levels starts with a new list. The first element of offsets will be used
    // when calculating the next offset. See documentation onf DefLevelsToBitmap
    // for when to use this method vs the other ones in this file for
    // reconstruction.
    //
    // Offsets must be sized to 1 + values_read_upper_bound.
    void PARQUET_EXPORT DefRepLevelsToList(
    const int16_t* def_levels,
    const int16_t* rep_levels,
    int64_t num_def_levels,
    LevelInfo level_info,
    ValidityBitmapInputOutput* output,
    int32_t* offsets);
    void PARQUET_EXPORT DefRepLevelsToList(
    const int16_t* def_levels,
    const int16_t* rep_levels,
    int64_t num_def_levels,
    LevelInfo level_info,
    ValidityBitmapInputOutput* output,
    int64_t* offsets);
    // Reconstructs a validity bitmap for a struct every member is a list or has
    // a list descendant. See documentation on DefLevelsToBitmap for when more
    // details on this method compared to the other ones defined above.
    void PARQUET_EXPORT DefRepLevelsToBitmap(
    const int16_t* def_levels,
    const int16_t* rep_levels,
    int64_t num_def_levels,
    LevelInfo level_info,
    ValidityBitmapInputOutput* output);

  • Link the library velox_dwio_arrow_parquet_writer_lib, which contains LevelConversion.cpp

    velox_add_library(
    velox_dwio_arrow_parquet_writer_lib
    ArrowSchema.cpp
    ArrowSchemaInternal.cpp
    ColumnWriter.cpp
    Encoding.cpp
    Encryption.cpp
    EncryptionInternal.cpp
    Exception.cpp
    FileEncryptorInternal.cpp
    FileDecryptorInternal.cpp
    FileWriter.cpp
    LevelComparison.cpp
    LevelComparisonAvx2.cpp
    LevelConversion.cpp

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 25, 2024
Copy link

netlify bot commented Nov 25, 2024

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 6150408
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/6744e1da86039b0008f3a6c8

@@ -20,6 +20,7 @@
#include "velox/dwio/common/BufferUtil.h"
#include "velox/dwio/common/ColumnVisitors.h"
#include "velox/dwio/parquet/thrift/ThriftTransport.h"
#include "velox/dwio/parquet/writer/arrow/LevelConversion.h"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If LevelConversion.h is used in both Reader and Writer, then maybe we should move it to common.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Or velox/dwio/parquet/arrow.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Or velox/dwio/parquet/arrow.

Let's put it in common. We will gradually remove arrow code in the future, so don't want another arrow folder.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Opened #11678. I started working on this.
I will open a PR today.

@zuyu zuyu requested a review from yingsu00 November 26, 2024 17:17
@zuyu zuyu closed this Nov 27, 2024
@zuyu zuyu deleted the parquet-linking-error branch November 27, 2024 18:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants