Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat](catalog)Support Pre-Execution Authentication for HMS Type Iceberg Catalog Operations. #43445

Merged
merged 5 commits into from
Nov 18, 2024

Conversation

CalvinKirs
Copy link
Member

@CalvinKirs CalvinKirs commented Nov 7, 2024

What problem does this PR solve?

Support Pre-Execution Authentication for HMS Type Iceberg Catalog Operations Summary
This PR introduces a new utility class, PreExecutionAuthenticator, which is designed to ensure pre-execution authentication for HMS (Hive Metastore) type operations on Iceberg catalogs. This is especially useful in environments where secure access is required, such as Kerberos-based Hadoop ecosystems. By integrating PreExecutionAuthenticator, each relevant operation will undergo an authentication step prior to execution, maintaining security compliance.

Motivation

In environments utilizing an Iceberg catalog with an HMS backend, many operations may require authentication to access secure data or perform privileged tasks. Given that operations on HMS-type catalogs typically run within a Hadoop environment secured by Kerberos, ensuring each operation is executed within an authenticated context is essential. Previously, there was no standardized mechanism to enforce pre-execution authentication, which led to potential security gaps. This PR aims to address this issue by introducing an extensible authentication utility.

Key Changes

Addition of PreExecutionAuthenticator Utility Class

Provides a standard way to perform pre-execution authentication for tasks. Leverages HadoopAuthenticator (when available) to execute tasks within a privileged context using doAs. Supports execution with or without authentication, enabling flexibility for both secure and non-secure environments. Integration with Iceberg Catalog Operations

All relevant HMS-type catalog operations will now use PreExecutionAuthenticator to perform pre-execution authentication. Ensures that operations like createDb, dropDb, and other privileged tasks are executed only after authentication. Extensible Design

PreExecutionAuthenticator is adaptable to other future authentication methods, if needed, beyond Hadoop and Kerberos. CallableToPrivilegedExceptionActionAdapter class allows any Callable task to be executed within a PrivilegedExceptionAction, making it versatile for various task types.

Check List (For Author)

  • Test

    • Manual test (add detailed scripts or steps below)
mysql> CREATE TABLE ha
    ->        (
    ->            vendor_id BIGINT,
    ->            trip_id BIGINT,
    ->            trip_distance FLOAT,
    ->            fare_amount DOUBLE,
    ->            store_and_fwd_flag STRING,
    ->            ts DATETIME
    ->        );
Query OK, 0 rows affected (2.08 sec)

mysql> show create table ha;
+-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                                                                                                                                                                                                                                                              |
+-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ha    | CREATE TABLE `ha` (
  `vendor_id` bigint NULL,
  `trip_id` bigint NULL,
  `trip_distance` float NULL,
  `fare_amount` double NULL,
  `store_and_fwd_flag` text NULL,
  `ts` datetimev2(6) NULL
) ENGINE=ICEBERG_EXTERNAL_TABLE
LOCATION 'xxxxx'
PROPERTIES (
  "doris.version" = "doris-2.1.6-rc04-67ee7f53e6",
  "write.parquet.compression-codec" = "zstd"
);

mysql>        INSERT INTO iceberg.ck_iceberg.ha
    ->        VALUES
    ->         (1, 1000371, 1.8, 15.32, 'N', '2024-01-01 9:15:23'),
    ->         (2, 1000372, 2.5, 22.15, 'N', '2024-01-02 12:10:11'),
    ->         (2, 1000373, 0.9, 9.01, 'N', '2024-01-01 3:25:15'),
    ->         (1, 1000374, 8.4, 42.13, 'Y', '2024-01-03 7:12:33');  
Query OK, 4 rows affected (5.10 sec)
{'status':'COMMITTED', 'txnId':'35030'}

mysql> select * from ha;
+-----------+---------+---------------+-------------+--------------------+----------------------------+
| vendor_id | trip_id | trip_distance | fare_amount | store_and_fwd_flag | ts                         |
+-----------+---------+---------------+-------------+--------------------+----------------------------+
|         1 | 1000371 |           1.8 |       15.32 | N                  | 2024-01-01 09:15:23.000000 |
|         2 | 1000372 |           2.5 |       22.15 | N                  | 2024-01-02 12:10:11.000000 |
|         2 | 1000373 |           0.9 |        9.01 | N                  | 2024-01-01 03:25:15.000000 |
|         1 | 1000374 |           8.4 |       42.13 | Y                  | 2024-01-03 07:12:33.000000 |
+-----------+---------+---------------+-------------+--------------------+----------------------------+
4 rows in set (1.20 sec)
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.
  • Release note

    None

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

…erg Catalog Operations

Support Pre-Execution Authentication for HMS Type Iceberg Catalog Operations
Summary
This PR introduces a new utility class, PreExecutionAuthenticator, which is designed to ensure pre-execution authentication for HMS (Hive Metastore) type operations on Iceberg catalogs. This is especially useful in environments where secure access is required, such as Kerberos-based Hadoop ecosystems. By integrating PreExecutionAuthenticator, each relevant operation will undergo an authentication step prior to execution, maintaining security compliance.

Motivation
In environments utilizing an Iceberg catalog with an HMS backend, many operations may require authentication to access secure data or perform privileged tasks. Given that operations on HMS-type catalogs typically run within a Hadoop environment secured by Kerberos, ensuring each operation is executed within an authenticated context is essential. Previously, there was no standardized mechanism to enforce pre-execution authentication, which led to potential security gaps. This PR aims to address this issue by introducing an extensible authentication utility.

Key Changes
Addition of PreExecutionAuthenticator Utility Class

Provides a standard way to perform pre-execution authentication for tasks.
Leverages HadoopAuthenticator (when available) to execute tasks within a privileged context using doAs.
Supports execution with or without authentication, enabling flexibility for both secure and non-secure environments.
Integration with Iceberg Catalog Operations

All relevant HMS-type catalog operations will now use PreExecutionAuthenticator to perform pre-execution authentication.
Ensures that operations like createDb, dropDb, and other privileged tasks are executed only after authentication.
Extensible Design

PreExecutionAuthenticator is adaptable to other future authentication methods, if needed, beyond Hadoop and Kerberos.
CallableToPrivilegedExceptionActionAdapter class allows any Callable task to be executed within a PrivilegedExceptionAction, making it versatile for various task types.
@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.
  • Release note

    None

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

…erg Catalog Operations

Support Pre-Execution Authentication for HMS Type Iceberg Catalog Operations
Summary
This PR introduces a new utility class, PreExecutionAuthenticator, which is designed to ensure pre-execution authentication for HMS (Hive Metastore) type operations on Iceberg catalogs. This is especially useful in environments where secure access is required, such as Kerberos-based Hadoop ecosystems. By integrating PreExecutionAuthenticator, each relevant operation will undergo an authentication step prior to execution, maintaining security compliance.

Motivation
In environments utilizing an Iceberg catalog with an HMS backend, many operations may require authentication to access secure data or perform privileged tasks. Given that operations on HMS-type catalogs typically run within a Hadoop environment secured by Kerberos, ensuring each operation is executed within an authenticated context is essential. Previously, there was no standardized mechanism to enforce pre-execution authentication, which led to potential security gaps. This PR aims to address this issue by introducing an extensible authentication utility.

Key Changes
Addition of PreExecutionAuthenticator Utility Class

Provides a standard way to perform pre-execution authentication for tasks.
Leverages HadoopAuthenticator (when available) to execute tasks within a privileged context using doAs.
Supports execution with or without authentication, enabling flexibility for both secure and non-secure environments.
Integration with Iceberg Catalog Operations

All relevant HMS-type catalog operations will now use PreExecutionAuthenticator to perform pre-execution authentication.
Ensures that operations like createDb, dropDb, and other privileged tasks are executed only after authentication.
Extensible Design

PreExecutionAuthenticator is adaptable to other future authentication methods, if needed, beyond Hadoop and Kerberos.
CallableToPrivilegedExceptionActionAdapter class allows any Callable task to be executed within a PrivilegedExceptionAction, making it versatile for various task types.
…ype Iceberg Catalog Operations"

This reverts commit d90b608.
…a Common Interface

### Optimize Column-Level Permission Checks Using Table-Level Permissions:

Since having column-level permissions does not imply table-level permissions, but having table-level permissions does imply permissions on all columns within the table, we can streamline column permission checks. When checking column-level permissions, we can first check if the user has table-level permissions. If table-level permissions are granted, column-level checks become unnecessary. Only if table-level permissions are absent do we proceed with specific column-level permission checks.

### Global Permissions Shortcut: Global-level permissions typically grant full access across all operations.

Therefore, to optimize permission checks, we can add an early check for global permissions. If the user has global permissions, they are authorized, and further permission checks at the database, table, or column levels are unnecessary, allowing us to return immediately.
@morningman
Copy link
Contributor

run buildall

@CalvinKirs
Copy link
Member Author

run buildall

@CalvinKirs CalvinKirs changed the title [feat](catalog)Support Pre-Execution Authentication for HMS Type Iceberg Catalog Operations [feat](catalog)Support Pre-Execution Authentication for HMS Type Iceberg Catalog Operations. [feat](catalog)Support Pre-Execution Authentication for HMS Type Iceberg Catalog Operations. [feat](catalog)Support Pre-Execution Authentication for HMS Type Iceberg Catalog Operations. [feat](catalog)Support Pre-Execution Authentication for HMS Type Iceberg Catalog Operations [feat](catalog)Support Pre-Execution Authentication for HMS Type Iceberg Catalog Operations [feat](catalog)Support Pre-Execution Authentication for HMS Type Iceberg Catalog Operations [feat](catalog)Support Pre-Execution Authentication for HMS Type Iceberg Catalog Operations [feat](catalog)Support Pre-Execution Authentication for HMS Type Iceberg Catalog Operations [feat](catalog)Support Pre-Execution Authentication for HMS Type Iceberg Catalog Operations [feat](catalog)Support Pre-Execution Authentication for HMS Type Iceberg Catalog Operations Nov 13, 2024
@CalvinKirs CalvinKirs changed the title [feat](catalog)Support Pre-Execution Authentication for HMS Type Iceberg Catalog Operations. [feat](catalog)Support Pre-Execution Authentication for HMS Type Iceberg Catalog Operations. [feat](catalog)Support Pre-Execution Authentication for HMS Type Iceberg Catalog Operations. [feat](catalog)Support Pre-Execution Authentication for HMS Type Iceberg Catalog Operations [feat](catalog)Support Pre-Execution Authentication for HMS Type Iceberg Catalog Operations [feat](catalog)Support Pre-Execution Authentication for HMS Type Iceberg Catalog Operations [feat](catalog)Support Pre-Execution Authentication for HMS Type Iceberg Catalog Operations [feat](catalog)Support Pre-Execution Authentication for HMS Type Iceberg Catalog Operations [feat](catalog)Support Pre-Execution Authentication for HMS Type Iceberg Catalog Operations [feat](catalog)Support Pre-Execution Authentication for HMS Type Iceberg Catalog Operations [feat](catalog)Support Pre-Execution Authentication for HMS Type Iceberg Catalog Operations. Nov 13, 2024
@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Nov 18, 2024
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

PR approved by anyone and no changes requested.

@CalvinKirs CalvinKirs merged commit 9b983ca into apache:master Nov 18, 2024
28 of 30 checks passed
@CalvinKirs CalvinKirs deleted the master-auth-test-117 branch November 18, 2024 03:17
github-actions bot pushed a commit that referenced this pull request Nov 18, 2024
…erg Catalog Operations. (#43445)

### What problem does this PR solve?

Support Pre-Execution Authentication for HMS Type Iceberg Catalog
Operations Summary
This PR introduces a new utility class, PreExecutionAuthenticator, which
is designed to ensure pre-execution authentication for HMS (Hive
Metastore) type operations on Iceberg catalogs. This is especially
useful in environments where secure access is required, such as
Kerberos-based Hadoop ecosystems. By integrating
PreExecutionAuthenticator, each relevant operation will undergo an
authentication step prior to execution, maintaining security compliance.

### Motivation
In environments utilizing an Iceberg catalog with an HMS backend, many
operations may require authentication to access secure data or perform
privileged tasks. Given that operations on HMS-type catalogs typically
run within a Hadoop environment secured by Kerberos, ensuring each
operation is executed within an authenticated context is essential.
Previously, there was no standardized mechanism to enforce pre-execution
authentication, which led to potential security gaps. This PR aims to
address this issue by introducing an extensible authentication utility.

### Key Changes
Addition of PreExecutionAuthenticator Utility Class

Provides a standard way to perform pre-execution authentication for
tasks. Leverages HadoopAuthenticator (when available) to execute tasks
within a privileged context using doAs. Supports execution with or
without authentication, enabling flexibility for both secure and
non-secure environments. Integration with Iceberg Catalog Operations

All relevant HMS-type catalog operations will now use
PreExecutionAuthenticator to perform pre-execution authentication.
Ensures that operations like createDb, dropDb, and other privileged
tasks are executed only after authentication. Extensible Design

PreExecutionAuthenticator is adaptable to other future authentication
methods, if needed, beyond Hadoop and Kerberos.
CallableToPrivilegedExceptionActionAdapter class allows any Callable
task to be executed within a PrivilegedExceptionAction, making it
versatile for various task types.


### Check List (For Author)

- Test <!-- At least one of them must be included. -->

    - [x] Manual test (add detailed scripts or steps below)
```
mysql> CREATE TABLE ha
    ->        (
    ->            vendor_id BIGINT,
    ->            trip_id BIGINT,
    ->            trip_distance FLOAT,
    ->            fare_amount DOUBLE,
    ->            store_and_fwd_flag STRING,
    ->            ts DATETIME
    ->        );
Query OK, 0 rows affected (2.08 sec)

mysql> show create table ha;
+-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                                                                                                                                                                                                                                                              |
+-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ha    | CREATE TABLE `ha` (
  `vendor_id` bigint NULL,
  `trip_id` bigint NULL,
  `trip_distance` float NULL,
  `fare_amount` double NULL,
  `store_and_fwd_flag` text NULL,
  `ts` datetimev2(6) NULL
) ENGINE=ICEBERG_EXTERNAL_TABLE
LOCATION 'xxxxx'
PROPERTIES (
  "doris.version" = "doris-2.1.6-rc04-67ee7f53e6",
  "write.parquet.compression-codec" = "zstd"
);

mysql>        INSERT INTO iceberg.ck_iceberg.ha
    ->        VALUES
    ->         (1, 1000371, 1.8, 15.32, 'N', '2024-01-01 9:15:23'),
    ->         (2, 1000372, 2.5, 22.15, 'N', '2024-01-02 12:10:11'),
    ->         (2, 1000373, 0.9, 9.01, 'N', '2024-01-01 3:25:15'),
    ->         (1, 1000374, 8.4, 42.13, 'Y', '2024-01-03 7:12:33');  
Query OK, 4 rows affected (5.10 sec)
{'status':'COMMITTED', 'txnId':'35030'}

mysql> select * from ha;
+-----------+---------+---------------+-------------+--------------------+----------------------------+
| vendor_id | trip_id | trip_distance | fare_amount | store_and_fwd_flag | ts                         |
+-----------+---------+---------------+-------------+--------------------+----------------------------+
|         1 | 1000371 |           1.8 |       15.32 | N                  | 2024-01-01 09:15:23.000000 |
|         2 | 1000372 |           2.5 |       22.15 | N                  | 2024-01-02 12:10:11.000000 |
|         2 | 1000373 |           0.9 |        9.01 | N                  | 2024-01-01 03:25:15.000000 |
|         1 | 1000374 |           8.4 |       42.13 | Y                  | 2024-01-03 07:12:33.000000 |
+-----------+---------+---------------+-------------+--------------------+----------------------------+
4 rows in set (1.20 sec)
```
github-actions bot pushed a commit that referenced this pull request Nov 18, 2024
…erg Catalog Operations. (#43445)

### What problem does this PR solve?

Support Pre-Execution Authentication for HMS Type Iceberg Catalog
Operations Summary
This PR introduces a new utility class, PreExecutionAuthenticator, which
is designed to ensure pre-execution authentication for HMS (Hive
Metastore) type operations on Iceberg catalogs. This is especially
useful in environments where secure access is required, such as
Kerberos-based Hadoop ecosystems. By integrating
PreExecutionAuthenticator, each relevant operation will undergo an
authentication step prior to execution, maintaining security compliance.

### Motivation
In environments utilizing an Iceberg catalog with an HMS backend, many
operations may require authentication to access secure data or perform
privileged tasks. Given that operations on HMS-type catalogs typically
run within a Hadoop environment secured by Kerberos, ensuring each
operation is executed within an authenticated context is essential.
Previously, there was no standardized mechanism to enforce pre-execution
authentication, which led to potential security gaps. This PR aims to
address this issue by introducing an extensible authentication utility.

### Key Changes
Addition of PreExecutionAuthenticator Utility Class

Provides a standard way to perform pre-execution authentication for
tasks. Leverages HadoopAuthenticator (when available) to execute tasks
within a privileged context using doAs. Supports execution with or
without authentication, enabling flexibility for both secure and
non-secure environments. Integration with Iceberg Catalog Operations

All relevant HMS-type catalog operations will now use
PreExecutionAuthenticator to perform pre-execution authentication.
Ensures that operations like createDb, dropDb, and other privileged
tasks are executed only after authentication. Extensible Design

PreExecutionAuthenticator is adaptable to other future authentication
methods, if needed, beyond Hadoop and Kerberos.
CallableToPrivilegedExceptionActionAdapter class allows any Callable
task to be executed within a PrivilegedExceptionAction, making it
versatile for various task types.


### Check List (For Author)

- Test <!-- At least one of them must be included. -->

    - [x] Manual test (add detailed scripts or steps below)
```
mysql> CREATE TABLE ha
    ->        (
    ->            vendor_id BIGINT,
    ->            trip_id BIGINT,
    ->            trip_distance FLOAT,
    ->            fare_amount DOUBLE,
    ->            store_and_fwd_flag STRING,
    ->            ts DATETIME
    ->        );
Query OK, 0 rows affected (2.08 sec)

mysql> show create table ha;
+-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                                                                                                                                                                                                                                                              |
+-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ha    | CREATE TABLE `ha` (
  `vendor_id` bigint NULL,
  `trip_id` bigint NULL,
  `trip_distance` float NULL,
  `fare_amount` double NULL,
  `store_and_fwd_flag` text NULL,
  `ts` datetimev2(6) NULL
) ENGINE=ICEBERG_EXTERNAL_TABLE
LOCATION 'xxxxx'
PROPERTIES (
  "doris.version" = "doris-2.1.6-rc04-67ee7f53e6",
  "write.parquet.compression-codec" = "zstd"
);

mysql>        INSERT INTO iceberg.ck_iceberg.ha
    ->        VALUES
    ->         (1, 1000371, 1.8, 15.32, 'N', '2024-01-01 9:15:23'),
    ->         (2, 1000372, 2.5, 22.15, 'N', '2024-01-02 12:10:11'),
    ->         (2, 1000373, 0.9, 9.01, 'N', '2024-01-01 3:25:15'),
    ->         (1, 1000374, 8.4, 42.13, 'Y', '2024-01-03 7:12:33');  
Query OK, 4 rows affected (5.10 sec)
{'status':'COMMITTED', 'txnId':'35030'}

mysql> select * from ha;
+-----------+---------+---------------+-------------+--------------------+----------------------------+
| vendor_id | trip_id | trip_distance | fare_amount | store_and_fwd_flag | ts                         |
+-----------+---------+---------------+-------------+--------------------+----------------------------+
|         1 | 1000371 |           1.8 |       15.32 | N                  | 2024-01-01 09:15:23.000000 |
|         2 | 1000372 |           2.5 |       22.15 | N                  | 2024-01-02 12:10:11.000000 |
|         2 | 1000373 |           0.9 |        9.01 | N                  | 2024-01-01 03:25:15.000000 |
|         1 | 1000374 |           8.4 |       42.13 | Y                  | 2024-01-03 07:12:33.000000 |
+-----------+---------+---------------+-------------+--------------------+----------------------------+
4 rows in set (1.20 sec)
```
CalvinKirs added a commit that referenced this pull request Nov 18, 2024
…MS Type Iceberg Catalog Operations. #43445 (#44129)

Cherry-picked from #43445

Co-authored-by: Calvin Kirs <guoqiang@selectdb.com>
morningman pushed a commit that referenced this pull request Nov 18, 2024
…MS Type Iceberg Catalog Operations. #43445 (#44127)

Cherry-picked from #43445

Co-authored-by: Calvin Kirs <guoqiang@selectdb.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/2.1.8-merged dev/3.0.3-merged p0_b reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants