Skip to content

Commit

Permalink
[GOBBLIN-1832] Emit warning instead of failing job for retention of H…
Browse files Browse the repository at this point in the history
…ive Table Views (#3695)

We should not allow hive retention on a view since it shouldn't have access to delete underlying data. Instead it should throw a warning message if it is a view instead of failing the job as there may be retention jobs configured to include both hive tables and views. We want to be able to dynamically determine at runtime whether or not to skip retention on the dataset in question rather than statically allow/denylist tables in the configurations.

Co-authored-by: Urmi Mustafi <umustafi@linkedin.com>
  • Loading branch information
umustafi and Urmi Mustafi authored May 11, 2023
1 parent 3a045cd commit 05c732c
Showing 1 changed file with 11 additions and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -18,11 +18,13 @@

import java.io.IOException;
import java.util.Collection;
import java.util.Collections;
import java.util.List;

import lombok.extern.slf4j.Slf4j;

import org.apache.hadoop.hive.metastore.IMetaStoreClient;
import org.apache.hadoop.hive.metastore.TableType;
import org.apache.hadoop.hive.ql.metadata.Partition;

import com.google.common.base.Function;
Expand Down Expand Up @@ -56,6 +58,8 @@ public Class<? extends DatasetVersion> versionClass() {
* Calls {@link #getDatasetVersion(Partition)} for every {@link Partition} found.
* <p>
* Note: If an exception occurs while processing a partition, that partition will be ignored in the returned collection
* Also note that if the dataset passed is a view type, we will return an empty list even if the underlying table is
* partitioned.
* </p>
*
* @throws IllegalArgumentException if <code>dataset</code> is not a {@link HiveDataset}. Or if {@link HiveDataset#getTable()}
Expand All @@ -69,7 +73,13 @@ public Collection<HiveDatasetVersion> findDatasetVersions(Dataset dataset) throw
final HiveDataset hiveDataset = (HiveDataset) dataset;

if (!hiveDataset.getTable().isPartitioned()) {
throw new IllegalArgumentException("HiveDatasetVersionFinder is only compatible with partitioned hive tables");
if (hiveDataset.getTable().getTableType() == TableType.VIRTUAL_VIEW) {
log.warn("Skipping processing a view type dataset: ", ((HiveDataset) dataset).getTable().getTableName());
return Collections.emptyList();
} else {
throw new IllegalArgumentException("HiveDatasetVersionFinder is only compatible with partitioned hive tables. "
+ "This is a snapshot hive table.");
}
}

try (AutoReturnableObject<IMetaStoreClient> client = hiveDataset.getClientPool().getClient()) {
Expand Down

0 comments on commit 05c732c

Please sign in to comment.