-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Nessie catalog #19
Comments
looking forward for this feature to conduct testing. |
Any update on supporting nessie catalog? |
@jbonofre might take it up after java 1.5.0 release. |
@ajantha-bhat Any rough idea about when this will be available? thanks! |
I would also like to know if it is estimated to be worked on soon, I'd find it very useful. Thx! |
Hi, we would like to contribute to this issue, is it possible? |
It looks like that Nessie has announced REST catalog support. This would make the native Nessie integration redundant. |
ATM, Nessie has Iceberg REST API on |
Is there a release date? |
It might be best to talk about Nessie releases in the project's Zulip chat (the join link is on projectnessie.org) :) |
Nessie 0.90.2 and later support the Iceberg REST Catalog API. |
I think this issue can be considered like fixed thanks to the REST Catalog API support by Nessie. |
I want to create iceberg tables using pyiceberg and store it in minio store, so for this i have created docker containers for services named as: nessie, minio, dremio DEFINE SENSITIVE VARIABLESNESSIE_URI = "http://nessie:19120/api/v1" conf = ( Start Spark Sessionspark = SparkSession.builder.config(conf=conf).getOrCreate() LOAD A CSV INTO AN SQL VIEWcsv_df = spark.read.format("csv").option("header", "true").load("../datasets/df_open_2023.csv") CREATE AN ICEBERG TABLE FROM THE SQL VIEWspark.sql("CREATE TABLE IF NOT EXISTS nessie.df_open_2023 USING iceberg AS SELECT * FROM csv_open_2023").show() QUERY THE ICEBERG TABLEspark.sql("SELECT * FROM nessie.df_open_2023 limit 10").show() Please tell me how to do it with pyiceberg |
generally speaking you use the REST catalog running the nessie server: |
iceberg-python/pyiceberg/catalog/rest.py Line 248 in c30e43a
however according to https://py.iceberg.apache.org/api/
|
I encountered an issue while using the load_catalog() method, where it shows the following error: To address this, I attempted to use load_rest("rest", <config_dict>), but I encountered a validation issue in the ConfigResponse model while working with the RestCatalog from PyIceberg. It seems that the defaults and overrides fields are required in the ConfigResponse model, but the Nessie REST API is not responding with these fields as expected. Even after passing them explicitly in the response, I am still getting a validation error. |
@cee-shubham I am having a similar issue. If someone has managed to load a Nessie catalog using pyiceberg's |
@sean-pasabi I was able to get pyiceberg working with REST catalog exposed by Nessie, at least as a proof of concept: https://github.com/edgarrmondragon/-learn-iceberg-nessie |
@edgarrmondragon I have a similar |
@edgarrmondragon I have followed your code, and while the namespace and table were successfully created and are visible in the MinIO bucket, I encountered an error when appending data to the table. The error is related to AWS access permissions, specifically an "ACCESS_DENIED" issue during a HeadObject operation. Below is the relevant error message: |
Hey @cee-shubham, did you mean @edgarrmondragon, because I haven't given any code? |
Feature Request / Improvement
PyIceberg has added support for glue catalog. We need to have support for Nessie catalog too just like hive, glue, REST catalogs.
Migrated from apache/iceberg#6414
The text was updated successfully, but these errors were encountered: