Skip to content

Commit

Permalink
[0.6.x] Backport PR #324 and #493 for fixing dead links in docs (#556)
Browse files Browse the repository at this point in the history
* Github Action to check links in documentation (#324)

* add github add to check md link

* Only run under `mkdocs/**`

* ws

* make lint

---------

Co-authored-by: Fokko Driesprong <fokko@apache.org>

* Fix dead links in docs (#493)
Backport to 0.6.1

---------

Co-authored-by: Kevin Liu <kevinjqliu@users.noreply.github.com>
Co-authored-by: Fokko Driesprong <fokko@apache.org>
  • Loading branch information
3 people committed Mar 29, 2024
1 parent f65b5c8 commit 813adbe
Show file tree
Hide file tree
Showing 4 changed files with 41 additions and 1 deletion.
16 changes: 16 additions & 0 deletions .github/workflows/check-md-link.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
name: Check Markdown links

on:
push:
paths:
- mkdocs/**
branches:
- 'main'
pull_request:

jobs:
markdown-link-check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@master
- uses: gaurav-nelson/github-action-markdown-link-check@v1
4 changes: 4 additions & 0 deletions mkdocs/docs/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@

<!-- prettier-ignore-start -->

<!-- markdown-link-check-disable -->

- [Getting started](index.md)
- [Configuration](configuration.md)
- [CLI](cli.md)
Expand All @@ -28,4 +30,6 @@
- [How to release](how-to-release.md)
- [Code Reference](reference/)

<!-- markdown-link-check-enable-->

<!-- prettier-ignore-end -->
20 changes: 20 additions & 0 deletions mkdocs/docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,8 @@ For the FileIO there are several configuration options available:

### S3

<!-- markdown-link-check-disable -->

| Key | Example | Description |
| -------------------- | ------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| s3.endpoint | https://10.0.19.25/ | Configure an alternative endpoint of the S3 service for the FileIO to access. This could be used to use S3FileIO with any s3-compatible object storage service that has a different endpoint, or access a private S3 endpoint in a virtual private cloud. |
Expand All @@ -91,17 +93,25 @@ For the FileIO there are several configuration options available:
| s3.proxy-uri | http://my.proxy.com:8080 | Configure the proxy server to be used by the FileIO. |
| s3.connect-timeout | 60.0 | Configure socket connection timeout, in seconds. |

<!-- markdown-link-check-enable-->

### HDFS

<!-- markdown-link-check-disable -->

| Key | Example | Description |
| -------------------- | ------------------- | ------------------------------------------------ |
| hdfs.host | https://10.0.19.25/ | Configure the HDFS host to connect to |
| hdfs.port | 9000 | Configure the HDFS port to connect to. |
| hdfs.user | user | Configure the HDFS username used for connection. |
| hdfs.kerberos_ticket | kerberos_ticket | Configure the path to the Kerberos ticket cache. |

<!-- markdown-link-check-enable-->

### Azure Data lake

<!-- markdown-link-check-disable -->

| Key | Example | Description |
| ----------------------- | ----------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| adlfs.connection-string | AccountName=devstoreaccount1;AccountKey=Eby8vdM02xNOcqF...;BlobEndpoint=http://localhost/ | A [connection string](https://learn.microsoft.com/en-us/azure/storage/common/storage-configure-connection-string). This could be used to use FileIO with any adlfs-compatible object storage service that has a different endpoint (like [azurite](https://github.com/azure/azurite)). |
Expand All @@ -112,8 +122,12 @@ For the FileIO there are several configuration options available:
| adlfs.client-id | ad667be4-b811-11ed-afa1-0242ac120002 | The client-id |
| adlfs.client-secret | oCA3R6P\*ka#oa1Sms2J74z... | The client-secret |

<!-- markdown-link-check-enable-->

### Google Cloud Storage

<!-- markdown-link-check-disable -->

| Key | Example | Description |
| -------------------------- | ------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| gcs.project-id | my-gcp-project | Configure Google Cloud Project for GCS FileIO. |
Expand All @@ -128,6 +142,8 @@ For the FileIO there are several configuration options available:
| gcs.default-location | US | Configure the default location where buckets are created, like 'US' or 'EUROPE-WEST3'. |
| gcs.version-aware | False | Configure whether to support object versioning on the GCS bucket. |

<!-- markdown-link-check-enable-->

## REST Catalog

```yaml
Expand All @@ -145,6 +161,8 @@ catalog:
cabundle: /absolute/path/to/cabundle.pem
```
<!-- markdown-link-check-disable -->
| Key | Example | Description |
| ---------------------- | ----------------------- | -------------------------------------------------------------------------------------------------- |
| uri | https://rest-catalog/ws | URI identifying the REST Server |
Expand All @@ -155,6 +173,8 @@ catalog:
| rest.signing-name | execute-api | The service signing name to use when SigV4 signing a request |
| rest.authorization-url | https://auth-service/cc | Authentication URL to use for client credentials authentication (default: uri + 'v1/oauth/tokens') |

<!-- markdown-link-check-enable-->

## SQL Catalog

The SQL catalog requires a database for its backend. PyIceberg supports PostgreSQL and SQLite through psycopg2. The database connection has to be configured using the `uri` property. See SQLAlchemy's [documentation for URL format](https://docs.sqlalchemy.org/en/20/core/engines.html#backend-specific-urls):
Expand Down
2 changes: 1 addition & 1 deletion mkdocs/docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ You either need to install `s3fs`, `adlfs`, `gcsfs`, or `pyarrow` to be able to

## Connecting to a catalog

Iceberg leverages the [catalog to have one centralized place to organize the tables](https://iceberg.apache.org/catalog/). This can be a traditional Hive catalog to store your Iceberg tables next to the rest, a vendor solution like the AWS Glue catalog, or an implementation of Icebergs' own [REST protocol](https://github.com/apache/iceberg/tree/main/open-api). Checkout the [configuration](configuration.md) page to find all the configuration details.
Iceberg leverages the [catalog to have one centralized place to organize the tables](https://iceberg.apache.org/concepts/catalog/). This can be a traditional Hive catalog to store your Iceberg tables next to the rest, a vendor solution like the AWS Glue catalog, or an implementation of Icebergs' own [REST protocol](https://github.com/apache/iceberg/tree/main/open-api). Checkout the [configuration](configuration.md) page to find all the configuration details.

For the sake of demonstration, we'll configure the catalog to use the `SqlCatalog` implementation, which will store information in a local `sqlite` database. We'll also configure the catalog to store data files in the local filesystem instead of an object store. This should not be used in production due to the limited scalability.

Expand Down

0 comments on commit 813adbe

Please sign in to comment.