Skip to content

Commit

Permalink
add adls docs
Browse files Browse the repository at this point in the history
  • Loading branch information
avriiil authored and ion-elgreco committed Sep 27, 2024
1 parent 2498837 commit fa344f7
Show file tree
Hide file tree
Showing 2 changed files with 59 additions and 0 deletions.
57 changes: 57 additions & 0 deletions docs/integrations/object-storage/adls.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# Azure ADLS Storage Backend

`delta-rs` offers native support for using Microsoft Azure Data Lake Storage (ADSL) as an object storage backend.

You don’t need to install any extra dependencies to read/write Delta tables to S3 with engines that use `delta-rs`. You do need to configure your ADLS access credentials correctly.

## Passing Credentials Explicitly

You can also pass ADLS credentials to your query engine explicitly.

For Polars, you would do this using the `storage_options` keyword as demonstrated above. This will forward your credentials to the `object store` library that Polars uses for cloud storage access under the hood. Read the [`object store` documentation](https://docs.rs/object_store/latest/object_store/azure/enum.AzureConfigKey.html#variants) for more information defining specific credentials.

## Example: Write Delta table to ADLS with Polars

Using Polars, you can write a Delta table to ADLS directly like this:

```python
import polars as pl

df = pl.DataFrame({"foo": [1, 2, 3, 4, 5]})

# define container name
container = <container_name>

# define credentials
storage_options = {
"ACCOUNT_NAME": <account_name>,
"ACCESS_KEY": <access_key>,
}

# write Delta to ADLS
df_pl.write_delta(
f"abfs://{container}/delta_table",
storage_options = storage_options
)
```

## Example with pandas

For libraries without direct `write_delta` methods (like Pandas), you can use the `write_deltalake` function from the `deltalake` library:

```python
import pandas as pd
from deltalake import write_deltalake

df = pd.DataFrame({"foo": [1, 2, 3, 4, 5]})

write_deltalake(
f"abfs://{container}/delta_table_pandas",
df,
storage_options=storage_options
)
```

## Using Local Authentication

If your local session is authenticated using the Azure CLI then you can write Delta tables directly to ADLS. Read more about this in the [Azure CLI documentation](https://learn.microsoft.com/en-us/cli/azure/).
2 changes: 2 additions & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,8 @@ nav:
- api/exceptions.md
- Integrations:
- Object Storage:
- integrations/object-storage/adls.md
- integrations/object-storage/gcs.md
- integrations/object-storage/hdfs.md
- integrations/object-storage/s3.md
- integrations/object-storage/s3-like.md
Expand Down

0 comments on commit fa344f7

Please sign in to comment.