Skip to content

Commit

Permalink
update docs, also parse from env var
Browse files Browse the repository at this point in the history
  • Loading branch information
nameexhaustion committed Jul 18, 2024
1 parent 075d1a2 commit fdb89c2
Show file tree
Hide file tree
Showing 4 changed files with 30 additions and 13 deletions.
6 changes: 6 additions & 0 deletions crates/polars-io/src/cloud/options.rs
Original file line number Diff line number Diff line change
Expand Up @@ -476,6 +476,12 @@ impl CloudOptions {
{
let mut this = Self::default();

if let Ok(v) = std::env::var("HF_TOKEN") {
this.config = Some(CloudConfig::Http {
headers: vec![("Authorization".into(), format!("Bearer {}", v))],
})
}

for (i, (k, v)) in config.into_iter().enumerate() {
let (k, v) = (k.as_ref(), v.into());

Expand Down
8 changes: 5 additions & 3 deletions py-polars/polars/io/csv/functions.py
Original file line number Diff line number Diff line change
Expand Up @@ -962,7 +962,9 @@ def scan_csv(
Parameters
----------
source
Path to a file.
Path(s) to a file or directory
When needing to authenticate for scanning cloud locations, see the `storage_options`
parameter.
has_header
Indicate if the first row of the dataset is a header or not. If set to False,
column names will be autogenerated in the following format: `column_x`, with
Expand Down Expand Up @@ -1054,15 +1056,15 @@ def scan_csv(
Expand path given via globbing rules.
storage_options
Options that indicate how to connect to a cloud provider.
If the cloud provider is not supported by Polars, the storage options
are passed to `fsspec.open()`.
The cloud providers currently supported are AWS, GCP, and Azure.
See supported keys here:
* `aws <https://docs.rs/object_store/latest/object_store/aws/enum.AmazonS3ConfigKey.html>`_
* `gcp <https://docs.rs/object_store/latest/object_store/gcp/enum.GoogleConfigKey.html>`_
* `azure <https://docs.rs/object_store/latest/object_store/azure/enum.AzureConfigKey.html>`_
* Hugging Face (`hf://`): Accepts an API key under the `token` paramter: \

Check warning on line 1066 in py-polars/polars/io/csv/functions.py

View workflow job for this annotation

GitHub Actions / main

"paramter" should be "parameter".
`{'token': '...'}`, or by setting the `HF_TOKEN` environment variable.
If `storage_options` is not provided, Polars will try to infer the information
from environment variables.
Expand Down
6 changes: 5 additions & 1 deletion py-polars/polars/io/ipc/functions.py
Original file line number Diff line number Diff line change
Expand Up @@ -333,7 +333,9 @@ def scan_ipc(
Parameters
----------
source
Path to a IPC file.
Path(s) to a file or directory
When needing to authenticate for scanning cloud locations, see the `storage_options`
parameter.
n_rows
Stop reading from IPC file after reading `n_rows`.
cache
Expand All @@ -354,6 +356,8 @@ def scan_ipc(
* `aws <https://docs.rs/object_store/latest/object_store/aws/enum.AmazonS3ConfigKey.html>`_
* `gcp <https://docs.rs/object_store/latest/object_store/gcp/enum.GoogleConfigKey.html>`_
* `azure <https://docs.rs/object_store/latest/object_store/azure/enum.AzureConfigKey.html>`_
* Hugging Face (`hf://`): Accepts an API key under the `token` paramter: \

Check warning on line 359 in py-polars/polars/io/ipc/functions.py

View workflow job for this annotation

GitHub Actions / main

"paramter" should be "parameter".
`{'token': '...'}`, or by setting the `HF_TOKEN` environment variable.
If `storage_options` is not provided, Polars will try to infer the information
from environment variables.
Expand Down
23 changes: 14 additions & 9 deletions py-polars/polars/io/parquet/functions.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,12 +59,14 @@ def read_parquet(
Parameters
----------
source
Path to a file or a file-like object (by "file-like object" we refer to objects
Path(s) to a file or directory
When needing to authenticate for scanning cloud locations, see the `storage_options`
parameter.
File-like objects are supported (by "file-like object" we refer to objects
that have a `read()` method, such as a file handler like the builtin `open`
function, or a `BytesIO` instance). If the path is a directory, files in that
directory will all be read.
For file-like objects,
stream position may not be updated accordingly after reading.
function, or a `BytesIO` instance) For file-like objects, stream position
may not be updated accordingly after reading.
columns
Columns to select. Accepts a list of column indices (starting at zero) or a list
of column names.
Expand Down Expand Up @@ -106,15 +108,15 @@ def read_parquet(
Reduce memory pressure at the expense of performance.
storage_options
Options that indicate how to connect to a cloud provider.
If the cloud provider is not supported by Polars, the storage options
are passed to `fsspec.open()`.
The cloud providers currently supported are AWS, GCP, and Azure.
See supported keys here:
* `aws <https://docs.rs/object_store/latest/object_store/aws/enum.AmazonS3ConfigKey.html>`_
* `gcp <https://docs.rs/object_store/latest/object_store/gcp/enum.GoogleConfigKey.html>`_
* `azure <https://docs.rs/object_store/latest/object_store/azure/enum.AzureConfigKey.html>`_
* Hugging Face (`hf://`): Accepts an API key under the `token` paramter: \

Check warning on line 118 in py-polars/polars/io/parquet/functions.py

View workflow job for this annotation

GitHub Actions / main

"paramter" should be "parameter".
`{'token': '...'}`, or by setting the `HF_TOKEN` environment variable.
If `storage_options` is not provided, Polars will try to infer the information
from environment variables.
Expand Down Expand Up @@ -320,8 +322,9 @@ def scan_parquet(
Parameters
----------
source
Path(s) to a file
If a single path is given, it can be a globbing pattern.
Path(s) to a file or directory
When needing to authenticate for scanning cloud locations, see the `storage_options`
parameter.
n_rows
Stop reading from parquet file after reading `n_rows`.
row_index_name
Expand Down Expand Up @@ -365,6 +368,8 @@ def scan_parquet(
* `aws <https://docs.rs/object_store/latest/object_store/aws/enum.AmazonS3ConfigKey.html>`_
* `gcp <https://docs.rs/object_store/latest/object_store/gcp/enum.GoogleConfigKey.html>`_
* `azure <https://docs.rs/object_store/latest/object_store/azure/enum.AzureConfigKey.html>`_
* Hugging Face (`hf://`): Accepts an API key under the `token` paramter: \

Check warning on line 371 in py-polars/polars/io/parquet/functions.py

View workflow job for this annotation

GitHub Actions / main

"paramter" should be "parameter".
`{'token': '...'}`, or by setting the `HF_TOKEN` environment variable.
If `storage_options` is not provided, Polars will try to infer the information
from environment variables.
Expand Down

0 comments on commit fdb89c2

Please sign in to comment.