Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add docs about using dtype #1903

Merged
merged 12 commits into from
May 12, 2023
14 changes: 13 additions & 1 deletion python-sdk/docs/astro/sql/operators/load_file.rst
Original file line number Diff line number Diff line change
Expand Up @@ -118,8 +118,20 @@ Parameters to use when loading a file to a Pandas dataframe
:start-after: [START load_file_example_6]
:end-before: [END load_file_example_6]

#. **load_options** - :ref:`load_options`
#. **load_options** - Use :ref:`load_options` to configure how the SDK loads data from your file to the dataframe.

.. note::

Depending on the file type you provide, the Astro SDK uses `read_csv <https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html#pandas-read-csv>`_, `read_json <https://pandas.pydata.org/docs/reference/api/pandas.read_json.html>`_, or `read_parquet <https://pandas.pydata.org/docs/reference/api/pandas.read_parquet.html>`_ to parse your file and load it to a dataframe. If the Astro SDK fails to automatically load data from your file to a dataframe, configure `dtype <https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html>`_ in :ref:`load_options` to manually specify the schema for the dataframe.
jwitz marked this conversation as resolved.
Show resolved Hide resolved
For example, to load data from a `.csv` file with two columns, `id` and `name`, you would add the following to your code:
jwitz marked this conversation as resolved.
Show resolved Hide resolved

```
tatiana marked this conversation as resolved.
Show resolved Hide resolved
dataframe = load_file(
input_file=File(path),
jwitz marked this conversation as resolved.
Show resolved Hide resolved
use_native_support=False,
load_options=[PandasLoadOptions(dtype={'id': int, 'name': str})],
)
```

Parameters for native transfer
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down