kedro.io.CSVS3DataSet does not use load_args #2

Lucianois · 2019-05-20T16:58:10Z

Description

When to loading a dataframe from S3, the arguments used are the default, instead of the configured under load_args on the catalog.yml

Context

Trying to load CSV from S3 with custom load_args

Steps to Reproduce

Change catalog.yml with
XXX.csv.s3:
type: CSVS3DataSet # https://kedro.readthedocs.io/en/latest/kedro.io.CSVS3DataSet.html
load_args: # https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html
sep: '\t'
encoding: "ISO-8859-1"

Create pipeline to display data in XXX.csv.s3
example:
node(
read_display,
["XXX.csv.s3"],
None
)

kedro run
Check if data is being displayed correctly.

Expected Result

Data should be split by tab

Actual Result

Data is loaded without custom sep.

-- If you received an error, place it here.

-- Separate them if you have more than one.

Your Environment

Include as many relevant details about the environment you experienced the bug in

Kedro version used: v0.14
Python version used: 3.6.8
Operating system and version:MacOSX 10.14.5

DOES NOT HAPPEN in KEDRO v0.13.1.dev53

Checklist

Include labels so that we can categorise your issue

Add a "Component" label to the issue
Add a "Priority" label to the issue

The text was updated successfully, but these errors were encountered:

Pet3ris · 2019-05-21T21:00:50Z

Hey @Lucianois - you can submit code snippets using "```" like so.

From the example you have submitted, it's not clear if you have applied the correct yaml formatting to the example. For instance, sep should be a load_arg, nested under that parameter.

Lucianois · 2019-05-21T21:14:46Z

    type: CSVS3DataSet # https://kedro.readthedocs.io/en/latest/kedro.io.CSVS3DataSet.html
    load_args: # https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html
        sep: '\t'
        encoding: "ISO-8859-1"
    credentials: prod_s3
    bucket_name: bucket1
    filepath: path-to-csv

@tsanikgr already identified the problem.

yetudada · 2019-05-21T21:20:41Z

Thanks for the comment @Pet3ris! And, @Lucianois, thank you so much for submitting this issue and for the updated the comment. We have a fix for this that we're about to push through.

tsanikgr · 2019-05-26T12:31:10Z

Fixed

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>

tsanikgr closed this as completed May 26, 2019

amandakys mentioned this issue Mar 3, 2023

Introducing Utility Modules to Kedro #2388

Closed

This was referenced Aug 3, 2023

Support versioning of the underlying dataset with PartitionedDataset #521

Closed

Support versioning of PartitionedDataset classes #2857

Closed

datajoely mentioned this issue Aug 18, 2023

Easier CustomDataset Creation #1936

Open

deepyaman mentioned this issue Aug 30, 2023

Vizro Discussion #2898

Closed

merelcht added a commit that referenced this issue Jan 11, 2024

Fix mypy strict issues #2 (#3497)

5d021ea

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kedro.io.CSVS3DataSet does not use load_args #2

kedro.io.CSVS3DataSet does not use load_args #2

Lucianois commented May 20, 2019

Pet3ris commented May 21, 2019

Lucianois commented May 21, 2019 •

edited

Loading

yetudada commented May 21, 2019

tsanikgr commented May 26, 2019

kedro.io.CSVS3DataSet does not use load_args #2

kedro.io.CSVS3DataSet does not use load_args #2

Comments

Lucianois commented May 20, 2019

Description

Context

Steps to Reproduce

Expected Result

Actual Result

Your Environment

Checklist

Pet3ris commented May 21, 2019

Lucianois commented May 21, 2019 • edited Loading

yetudada commented May 21, 2019

tsanikgr commented May 26, 2019

Lucianois commented May 21, 2019 •

edited

Loading