Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use columns defined for COPY to Redshift so we never rely on order #2245

Merged
merged 3 commits into from
Oct 3, 2017

Conversation

jamesmcm
Copy link
Contributor

@jamesmcm jamesmcm commented Oct 2, 2017

Description

See issue #2244

Motivation and Context

Right now COPYs to Redshift are dependent upon the column ordering which can break imports.

Have you tested this? If so, how?

I've tested it locally with our columns defined.

Copy link
Collaborator

@dlstadther dlstadther left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for submitting!

I have listed one code refactor request.

source=f,
creds=self._credentials(),
options=self.copy_options)
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps you could do something more like below to reduce duplicate code

colnames = ''
if len(self.columns) > 0:
    colnames = ",".join(x[0] for x in self.columns])
cursor.execute("""\
    copy {table} {cols} from '{source}'
    credentials '{creds}'
    {options}
    ;""".format(
        table=self.table,
        colnames='({})'.format(colnames),
        source=f,
        creds=self._credentials(),
        options=self.copy_options
    )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed :)

@dlstadther
Copy link
Collaborator

I'm going to pull this branch down and verify its compatibility with my ETLs (as i'm not familiar with the data mapping options with COPY). Will report back soon

@jamesmcm
Copy link
Contributor Author

jamesmcm commented Oct 2, 2017

Thanks, I don't think it should break anything, unless you have defined the columns there but are reliant on it importing them out of order, and I can't really think of any use case where that is desirable.

@dlstadther dlstadther merged commit 9cbf188 into spotify:master Oct 3, 2017
@dlstadther
Copy link
Collaborator

Thanks @jamesmcm !

This was referenced Jun 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants