-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use columns defined for COPY to Redshift so we never rely on order #2245
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for submitting!
I have listed one code refactor request.
luigi/contrib/redshift.py
Outdated
source=f, | ||
creds=self._credentials(), | ||
options=self.copy_options) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps you could do something more like below to reduce duplicate code
colnames = ''
if len(self.columns) > 0:
colnames = ",".join(x[0] for x in self.columns])
cursor.execute("""\
copy {table} {cols} from '{source}'
credentials '{creds}'
{options}
;""".format(
table=self.table,
colnames='({})'.format(colnames),
source=f,
creds=self._credentials(),
options=self.copy_options
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed :)
I'm going to pull this branch down and verify its compatibility with my ETLs (as i'm not familiar with the data mapping options with COPY). Will report back soon |
Thanks, I don't think it should break anything, unless you have defined the columns there but are reliant on it importing them out of order, and I can't really think of any use case where that is desirable. |
Thanks @jamesmcm ! |
Description
See issue #2244
Motivation and Context
Right now COPYs to Redshift are dependent upon the column ordering which can break imports.
Have you tested this? If so, how?
I've tested it locally with our columns defined.