-
Notifications
You must be signed in to change notification settings - Fork 14.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prevent templated field logic checks in operators __init__
in BigQueryToPostgresOperator
operator
#36491
Conversation
@@ -36,8 +34,6 @@ class BigQueryToPostgresOperator(BigQueryToSqlBaseOperator): | |||
:param postgres_conn_id: Reference to :ref:`postgres connection id <howto/connection:postgres>`. | |||
""" | |||
|
|||
template_fields: Sequence[str] = (*BigQueryToSqlBaseOperator.template_fields, "dataset_id", "table_id") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this a breaking change?
I think the real issue is with:
airflow/airflow/providers/google/cloud/transfers/bigquery_to_sql.py
Lines 96 to 99 in 6802d41
try: | |
self.dataset_id, self.table_id = dataset_table.split(".") | |
except ValueError: | |
raise ValueError(f"Could not parse {dataset_table} as <dataset>.<table>") from None |
and it will affect all operators that inhert from the base class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this a breaking change?
I think the real issue is with:
airflow/airflow/providers/google/cloud/transfers/bigquery_to_sql.py
Lines 96 to 99 in 6802d41
try: self.dataset_id, self.table_id = dataset_table.split(".") except ValueError: raise ValueError(f"Could not parse {dataset_table} as <dataset>.<table>") from None and it will affect all operators that inhert from the base class
Coming to think of it, it might be breaking indeed as fields that don't exist in the parent's template_fields
are removed from child's definition.
I suggest reverting it for now.
@romsharon98 instead of deleting this line, try to hardcode all of the values that should be templated, and see if it works (a bit ugly, but I don't have better idea for now).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your insight @eladkal !
Can you help me understand why is this breaking change?
This is how I understand it:
Lets assume I revert the PR, so both "dataset_id", "table_id"
are templated field for BigQueryToPostgresOperator
.
But the parent constructor (BigQueryToSqlBaseOperator
) always run them over by the line you mentioned.
So as I understand it this reverted line has no meaning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SCRATCH_THAT:
I think we should not treat it as breaking (or at least what I undertstand it here).
I think the only scenario where it would matter is:
- Someone creates a custom operator derived from BigQueryToPostgresOperator
- The same someone adds new fields there "dataset_id" and "table_id"
- And expects them to be templated.
Even if it worked previously, that was accidental and unintended and we should treat this change as a bug-fix. If somoene adds new fields in a derived operator it's their responsibilty to add those fields to templated fields.
While this change might technically break someone's implementation, IMHO We should treat it as bugfix because:
a) it's a very low chance this will happen
b) while we are breaking something technically we are bringing things back to how they were intended to work. Having those fields in this operator was accidental not intentional
I will repeat it for as long as it sticks - SemVer and "breaking" classification is not whether something is "technically" broken but whether our intentions changed. If we would apply "breaking change" label for every change that changes behaviour then pretty much every single bugfix is "technically" breaking because it changes behaviour.
UPDATE: I just realized I missed the parent class. Let me revise it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eladkal is right - but we shiould not revert it - instead we should ad those to fields to the base class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eladkal is right - but we shiould not revert it - instead we should ad those to fields to the base class.
Sounds good to me.
A note from a technical perspective of the validation pre-commit -
As the validation is currently based on very simplified AST parsing, it would be better for now to define the fields directly (i.e., template_fields = ['a','b']
), rather than relying on parents' fields (i.e, template_fields=(**ParentClass.template_fields,'b')
)., otherwise the validation might fail.
The cost would be minimal abuse to the inheritance, which can later be fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. This is really what I also proposed - to move template_fields = ['dataset_id', 'template_id'] to BigQueryToSqlBaseOperator
. This is where they belong and this is what will make them consistent with the AST check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Followup PR #36663
This reverts commit f070efa.
related: #36484
fix
BigQueryToPostgresOperator
operator for this cherry-picking: #33786^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in newsfragments.