Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🎉 Google Ads improvement: Support user-specified queries #5302

Merged
merged 39 commits into from
Sep 10, 2021

Conversation

vovavovavovavova
Copy link
Contributor

@vovavovavovavova vovavovavovavova commented Aug 10, 2021

What

Closes #5165
Also this one #5873

How

This allow user to specify the custom GAQL queries as a base for report builder.
However, there are some limitations for now:

* if user wants stream to be incremental, he should not use literals WHERE and ORDER BY. Currently supported format is SELECT a, b,c,d,e ... FROM table. The constructions above should be available for full-refresh streams, but it is not recommended.
* we do not validate queries for now, as well do not check fields compatibility one with another. Those may be done by user manually using
query builder

Recommended reading order

  1. x.java
  2. y.python

Pre-merge Checklist

Expand the relevant checklist and delete the others.

New Connector

Community member or Airbyter

  • Community member? Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • docs/SUMMARY.md
    • docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
    • docs/integrations/README.md
    • airbyte-integrations/builds.md
  • PR name follows PR naming conventions
  • Connector added to connector index like described here

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • Credentials added to Github CI. Instructions.
  • /test connector=connectors/<name> command is passing.
  • New Connector version released on Dockerhub by running the /publish command described here

Updating a connector

Community member or Airbyter

  • Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Changelog updated in docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
  • PR name follows PR naming conventions
  • Connector version bumped like described here

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • Credentials added to Github CI. Instructions.
  • /test connector=connectors/<name> command is passing.
  • New Connector version released on Dockerhub by running the /publish command described here

Connector Generator

  • Issue acceptance criteria met
  • PR name follows PR naming conventions
  • If adding a new generator, add it to the list of scaffold modules being tested
  • The generator test modules (all connectors with -scaffold in their name) have been updated with the latest scaffold by running ./gradlew :airbyte-integrations:connector-templates:generator:testScaffoldTemplates then checking in your changes
  • Documentation which references the generator is updated as needed.

@github-actions github-actions bot added the area/connectors Connector related issues label Aug 10, 2021
@vovavovavovavova vovavovavovavova changed the title Valdemar/#5165 custom query stream Google Ads improvement: Support user-specified queries Aug 10, 2021
@vovavovavovavova
Copy link
Contributor Author

vovavovavovavova commented Aug 10, 2021

/test connector=connectors/source-google-ads

🕑 connectors/source-google-ads https://github.com/airbytehq/airbyte/actions/runs/1117013432
❌ connectors/source-google-ads https://github.com/airbytehq/airbyte/actions/runs/1117013432

@jrhizor jrhizor temporarily deployed to more-secrets August 10, 2021 14:35 Inactive
@vovavovavovavova
Copy link
Contributor Author

vovavovavovavova commented Aug 10, 2021

/test connector=connectors/source-google-ads

🕑 connectors/source-google-ads https://github.com/airbytehq/airbyte/actions/runs/1117027074
❌ connectors/source-google-ads https://github.com/airbytehq/airbyte/actions/runs/1117027074

@jrhizor jrhizor temporarily deployed to more-secrets August 10, 2021 14:39 Inactive
@vovavovavovavova
Copy link
Contributor Author

vovavovavovavova commented Aug 10, 2021

/test connector=connectors/source-google-ads

🕑 connectors/source-google-ads https://github.com/airbytehq/airbyte/actions/runs/1117060572
✅ connectors/source-google-ads https://github.com/airbytehq/airbyte/actions/runs/1117060572

@jrhizor jrhizor temporarily deployed to more-secrets August 10, 2021 14:50 Inactive
@vovavovavovavova
Copy link
Contributor Author

vovavovavovavova commented Aug 10, 2021

/test connector=connectors/source-google-ads

🕑 connectors/source-google-ads https://github.com/airbytehq/airbyte/actions/runs/1117091261
✅ connectors/source-google-ads https://github.com/airbytehq/airbyte/actions/runs/1117091261

@jrhizor jrhizor temporarily deployed to more-secrets August 10, 2021 15:00 Inactive
@@ -69,6 +108,11 @@ def streams(self, config: Mapping[str, Any]) -> List[Stream]:
incremental_stream_config = dict(
api=google_api, conversion_window_days=config["conversion_window_days"], start_date=config["start_date"]
)

custom_query_streams = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to validate custom queries in the check_connection method?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can use this for validating queries.

Copy link
Contributor Author

@vovavovavovavova vovavovavovavova Aug 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

used regex in spec.json, since google query validator works on client (UI javascript) side


custom_query_streams = [
CustomQuery(custom_query_config=config["custom_query"][i], **incremental_stream_config)
for i in range(len(config.get("custom_query", [])))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need range here???

for stream in self.streams(config=config):
if not isinstance(stream, (CustomQueryFullRefresh, CustomQueryIncremental)):
streams.append(stream.as_airbyte_stream())
# TODO: extend with custom defined streams
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if you did this here actually, is this still a relevant comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a mark where the default discover logic was extended, will be removed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't need to change it at all, just implement this in streams

if not isinstance(stream, (CustomQueryFullRefresh, CustomQueryIncremental)):
streams.append(stream.as_airbyte_stream())
# TODO: extend with custom defined streams
for usr_query in config.get("custom_query", []):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't you do this inside CustomQueryFullRefresh and CustomQueryIncremental classes ?

Copy link
Contributor Author

@vovavovavovavova vovavovavovavova Aug 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need this to be made in discover, to fill schemas (dynamically, it is very important)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what stops you from doing this in streams??? this is what get_json_schema for

@avida avida force-pushed the valdemar/#5165_custom_query_stream branch from d8994c9 to 93de86a Compare September 7, 2021 11:32
@jrhizor jrhizor temporarily deployed to more-secrets September 7, 2021 11:35 Inactive
@avida avida force-pushed the valdemar/#5165_custom_query_stream branch 2 times, most recently from ae86f71 to dde02b2 Compare September 7, 2021 11:42
@avida
Copy link
Contributor

avida commented Sep 7, 2021

/test connector=connectors/source-google-ads

🕑 connectors/source-google-ads https://github.com/airbytehq/airbyte/actions/runs/1209270261
❌ connectors/source-google-ads https://github.com/airbytehq/airbyte/actions/runs/1209270261
🐛 https://gradle.com/s/agdxktqyh53js

@jrhizor jrhizor temporarily deployed to more-secrets September 7, 2021 11:45 Inactive
@avida avida linked an issue Sep 7, 2021 that may be closed by this pull request
@avida avida force-pushed the valdemar/#5165_custom_query_stream branch from dde02b2 to dae0d6f Compare September 7, 2021 11:52
@avida
Copy link
Contributor

avida commented Sep 7, 2021

/test connector=connectors/source-google-ads

🕑 connectors/source-google-ads https://github.com/airbytehq/airbyte/actions/runs/1209298457
✅ connectors/source-google-ads https://github.com/airbytehq/airbyte/actions/runs/1209298457

@jrhizor jrhizor temporarily deployed to more-secrets September 7, 2021 11:55 Inactive
@avida avida force-pushed the valdemar/#5165_custom_query_stream branch from dae0d6f to 612ffc3 Compare September 7, 2021 12:05
Copy link
Contributor

@sherifnada sherifnada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great! one question on conflicting where clauses

@@ -67,6 +67,27 @@
"maximum": 1095,
"default": 14,
"examples": [14]
},
"custom_queries": {
"type": "array",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"type": "array",
"type": "array",
"title": "Custom GAQL Queries"

if google_data_type == "ENUM":
field_value = {"type": "string", "enum": list(node.enum_values)}
elif google_data_type == "MESSAGE": # this can be anything (or skip as additionalproperties) ?
output_type = ["string", "number", "array", "object", "boolean", "null"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's just make it a string and cast it as that

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, but it took some time, also added test for stream that contains MESSAGE field.

result_query = query.replace(columns, new_columns)

# Modify/insert where condition
where_cond = CustomQuery.WHERE_EXPR.search(result_query)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a very cool approach, I like it. The only issue I see here is WHERE could contain a time-bound function like LAST_14_DAYS etc. then wouldn't it conflict with this incremental logic?

In this case I think we should not add the where condition and just default to what the user did

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case I've added checking if custom query contain "segments.date" field on check_connection method. It wont allow adding time-bound conditions to custom queries.

@sherifnada sherifnada self-requested a review September 9, 2021 04:34
@avida
Copy link
Contributor

avida commented Sep 9, 2021

/test connector=connectors/source-google-ads

🕑 connectors/source-google-ads https://github.com/airbytehq/airbyte/actions/runs/1216544888
✅ connectors/source-google-ads https://github.com/airbytehq/airbyte/actions/runs/1216544888

@jrhizor jrhizor temporarily deployed to more-secrets September 9, 2021 08:36 Inactive
@avida
Copy link
Contributor

avida commented Sep 9, 2021

/test connector=connectors/source-google-ads

🕑 connectors/source-google-ads https://github.com/airbytehq/airbyte/actions/runs/1216717428
✅ connectors/source-google-ads https://github.com/airbytehq/airbyte/actions/runs/1216717428

@jrhizor jrhizor temporarily deployed to more-secrets September 9, 2021 09:27 Inactive
@avida avida changed the title Google Ads improvement: Support user-specified queries 🎉 Google Ads improvement: Support user-specified queries Sep 9, 2021
@avida
Copy link
Contributor

avida commented Sep 10, 2021

/publish connector=connectors/source-google-ads

🕑 connectors/source-google-ads https://github.com/airbytehq/airbyte/actions/runs/1220121286
✅ connectors/source-google-ads https://github.com/airbytehq/airbyte/actions/runs/1220121286

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation
Projects
None yet
6 participants