Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connector Facebook-Marketing: update insights streams with custom entries for fields, breakdowns and action_breakdowns #4864

Closed
Show file tree
Hide file tree
Changes from 11 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
{
"properties": {
"action_device": {"type": ["null", "string"]},
"action_canvas_component_name": {"type": ["null", "string"]},
"action_carousel_card_id": {"type": ["null", "string"]},
"action_carousel_card_name": {"type": ["null", "string"]},
"action_destination": {"type": ["null", "string"]},
"action_reaction": {"type": ["null", "string"]},
"action_target_id": {"type": ["null", "string"]},
"action_type": {"type": ["null", "string"]},
"action_video_sound": {"type": ["null", "string"]},
"action_video_type": {"type": ["null", "string"]},
"ad_format_asset": {"type": ["null", "string"]},
"age": {"type": ["null", "string"]},
"app_id": {"type": ["null", "string"]},
"body_asset": {"type": ["null", "string"]},
"call_to_action_asset": {"type": ["null", "string"]},
"country": {"type": ["null", "string"]},
"description_asset": {"type": ["null", "string"]},
"device_platform": {"type": ["null", "string"]},
"dma": {"type": ["null", "string"]},
"frequency_value": {"type": ["null", "string"]},
"gender": {"type": ["null", "string"]},
"hourly_stats_aggregated_by_advertiser_time_zone": {"type": ["null", "string"]},
"hourly_stats_aggregated_by_audience_time_zone": {"type": ["null", "string"]},
"image_asset": {"type": ["null", "string"]},
"impression_device": {"type": ["null", "string"]},
"link_url_asset": {"type": ["null", "string"]},
"place_page_id": {"type": ["null", "string"]},
"platform_position": {"type": ["null", "string"]},
"product_id": {"type": ["null", "string"]},
"publisher_platform": {"type": ["null", "string"]},
"region": {"type": ["null", "string"]},
"skan_conversion_id": {"type": ["null", "string"]},
"title_asset": {"type": ["null", "string"]},
"video_asset": {"type": ["null", "string"]}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,22 @@
# Copyright (c) 2021 Airbyte, Inc., all rights reserved.
#

import json
from datetime import datetime
from typing import Any, List, Mapping, Optional, Tuple, Type, MutableMapping
from jsonschema import RefResolver

from airbyte_cdk.entrypoint import logger
from airbyte_cdk.logger import AirbyteLogger
from airbyte_cdk.models import AirbyteConnectionStatus, AuthSpecification, ConnectorSpecification, DestinationSyncMode, OAuth2Specification, Status

from typing import Any, List, Mapping, Optional, Tuple, Type

import pendulum
from airbyte_cdk.models import AuthSpecification, ConnectorSpecification, DestinationSyncMode, OAuth2Specification
from airbyte_cdk.sources import AbstractSource
from airbyte_cdk.sources.streams import Stream
from airbyte_cdk.sources.streams.core import package_name_from_class
from airbyte_cdk.sources.utils.schema_helpers import ResourceSchemaLoader
from pydantic import BaseModel, Field
from source_facebook_marketing.api import API
from source_facebook_marketing.streams import (
Expand All @@ -26,6 +35,17 @@
)


class InsightConfig(BaseModel):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vladimir-remar did you validate that the json output by spec works with the UI via the instructions here? Just checking because I don't remember if we support a list of objects

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sherifnada we do actually :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sherifnada thanks to @keu, I attached some images from UI
Screenshot from 2021-10-14 11-40-23
It will be filled like this
Screenshot from 2021-10-14 11-42-16
And it will look like this with two elements
Screenshot from 2021-10-14 11-42-50
and from destination side
Screenshot from 2021-10-14 11-44-12


name: str = Field(description="The name value of insight")

fields: Optional[List[str]] = Field(description="A list of chosen fields for fields parameter", default=[])

breakdowns: Optional[List[str]] = Field(description="A list of chosen breakdowns for breakdowns", default=[])

action_breakdowns: Optional[List[str]] = Field(description="A list of chosen action_breakdowns for action_breakdowns", default=[])


class ConnectorConfig(BaseModel):
class Config:
title = "Source Facebook Marketing"
Expand Down Expand Up @@ -65,6 +85,9 @@ class Config:
minimum=1,
maximum=30,
)
insights: Optional[List[InsightConfig]] = Field(
vladimir-remar marked this conversation as resolved.
Show resolved Hide resolved
description="A list wich contains insights entries, each entry must have a name and can contains fields, breakdowns or action_breakdowns)"
)


class SourceFacebookMarketing(AbstractSource):
Expand Down Expand Up @@ -104,10 +127,11 @@ def streams(self, config: Mapping[str, Any]) -> List[Type[Stream]]:
days_per_job=config.insights_days_per_job,
)

return [
streams = [
Campaigns(api=api, start_date=config.start_date, end_date=config.end_date, include_deleted=config.include_deleted),
AdSets(api=api, start_date=config.start_date, end_date=config.end_date, include_deleted=config.include_deleted),
Ads(api=api, start_date=config.start_date, end_date=config.end_date, include_deleted=config.include_deleted),

AdCreatives(api=api),
AdsInsights(**insights_args),
AdsInsightsAgeAndGender(**insights_args),
Expand All @@ -118,6 +142,22 @@ def streams(self, config: Mapping[str, Any]) -> List[Type[Stream]]:
AdsInsightsActionType(**insights_args),
]

return self._update_insights_streams(insights=config.insights, args=insights_args, streams=streams)

def check(self, logger: AirbyteLogger, config: Mapping[str, Any]) -> AirbyteConnectionStatus:
"""Implements the Check Connection operation from the Airbyte Specification. See https://docs.airbyte.io/architecture/airbyte-specification."""
try:
check_succeeded, error = self.check_connection(logger, config)
if not check_succeeded:
return AirbyteConnectionStatus(status=Status.FAILED, message=repr(error))
except Exception as e:
return AirbyteConnectionStatus(status=Status.FAILED, message=repr(e))

self._check_insights_entries(config.get('insights', []))

return AirbyteConnectionStatus(status=Status.SUCCEEDED)


def spec(self, *args, **kwargs) -> ConnectorSpecification:
"""
Returns the spec for this integration. The spec is a JSON-Schema object describing the required configurations (e.g: username and password)
Expand All @@ -128,11 +168,73 @@ def spec(self, *args, **kwargs) -> ConnectorSpecification:
changelogUrl="https://docs.airbyte.io/integrations/sources/facebook-marketing",
supportsIncremental=True,
supported_destination_sync_modes=[DestinationSyncMode.append],
connectionSpecification=ConnectorConfig.schema(),
connectionSpecification=expand_local_ref(ConnectorConfig.schema()),
authSpecification=AuthSpecification(
auth_type="oauth2.0",
oauth2Specification=OAuth2Specification(
rootObject=[], oauthFlowInitParameters=[], oauthFlowOutputParameters=[["access_token"]]
),
)
),
)

def _update_insights_streams(self, insights, args, streams) -> List[Type[Stream]]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest the following approach for these custom streams:

  1. The "standard" streams offered by the connector are always available (if a user doesn't want them they can always just deselect them)
  2. If the user inputs any custom streams, they are named as custom_<user-input-name> and appended to the list of streams

This way it is very very obvious to the user what is happening. This is especially important as the connector's config is updated over time e.g: a user might call a stream ads_insights today, then remove it next week, and now the data represented in that stream is mixed with the "standard" ads insight stream and the "custom" insight stream. Making the suggestion above to make such situations much less likely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the change in the lastest commit

"""Update method, if insights have values returns streams replacing the
default insights streams else returns streams

"""
if not insights:
return streams

insights_custom_streams = list()

for insight in insights:
args["name"] = f"Custom{insight.name}"
args["fields"] = list(set(insight.fields))
args["breakdowns"] = list(set(insight.breakdowns))
args["action_breakdowns"] = list(set(insight.action_breakdowns))
insight_stream = AdsInsights(**args)
insights_custom_streams.append(insight_stream)

return streams + insights_custom_streams

def _check_insights_entries(self, insights):
vladimir-remar marked this conversation as resolved.
Show resolved Hide resolved

default_fields = list(ResourceSchemaLoader(package_name_from_class(self.__class__)).get_schema("ads_insights").get("properties", {}).keys())
default_breakdowns = list(ResourceSchemaLoader(package_name_from_class(self.__class__)).get_schema("ads_insights_breakdowns").get("properties", {}).keys())
default_actions_breakdowns = [e for e in default_breakdowns if 'action_' in e]

for insight in insights:
vladimir-remar marked this conversation as resolved.
Show resolved Hide resolved
if insight.get('fields') and not self._check_values(default_fields, insight.get('fields')):
message = f"For {insight.get('name')} that field are not espected in fields"
vladimir-remar marked this conversation as resolved.
Show resolved Hide resolved
raise Exception("Config validation error: " + message) from None
if insight.get('breakdowns') and not self._check_values(default_breakdowns, insight.get('breakdowns')):
message = f"For {insight.get('name')} that breakdown are not espected in breakdowns"
raise Exception("Config validation error: " + message) from None
if insight.get('action_breakdowns') and not self._check_values(default_actions_breakdowns, insight.get('action_breakdowns')):
message = f"For {insight.get('name')} that action_breakdowns are not espected in action_breakdowns"
raise Exception("Config validation error: " + message) from None

return True

def _check_values(self, default_value: List[str], custom_value: List[str]) -> bool:
for e in custom_value:
if e not in default_value:
logger.error(f"{e} does not appers in {default_value}")
vladimir-remar marked this conversation as resolved.
Show resolved Hide resolved
return False
return True


def expand_local_ref(schema, resolver=None, **kwargs):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sherifnada thanks to @keu
With this, we can set properly the insights in the UI.

resolver = resolver or RefResolver("", schema)
if isinstance(schema, MutableMapping):
if "$ref" in schema:
ref_url = schema.pop("$ref")
url, resolved_schema = resolver.resolve(ref_url)
schema.update(resolved_schema)
for key, value in schema.items():
schema[key] = expand_local_ref(value, resolver=resolver)
return schema
elif isinstance(schema, List):
return [expand_local_ref(item, resolver=resolver) for item in schema]

return schema
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
from datetime import datetime
from typing import Any, Iterable, Iterator, List, Mapping, MutableMapping, Optional, Sequence

import airbyte_cdk.sources.utils.casing as casing
import backoff
import pendulum
from airbyte_cdk.models import SyncMode
Expand Down Expand Up @@ -291,10 +292,32 @@ class AdsInsights(FBMarketingIncrementalStream):

breakdowns = []

def __init__(self, buffer_days, days_per_job, **kwargs):
def __init__(
self,
buffer_days,
days_per_job,
name: str = None,
fields: List[str] = None,
breakdowns: List[str] = None,
action_breakdowns: List[str] = None,
**kwargs,
):

super().__init__(**kwargs)
self.lookback_window = pendulum.duration(days=buffer_days)
self._days_per_job = days_per_job
self._fields = fields
self.action_breakdowns = action_breakdowns or self.action_breakdowns
self.breakdowns = breakdowns or self.breakdowns
self._new_class_name = name

@property
def name(self) -> str:
"""
:return: Stream name. By default this is the implementing class name, but it can be overridden as needed.
"""
name = self._new_class_name or self.__class__.__name__
return casing.camel_to_snake(name)

def read_records(
self,
Expand Down Expand Up @@ -389,12 +412,16 @@ def get_json_schema(self) -> Mapping[str, Any]:
:return: A dict of the JSON schema representing this stream.
"""
schema = ResourceSchemaLoader(package_name_from_class(self.__class__)).get_schema("ads_insights")
if self._fields:
schema["properties"] = {k: v for k, v in schema["properties"].items() if k in self._fields}
schema["properties"].update(self._schema_for_breakdowns())
return schema

@cached_property
def fields(self) -> List[str]:
"""List of fields that we want to query, for now just all properties from stream's schema"""
if self._fields:
return self._fields
schema = ResourceSchemaLoader(package_name_from_class(self.__class__)).get_schema("ads_insights")
return list(schema.get("properties", {}).keys())

Expand Down
8 changes: 7 additions & 1 deletion docs/integrations/sources/facebook-marketing.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,12 @@ Facebook heavily throttles API tokens generated from Facebook Apps by default, m

See Facebook's [documentation on rate limiting](https://developers.facebook.com/docs/marketing-api/overview/authorization/#access-levels) for more information on requesting a quota upgrade.

## Custom Insights
In order to retrieve specific fields from Facebook Ads Insights combined with other breakdowns, there is a mechanism to allow you to choose which fields and breakdowns to sync.
It is highly recommended to follow the [documenation](https://developers.facebook.com/docs/marketing-api/insights/breakdowns), as there are limitations related to breakdowns. Some fields can not be requested and many others just work combined with specific fields, for example, the breakdown **app_id** is only supported with the **total_postbacks** field.
By now, the only check done when setting up a source is to check if the fields, breakdowns and action breakdowns are within the ones provided by Facebook. This is, if you enter a good input, it's gonna be validated, but after, if the calls to Facebook API with those pareameters fails you will receive an error from the API.
As a summary, custom insights allows to replicate only some fields, resulting in sync speed increase.

#### Data type mapping

| Integration Type | Airbyte Type | Notes |
Expand All @@ -90,6 +96,7 @@ See Facebook's [documentation on rate limiting](https://developers.facebook.com/

| Version | Date | Pull Request | Subject |
| :--- | :--- | :--- | :--- |
| 0.2.21 | 2021-10-05 | [4864](https://github.com/airbytehq/airbyte/pull/4864) | Update insights streams with custom entries for fields, breakdowns and action_breakdowns |
| 0.2.20 | 2021-10-04 | [6719](https://github.com/airbytehq/airbyte/pull/6719) | Update version of facebook\_bussiness package to 12.0 |
| 0.2.19 | 2021-09-30 | [6438](https://github.com/airbytehq/airbyte/pull/6438) | Annotate Oauth2 flow initialization parameters in connector specification |
| 0.2.18 | 2021-09-28 | [6499](https://github.com/airbytehq/airbyte/pull/6499) | Fix field values converting fail |
Expand All @@ -113,4 +120,3 @@ See Facebook's [documentation on rate limiting](https://developers.facebook.com/
| 0.1.3 | 2021-02-15 | [1990](https://github.com/airbytehq/airbyte/pull/1990) | Support Insights stream via async queries |
| 0.1.2 | 2021-01-22 | [1699](https://github.com/airbytehq/airbyte/pull/1699) | Add incremental support |
| 0.1.1 | 2021-01-15 | [1552](https://github.com/airbytehq/airbyte/pull/1552) | Release Native Facebook Marketing Connector |