-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat/tiktok #68
Feat/tiktok #68
Conversation
ingestr/src/sources.py
Outdated
and "ad_id" not in dimensions | ||
): | ||
raise ValueError( | ||
"You must provide one ID dimension. Please use one ID dimension from the following options: [AUCTION_ADVERTISER, AUCTION_AD, AUCTION_CAMPAIGN, AUCTION_ADGROUP]" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest writing the actual dimension names here, this error message is very confusing
ingestr/src/sources.py
Outdated
dimensions = fields[1].split(",") | ||
if ( | ||
"campaign_id" not in dimensions | ||
and "advertiser_id" not in dimensions | ||
and "adgroup_id" not in dimensions | ||
and "ad_id" not in dimensions | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could it be that the user puts space in between by accident? e.g. custom:campaign_id ,stat_time_day:clicks
, in which case this would fail. I suggest cleaning them up first.
ingestr/src/sources.py
Outdated
if kwargs.get("interval_start"): | ||
start_date = ensure_pendulum_datetime( | ||
str(kwargs.get("interval_start")) | ||
).in_tz(time_zone[0]) | ||
else: | ||
Default_date = pendulum.now().subtract(days=90) | ||
start_date = ensure_pendulum_datetime(Default_date).in_tz(time_zone[0]) | ||
|
||
if kwargs.get("interval_end"): | ||
end_date = ensure_pendulum_datetime(str(kwargs.get("interval_end"))).in_tz( | ||
time_zone[0] | ||
) | ||
else: | ||
end_date = ensure_pendulum_datetime(pendulum.now()).in_tz(time_zone[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if kwargs.get("interval_start"): | |
start_date = ensure_pendulum_datetime( | |
str(kwargs.get("interval_start")) | |
).in_tz(time_zone[0]) | |
else: | |
Default_date = pendulum.now().subtract(days=90) | |
start_date = ensure_pendulum_datetime(Default_date).in_tz(time_zone[0]) | |
if kwargs.get("interval_end"): | |
end_date = ensure_pendulum_datetime(str(kwargs.get("interval_end"))).in_tz( | |
time_zone[0] | |
) | |
else: | |
end_date = ensure_pendulum_datetime(pendulum.now()).in_tz(time_zone[0]) | |
start_date = pendulum.now().subtract(days=90).in_tz(time_zone[0]) | |
end_date = ensure_pendulum_datetime(pendulum.now()).in_tz(time_zone[0]) | |
if kwargs.get("interval_start"): | |
start_date = ensure_pendulum_datetime(kwargs.get("interval_start")).in_tz(time_zone[0]) | |
if kwargs.get("interval_end"): | |
end_date = ensure_pendulum_datetime(kwargs.get("interval_end")).in_tz( | |
time_zone[0] | |
) |
"data_level": data_level, | ||
"start_date": start_time, | ||
"end_date": end_time, | ||
"page_size": 1000, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this configurable? ideally we should get it from the --page-size
variable in ingestr
|
||
while True: | ||
self.params["page"] = current_page | ||
response = create_client().get( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why create the client in every step of the loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to create a client every time. It's my mistake. I have corrected that.
flat_structure(items=items, time_zone=self.time_zone) | ||
|
||
all_items.extend(items) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest yielding these pages instead of collecting them, otherwise we might run into memory issues
ingestr/src/sources.py
Outdated
@@ -37,6 +38,7 @@ | |||
from ingestr.src.slack import slack_source | |||
from ingestr.src.stripe_analytics import stripe_source | |||
from ingestr.src.table_definition import table_string_to_dataclass | |||
from ingestr.src.tiktok_ads._init_ import tiktok_source |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you change ingestr/src/tiktok_ads/_init_.py
to ingestr/src/tiktok_ads/__init__.py
, you will be able to treat tiktok_ads
as a python module.
That way, the import statement can change to:
from ingestr.src.tiktok_ads import tiktok_source
endpoint = "custom_reports" | ||
|
||
parsed_uri = urlparse(uri) | ||
source_fields = parse_qs(parsed_uri.query) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can tiktok access token contain =
, +
or &
? We may need to escape those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding this I am not sure. We have not escaped them before, so we need to discuss this with the team.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with leaving this as is.
ingestr/src/sources.py
Outdated
str(kwargs.get("interval_start")) | ||
).in_tz(time_zone[0]) | ||
else: | ||
Default_date = pendulum.now().subtract(days=90) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a default start date in order to be able to load data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we need it in this case because the API expects a start and end date.
endpoint = "custom_reports" | ||
|
||
parsed_uri = urlparse(uri) | ||
source_fields = parse_qs(parsed_uri.query) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with leaving this as is.
Following changes are made in this PR:
Note:
Filter logic is not yet implemented and will be added soon