Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: data collection for link click statistics #209

Merged
merged 34 commits into from
Jun 29, 2020
Merged

feat: data collection for link click statistics #209

merged 34 commits into from
Jun 29, 2020

Conversation

kylerwsm
Copy link
Contributor

@kylerwsm kylerwsm commented Jun 19, 2020

Problem

This PR aims to start-off data collection for our upcoming link click statistics feature.

Closes #168.

Solution

We do this by collecting and storing data directly to dedicated tables, each representing a statistic we will show to users. Additionally, in this implementation, badly formatted data will not be written to the statistics databases.

More information on how the each statistic is implemented:

Daily Statistics

  • Uses a composite key, a combination of the short url and date
  • Date is in Singapore time
  • Index is unique and uses the shortUrl and date fields

Weekday Statistics

  • Uses the short url as the primary key.
  • Each day is represented by an integer between 0 (Sunday) to 6 (Saturday). This representation aligns with JavaScript Date objects.
  • Index is unique and uses the shortUrl, weekday, and hours fields

Device Statistics

  • Uses the short url as the primary key.
  • Makes use of the ua-parser-js library to parse the user agent string to a device type.
  • Limitation: Desktop devices do not identify their device types in user agent string, unlike mobile and tablet devices. I have then defaulted user agent strings that do not identify their device types to be desktop devices.
  • Index is unique and uses only the shortUrl field

@kylerwsm kylerwsm force-pushed the stats branch 2 times, most recently from 5108462 to e971b06 Compare June 23, 2020 07:14
@kylerwsm kylerwsm requested a review from yong-jie June 23, 2020 07:51
@yong-jie
Copy link
Member

have confirmed with @kylerwsm that data collection is presently working when tested on staging

@kylerwsm kylerwsm requested a review from liangyuanruo June 23, 2020 09:33
@kylerwsm kylerwsm marked this pull request as ready for review June 23, 2020 16:31
Copy link
Contributor

@liangyuanruo liangyuanruo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as commented. it would be good if we can try creating the tables without indexes, then run an EXPLAIN SELECT ... and make sure that we're still able to rely on the composite primary keys without doing any table scans here. indexes result in additional write overhead, best if the composite PKs can provide indexed lookup.

@kylerwsm kylerwsm force-pushed the stats branch 3 times, most recently from a0e2a9e to 7218a3f Compare June 25, 2020 13:09
kylerwsm added 26 commits June 29, 2020 16:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Collection of data prior to the launch of analytics
3 participants