Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Link Statistics with SQL Server Functions #264

Merged
merged 11 commits into from
Jul 15, 2020
Merged

Link Statistics with SQL Server Functions #264

merged 11 commits into from
Jul 15, 2020

Conversation

kylerwsm
Copy link
Contributor

@kylerwsm kylerwsm commented Jul 7, 2020

Problem

The previous attempt to log link statistics is too computationally expensive.

We needed an alternative that is able to handle and thrive under high user traffic.

Solution

Rather than logging using Sequelize model instances, we turned to SQL server functions. This function will be initialised by the server when it starts, and will be called to increment the relevant statistics on redirects.

Additionally, @liangyuanruo and I have load tested this implementation with the following results:

  • 199,978/200,000 clicks were logged
  • 22 network requests failed from HTTP 502 Bad Gateway and did not obtain a redirect
  • Load was about ~200/s

@kylerwsm kylerwsm marked this pull request as ready for review July 7, 2020 12:14
Copy link
Contributor

@liangyuanruo liangyuanruo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes as requested.

ON CONFLICT ("shortUrl", "date")
DO UPDATE SET "clicks" = "${clicksTable}"."clicks" + 1;
-- Update weekday clicks.
INSERT INTO "${weekdayTable}" ("shortUrl", "weekday", "hours", "clicks", "createdAt", "updatedAt")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is your plan to convert and display data in GMT+8 to the user?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my previous testings, the timestamps the function is using from the database is already in GMT+8. And since the logged link statistics data strips all timezone information, there should not be a need to convert the timezone of the data to the user.

Copy link
Contributor

@liangyuanruo liangyuanruo Jul 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i believe that's only because your localhost time is in GMT+8. the database in staging/production are in UTC.

select current_timestamp;

> 2020-07-14 04:38:38.918593+00

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have rebased, and added explicit timezones to the function

@kylerwsm kylerwsm requested a review from liangyuanruo July 8, 2020 06:45
Copy link
Contributor

@liangyuanruo liangyuanruo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should address the timezone issue in SQL while there is still a timestamp object to work with. i'm thinking we can conduct a final load test and put this into production tomorrow, if you're up for it.

LoneRifle and others added 11 commits July 15, 2020 17:26
LinkStatisticsRepository updates click stats by finding the instance,
then incrementing. This may result in larger transactions than needed,
given that the increment can be done in a single statement. Rework
the implementation so that we update, and if not present, we insert.
We will also do this in many small transactions.

Attempts to implement using `Model.upsert()` were not successful, due
to Sequelize creating temporary functions which fail to reference the
target table.

- interface - rework jsdoc, change return type to boolean
- impl - use `Model.update()` and `Sequelize.literal`, if not found,
  `Model.insert()` or throw not found error
- Drop the constraint of doing stats updates in a single txn
@liangyuanruo liangyuanruo merged commit 25e060e into develop Jul 15, 2020
@liangyuanruo liangyuanruo deleted the temp-func branch July 15, 2020 10:27
xming13 referenced this pull request in xming13/GoGovSG Aug 21, 2020
* feat(link statistics) Use a single SQL function to increment clicks across the following tables:

- devices_stats
- daily_stats
- weekday_stats
- urls

Transactions whedn updating clicks are implicit as operations are carried out within the scope of a function, which either succeed in their entirety or rolled back upon failure.

Co-authored-by: LoneRifle <LoneRifle@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants