The tracerboy API is an experimental event tracking and reporting web service, built as part of an assignment at Narrative.
Since this application is not yet published as a Docker image or some other wrapping I'm making an assumption that eager engineers will have basic tooling installed on their machines. Tooling being - SBT and Docker. Developer experience can be further extended by usage of Nix it is however not mandatory requirement.
This project uses Postgres with TimescaleDB extension to store all analytical events and continuous aggregates with hypertables. The application is bundled with embedded Flyway that will automatically detect the state of database and execute appropriate migrations when booted or reloaded.
Make sure that application can access Postgres instance or use a Docker Compose wrapper
script ./bin/tracerboy-dev.sh
to boot it up. This will create new Docker container with
Postgres and TimescaleDB extension installed and configured. It will also set username to tb
, password to tb
and
initialise new empty database with name tb
.
./bin/tracerboy-dev.sh up pg
Boot-up the application with sbt run
and DATABASE_URL
environment variable preconfigured.
# export DATABASE_URL="jdbc:postgresql://localhost:5432/tb?user=tb&password=tb"
sbt run
Or build Docker image and boot-up the application within the help
of Docker composition
:
sbt docker:publishLocal
./bin/tracerboy-dev.sh up tracerboy tracerboy-gw
If you wish to scale up or down number of replicas up use something along the following lines:
./bin/tracerboy-dev.sh up -d
./bin/tracerboy-dev.sh up --scale tracerboy=3 -d
./bin/tracerboy-dev.sh stop # To stop everything
When interacting with the service via Docker, please use port 4000
oppose to port 9090
when running natively on the
host operating system.
curl -D - --request POST \
127.0.0.1:4000/analytics\?timestamp=1662981405\&user=Oto+Brglez\&event=click
Main endpoint for accepting tracking information has the following query parameters:
POST /analytics?timestamp={millis_since_epoch}&user={user_id}&event={click|impression}
The endpint for reporting/analytics can be accessed on the given path
GET /analytics
If you wish to run with reloading in development mode then please consider using sbt-revolver.
sbt "~service/reStart"
The unit tests suite that is bundled with the application and can be run with the help of sbt.
sbt test
The integration tests are packaged as separate module and can be invoked via usage of Gatling.
sbt integration/GatlingIt/test
The Gatling traffic simulation will run against the service running on localhost:9090. If this project is to become more serious in the future I would likely suggest usage of Testcontainers and re-usage of existing Docker Compose setup and configuration as per-the-docs.
- Although the assigment identifies "user" with "username" it would be wiser to use proper UUIDs in production setup.
- Since the application is "stateless" there is no common shared state among running instances, thus making the application easy to scale.
timestamp
query parameter in thePOST /analytics
request should likely be hidden from the outside and set on the server when data is received and processed.- In production use-case the business logic handling could be improved with usage of ZIO Prelude / Validation or Scala Cats Validated
- The analytical aggregation is implemented with the help of TimescaleDB's hypertables and real-time continuous aggregates. In real-world scenario each of the aggregates would also need a proper retention policy.
- Why I've chosen TimescaleDB and not some other alternative is neatly captured and explained in this article . Other interesting options would also be InfluxDB, MongoDB (time-series) or other specialised time-series databases.
- Additional work should be done in this application in terms of logging and monitoring if it is to be used in the wild.