Skip to content
Keenon Werling edited this page Oct 14, 2022 · 1 revision

Welcome to the AddBiomechanics wiki!

Terms cheat sheet

  • AWS: Amazon Web Services
  • AWS S3: "Simple Storage Service", an infinitely large file storage service. Files can be uploaded directly to S3 by end-users in their web browsers, and S3 can also serve "static files" as a website host. Very versatile, and very cheap.
  • AWS IoT: "Internet of Things" (totally hype-ey stupid name), really this is just a PubSub service that allows connections by end-users in their web browsers over WebSockets.
  • AWS Cognito: A service to manage users, log them in/out, etc. Crucially, also integrates with AWS S3 and AWS IoT to allow logged in users to connect directly to AWS services from their web browser, which saves us having to run intermediate servers.
  • PubSub: "Publish Subscribe", a simple networking abstraction where everyone connects to a central server, and can send messages to the server to be forwarded (or "published") to everyone else connected to the server. When you publish a message, you send it to a "topic." To avoid getting flooded with messages, you can "subscribe" to just subsets of topics, to only see those messages.
  • WebSocket: a technology that emulates native sockets, from within a web browser. WebSocket connections allow long-lived connections to a server (or in our case, to AWS IoT) where the server is able to push messages back to the web browser in real-time.
  • Docker: a way to build and manage "containers", which are like light-weight virtual machines. Docker images save their entire filesystem, so your code can ship along with the rest of the computer it's running on. When you run a Docker container on your computer, it looks/feels a lot like booting a virtual machine.

High level architecture description:

AddBiomechanics' design is trying to achieve a few simple quality of life goals:

  • Nobody carries a pager. (Corollary: The user experience doesn't crash when our physical servers crash.)
  • We never rent virtual machines from Amazon (too expensive, too finicky) - we use physical servers on campus for any processing we need.
  • The physical servers we manage are NOT accessible by the public internet, and only communicate with the world through download and re-upload to AWS S3, and receiving and sending messages over AWS IoT.

To achieve this, we build a static single-page website, using React, that's hosted on AWS S3. Users log in (without leaving the page), are authenticated by AWS Cognito, and then manage (upload/download/create/delete) their files directly on AWS S3 from their web browser. Whenever the user changes their files, they publish a message on AWS IoT.

That whole experience can exist without running any servers, and costs low-single-digit dollars a month to host.

The only remaining part of the user experience is how we actually process the user's uploaded data. We do that on physical server(s) on Stanford campus, which are behind the firewall and not exposed directly to the internet. Those servers run a Docker container, which connects to AWS S3 and AWS IoT, and listens for changes. When a user marks their subject "Ready to process", their browser uploads a special empty flag file to AWS S3, and sends a message to AWS IoT. When our processing server sees that the user has done this, it downloads the subject files from S3, processes them (sending log messages back to AWS IoT in case the user wants to follow along), and then re-uploads finished data back to S3, and sends a message to IoT saying it finished. Then the user's page updates with the new data, showing their subject has finished processing.

That's it!

This approach is cheap to run, users can still upload/download files even when our processing servers crash (it'll just say "waiting for server" when they ask to process data, which is not urgent enough to require carrying pagers), and it's very scalable.

Clone this wiki locally