This project is being open-sourced, but is still a work-in-progress. It was designed be a horizontally audio-processing micro-service for deployement on AWS Lambda. In order to complete the open sourcing, it needs to be generalised. Coming soon!
This is a micro-service written for serverless architecture (currently running on AWS Lambda) that performs audio processing and metadata creation for a single user's radcast.
This app uses the Node.js framework and is designed to be run remotely as an AWS Lambda function. It is absolutely CRITICAL that anyone working on this project make themselves familiar with the setup configuration - as there are a number of config steps that can easily go wrong, rendering the code non-functional. Full set up instructions are documented in the setup.md document. This doc also contains information on how to deploy the projet to the Lambda function.
The function configuration - memory allocation & effects on timing and costing - has been tested and the results are documented in a Google Sheet.
It's highly advisable that you become familiar with nuances associated with Lambda functions before working on this code - like cold start effects, max execution time, ephemerality of memory, invocation etc etc. In summary though, a Lambda function is a docker container managed by AWS for you. Every time your function is invoked (various options for this) AWS will spin up a new instance, allow it to perform the work, and once complete, the instance will be destroyed. It suits applications that are lightweight, stateless, and completely modular (amongst other qualities).
This project shouldn't have a quick start actually. You should read the entire setup document. But anyway, you're probably a millenial, so here goes:
- Export the function from the AWS console by selecting Actions --> Export function. Or you can check it out from it's repository.
- Set up AWS
sam cli
:$ pip install --user aws-sam-cli
- Make sure the sam binary is globally available in your shell:
$ export PATH=$PATH:~/Library/Python/2.7/bin
where the path is the path to where your pip modules install to - Restart your shell, and type
$ sam --version
- it should display the version, if it doesn't something has gone wrong. - The template.yaml file should be included in the git repo, so you shouldn't have to run
$ sam init
- Install the required dependancies:
$ npm install
- Running: Use
$ sam local invoke "createSingleRadcast" -e test/unit_test_radcast_dev.json
This should start a docker container on your local machine and invoke the function with an input. - You can use the debugger of your choice - connect it to port 5858.
- If you were running any other Docker containers with port 5858 open as a debug port, you may need to restart Docker.
- Docker
- Python2.7 or Python3.6
$ sam local invoke "createSingleRadcast" -e test/unit_test_radcast_dev.json --debug-port 5858
|-- audio-handling
| |-- audio-manager: Class that handles audio download, transcoding and upload
|-- config
| |-- api_key.js: Key to validate requests with
| |-- constants.js: General project constants
| |-- radcast-audio-private-key.json: Firebase project service account key
|-- models: .* Model classes to represent Radacst data structures
|-- node_modules: external dependancies / packages
|-- radcast-creation
| |-- radcast_creator.js: Manager class that manages the sequential process of creating radcasts
| |-- session_metrics.js: Class that records and reports metrics about the creation session (duration, success rate etc)
|-- supporting-files: *. Non-source-code files to aid with project set up / documentation
|-- test: *. Invocation source files for different types of tests
|-- utilities:
| |-- utilities.js: Arbitrary, generic functions required around the project
|-- index.js: 'Main' file to which the event object that generated the invocation is passed.
|-- package-lock.json: npm generated after npm install
, represents exact, explicit, versioned and full dependency tree.
|-- package.json: User defined dependencies
|-- readme.md: You are here.
|-- template.yaml: Defines the environment/resources AWS must create to test the function locally. Note these configurations don't extend to the hosted function. Those must be set in the AWS console.
On invocation, the function should be supplied 1 user UID to create a radcast for. After this event, the following will happen:
- All active feeds and their episodes are downloaded, as well as the user's feed subscriptions
- Episode validation: Checks the episodes pointed to by the user records are valid
- Audio stitching: Loads individual files from GCS, transcodes and concatenates them into 1 radcast audio file, then saves (uploads) the output to GCS
- Writes the metadata related to the radcast to the relevant subcollection of the user's doc
(Will add image when I get back - I don't want to risk the package being too big to deploy)
Episode validation requires the following to requirements to pass:
- User must be actively subscribed to the feed:
activeSubscription: true
- Story episode:
2.1
active: true
2.2audioURL: <String>
(i.e. non-null) 2.3episodeIndex:
>= user's index for that feed - Daily episode:
3.1
currentDate - 2ds <= relevantDate <= currentDate + hrs
3.2audioURL: <String>
(i.e. non-null)
The process of episode validation involves downloading all active feeds and their episodes subcollections. The data for each feed and episode are fed to classes in the project and used to instantiated objects. This is for coherence and modularity. The feed class has it's episodes stored by 1) episode UIDs and 2) By index.
Once validated episodes have been acquired, a radcast
object is created (to hold the metadata specific to the radcast), and an AudioManager
object is create to perform the following:
- Download individual episode audio files
- Concatenate these audio files
- Upload the resultant file to GCS
The audio download and stitching process is complicated by general intros, and context intros. General intros (which can be disables by users) feature at the beginning of radcasts if they exist, then come context intros - which again may or may not exist, depending on whether the content team have uploaded them for a particular episode, then come the episodes in the user's set priority order. A radcast intro and episode order could look like this:
The complication here is that context intro metadata is coupled to its relevant episode document, and it doesn't have a priority of its own - this needs to be inferred at radcast creation time. The AudioDownload
class is used to aid with sequenctial downloading and ordering during downloading and audio concatenation. The methodology is difficult to explain simply, but here's my best attempt:
AudioDownload
objects are created for all main episode audio objects - they are ordered by priority and injected into adownloadSequence
array on theRadcast
object. If required, on a second pass,AudioDownload
objects are created for context intros, and these objects are spliced into thedownloadSequence
array at the correct index - either 0, 1 depending on whether there is a welcome episode.- Audio file's are downloaded in order using the
downloadSequence
Array. - On successful download of each
AudioDownload
object, aRadcastEpisode
is created and added to theRadcast
object. The special case is for context intro's, where only 1RadcastEpisode
is created for all context intros. - Once download is complete, the
downloadSequence
array is again used to set up the audio concatenation. This is done using the ffmpeg library (described below). - The resultant radcast audio file is then uploaded to GCS.
Audio concatenation is done using the ffmpeg library. This software needs to be compiled, which was difficult in the docker environment, hence we used a publicly available static binary. It's a mature piece of software, so should be fine to not update the binary frequently.
Before explaining how we concat the audio files, its important to understand the fundamentals in this process. Audio files are encoded raw digital audio data, and come in many different flavours of encoding. The key porperties to consider with audio encoding are:
- Sampling rate
- Bit depth
- Frame rate (for mpeg audio)
- Bit rate
I won't go into detail about these, read about them online, but essentially if you merge audio files with varying encoding properties without address the differences, audio players have a very difficult time understanding the file. To this end, to achieve a high quality output, we completely decode and re-encode the individual audio files, which adds processing overhead, but guarantees output quality and readability.
We use ffmpeg via the command line, and access the command line using the node-cmd
package. The command used has the following structure.
Command Explanation
- load multiple inputs with -i inputFilePath.ext
- specify a complex_filter, this means it supports multiple inputs and outputs
- Spcify the input streams in a chain: [0:0][1:0] - this means 'select' stream 0 from input 0 and stream 0 from input 1. We single stream files, since they only contain audio, making this simple.
- Specify the filter 'concat' for merging streams in order
- Provide filter arguments, n number of segments to concat, v number of video streams for output, a number of audio streams for output
- Specify an output 'tag' - like an alias
- Finally provide the output arguments and output file. Different to the example we add some extra arguments
- -map_metadata -1 => Remove metadata from inputs (this is critical)
- -c:a => specify the output audio codec - currently using mp3
- -b:a => specify output bitrate - currently 128kb/s
- -metadata => Add the metadata tags we actually want to be there
Note on removing input metadata tags
This is essential because a number of the audio files in the Radcast system are editted by the creative with software programs that add some kind if 'end of file' tag, and this causes streaming libraries to stop buffering the radcast audio file data when they read this tag.
Transcoding timeout
There is a 3 min (180s) timeout on the transcoding process. This is to safeguard against this shell process hanging. It is untested (yikes) as there has not been time to simulate such a failure yet. This timeout means that if the Lambda docker container gets reused, this process won't be running - it's unclear what AWS does to 'refresh' the container between invocations, so doing this manually is safer.
There's a very simple script that performs the scheduling of radcasts for for each user. The protocol for scheduling a radcast is:
- The
nextRadcastDate
must be in the past - Find the next day for a Radcast using the the
radcastFrequency
dictionary. - Find the diff (number of days) between the next day and today, use this and the
scheduledTime
to create a date object for the next radcast - Upload this to the users doc.
The error handling for the createSingleRadcast function comes in 2 flavours:
- Hard errors - complete failure to create radcast
- Soft errors - partial failure wherein a radcast can still be delivered
In the case of a hard error, the VM managing the task queue (of which this function invocation is one task), will re-attempt the task by simply re-invoking the function again. It will do this twice, hence total attempts = 3, before giving up.
The two kinds of soft errors are Validation errors and Download errors. Both are discussed above. If they occur, they will not cause the radcast to not be delivered UNLESS they failure rate drops below a threshold (60% - set in the AudioManager class). Failures are reported both to Snowden (Elk) and in the CMS.