Serverless investment management portfolio.
This project assists with investment portfolio management. It retrieves account balances from financial institutions and deploys a web application summarising the actions required to rebalance the portfolio.
You can take a look at a demo deployment here.
The project is comprised of several pieces working together:
- A GitHub Actions workflow periodically runs Python code to retrieve account balances.
- Balances and asset prices are stored in a plain text Beancount ledger.
- AWS Cloud Development Kit is used to define the cloud application in code.
- Datasette transforms data in the Beancount ledger into an interactive website.
- Mangum allows the Datasette web application to run on AWS Lambda, resulting in a practically zero-cost deployment.
- Summary
- Contents
- How-to guides
- How to set up a local development environment
- How to activate Mamba environment
- How to update Node.js
- How to update AWS CDK Toolkit
- How to update AWS Construct Library
- How to update Python dependencies
- How to update Lambda function Python dependencies
- How to serve Datasette locally
- How to rotate personal access tokens
- How to update ubank device credentials
- Explanation
- Reference
Create the portfolio
Mamba environment defined in environment.yml
:
micromamba create --file environment.yml --yes
Activate the portfolio
environment:
micromamba activate portfolio
Install Node.js dependencies:
npm ci
Install Python dependencies:
poetry install
Install Lambda function Python dependencies:
cd cdk/function
poetry install
cd -
Activate the portfolio
environment with the following command:
micromamba activate portfolio
Pin the nodejs
dependency in environment.yml to the active LTS version listed on this page.
Install the latest version of AWS CDK Toolkit:
npm install aws-cdk
Update just the AWS Construct Library (aws-cdk-lib
):
poetry update aws-cdk-lib
Update Python dependencies to their latest versions (according to version constraints in pyproject.toml
):
poetry update
Update Lambda function dependencies defined in separate directory:
cd cdk/function
poetry update
cd -
Build an SQLite database from a Beancount ledger using commands almost identical to those in the deploy workflow:
micromamba activate portfolio
bean-sql ../portfolio-ledger/portfolio.beancount cdk/function/portfolio.db
sqlite3 cdk/function/portfolio.db < ../portfolio-ledger/target_allocation.sql
sqlite3 cdk/function/portfolio.db < tables.sql
Run the Datasette web application locally using the following command.
Changes to metadata.yml
will restart the web application, which is useful when developing dashboards.
GitHub authentication is not configured as opposed to the production application deployed to AWS.
datasette cdk/function/portfolio.db --reload --metadata cdk/function/metadata.yaml
Complete the following steps after an expiry notification is received.
- Regenerate the token.
- Update portfolio-ledger's TOKEN repository secret.
- Regenerate the token.
- Update portfolio's TOKEN repository secret.
ubank device credentials are stored in AWS Parameter Store. Enrol a new device and update the parameter with the following commands:
$ python -m ubank name@domain.com --output device.json
Enter ubank password:
Enter security code sent to 04xxxxx789: 123456
$ aws ssm put-parameter \
--name "/portfolio/ubank-device" \
--value "$(< device.json)" \
--type SecureString \
--overwrite \
--region us-east-1 \
--no-cli-pager
{
"Version": 19,
"Tier": "Standard"
}
$ rm device.json
Check the parameter's value with the following command:
$ aws ssm get-parameter \
--name "/portfolio/ubank-device" \
--with-decryption \
--region us-east-1 \
--output text \
--query 'Parameter.Value' \
--no-cli-pager
{
"hardware_id": "d5c79ef7-8d6a-4feb-b129-a7f54440a348",
"device_id": "85ce55d4-4175-4016-bc37-1b563c680763",
...
}
Credentials used to authenticate with financial institutions are stored in this repository 😱. This makes it a breeze to develop and test things locally.
Encryption and decryption is handled using git-crypt. GitHub Actions workflows decrypt the secrets file when required.
This application serves financial information over the internet. The Datasette plugin datasette-auth-github is used to control access to the application.
The OAuth application Portfolio Datasette
was registered on GitHub, with Authorization callback URL set to https://portfolio.brodie.id.au/-/github-auth-callback
.
The Lambda function configures the plugin's OAuth client ID and secret settings with values retrieved from Parameter Store.
Access is restricted to my GitHub user ID. Forbidden requests are redirected to the GitHub auth page using the datasette-redirect-forbidden plugin.
Portfolio information is summarised in dashboard charts using the datasette-dashboards plugin. This information should be visible on the index page, rather than having to navigate to the dashboard page.
datasette-dashboards documentation suggests using <iframe>
elements to embed dashboards and charts in HTML.
However, it was difficult to achieve a responsive layout without scrollbars using this approach.
Instead, a subset of elements from the dashboard page are included in the index page's HTML.
Datasette's index page is customised using the description_html
metadata property.
The Datasette CLI and extract-dashboard.py
script is used to extract HTML elements from the dashboard page.
The following command extracts dashboard HTML elements to the clipboard for easy pasting into metadata.yaml
:
datasette serve \
--get https://portfolio.brodie.id.au/-/dashboards/portfolio \
--metadata cdk/function/metadata.yaml \
cdk/function/portfolio.db | \
python extract-dashboard.py | \
pbcopy
This command does the same for the demo application:
datasette serve \
--get https://portfolio-demo.brodie.id.au/-/dashboards/portfolio \
--metadata cdk/function/metadata.yaml \
cdk/function/portfolio.db | \
python extract-dashboard.py | \
pbcopy
Lambda allocates CPU proportional to the amount of memory configured. Request durations of multiple seconds were observed with the default setting of 128 MB. Occasional timeouts occurred when the default timeout of 3 seconds was exceeded.
Setting memory size to 1024 MB resulted in much shorter durations: 100 ms or less. Costs should be comparable or even reduced as we're using more expensive compute but for less time.
This section describes the workflows used by this project.
flowchart LR
dispatch --> deploy
subgraph portfolio
s2([Schedule]) --> test[<pre>test</pre> workflow]
s([Schedule]) --> update[<pre>update</pre> workflow]
p([Push]) --> deploy[<pre>deploy</pre> workflow]
end
subgraph "portfolio-ledger (private)"
p2([Push]) --> dispatch[<pre>dispatch</pre> workflow]
end
Financial institution's websites/APIs are subject to change. This workflow runs pytest tests fortnightly. A failed test workflow indicates that something on the financial institution's end has changed and the code needs to be fixed.
This workflow updates the Beancount ledger in portfolio-ledger with the latest balances and asset prices.
It is scheduled to run approximately every 10 days.
This workflow converts a Beancount ledger contained in the private portfolio-ledger repository to an SQLite database and then deploys the CDK application to AWS.
It is triggered when changes are pushed to this repository or the portfolio-ledger repository.
dispatch (in portfolio-ledger)
This workflow triggers the deploy workflow when changes are pushed to portfolio-ledger, regardless of whether the changes were manual or automatic.
My AWS account was configured to trust GitHub's OpenID Connect (OIDC) provider. This allows workflows to deploy to AWS without using long-lived credentials.
Note
This hack is no longer required after reverse engineering SelfWealth's mobile API. I'll keep this information here, because it may be required again in the future.
Some financial institution's websites behave differently when accessed from the GitHub Actions network, likely due to overly sensitive anti-bot protection. Code that would successfully retrieve a balance when run on a computer at home would fail when run on GitHub Actions.
To work around this, the deploy and test workflows connect to a Tailscale network and route traffic via an exit node at home.
Name | Description |
---|---|
portfolio-ledger/contents/write | Grants contents:write access to portfolio-ledger repository. Used by update workflow to update Beancount ledger, and by deploy workflow to check out portfolio-ledger. |
portfolio/actions/write | Grants actions:write access to this repository. Used by portfolio-ledger's dispatch workflow to trigger this repository's deploy workflow. |
The following secrets were created in the repository:
Name | Description |
---|---|
GIT_CRYPT_KEY | Used by git-crypt to decrypt secret repository files. |
TAILSCALE_OAUTH_CLIENT_SECRET | Used in update and test workflows to connect to a Tailscale network. |
TOKEN | portfolio-ledger/contents/write personal access token. |
The Lambda function retrieves the following parameters from Parameter Store on startup:
Name | Description |
---|---|
/portfolio/datasette-secret | Key used to sign Datasette cookies. |
/portfolio/github-client-id | GitHub OAuth application client ID. |
/portfolio/github-client-secret | GitHub OAuth application client secret. |
/portfolio/ubank-device | Enrolled ubank device credentials. |
Parameters are stored in the us-east-1
region.
List parameters associated with this project with the following command:
aws ssm describe-parameters --region us-east-1 --parameter-filters "Key=tag:project,Values=portfolio" --query 'Parameters[*].[Name,Type]' --output text --no-cli-pager
/portfolio/datasette-secret SecureString
/portfolio/github-client-id String
/portfolio/github-client-secret SecureString
/portfolio/ubank-device SecureString