Skip to content

durch/google-bigtable-postgres-fdw

Repository files navigation

MIT licensed

Google Bigtable Rust PostgreSQL FDW

Rust PostgreSQL foreign data wrapper for interfacing with Google Cloud Bigtable, as well as other API compatible databases (HBase should work with some effort).

While logic is contained in Rust, it leverages PostgreSQL C FDW callbacks.

Quick start

You can get up and running quickly with PG + Rust setup using docker-compose.

docker-compose up -d
docker exec -it `docker ps | grep btpgext_vm | awk '{print $1}'` /bin/bash

This will setup staticly linked Rust 1.16 and PG 9.6 in an Ubuntu Xenial image, and get you in, where you can run make install and start playing.

Roadmap

  • SELECT
  • SELECT LIMIT
  • SELECT OFFSET
  • SELECT WHERE
  • INSERT
  • UPDATE - update can be achived by inserting an existing key
  • DELETE
  • Support for PG 9.3+
  • Useful EXPLAIN
  • Reduce C boilerplate

Installation

  • PostgreSQL 9.6+
  • stable Rust, get it using rustup.
git clone https://github.com/durch/google-bigtable-postgres-fdw.git
cd google-bigtable-postgres-fdw
make install
psql -U postgres

Initial DB setup

CREATE EXTENSION bigtable;
CREATE SERVER test FOREIGN DATA WRAPPER bigtable OPTIONS (instance '`instance_id`', project '`project_id`');
CREATE FOREIGN TABLE test(bt json) SERVER test OPTIONS (name '`table_name`');
CREATE USER MAPPING FOR postgres SERVER TEST OPTIONS (credentials_path '`path_to_service_account_json_credentials`');

Usage

You can use gen.py to generate some test data. Modify gen.py to adjust for the number of generated records, also modify thecolumn key in the generated output as this needs be a column familly that exists in your Bigtable, running python gen.py outputs test.sql, which can be fed into PG.

psql -U postgres < test.sql

SELECT

One Bigtable row per PG rowis returned, limit is done on the BT side, rows are returned as json and can be further manipulated using Postgres json functions and operators. WHERE is evaluted on the PG side so be sure to grab what you need from BT.

SELECT * FROM test;
SELECT * FROM test LIMIT 100;

SELECT bt->'familyName', bt->'qualifier' FROM test WHERE bt->>'rowKey' ~* '.*regex.*';
SELECT bt->'familyName', bt->'qualifier' FROM test WHERE bt->>'rowKey' = 'exact';

INSERT

INSERT takes a json array of rows as defined in core BT lib:

[
    {
        "row_key": "string",
        "family": "string",
        "qualifier": "string",
        "value": "string"
    }
]

As you are passing in one json object which gets inserted using one HTTP request, INSERT counter always shows one row inserted, the real number of rows can be accessed using RETURNING clause.

-- assuming default column name from the conf snippet above:
INSERT INTO test VALUES (
        '[
           {
             "value": "{\"id\": \"6vh22634bst6ysowcquxwd57e15cudr7\", \"lat\": 27, \"lng\": -169)}",
             "qualifier": "test",
             "family": "cf1",
             "row_key": "8xhxbop9azuufgxp"
           }
         ]'
    ) RETURNING bt;

           bt
-------------------------
{"count":1,"meta":"Ok"}
(1 row)