Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable custom aggregate functions (take 2) #529

Merged
merged 36 commits into from
Sep 8, 2022
Merged
Show file tree
Hide file tree
Changes from 34 commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
24a6a12
initial commit.
Jul 3, 2020
fad9ba6
documentation
llimllib Sep 5, 2022
d191caa
remove no-longer-valid type
llimllib Sep 5, 2022
0d937a7
close over state initialization for performance
llimllib Sep 5, 2022
8fd3f8a
link documentation in comment
llimllib Sep 5, 2022
ba733ba
more testing
llimllib Sep 5, 2022
9e6b462
run tests if they're main
llimllib Sep 6, 2022
573afa7
accept a single arg
llimllib Sep 6, 2022
a3abdcb
this kind of works but I'm abandoning this branch
llimllib Sep 6, 2022
9daf01f
a middle road sqlite3_agg_context solution
llimllib Sep 6, 2022
ec5c72b
try out auto-updating state
llimllib Sep 6, 2022
a927950
improve quantile test, add multiple agg test
llimllib Sep 6, 2022
e643bd9
add a null to the test
llimllib Sep 6, 2022
2cbdb0e
acorn fails to parse ||=, whatever
llimllib Sep 6, 2022
b9ccd48
make eslint happy
llimllib Sep 6, 2022
ac548d4
make initial_value an argument
llimllib Sep 7, 2022
bf22aa1
test step and finalize exceptions
llimllib Sep 7, 2022
55858e9
add memory leak test
llimllib Sep 7, 2022
9a0c185
update docs to current interface
llimllib Sep 7, 2022
2445107
delete state in exception handlers
llimllib Sep 7, 2022
5b62cf6
remove null state
llimllib Sep 7, 2022
062f147
return init function and document object
llimllib Sep 7, 2022
7aff1ae
more tests and update back to init function
llimllib Sep 7, 2022
67f85e5
update redefinition test for new interface
llimllib Sep 7, 2022
b8692d4
update README to match fixed signature
llimllib Sep 7, 2022
b41e5cf
more consistent test formatting
llimllib Sep 7, 2022
d257bba
Update README.md
llimllib Sep 7, 2022
e82c286
clarify what exactly the result will contain
llimllib Sep 7, 2022
b65457c
Update README.md
lovasoa Sep 7, 2022
8d2c2e0
Update README.md
lovasoa Sep 7, 2022
f8f4a7c
Update README.md
lovasoa Sep 7, 2022
bdaa1b6
Update README.md
lovasoa Sep 7, 2022
e86d7ff
Update README.md
lovasoa Sep 7, 2022
423fc36
Improve documentation and type annotations
lovasoa Sep 8, 2022
f8e7bd3
ignore documentation in eslintrc
lovasoa Sep 8, 2022
799ebcd
reduce code size
lovasoa Sep 8, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,33 @@ db.create_function("add_js", add);
// Run a query in which the function is used
db.run("INSERT INTO hello VALUES (add_js(7, 3), add_js('Hello ', 'world'));"); // Inserts 10 and 'Hello world'

// You can create custom aggregation functions, by passing a name
// and a set of functions to `db.create_aggregate`:
//
// - an `init` function. This function receives no argument and returns
// the initial value for the state of the aggregate function.
// - a `step` function. This function takes two arguments
// - the current state of the aggregation
// - a new value to aggregate to the state
// It should return a new value for the state.
// - a `finalize` function. This function receives a state object, and
// returns the final value of the aggregate. It can be omitted, in which case
// the final value of the state will be returned directly by the aggregate function.
//
// Here is an example aggregation function, `json_agg`, which will collect all
// input values and return them as a JSON array:
db.create_aggregate(
"json_agg",
{
init: () => [],
step: (state, val) => [...state, val],
finalize: (state) => JSON.stringify(state),
}
);

db.exec("SELECT json_agg(column1) FROM (VALUES ('hello'), ('world'))");
// -> The result of the query is the string '["hello","world"]'

// Export the database to an Uint8Array containing the SQLite database file
const binaryArray = db.export();
```
Expand Down
280 changes: 218 additions & 62 deletions src/api.js
Original file line number Diff line number Diff line change
Expand Up @@ -225,6 +225,14 @@ Module["onRuntimeInitialized"] = function onRuntimeInitialized() {
"",
["number", "string", "number"]
);

// https://www.sqlite.org/c3ref/aggregate_context.html
// void *sqlite3_aggregate_context(sqlite3_context*, int nBytes)
var sqlite3_aggregate_context = cwrap(
"sqlite3_aggregate_context",
"number",
["number", "number"]
);
var registerExtensionFunctions = cwrap(
"RegisterExtensionFunctions",
"number",
Expand Down Expand Up @@ -1131,81 +1139,90 @@ Module["onRuntimeInitialized"] = function onRuntimeInitialized() {
return sqlite3_changes(this.db);
};

/** Register a custom function with SQLite
@example Register a simple function
db.create_function("addOne", function (x) {return x+1;})
db.exec("SELECT addOne(1)") // = 2
var extract_blob = function extract_blob(ptr) {
var size = sqlite3_value_bytes(ptr);
var blob_ptr = sqlite3_value_blob(ptr);
var blob_arg = new Uint8Array(size);
for (var j = 0; j < size; j += 1) {
blob_arg[j] = HEAP8[blob_ptr + j];
}
return blob_arg;
};

@param {string} name the name of the function as referenced in
SQL statements.
@param {function} func the actual function to be executed.
@return {Database} The database object. Useful for method chaining
*/
var parseFunctionArguments = function parseFunctionArguments(argc, argv) {
var args = [];
for (var i = 0; i < argc; i += 1) {
var value_ptr = getValue(argv + (4 * i), "i32");
var value_type = sqlite3_value_type(value_ptr);
var arg;
if (
value_type === SQLITE_INTEGER
|| value_type === SQLITE_FLOAT
) {
arg = sqlite3_value_double(value_ptr);
} else if (value_type === SQLITE_TEXT) {
arg = sqlite3_value_text(value_ptr);
} else if (value_type === SQLITE_BLOB) {
arg = extract_blob(value_ptr);
} else arg = null;
args.push(arg);
}
return args;
};
var setFunctionResult = function setFunctionResult(cx, result) {
switch (typeof result) {
case "boolean":
sqlite3_result_int(cx, result ? 1 : 0);
break;
case "number":
sqlite3_result_double(cx, result);
break;
case "string":
sqlite3_result_text(cx, result, -1, -1);
break;
case "object":
if (result === null) {
sqlite3_result_null(cx);
} else if (result.length != null) {
var blobptr = allocate(result, ALLOC_NORMAL);
sqlite3_result_blob(cx, blobptr, result.length, -1);
_free(blobptr);
} else {
sqlite3_result_error(cx, (
"Wrong API use : tried to return a value "
+ "of an unknown type (" + result + ")."
), -1);
}
break;
default:
sqlite3_result_null(cx);
}
};

/** Register a custom function with SQLite
@example <caption>Register a simple function</caption>
db.create_function("addOne", function (x) {return x+1;})
db.exec("SELECT addOne(1)") // = 2

@param {string} name the name of the function as referenced in
SQL statements.
@param {function} func the actual function to be executed.
@return {Database} The database object. Useful for method chaining
*/
Database.prototype["create_function"] = function create_function(
name,
func
) {
function wrapped_func(cx, argc, argv) {
var args = parseFunctionArguments(argc, argv);
var result;
function extract_blob(ptr) {
var size = sqlite3_value_bytes(ptr);
var blob_ptr = sqlite3_value_blob(ptr);
var blob_arg = new Uint8Array(size);
for (var j = 0; j < size; j += 1) {
blob_arg[j] = HEAP8[blob_ptr + j];
}
return blob_arg;
}
var args = [];
for (var i = 0; i < argc; i += 1) {
var value_ptr = getValue(argv + (4 * i), "i32");
var value_type = sqlite3_value_type(value_ptr);
var arg;
if (
value_type === SQLITE_INTEGER
|| value_type === SQLITE_FLOAT
) {
arg = sqlite3_value_double(value_ptr);
} else if (value_type === SQLITE_TEXT) {
arg = sqlite3_value_text(value_ptr);
} else if (value_type === SQLITE_BLOB) {
arg = extract_blob(value_ptr);
} else arg = null;
args.push(arg);
}
try {
result = func.apply(null, args);
} catch (error) {
sqlite3_result_error(cx, error, -1);
return;
}
switch (typeof result) {
case "boolean":
sqlite3_result_int(cx, result ? 1 : 0);
break;
case "number":
sqlite3_result_double(cx, result);
break;
case "string":
sqlite3_result_text(cx, result, -1, -1);
break;
case "object":
if (result === null) {
sqlite3_result_null(cx);
} else if (result.length != null) {
var blobptr = allocate(result, ALLOC_NORMAL);
sqlite3_result_blob(cx, blobptr, result.length, -1);
_free(blobptr);
} else {
sqlite3_result_error(cx, (
"Wrong API use : tried to return a value "
+ "of an unknown type (" + result + ")."
), -1);
}
break;
default:
sqlite3_result_null(cx);
}
setFunctionResult(cx, result);
}
if (Object.prototype.hasOwnProperty.call(this.functions, name)) {
removeFunction(this.functions[name]);
Expand All @@ -1229,6 +1246,145 @@ Module["onRuntimeInitialized"] = function onRuntimeInitialized() {
return this;
};

/** Register a custom aggregate with SQLite
@example <caption>Register a custom sum function</caption>
db.create_aggregate("js_sum", {
init: () => 0,
step: (state, value) => state + value,
finalize: state => state
});
db.exec("SELECT js_sum(column1) FROM (VALUES (1), (2))"); // = 3

@param {string} name the name of the aggregate as referenced in
SQL statements.
@param {object} aggregateFunctions
object containing at least a step function.
@param {function(): T} [aggregateFunctions.init = ()=>null]
a function receiving no arguments and returning an initial
value for the aggregate function. The initial value will be
null if this key is omitted.
@param {function(T, any) : T} aggregateFunctions.step
a function receiving the current state and a value to aggregate
and returning a new state.
Will receive the value from init for the first step.
@param {function(T): any} [aggregateFunctions.finalize = (state)=>state]
a function returning the result of the aggregate function
given its final state.
If omitted, the value returned by the last step
will be used as the final value.
@return {Database} The database object. Useful for method chaining
@template T
*/
Database.prototype["create_aggregate"] = function create_aggregate(
name,
aggregateFunctions
) {
if (!Object.hasOwnProperty.call(aggregateFunctions, "step")
) {
throw "An aggregate function must have a step function in " + name;
}

// Default initializer and finalizer
function init() { return null; }
function finalize(state) { return state; }

aggregateFunctions["init"] = aggregateFunctions["init"] || init;
aggregateFunctions["finalize"] = aggregateFunctions["finalize"]
|| finalize;

// state is a state object; we'll use the pointer p to serve as the
// key for where we hold our state so that multiple invocations of
// this function never step on each other
var state = {};

function wrapped_step(cx, argc, argv) {
// > The first time the sqlite3_aggregate_context(C,N) routine is
// > called for a particular aggregate function, SQLite allocates N
// > bytes of memory, zeroes out that memory, and returns a pointer
// > to the new memory.
//
// We're going to use that pointer as a key to our state array,
// since using sqlite3_aggregate_context as it's meant to be used
// through webassembly seems to be very difficult. Just allocate
// one byte.
var p = sqlite3_aggregate_context(cx, 1);

// If this is the first invocation of wrapped_step, call `init`
//
// Make sure that every path through the step and finalize
// functions deletes the value state[p] when it's done so we don't
// leak memory and possibly stomp the init value of future calls
if (!Object.hasOwnProperty.call(state, p)) {
state[p] = aggregateFunctions["init"].apply(null);
}

var args = parseFunctionArguments(argc, argv);
var mergedArgs = [state[p]].concat(args);
try {
state[p] = aggregateFunctions["step"].apply(null, mergedArgs);
} catch (error) {
delete state[p];
sqlite3_result_error(cx, error, -1);
}
}

function wrapped_finalize(cx) {
var result;
var p = sqlite3_aggregate_context(cx, 1);
try {
result = aggregateFunctions["finalize"].apply(null, [state[p]]);
} catch (error) {
delete state[p];
sqlite3_result_error(cx, error, -1);
return;
}

setFunctionResult(cx, result);

delete state[p];
}

if (Object.prototype.hasOwnProperty.call(this.functions, name)) {
removeFunction(this.functions[name]);
delete this.functions[name];
}
if (Object.prototype.hasOwnProperty.call(
this.functions,
name + "__finalize"
)) {
removeFunction(this.functions[name + "__finalize"]);
delete this.functions[name + "__finalize"];
}
// The signature of the wrapped function is :
// void wrapped(sqlite3_context *db, int argc, sqlite3_value **argv)
var step_ptr = addFunction(wrapped_step, "viii");

// The signature of the wrapped function is :
// void wrapped(sqlite3_context *db)
var finalize_ptr = addFunction(wrapped_finalize, "vi");
this.functions[name] = step_ptr;
this.functions[name + "__finalize"] = finalize_ptr;

// passing null to the sixth parameter defines this as an aggregate
// function
//
// > An aggregate SQL function requires an implementation of xStep and
// > xFinal and NULL pointer must be passed for xFunc.
// - http://www.sqlite.org/c3ref/create_function.html
this.handleError(sqlite3_create_function_v2(
this.db,
name,
aggregateFunctions["step"].length - 1,
SQLITE_UTF8,
0,
0,
step_ptr,
finalize_ptr,
0
));
return this;
};

// export Database to Module
Module.Database = Database;
};
1 change: 1 addition & 0 deletions src/exported_functions.json
Original file line number Diff line number Diff line change
Expand Up @@ -41,5 +41,6 @@
"_sqlite3_result_int",
"_sqlite3_result_int64",
"_sqlite3_result_error",
"_sqlite3_aggregate_context",
"_RegisterExtensionFunctions"
]
Loading