-
-
Notifications
You must be signed in to change notification settings - Fork 253
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Showing
18 changed files
with
190 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
book |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
[book] | ||
authors = ["Jubilee Young"] | ||
language = "en" | ||
multilingual = false | ||
src = "src" | ||
title = "Building Postgres Extensions with Rust" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
# Working with Postgres Extensions | ||
|
||
- [Working with PGRX](./extension/README.md) | ||
- [Building Extensions with PGRX](./extension/build.md) | ||
- [Cross Compiling](./extension/build/cross-compile.md) | ||
- [Writing Extensions with PGRX](./extension/write.md) | ||
- [Testing Extensions with PGRX](./extension/test.md) | ||
- [Memory Checking](./extension/test/memory-checking.md) | ||
- [Basics of Postgres Internals](./pg-internal.md) | ||
- [Pass-By-Datum](./pg-internal/datum.md) | ||
- [Memory Contexts](./pg-internal/memory-context.md) | ||
- [Varlena Types](./pg-internal/varlena.md) | ||
- [`sigsetjmp` & `siglongjmp`](./pg-internal/setjmp-longjmp.md) | ||
- [Contributing](./contributing.md) | ||
- [PGRX Internals](./contributing/pgrx-internal.md) | ||
- [Releases](./contributing/release.md) | ||
- [Design Decisions](./design-decisions.md) |
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
# PGRX Internals |
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
# Working with PGRX | ||
|
||
The idea of pgrx is that writing Postgres extensions with `pgxs.mk` requires | ||
- writing a bunch of C code that must manually handle many Postgres invariants | ||
- writing SQL that then loads the extension properly, including many | ||
[CREATE FUNCTION] and [CREATE TYPE] declarations | ||
|
||
This demands programmers who wish to write Postgres extensions to become | ||
experts in C, SQL, and the inner workings of Postgres, on top of having useful | ||
domain knowledge for the actual extension. | ||
|
||
Alternatively, with Rust, safe abstractions can be designed to encode the | ||
invariants that Postgres requires in types. Powerful procedural macros can | ||
generate the code to handle the Postgres function ABI, or even write the | ||
needed SQL declarations! This is what pgrx does, with the intent to allow | ||
writing extensions correctly while only being familiar with a single language: | ||
Rust. | ||
|
||
...and pgrx. While the annotations that pgrx requires are easy to write, they | ||
aren't necessarily automatic. And even if it is simpler than `pgxs.mk`, the | ||
pgrx build system, primarily wielded through `cargo pgrx`, sometimes needs user | ||
intervention to fix problems. Most of this is in service of allowing you to | ||
adjust exactly how much pgrx assists you, so that it doesn't get in the way | ||
if you need to force something inside it. | ||
|
||
<!-- the following may currently be aspirational rather than actual --> | ||
This guide assumes the reader (you) are a Rust programmer, but it does not | ||
expect you to be intimately familiar with Postgres, nor deeply nuanced in FFI. | ||
It may assume familiarity with C and SQL work, but as long as you have written | ||
`extern "C"` or `LEFT JOIN` before, you should be fine. | ||
|
||
[CREATE FUNCTION]: https://www.postgresql.org/docs/current/sql-createfunction.html | ||
[CREATE TYPE]: https://www.postgresql.org/docs/current/sql-createtype.html |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
# Building Extensions with PGRX | ||
<!-- TODO: explain the build system more --> |
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
# Testing Extensions with PGRX | ||
|
||
Both `cargo test` and `cargo pgrx test` can be used to run tests using the `pgrx-tests` framework. | ||
Tests annotated with `#[pg_test]` will be run inside a Postgres database. | ||
<!-- TODO: explain the test framework more --> |
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
# Writing Extensions with PGRX | ||
|
||
<!-- TODO: write all of this --> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
# Basics of Postgres Internals |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
# Pass-By-Datum | ||
|
||
The primary way that Postgres passes values between Postgres functions that can hypothetically | ||
have any type is using the "Datum" type. The declaration is written thus in the source code: | ||
```c | ||
typedef uintptr_t Datum; | ||
``` | ||
|
||
The way Postgres uses Datum is more like a sort of union, which might be logically described as | ||
```rust | ||
#[repr(magic)] // This is not actually ABI-conformant | ||
union Datum { | ||
bool, | ||
i8, | ||
i16, | ||
i32, | ||
i64, | ||
f32, | ||
f64, | ||
Oid, | ||
*mut varlena, | ||
*mut c_char, // null-terminated cstring | ||
*mut c_void, | ||
} | ||
``` | ||
|
||
Thus, sometimes it is a raw pointer, and sometimes it is a value that can fit in a pointer. | ||
This causes it to incur several of the hazards of being a raw pointer, likely to a lifetime-bound | ||
allocation, yet be copied around with the gleeful abandon that one reserves for ordinary bytes. | ||
The only way to determine which variant is in actual usage is to have some other contextual data. | ||
|
||
<!-- TODO: finish out the Datum<'_> drafts and provide alternatives to worrying about pointers --> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,61 @@ | ||
# Memory Contexts | ||
|
||
Postgres uses a set of "memory contexts" in order to manage memory and prevent leakage, despite | ||
the fact that Postgres code may churn through tables with literally millions of rows. Most of the | ||
memory contexts that an extension's code is likely to be invoked in are transient contexts that | ||
will not outlive the current transaction. These memory contexts will be freed, including all of | ||
their contents, at the end of that transaction. This means that allocations using memory contexts | ||
will quickly be cleaned up, even in C extensions that don't have the power of Rust's compile-time | ||
memory management. However, this is incompatible with certain assumptions Rust makes about safety, | ||
thus making it tricky to correctly bind this code. | ||
|
||
<!-- TODO: finish out `MemCx` drafts and provide alternatives to worrying about allocations --> | ||
|
||
## What `palloc` calls to | ||
In extension code, especially that written in C, you may notice calls to the following functions | ||
for allocation and deallocation, instead of the usual `malloc` and `free`: | ||
|
||
```c | ||
typedef size_t Size; | ||
|
||
extern void *palloc(Size size); | ||
extern void *palloc0(Size size); | ||
extern void *palloc_extended(Size size, int flags); | ||
|
||
extern void pfree(void *pointer); | ||
``` | ||
<!-- | ||
// Only in Postgres 16+ | ||
extern void *palloc_aligned(Size size, Size alignto, int flags); | ||
--> | ||
When combined with appropriate type definitions, the `palloc` family of functions are identical to | ||
calling the following functions and passing the `CurrentMemoryContext` as the first argument: | ||
```c | ||
typedef struct MemoryContextData *MemoryContext; | ||
#define PGDLLIMPORT | ||
extern PGDLLIMPORT MemoryContext CurrentMemoryContext; | ||
extern void *MemoryContextAlloc(MemoryContext context, Size size); | ||
extern void *MemoryContextAllocZero(MemoryContext context, Size size); | ||
extern void *MemoryContextAllocExtended(MemoryContext context, | ||
Size size, int flags); | ||
``` | ||
<!-- | ||
// Only in Postgres 16+ | ||
extern void *MemoryContextAllocAligned(MemoryContext context, | ||
Size size, Size alignto, int flags); | ||
--> | ||
|
||
Notice that `pfree` only takes the pointer as an argument, effectively meaning every allocation | ||
must know what context it belongs to in some way. | ||
|
||
### `CurrentMemoryContext` makes `impl Deref` hard | ||
|
||
<!-- TODO: this segment. --> | ||
|
||
### Assigning lifetimes to `palloc` is hard | ||
|
||
<!-- TODO: this segment. --> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
# sigsetjmp & siglongjmp | ||
|
||
In order to handle errors that may be distributed widely across the database and deeply nested, | ||
Postgres uses `sigsetjmp` and `siglongjmp` in a certain "calling convention" to handle a stack | ||
of error-handling steps. At a "try-catch" site, `sigsetjmp` is called, and at an error site, | ||
`siglongjmp` is called, each time manipulating a global stack of error contexts to allow nested | ||
try-catches. To address the fact that Rust code is preferably not jumped over, instead properly | ||
handling its destructors via unwinding, pgrx guards calls into C with a function that handles the | ||
global state and then panics. Likewise, Rust panics are hooked in ways that then propagate into | ||
errors in Postgres. | ||
|
||
<!-- | ||
TODO: Make the next statement slightly untrue by making it easier to call functions unsoundly so | ||
that we can call certain functions in tight loops with only a single guard on the inner loop. | ||
--> | ||
The functions normally accessed via `pgrx::pg_sys` are `unsafe`, but are less unsafe than some C | ||
functions because of this guard. You do not need to worry about `siglongjmp` when calling those. | ||
However, if you define your own `extern "C" fn` for *Postgres* to call, you may need to apply | ||
`#[pg_guard]` to handle such deep nesting between Rust and C calling contexts. | ||
|
||
If you do, try to limit the amount of code that lies within the scope of that guard, as it is easy | ||
to make a mistake that makes this guard useless. Any code that is part of the guarded scope should | ||
not have any destructors, because it is called *after* `sigsetjmp` is called. Thus, destructors | ||
in that scope will be skipped over! The mentioned FFI functions which are already guarded by pgrx | ||
each wrap only one call, which is the most appropriate scope in the majority of cases. | ||
|
||
<!-- TODO: Provide more context on appropriate code, explain C-unwind a bit --> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
# Varlena Types |