Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Add documentation for new v1.1 features #126

Merged
merged 1 commit into from
Feb 4, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 38 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,34 +54,60 @@ _Oura_ running in `daemon` mode can be configured to use custom filters to pinpo

If the available out-of-the-box features don't satisfiy your particular use-case, _Oura_ can be used a library in your Rust project to setup tailor-made pipelines. Each component (sources, filters, sinks, etc) in _Oura_ aims at being self-contained and reusable. For example, custom filters and sinks can be built while reusing the existing sources.

## How it Works

Oura is in its essence just a pipeline for proccessing events. Each stage of the pipeline fulfills a different roles:

- Source Stages: are in charge of pulling data from the blockchain and mapping the raw blocks into smaller, more granular events. Each event is then sent through the output port of the stage for further processing.
- Filter Stages: receive individual events from the source stage and apply some sort of transformation to each one. The transformations applied will depend on the particular use-case, but they usually revolve around selecting relevant events and enriching them with extra information.
- Sink Stages: receive the final events from the filter stage and submits the payload to some external system, database or service for further processing.

![diagram](assets/diagram.png)

## Feature Status

- Sources
- [x] chain-sync full-block (node-to-client)
- [ ] chain-sync headers-only (node-to-node)
- [x] chain-sync + block-fetch (node-to-node)
- [ ] shared file system
- Sinks
- [x] Kafka topic
- [x] Elasticsearch index / data stream
- [x] Rotating log files with compression
- [ ] Redis streams
- [ ] AWS SQS queue
- [ ] AWS Lambda call
- [ ] GCP PubSub
- [x] webhook (http post)
- [x] terminal (append-only, tail-like)
- [ ] TUI
- Events / Parsers
- [x] block events (start, end)
- [x] transaction events (inputs, outputs, assets)
- [x] metadata events (labels, content)
- [x] mint events (policy, asset, quantity)
- [x] pool registrations events
- [x] delegation events
- [x] CIP-25 metadata parser (image, files)
- [ ] CIP-15 metadata parser
- Filters
- [x] by event type (block, tx, mint, cert, etc)
- [x] by asset subject (policy, name, etc)
- [x] by metadata keys
- [ ] by block property (size, tx count)
- [ ] by tx property (fee, has native script, has plutus script, etc)
- [ ] by utxo property (address, asset, amount range)
- Enrichment
- [ ] policy info from metadata service
- [ ] input tx info from Blockfrost api
- [ ] address translation from ADAHandle
- [x] cherry pick by event type (block, tx, mint, cert, etc)
- [x] cherry pick by asset subject (policy, name, etc)
- [x] cherry pick by metadata keys
- [ ] cherry pick by block property (size, tx count)
- [ ] cherry pick by tx property (fee, has native script, has plutus script, etc)
- [ ] cherry pick by utxo property (address, asset, amount range)
- [ ] enrich events with policy info from external metadata service
- [ ] enrich input tx info from Blockfrost API
- [ ] enrich addresses descriptions using ADAHandle
- Other
- [x] stateful chain cursor to recover from restarts
- [ ] buffer stage to hold blocks until they reach a certain depth

## Known Limitations

- Oura only knows how to process blocks from the Shelley era. We are working on adding support for Byron in a future release.
- Oura reads events from minted blocks / transactions. Support for querying the mempool is planned for a future release.
- Oura will notify about chain rollbacks as a new event. The business logic for "undoing" the already processed events is a responsability of the consumer. We're working on adding support for a "buffer" filter stage which can hold blocks until they reach a configurable depth (number of confirmations).

## Contributing

Expand Down
Binary file added assets/diagram.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 2 additions & 0 deletions book/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
- [Usage](./usage/README.md)
- [Watch Mode](./usage/watch.md)
- [Daemon Mode](./usage/daemon.md)
- [Dump Mode](./usage/dump.md)
- [Library](./usage/library.md)
- [Filters](./filters/README.md)
- [Fingerprint](./filters/fingerprint.md)
Expand All @@ -21,6 +22,7 @@
- [Kafka](./sinks/kafka.md)
- [Elasticsearch](./sinks/elastic.md)
- [Webhook](./sinks/webhook.md)
- [Logs](./sinks/logs.md)
- [Reference](reference/README.md)
- [Data Dictionary](./reference/data_dictionary.md)
- [Guides](./guides/README.md)
Expand Down
1 change: 1 addition & 0 deletions book/src/sinks/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,5 +10,6 @@ These are the existing sinks that are included as part the main _Oura_ codebase:
- [Kakfa](kafka.md): a sink that sends each event into a Kafka topic
- [Elasticsearch](elastic.md): a sink that writes events into an Elasticsearch index or data stream.
- [Webhook](webhook.md): a sink that outputs each event as an HTTP call to a remote endpoint.
- [Logs](logs.md): a sink that saves events to the file system using JSONL text files.

New sinks are being developed, information will be added in this documentation to reflect the updated list. Contributions and feature request are welcome in our [Github Repo](https://github.com/txpipe/oura).
26 changes: 26 additions & 0 deletions book/src/sinks/logs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Logs

A sink that saves events into the file system. Each event is json-encoded and appended to the of a text file. Files are rotated once they reach a certain size. Optionally, old files can be automatically compressed once they have rotated.

## Configuration

Example sink section config

```toml
[sink]
type = "Logs"
output_path = "/var/oura/mainnet"
output_format = "JSONL"
max_bytes_per_file = 1_000_000
max_total_files = 10
compression = true
```

### Section: `sink`

- `type`: the literal value `Logs`.
- `output_path`: the path-like prefix for the output log files
- `output_format` (optional): specified the type of syntax to use for the serialization of the events. Only available option at the moment is `JSONL` (json + line break)
- `max_bytes_per_file` (optional): the max amount of bytes to add in a file before rotating it
- `max_total_files` (optional): the max amount of files to keep in the file system before start deleting the old ones
- `compression` (optional): a boolean indicating if the rotated files should be compressed.
2 changes: 1 addition & 1 deletion book/src/usage/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@

_Oura_ provides three different execution modes:

- [Dameon](daemon.md): a fully-configurable pipeline that runs in the background. Sources, filters and sinks can be combined to fulfil particular use-cases.
- [Daemon](daemon.md): a fully-configurable pipeline that runs in the background. Sources, filters and sinks can be combined to fulfil particular use-cases.
- [Watch](watch.md): to watch live block events from a node directly in the terminal. It is meant for humans, it uses colors and throttling to facilitate reading.
- [Dump](dump.md): to dump live block events from a node into rotation log files or stdout. It uses JSONL format for persistence of the events.
9 changes: 9 additions & 0 deletions book/src/usage/daemon.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,11 @@ type = "Z"
# custom config fields for this sink type
foo = "123"
bar = "789"

# optional cursor settings, remove seaction to disable feature
[cursor]
type = "File"
path = "/var/oura/cursor"
```

### The `source` section
Expand All @@ -58,6 +63,10 @@ This section specifies a collection of filters that are applied in sequence to e

This section specifies the destination of the data. The special `type` field must always be present and containing a value matching any of the available built-in sinks. The rest of the fields in the section will depend on the selected `type`. See the [sinks](../sinks/index.md) section for a list of available options.

### The `cursor` section

This section specifies how to configure the "cursor" feature. A cursor is a reference of the current position of the pipeline. If the pipeline needs to restart for whatever reason, and a cursor is available, the pipeline will start reading from that point in the chain. Removing the section from the config will disable the cursor feature.

### Full Example

Here's an example configuration file that uses a Node-to-Node source and output the events into a Kafka sink:
Expand Down
2 changes: 2 additions & 0 deletions book/src/usage/library.md
Original file line number Diff line number Diff line change
@@ -1 +1,3 @@
# Library

Coming Soon!