apf is a framework developed to create a dockerized pipeline to process an alert stream, that can be easily be deployed in a local machine or distributed using Kubernetes.
First developed to process ZTF data it is capable to be used for any stream/static data processing pipeline.
apf installation can be done with pip
pip install apf_base
This will install the apf python package and apf command line script.
apf is based on steps conected through Apache Kafka topics.
Each step is composed by a consumer
and is isolated from other
steps inside a docker container.
When running, the step calls the execute() method for each message or message batch consumed. A step can have multiple producers and databases back-ends plugins that can be accessed inside the execute method to have a more complex logic.
This generic step greatly reduce the development of each component of the pipeline and make it easier to test each component separately.
- Automatic Metric Sender (KafkaMetrics)
- Automatic Code Generation (
apf new-step <step_name>
) - Multiple Consumer Plugins:
- Kafka
- AVRO
- CSV
- JSON
- Producers:
- Kafka
- CSV
- Metrics:
- Kafka
A quick-start guide to create a new step can be found here.