Skip to content

Kubernetes Resources

RyanRConaway edited this page Jun 1, 2023 · 22 revisions

The UrbanOS platform uses kubernetes to host both the core microservice applications as well as external applications, such as Redis. The core applications can optionally be configured to connect to managed cloud services for external dependencies. If external dependencies are enabled and configured as part of the UrbanOS helm release, they will be managed under helm and hosted on the kubernetes cluster instead of a managed cloud service.

Core Microservice applications can be found here: https://github.com/UrbanOS-Public/smartcitiesdata UrbanOS chart configuration can be found here: https://github.com/UrbanOS-Public/charts

This document aims to provide a summary of most kubernetes resources with a strong focus on dependencies between resources and how they are managed. These resources will differ based on how the platform is configured, so this document assumes all resources are hosted on the kubernetes cluster.

Applications

This is a list of the various applications used in the UrbanOS platform. External applications can be managed via cloud or deployed along-side the core applications in the kubernetes cluster.

Name Description Core/External Required
Alchemist Microservice used to perform transformations within the data pipeline. Reads extracted data from the data pipeline (from reaper) and write transformed data back to the data pipeline (to valkyrie) . Core Yes
Andi Microservice used to host the front-end for adding and configuring datasets. Used by admin/curators of the system. Pushes events onto the kafka event-stream for the system to react to. Core Yes
Discovery-API Microservice used to access information about datasets as well as the data itself. Queries Trino to access the persisted data stack. Reacts to the event-stream, but not the data pipeline. Core Yes
Discovery-Streams Microservice used to stream processed data over a websocket for subscription-based data consumption. Reacts to event-stream and also is the tail of the data pipeline. Reads from the data pipeline after forklift has persisted the data. Core Yes
Discovery-UI Microservice used to host the front-end for the data access portion of the system. Used by the end-user. Core Yes
Elastic Search Search engine used for finding datasets in discovery. External Yes
Forklift Microservice used to write proceeded data from the kafka data pipeline into the persisted data stack. Core Yes
Hive A distributed, fault-tolerant data warehouse system that enables analytics at a massive scale External Yes
Kafka An event streaming platform used to communicate domain-level events over an "event-stream" topic as well as distribute processed data between the core microservices. External Yes
Minio A high-performance, S3 compatible object store. It is used as a kubernetes alternative to cloud buckets to store data. External No, but some S3 compatible interface is required
Postgres PostgreSQL is an object-relational database system that uses and extends the SQL language. It is used as persistence layer for Andi and Discovery-Api. It also is used by trino/hive to store metadata. External Yes
Raptor Microservice used as an authentication layer. It retrieves application API keys, verifies access groups, and is partially responsible for logging in. Core Yes
Reaper Microservice used to pull data into the platform. It handles scheduling data extractions as well parsing and performing ingestion steps as configured by the system Curators. Begins the kafka data pipeline by parsing and writing extracted data to the data pipeline Core Yes
Redis Redis is an in-memory object store. Redis is primarily used to manage entity state (such as datasets, ingestions, organizations, and more) in our core applications with some additional caching mechanisms when querying external APIs, such as Auth0. External Yes
Trino Trino is a tool designed to efficiently query vast amounts of data using distributed queries. Trino is the only way to read/write data to/from the persisted data stack. External Yes
Valkyrie Microservice used to validate and standardize data. If incoming data is of the wrong data type, but can be converted, valkyrie will standardize the data. If incoming data cannot be standardize, valkyrie rejects the data. Core Yes
Vault Vault is a secret store that can be deployed on kubernetes. It allows for fine-grained access control. It is only used for the "Secret" ingestion extract step. It is not used for user credentials. External Yes

Pods

Pods are containers that host (docker) images, which isolate an environment to contain dependencies needed for an application. Generally, they are managed by a deployment, operator, or stateful set. External information can be connected to pods via environment variables and hard drives (Persistent Volume Claims).

Name Kubernetes Only? Description Used By Uses Safe to delete? Kustomized? Managed by Operator? Chart repository Triage Flow/Troubleshooting Docs
*-alchemist-* Yes This pod hosts the alchemist microservice. It is responsible for applying transformations, defined by the curator through Andi, to each message that reaper put onto the data layer. It writes the transformed data back to the data pipeline for valkyrie to pick up. N/A Kafka: Reads events from the event-stream. Also, reads from the data pipeline to perform transformations on a data extraction.
Redis: Used to manage state of entities.
Yes, it will restart. No No https://github.com/UrbanOS-Public/charts/tree/master/charts/alchemist NA
andi-* Yes This pod hosts the ANDI front end. Curator: Curators/Admins use Andi to register ingestions, datasets, access groups, and more. Redis: Maintains entity state.
Postgres: Stores front-end UI state.
Kafka: Places messages on the event-stream to be read downstream. Also reads from the event stream.
Auth0: Connects to external Auth0 tenant for authentication.
Yes, it will restart from last kafka message. No No https://github.com/UrbanOS-Public/charts/tree/master/charts/andi NA
discovery-api-* Yes This pod hosts the discovery API microservice. It can be directly queried via API externally or used by DiscoveryUI as a backend. End-User: End users can query discovery-api directly to obtain data from the persistent data stack. Discovery-UI: Uses discovery-api as a backend for data retrieval. Kafka: Reads from the event-stream to receive entity updates.
Redis: Maintains entity state.
Auth0: Connects to external Auth0 tenant for authentication.
Elasticsearch: Used to search for datasets.
Presto: Used to query data already saved to the system.
Yes, it will restart from last kafka message. No No https://github.com/UrbanOS-Public/charts/tree/master/charts/discovery-api NA
discovery-streams-* Yes This pod hosts a websocket service that can retrieve data. It can be subscribed to via WS connection externally. End-Users: End-Users can subscribe to a websocket to receive data based on subscriptions. Redis: Stores entity state in viewstore.
Kafka: Used to receive the latest processed data to publish externally.
Yes, it will restart from last kafka message. No No https://github.com/UrbanOS-Public/charts/tree/master/charts/discovery-streams NA
discovery-ui-* Yes This pod hosts the discovery front-end application. End-User: Front-End for end users. Redis: Maintains entity state.
Discovery-API: Uses discovery-api as a backend to obtain data.
Yes, it will restart from last kafka message. No No https://github.com/UrbanOS-Public/charts/tree/master/charts/discovery-ui NA
elasticsearch-master-* No, can be cloud managed This pod is at least one of the high-availability pods that host the elasticsearch server. discovery-api: Service that connects to elasticsearch to search datasets. N/A Partially, it will restart, but drop any current transactions. No No https://github.com/elastic/helm-charts NA
forklift-* Yes This pod "lifts" processed data from kafka data layer into the data stack (Trino/Hive/Presto/Minio). N/A Redis: Maintains entity state.
Kafka: Reads from both the main event-stream for entity updates and the data layer for processed data.
Presto: Used to store processed data.
Minio: Used to manage the buckets that the data will be stored in.
Yes, it will restart from last kafka message. No No https://github.com/UrbanOS-Public/charts/tree/master/charts/forklift NA
hive-metastore-* No, can be cloud managed This pod hosts the hive portion of the datastack (Trino/Hive/Presto/Minio). Trino: Trino uses hive as a connector to data stored in minio. All queries into hive come from Trino. Minio(Optional): Can be used as the required s3 connection that hive needs to manage data storage.
Postgres: Used to save metadata about queries.
Partially. It will restart, but drop any current transactions. No No https://github.com/trinodb/charts NA
kafka-exporter-* No, can be cloud managed This pod exports kafka metrics for prometheus. It is managed by the strimzi-kafka-operator. Prometheus: Used to collect metrics about kafka. Kafka: Monitors all kafka topics and consumers. Yes, it will restart. No Yes: strimzi-kafka-operator https://github.com/strimzi/strimzi-kafka-operator/tree/main/helm-charts/helm3/strimzi-kafka-operator NA
kafka-scraper-cron-* Yes This pod executes a quick cronjob that logs the primary Kafka metrics. This is a stand-in for a cluster that does not have prometheus installed, but wants to monitor logs to determine errors. Since kafka metrics can indicate a backup of messages, this cronjob can serve as a way to unify application metrics with a log aggregator. Log Aggregator: If a log aggregator is installed on the cluster, it will use this cron to collect kafka metrics in log form. Kafka Exporter: Queries the exporter's endpoint to retrieve and log the current metrics. Yes, but it will skip that cronjob No No https://github.com/UrbanOS-Public/charts/tree/master/charts/kafka NA
Minio-pool-* No, any s3 compatible service can be used This pod is a controller for the primary data storage layer. It is managed by the minio-operator and configured by the Minio Tenant. Trino: Used by trino/hive to store data.
Forklift: Indirectly used by forklift (through Trino) to read/write data.
Discovery-API: Indirectly used by discovery-api (through Trino) to read data.
N/A Yes, it will restart. No Yes: minio-operator https://github.com/minio/operator/tree/master/helm
NA
Minio-operator-* Yes This pod hosts the minio operator, which uses the Tenant configuration to manage minio related resources and operations, such as establishing SSL certs, creating initial user and buckets, and creating secrets. N/A Tenant: Detects installed tenants and manages based on their configuration. Partially. It will restart, but drop any current transactions. No No, it is the operator. https://github.com/minio/operator/tree/master/helm NA
pipeline-entity-operator-* Yes This pod hosts the kafka entity operator, which monitors the cluster state to manage kafka topics and users. It is responsible for creating topics and users as configured in the chart. N/A Kafka: Ensures configured kafka topics are created and configured based on chart values. Yes, it will restart. No Yes, strimzi-kafka-operator https://github.com/strimzi/strimzi-kafka-operator/tree/main/helm-charts/helm3/strimzi-kafka-operator NA
pipeline-kafka-* No These pods hosts the kafka brokers. Kafka brokers are capable of handling reads and writes, and they all synchronize data between each other to ensure data reliability and fault tolerance. They also provide storage for written messages. Andi: Used by Andi to read/write to the event-stream topic.
Reaper: Used by Reaper to read/write to the event-stream topic. Also used to begin the data pipeline for each batch of ingested data.
Alchemist: Used by Alchemist to read/write to the event-stream topic. Also used in the data pipeline, after reaper, for performing transformations on each batch of data.
Valkyrie: Used by Valkyrie to read/write to the event-stream topic. Also used in the data pipeline, after Alchemist, to validate data types.
Forklift: Used by forklift to read/write to the event-stream topic. Also used in the data pipeline, after Valkyrie, to persist the data pipeline into the primary data stack (Trino/Hive/Presto/Minio).
Discovery-API: Used by Discovery-API to read/write to the event-stream topic.
Discovery-Streams: Used by discovery-streams to read/write to the event-stream topic. Also used in the data-pipeline, after forklift, to stream to processed data to any subscribers of the stream.
N/A Partially, it will restart but drop any current transactions. No Yes, strimzi-kafka-operator https://github.com/strimzi/strimzi-kafka-operator/tree/main/helm-charts/helm3/strimzi-kafka-operator NA
pipeline-zookeeper-* No This pod hosts the Apache zookeeper controller, which manages the status of kafka brokers. It is responsible for operations like electing a broker leader in the context of high availability and configuring brokers based on charts. N/A Kafka: Ensures configured kafka brokers are created and configured based on chart values. Yes, it will restart. No Yes, strimzi-kafka-operator https://github.com/strimzi/strimzi-kafka-operator/tree/main/helm-charts/helm3/strimzi-kafka-operator NA
*postgres* No The name of this pod is dependent on the release name, which may vary. This pod hosts the postgres database, which is not managed under the primary UrbanOS release. It must be deployed independently as it's own release. Andi: Used by Andi to store the front-end UI state.
Discovery-API: Used by discovery-api to maintain organization and user state.
Trino/Hive: Used by trino/hive to store data metadata.
N/A Partially, it will restart, but any current transactions will be dropped. No Yes, strimzi-kafka-operator https://github.com/strimzi/strimzi-kafka-operator/tree/main/helm-charts/helm3/strimzi-kafka-operator NA
raptor-* Yes This pod hosts the raptor microservice, which is an authentication layer for the platform. It is responsible for retrieving application API keys, verifying access groups, and partially responsible for logging in. Andi: Used by Andi to validate API keys and set access group.
Discovery-API: Used by discovery-api to create/validate API keys and read/write access group.
Auth0: Uses auth0 api to obtain roles and user metadata. Yes, it will restart. No No https://github.com/UrbanOS-Public/charts/tree/master/charts/raptor NA
reaper-* Yes This pod hosts the reaper microservice. It is responsible for scheduling and performing data extractions from external sources. It is the first step in the data pipeline. N/A Kafka: Reads events from the event-stream. Also, begins the data pipeline by placing batched data onto kafka topics.
Redis: Used to manage state of entities.
Yes, it will restart. No No https://github.com/UrbanOS-Public/charts/tree/master/charts/reaper NA
redis-*-master-* & redis-*-replicas-* Yes Master pod performs all write operations to redis. Master pod offloads some read operations to replicas. If master pod goes down, a replica will be promoted. Redis is primarily used to manage entity state in our application with some additional caching mechanisms. Andi: Manages entity state. Also caches external api calls, like Auth0, to avoid rate limiting.
Reaper: Manages entity state.
Alchemist: Manages entity state.
Valkyrie: Manages entity state. Forklift: Manages entity state. Also, used to track message count state of a data extraction.
Discovery-API: Manages entity state. Also caches external api calls, like Auth0, to avoid rate limiting.
N/A Yes, it will restart. No No https://github.com/bitnami/charts/tree/main/bitnami/redis NA
*-trino-coordinator-* & *-trino-worker-* No The coordinator pod is responsible for receiving, parsing, and planning the execution of a sql query. It distributes the execution plan to the workers and returns the combined results once finished. Forklift: Receives SQL queries from forklift that write the data pipeline to the persisted data stack. Discovery-API: Receives SQL queries from discovery-api that read data to be reported to the end-user. Hive: Uses hive as a connector to access data stored in the s3 interface. Yes, it will restart. Yes, kustomizations need to be made if on an Openshift cluster due to the trino chart not allowing for configuration of Security Context within the deployment sepc. No https://github.com/trinodb/charts/tree/main/charts/trino NA
strimzi-cluster-operator-* Yes This pod is the kafka operator. It manages all kafka resources, including the entity operator. N/A Kakfka: Generally manages all kafka resources, either directly or through another operator. Yes, it will restart. No No https://github.com/strimzi/strimzi-kafka-operator/tree/main/helm-charts/helm3/strimzi-kafka-operator NA
valkyrie-* Yes This pod hosts the valkyrie microservice. It is responsible for reading transformed data (after alchemist) from the data pipeline, validating the datatypes, standardizing when possible, and writing the data back to the data pipeline for forklift to consume. N/A Kafka: Reads events from the event-stream. Also, reads from the data pipeline to perform data validation and standardization on a data extraction.
Redis: Used to manage state of entities.
Yes, it will restart. No No https://github.com/UrbanOS-Public/charts/tree/master/charts/valkyrie NA
vault-* No These pods host the high-availability vault service. Vault is used to store and retrieve secrets. Andi: Writes to vault when a curator adds a secret via the ingestion extract step.
Reaper: Reads from vault to retrieve the secret from the ingestion secret step.
N/A Yes, it will restart. No No https://github.com/hashicorp/vault-helm NA
vault-agent-injector-* No Handles injecting secrets defined in yaml for kubernetes upon vault deployment. Not used in our application. N/A N/A Yes, it will restart. No No https://github.com/hashicorp/vault-helm NA
= = ============================= ========================================= ========================================= = = = = =
Clone this wiki locally