Skip to content

Latest commit

 

History

History
28 lines (27 loc) · 5.96 KB

GLOSSARY.md

File metadata and controls

28 lines (27 loc) · 5.96 KB

Glossary

  • Control Plane — (or: CP) our server and API that store user input and control pipelines,
  • API - GRPC-api for control plane. For all installation we share same Private proto API.
  • Console — (or: Front UI) We have separate layer of Console API-s for UI in all installations. With console term we refer to our UI-team.
  • EndpointEndpoint is configureation of database, can be Source or Destination.
  • Transfer — (or: Flow, Connector). Data pipeline essentially link between Source and Destination.
  • RuntimeRuntime is place where we execute our workload. We use external runtimes (like separate EC2 machine, K8S-jobs, YT vanilla operation or compute cloud virtual machines in Yandex.cloud).
  • Data Plane — (or: DP) this is our main value, actual executor of Transfer-s. Executed inside specific Runtime
  • Abstract (or Abstract 1) set of base structs and interfaces which transfer operate with.
  • Abstract 2 (or: Base) alternative set of interfaces that transfer may operate with. Main difference that it operate abstract Event rather than real Change Item therefore more flexible.
  • Change Item — (or: Event). Transfer is a LOGICAL data transfer service. The minimum unit of data is a logical ROW (object). Between source and target we communicate via ChangeItem-s. .
  • Provider — Pluggable data storage implementation.
  • Model — Configuration of specific plugin. Usually plain-old go struct with fields. Fields are part of contract between CP and DP, so changing them must be compatibility aware.
  • Source — (or: Replication, ReplicationSource). Replication data provider. Infinite worker that push into Sink newly arrived Change Items
  • Storage — Snapshot data provider. Finite worker that read data into batches.
  • Sink — Data writer.
  • DataProviderAbstract2 implementation of snaphsot/replication data source.
  • Target - Abstract2 implementation of data sink
  • Task — (or: Operation) Finite work inside data plane.
  • Replication — Infinite work inside data plane. Execute Source to Sink connection inside corresponding Runtime
  • Pool (or Worker) Infinite control plane worker that schedule Replication-s and Task-s.
  • Job — (or: VM, Worker, Instance). Each data plane can be runned on several runtime artifacts (such as VM, Pod, YT Job).
  • Process — (or: Thread, Goroutine). Inside each data plane we can run parallel reads of data.
  • Parser — Parse raw unstructured data into Change Item-s
  • Serializer — Serailize Change Item-s into bytes sequence
  • Module Owner (or: Module Maintainer) - Package owner that stated in ya.make files in OWNER section.
  • Feature Owner - persone or group of person who responsible for linked group of packages (or provider as whole). Current mapping of feature owners here, but this is more informal separation. In most cases automatic already know who is more responsible for feature based on amount of contribution.