Skip to content

SinaraML Glossary

pestovid edited this page Mar 12, 2024 · 13 revisions

SinaraML framework provides SinaraML Server, SinaraML Storage, SinaraML Lib, SinaraML Step Template, SinaraML Tools to the Data Scientist.

SinaraML Server - Jupyter Server, with all the necessary libraries for working with data and training models. SinaraML provides tree Basic Servers for different purposes - classic ML, computer vision (CV) and natural language processing (NLP).

SinaraML Storage - long-term storage where ML pipeline stored input and output entities. Depending on infrastructure SinaraML Storage can be implemented based on S3, HDFS protocols or local disk.

SinaraML Volume personal data store which can be file system directory or docker volume.

SinaraML Lib - compact library that contains everything you need to create ML pipelines, for data preparation and versioning, model versioning and serving.

SinaraML Archive - part of the SinaraML Lib which provides effective way to store large files and pipeline entities in the SinaraML Storage.

Bento Archive - BentoML Service which contains only artifacts without REST API

SinaraML Step - an ML pipeline step which consists of one or several substeps. Each substep is a Jupyter Notebook written with defined rules.

SinaraML Step Template component (or step) template for creating a SinaraML Step - an ML pipeline step. The ML pipeline consists of several steps. Each step based on this template.

Pipeline Fabric is a tool used to create pipelines from pipeline with required pipeline profile.

Pipeline Profile or Pipeline Blueprint is a list of predefined steps which can be used to quickly create several steps united in single pipeline.

Pipeline Designer Lib library used by the Pipeline Fabric to create and manage pipelines.

Interactive run - execution of a substep cell by cell in Jupyter server UI.

Job run - execution python job file (step.dev.py for example) - runs all substep in the current step.

SinaraML CLI - number of CLI tools for creating, deleting, stopping and starting a SinaraML Server, creating docker images from BentoServices created by ML pipelines, ML pipelines management, visualization etc.

SinaraML Basic - preconfigured personal platform working on desktop, remote virtual machine which can be located on-prem or on a cloud.

SinaraML Customizable Infra - way to customize orchestration of SinaraML Server, SinaraML Storage and SinaraML Spark for integration with your infrastructure (Git, Docker repos, authentication and authorization methods including Active Directory).

SinaraML Customizable development process - configure SinaraML Template and SinaraML Step for your development process.

SinaraML Examples - library of ready to use configurable ML pipelines can be customized for your needs.

SinaraML Tutorials - step by step instructions how to setup and use SinaraML. Includes this Glossary. Also contains typical pipelines examples.

SinaraML Book - deep guide to the dsmlops.

Run modes - interactive or job

Dataflow - paradigm based on the idea of representing computations as a directed graph, where nodes are computations and data flow along the edges. Dataflow is the movement of data through a system comprised of software, hardware or a combination of both.

Control Flow - is the order in which individual steps of an pipeline are executed.