forked from harvard-edge/cs249r_book
-
Notifications
You must be signed in to change notification settings - Fork 0
/
workflow.qmd
90 lines (61 loc) · 8.48 KB
/
workflow.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
# AI Workflow
![_DALL·E 3 Prompt: Create a rectangular illustration of a stylized flowchart representing the AI workflow/pipeline. From left to right, depict the stages as follows: ‘Data Collection’ with a database icon, ‘Data Preprocessing’ with a filter icon, ‘Model Design’ with a brain icon, ‘Training’ with a weight icon, ‘Evaluation’ with a checkmark, and ‘Deployment’ with a rocket. Connect each stage with arrows to guide the viewer horizontally through the AI processes, emphasizing the sequential and interconnected nature of these steps._](./images/cover_ai_workflow.png)
In this chapter, we'll explore the machine learning (ML) workflow, setting the stage for subsequent chapters that delve into the specifics. To ensure we don't lose sight of the bigger picture, this chapter offers a high-level overview of the steps involved in the ML workflow.
The ML workflow is a structured approach that guides professionals and researchers through the process of developing, deploying, and maintaining ML models. This workflow is generally divided into several crucial stages, each contributing to the effective development of intelligent systems.
::: {.callout-tip}
## Learning Objectives
* Understand the ML workflow and gain insights into the structured approach and stages involved in developing, deploying, and maintaining machine learning models.
* Learn about the unique challenges and distinctions between workflows for Traditional machine learning and embedded AI.
* Appreciate the various roles involved in ML projects and understand their respective responsibilities and significance.
* Understanding the importance, applications, and the considerations for implementing ML models in resource-constrained environments.
* Gain awareness about the ethical and legal aspects that need to be considered and adhered to in ML and embedded AI projects.
* Establish a basic understanding of ML workflows and roles to be well-prepared for deeper exploration in the following chapters.
:::
## Overview
![Multi-step design methodology for the development of a machine learning model. Commonly referred to as the machine learning lifecycle](./images/ML_life_cycle.png)
Developing a successful machine learning model requires a systematic workflow. This end-to-end process enables you to build, deploy and maintain models effectively. It typically involves the following key steps:
1. **Problem Definition** - Start by clearly articulating the specific problem you want to solve. This focuses your efforts during data collection and model building.
2. **Data Collection to Preparation** - Gather relevant, high-quality training data that captures all aspects of the problem. Clean and preprocess the data to get it ready for modeling.
3. **Model Selection and Training** - Choose a machine learning algorithm suited to your problem type and data. Consider pros and cons of different approaches. Feed the prepared data into the model to train it. Training time varies based on data size and model complexity.
4. **Model Evaluation** - Test the trained model on new unseen data to measure its predictive accuracy. Identify any limitations.
6. **Model Deployment** - Integrate the validated model into applications or systems to start operationalization.
7. **Monitor and Maintain** - Track model performance in production. Retrain periodically on new data to keep it current.
Following this structured **ML workflow** helps guide you through the key phases of development. It ensures you build effective and robust models that are ready for real-world deployment. The end result is higher quality models that solve your business needs.
The ML workflow is iterative, requiring ongoing monitoring and potential adjustments. Additional considerations include:
* **Version Control**: Keep track of code and data changes to reproduce results and revert to earlier versions if needed.
* **Documentation**: Maintain detailed documentation to allow for workflow understanding and reproduction.
* **Testing**: Rigorously test the workflow to ensure its functionality.
* **Security**: Safeguard your workflow and data, particularly when deploying models in production settings.
## Traditional vs. Embedded AI
The ML workflow serves as a universal guide, applicable across various platforms including cloud-based solutions, edge computing, and tinyML. However, the workflow for Embedded AI introduces unique complexities and challenges, which not only make it a captivating domain but also pave the way for remarkable innovations.
### Resource Optimization
- **Traditional ML Workflow**: Prioritizes model accuracy and performance, often leveraging abundant computational resources in cloud or data center environments.
- **Embedded AI Workflow**: Requires careful planning to optimize model size and computational demands, given the resource constraints of embedded systems. Techniques like model quantization and pruning are crucial.
### Real-time Processing
- **Traditional ML Workflow**: Less emphasis on real-time processing, often relying on batch data processing.
- **Embedded AI Workflow**: Prioritizes real-time data processing, making low latency and quick execution essential, especially in applications like autonomous vehicles and industrial automation.
### Data Management and Privacy
- **Traditional ML Workflow**: Processes data in centralized locations, often necessitating extensive data transfer and focusing on data security during transit and storage.
- **Embedded AI Workflow**: Leverages edge computing to process data closer to its source, reducing data transmission and enhancing privacy through data localization.
### Hardware-Software Integration
- **Traditional ML Workflow**: Typically operates on general-purpose hardware, with software development occurring somewhat independently.
- **Embedded AI Workflow**: Involves a more integrated approach to hardware and software development, often incorporating custom chips or hardware accelerators to achieve optimal performance.
## Roles & Responsibilities
Creating an ML solution, especially for embedded AI, is a multidisciplinary effort involving various specialists.
Here's a rundown of the typical roles involved:
| Role | Responsibilities |
|--------------------------------|----------------------------------------------------------------------------------------------------|
| Project Manager | Oversees the project, ensuring timelines and milestones are met. |
| Domain Experts | Offer domain-specific insights to define project requirements. |
| Data Scientists | Specialize in data analysis and model development. |
| Machine Learning Engineers | Focus on model development and deployment. |
| Data Engineers | Manage data pipelines. |
| Embedded Systems Engineers | Integrate ML models into embedded systems. |
| Software Developers | Develop software components for AI system integration. |
| Hardware Engineers | Design and optimize hardware for the embedded AI system. |
| UI/UX Designers | Focus on user-centric design. |
| QA Engineers | Ensure the system meets quality standards. |
| Ethicists and Legal Advisors | Consult on ethical and legal compliance. |
| Operations and Maintenance Personnel | Monitor and maintain the deployed system. |
| Security Specialists | Ensure system security. |
Understanding these roles is crucial for the successful completion of an ML project. As we proceed through the upcoming chapters, we'll delve into each role's essence and expertise, fostering a comprehensive understanding of the complexities involved in embedded AI projects. This holistic view not only facilitates seamless collaboration but also nurtures an environment ripe for innovation and breakthroughs.