Skip to content

Repository for DCA0305, an undergraduate course about Machine Learning Workflows and Pipelines

License

Notifications You must be signed in to change notification settings

ivanovitchm/mlops

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Federal University of Rio Grande do Norte

Technology Center

Department of Computer Engineering and Automation

Machine Learning Based Systems Design

Reference Link
πŸ“š Noah Gift, Alfredo Deza Practical MLOps
πŸ“š Chip Huyen Designing ML Systems
πŸ“š Jason Brownlee Deep Learning for NLP
πŸ’£ ChatGPT OpenAI Chat
πŸ˜ƒ CS329S - ML Systems Design Stanford's MLOps course
🎯 Machine Learning Operations MLOps Community

Lessons

Week 01

  • Open in PDF Course Outline
  • 🎯 Week Goals
    • Your main goal for this week is to create a personal repository for tracking your progress and coursework, gain access to the GitHub Education Pro pack, and learn how to start coding instantly using GitHub Codespaces, complete the GitHub Learning Game, and read complementary material.
  • πŸŽ‰ GitHub Education Benefits
    • GitHub Education Pro: Get access to the GitHub Education Pro pack by visiting GitHub Education
  • πŸš€ Instant Coding with Codespaces
  • πŸ“– Learning Resources
    • GitHub Learning Game: Check out the interactive Git learning game at GitHub Learning Game
    • Michael A. Lones. How to avoid machine learning pitfalls: a guide for academic researchers Arxiv
  • Thread.run(study_mlops) πŸ’»πŸš€

Week 02

  • Open in PDF Command Line Interface Fundamentals
  • 🎯 Week Goals
    • Get ready to unlock your inner Data Science Ninja! πŸ₯· This week is all about getting hands-on with the command line. Why? Because it's like the Swiss Army knife for any data scientistβ€”versatile, indispensable, and oh-so-powerful!

Week 03

  • Open in PDF Clean Code Principles for Data Science and Machine Learning
  • 🎯 Week Goals
    • This week is all about mastering the art of writing clean and efficient code. As a future data scientist or machine learning engineer, writing code that is both understandable and maintainable is crucial. We'll dig into principles like DRY and KISS, refactoring and see how they can be applied to data science and machine learning projects.
  • 🀲 Hands-On Activities
    • Jupyter Topic Name: Explore the practical aspects of the concepts discussed this week. Learn through coding exercises and real-world examples.
    • Open in Dataquest: Functions: fundamentals and intermediate
      • πŸ‘Š Skills You'll Gain: You will learn how to a) define and create functions and pipelines, b) debug funcitons, c) define default arguments, d) use multiples return statements, e) return multiples variables, f) variable scopes and more.
      • ⏳ Estimated time: 6h

Week 04

Week 05

  • Open in PDF Handling Errors, Writing Tests and Logs
  • 🎯 Week Goals
    • This week, we dive into three essential pillars of reliable machine learning systems: error handling, testing, and logging. These are critical skills for building robust, maintainable ML pipelines and applications.
  • 🀲 Hands-On Activities
    • Jupyter Practical Error Handling, Testing, and Logging: Explore these concepts through coding exercises and real-world examples.
  • πŸ“– Learning Resources

Week 06

  • Open in PDF Machine Learning Fundamentals
    • What is Machine Learning (ML)? Open in Loom
    • ML types Open in Loom
    • Main challenges of ML
      • Variables, pipeline, and controlling chaos Open in Loom
      • Train, dev and test sets Open in Loom
      • Bias vs Variance Open in Loom
    • Evaluation metrics
      • How to choose an evaluation metric? Open in Loom
      • Threshold metrics Open in Loom
      • Ranking metrics Open in Loom
  • Open in PDF Essential Guide for NLP
    • Pre-processing and cleaning
    • Text representation
    • Jupyter Preparing Text Data: manual tokenization, NLTK, Scikit-Learng, Keras, so on.

Week 07

  • Open in PDF Steps to Process Film Review Data for Sentiment Analysis
    • Jupyter Hands on Extract, Transform and Load (ETL): opening and reading files, cleaning the content,compiling a preliminary vocabulary, processing multiple files, refining the vocabulary, saving the final vocabulary.
  • Deep Learning Fundamentals
    • The perceptron Open in Loom
    • Building Neural Networks Open in Loom
    • Matrix Dimension Open in Loom
    • Applying Neural Networks Open in Loom
    • Training a Neural Networks Open in Loom
    • Backpropagation with Pencil & Paper Open in Loom
    • Learning rate & Batch Size Open in Loom
    • Exponentially Weighted Average Open in Loom
    • Adam, Momentum, RMSProp, Learning Rate Decay Open in Loom
    • Hands on DL fundamentals Open in Dataquest
      • You'll learn how to: a) Understand how neural networks are represented; b) understand how adding hidden layers can provide improved model performance; c) Understand how neural networks capture nonlinearity in the data.

Week 08

Week 09

  • Open in PDF A MLOps case study using Weights and Biases

    • Jupyter Fetch Data.
    • Jupyter EDA.
    • Jupyter Preprocessing Data.
    • Jupyter Data Check.
    • Jupyter Data Segregation.
    • Jupyter Vocabulary creation.
    • Jupyter Train.
  • Sequence models for Deep Learning Open in Dataquest

    • You'll learn how to: a) describing sequential Neural Network models; b) determining when to use RNN, GRU, and LSTM; c) implement a sequential model using a basic RNN.

Week 10

  • Natural Language Processing for Deep Learning Open in Dataquest
    • You'll learn how to: a) processing and exploring text data, b) visualizing text data using a word cloud, c) implementing tokenization and word embeddings, d) building sequence models and e) building a transformer-based text classification model.
  • Hands on project: the target is to use Weights and Biases, and Directed Acyclic Graphs (DAG) to build a pipeline for a NLP project.
    • Building a Data Pipeline Open in Dataquest
    • You'll learn how to: a) writing a robust pipeline with a scheduler in Python, b) using advanced Python concepts like closures, decorators, and more.

About

Repository for DCA0305, an undergraduate course about Machine Learning Workflows and Pipelines

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published