Spring 2020-2023
Duke MIDS
This course is designed to give you a comprehensive view of cloud computing including Big Data and Machine Learning. A variety of learning resources will be used including interactive labs on Cloud Platforms (Google, AWS, Azure). This is a project-based course with extensive hands-on assignments.
Upon successful completion of this course, you will be able to:
- Summarize the fundamentals of cloud computing
- Evaluate the economics of cloud computing
- Accurately evaluate distributed computing challenges and opportunities and apply this knowledge to real-world projects.
- Develop non-linear life-long learning skills
- Build, share and present compelling portfolios using: Github, Hugging Face, YouTube, and Linkedin.
- Develop Metacognition skills (By teaching we learn)
-
Cloud Computing Foundations
- Overview of Cloud Computing
- Cloud Adoption Framework(s)
- Economics of Cloud Computing
- Types of Cloud Services: SaaS, PaaS, IaaS, MaaS, Serverless
- IaC (Infrastructure as Code) w/ Terraform
- Continuous Delivery
-
Virtualization & Containerization
- CPU, Memory, I/O
- SDN (Software Defined Networks)
- SDS (Software Defined Storage)
- Containers: Docker, Kubernetes, EKS (Elastic Kubernetes Service), Google Kubernetes Engine, Container Registries
-
Challenges and Opportunities in Distributed Computing
- CAP Theorem
- Eventual Consistency
- Amdahl's law
- End of Moore’s Law
- ASICS: GPUs, TPUs, FPGAs
-
Cloud Storage
- Cloud Databases: HBase, MongoDB, Cassandra, DynamoDB, Google BigQuery
- Cloud Object Storage: Amazon S3, GCP Cloud Storage, Amazon Glacier, Data Lakes, OpenStack Swift
- Distributed File Systems: Red Hat Ceph, Amazon EFS (Elastic File System), HDFS
-
Serverless
- Cloud 9 Development Environment
- FaaS (Function as a Service): AWS Lambda, GCP Cloud Functions, Azure Functions
- Cloud-Native Primitives: AWS Step Machines, AWS SQS, AWS SNS, AWS Cognito, AWS API Gateway
- Google Cloud Shell Development Environment
- Google App Engine
-
Big Data Platforms
- Batch Processing: EMR/Hadoop, AWS Batch
- ETL (Extract Transform Load): AWS Glue, AWS Athena
- Stream Processing: EMR/Spark, AWS Kinesis, Kafka
-
Managed Machine Learning Systems and Platforms
- AWS Sagemaker
- GCP AI Platform
- Azure ML Studio
-
Edge Computing
- IoT: AWS Greengrass, Raspberry Pi
- Edge Machine Learning: Tensorflow lite, Intel Movidius, Apple X12
The purpose of the async discussion forum is to facilitate a free exchange of ideas. Remain respectful of other ideas. Active, relevant and timely discussion is encouraged. Please refrain from simple replies such as "I agree". Use the Critical Thinking framework as described in the O'Reilly book Practical MLOps Preface.
The requirements each week is to both create a post according to the assignment but to also comment in a meaningful way on posts by two other students.
Students can choose to either do a discussion question each week or create a pull request to work on a ticket on the MLOps Template project. The queue of work will be organized by the TAs and advanced can also create a ticket, then work on it.
- Each week you will do 1-5 minute demo (hard capped at 5 minutes). This trains your metacognitive abilities.
- You have the option of doing a demo on work from GitHub contributions related to class or class projects.
This course will make sure of several free resources that allow students to use real cloud environments on AWS, Google and Azure. Please set up accounts as follows:
- AWS: Create an account on AWS Educate using your school email account: https://aws.amazon.com/education/awseducate/. This will be where “free” sandboxed AWS Environments will launch. (Note, you are also encouraged to sign up for a “free tier” AWS account: https://aws.amazon.com/free/
- GCP: Create an account on Qwiklabs using your school email account: https://www.qwiklabs.com/
- Azure: Create an account on Azure for students using your school email account: https://azure.microsoft.com/en-us/free/students/.
Required Readings and Media
- Berkeley View of Cloud Computing
- Google Cloud Adoption Framework (Read Whitepaper)
- AWS Cloud Adopation Framework (optional)
- The Economics of the Cloud-Microsoft
- Introduction to AWS Economics
- Gartner AI Hype Cycle
- Python for DevOps-Book
- Python for Programmers-Book
- Data Engineering with Python and AWS Lambda-Video
- Duke+Coursera: Cloud Computing for Data Coursera Course
- Gift, N (2021) Practical MLOps, Sebastopol, CA: O'Reilly
- Gift, N (2021) Cloud Computing for Data Analysis
- Gift, N (2020) Pragmatic AI: An Introduction to Cloud-Based Machine Learning
- Gift, N (2022)Developing on AWS with CSharp
- Coursera-DE-C2-Lab1-Linux
- Coursera-DE-C2-Lab2-Using-Bash
- Coursera-DE-C2-Lab3-Building-Bash-Scripts
- Coursera-DE-C2-Lab4-Composing-File-Data-Solutions
- AWS Training
- AWS Educate
- AWS Academy
- Google Qwiklabs
- Microsoft Learn
- Python in One Hour
- Know Thyself: The Science of Self-Awareness
- DataCamp - CLI Automation Python
- AWS Training & Certification
- AWS Educate
- AWS Academy
- Google Qwiklabs - Hands-On Cloud Training
- Coursera
- Google Cloud Platform Fundamentals: Core Infrastructure
- Microsoft Learn
- edX
- Applied Computer Vision with Python Lectures: https://learning.oreilly.com/videos/applied-computer-vision/60652VIDEOPAIMLL/
- Learn Python in One Hour: https://learning.oreilly.com/videos/learn-python-in/60645VIDEOPAIML/
- Cloud Computing with Python: https://learning.oreilly.com/videos/cloud-computing-with/60650VIDEOPAIML/
- Python for Data Science with Colab and Pandas in One Hour: https://learning.oreilly.com/videos/python-for-data/62062021VIDEOPAIML/
- GCP Cloud Functions:
https://learning.oreilly.com/videos/learn-gcp-cloud/50101VIDEOPAIML/ - Azure AutoML
https://learning.oreilly.com/videos/learn-azure-ml/50104VIDEOPAIML/
- AWS Bootcamp
- Logic to Live
- AWS Lambda Python Cloud9 and Boto3 One Hour
- Learn AWS Cloudshell
- Using AWS Sagemaker
- Learn to build Data Pipelines
- Hello World IAC with AWS CDK
- Github Actions vs AWS Code Build for CI
- AWS Sagemaker Autopilot from Zero
- AWS Cloud Practitioner
- AWS ML
- AWS SA
- Building AI Applications with GCP: https://learning.oreilly.com/videos/building-ai-applications/9780135973462/
- Build GCP Cloud Functions: https://learning.oreilly.com/videos/learn-gcp-cloud/50101VIDEOPAIML/
- Google Cloud Functions for the Impatient
-
Data Science, Pandas, and Colab: https://learning.oreilly.com/videos/python-for-data/62062021VIDEOPAIML/
-
Python and DevOps: https://learning.oreilly.com/videos/python-devops-in/61272021VIDEOPAIML/
-
Python Command-line Tools: https://learning.oreilly.com/videos/learn-python-command-line/50102VIDEOPAIML/
- Docker containers:
https://learning.oreilly.com/videos/learn-docker-containers/50103VIDEOPAIML/ - Learn the Vim Text Editor: https://learning.oreilly.com/videos/learn-vim-in/50100VIDEOPAIML/