Introduction to ML Safety

layout

title

nav_exclude

permalink

seo

home

About

false

about

type	name
Course	Introduction to ML Safety

Introduction to ML Safety

ML systems are rapidly increasing in size, are acquiring new capabilities, and are increasingly deployed in high-stakes settings. As with other powerful technologies, safety for ML should be a leading research priority. In this course we'll discuss how researchers can shape the process that will lead to strong AI systems and steer that process in a safer direction. We'll cover various technical topics to reduce existential risks (X-Risks) from strong AI, namely withstanding hazards ("Robustness"), identifying hazards ("Monitoring"), reducing inherent ML system hazards ("Control"), and reducing systemic hazards ("Systemic Safety"). At the end, we will zoom out and discuss additional abstract existential hazards and discuss how to increase safety without unintended side effects. For the course content and assignments, refer to the schedule.

Prerequisites

This is a topics course in machine learning, so a solid background in Machine Learning and Deep Learning is necessary. If you don't have this background, we recommend Week 1-6 of MIT 6.036 followed by Lectures 1-13 of the University of Michigan's EECS498 or Week 1-6 and 11-12 of NYU's Deep Learning.

Syllabus

Safety Engineering: Risk Decomposition, A Systems View of Safety, Black Swans
Robustness: Adversaries, Long Tails
Monitoring: Anomalies, Interpretable Uncertainty, Transparency, Trojans, Emergent Behavior
Control: Honesty, Value Learning, Machine Ethics, Intrasystem Goals
Systemic Safety: ML for Improved Epistemics, ML for Improved Cyberdefense, Cooperative AI
Additional X-Risk Discussion: Future Scenarios, Selection Pressures, Avoiding Capabilities Externalities

Name		Name	Last commit message	Last commit date
Latest commit History 431 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
_announcements		_announcements
_includes		_includes
_layouts		_layouts
_modules		_modules
_sass/custom		_sass/custom
_schedules		_schedules
_staffers		_staffers
assets		assets
js		js
.gitignore		.gitignore
CNAME		CNAME
Gemfile		Gemfile
LICENSE		LICENSE
README.md		README.md
_config.yml		_config.yml
announcements.md		announcements.md
calendar.md		calendar.md
favicon.ico		favicon.ico
favicon.png		favicon.png
readings.md		readings.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction to ML Safety

Prerequisites

Syllabus

About

Releases

Packages

Contributors 11

Languages

License

centerforaisafety/course.mlsafety.org

Folders and files

Latest commit

History

Repository files navigation

Introduction to ML Safety

Prerequisites

Syllabus

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 11

Languages

Packages