This page is subject to change before the start of the course.

Course Description

Data-driven models have been increasingly used in many domains to assist in human decision-making that has a significant impact on people’s lives – from job hiring and promotion, college admission, judicial decision, to business or public service delivery. The development of decision aids has been made possible both by voluminous data and new data science tools that can exploit complex structures and patterns in data.

Learning Objectives

This course focuses on both concepts and practice in order to understand and cope with the ethical challenges in data science and data-driven decision making. We will introduce (a) the core concepts of fairness and interpretability/explainability and (b) analytic and technical tools to mitigate emerging problems in the real world.

Concepts

Recognize where and understand why (un)fairness and ethical issues arise when applying data science to real world problems
Learn how to conceptualize, measure, and mitigate bias in data-driven decision-making
Learn how to evaluate models and make data-driven decision-making more interpretable and explainable
Learn to think critically about data-driven decisions and policy questions, and evaluate a project with these concerns in mind

Practice

Develop fluency in the key technical, ethical, policy, and legal terms and concepts that are relevant to a normative assessment of data science
Learn common approaches and emerging tools for measuring, mitigating or managing these ethical concerns
Gain exposure to technical, legal and policy documents that help understand the current regulatory environment and anticipate future developments
Design and implement data science applications using real-world datasets, and systematically evaluate and justify the chosen approach to deal with the ethical concerns

Course Content

Topics to be covered:

Big data’s disparate impact
Decision-making by humans and machines
Decision-making by machines and big data
Sources of unfairness/biases
Formal notions and statistical measures of fairness
Fair ML and bias mitigation
Interpretability & explainability in AI
Ethics and privacy
Legal and policy perspectives, etc.

See the course schedule for weekly topics.

Computing

This course will use R and/or Python for computing. GitHub will be used for homework and project assignments, where tools such as Jupyter Notebooks or R Markdown will be used for creating reproducible data science documents.

Prerequisites

Students are expected to be familiar with the basics of Probability and Statistics, Data Mining/Machine Learning, and should be comfortable with programming with DM/ML toolkits. Students need to have a willingness to do interdisciplinary research, and be comfortable to learn concepts through reading technical, legal and policy documents.

Grading

Grades are based on three major activities listed below. Assignments are due as scheduled, and grades on late work will be decreased by 10% per day late. See the assignment page for more details.

40% in-class participation and reading (including quizzes and reading reflection/discussion)
30% homework and midterm
30% final project (including several milestones)

Class Participation

Class participation will be assessed through online quizzes and discussions assigned each week.

Readings

This course will use online materials and academic readings. There will be reading assignments over the course of the semester. Links to the electronic copies of these readings will be provided. There are no textbooks.

University Policies

See the university policies page.

Syllabus (Fall 2024)