Virginia Tech® home

ECE 6514 - Trustworthy Machine Learning

Course Description

Fundamentals of trustworthy machine learning. Overview of modern machine learning techniques and the associated security, privacy, and data quality issues, adversarial machine learning (e.g., decision-time attacks & defenses, data poisoning attacks & defenses, robustness certification), privacy-preserving machine learning (e.g., membership inference attacks, model inversion attacks, differential privacy, and federated learning), strategic data collection and utilization (e.g., data valuation, data selection).

Why take this course?

The deployment of machine learning into real-world applications calls for a set of complementary techniques that will ensure that machine learning is trustworthy. The notion of trust has a broad meaning in the context of machine learning, including enhancing security of machine learning pipelines in adversarial environments, preserving privacy when models are trained on sensitive and proprietary datasets, and understanding how data affects model behaviors. There has been a rapid advancement in trustworthy machine learning in recent years. However, the recent advancement has largely been dispersed in research papers and is not covered in existing courses at Virginia Tech or textbooks. It is crucial to develop a new course to help students to get a systematic understanding of the state-of-the-art techniques to build trustworthy learning pipelines. This course will cover the fundamentals of trustworthy machine learning. It commences with an overview of the security, privacy, data quality issues in machine learning as well as their connections. The course will then cover recent techniques for addressing each of the issues.

Learning Objectives

  • Analyze security and privacy vulnerabilities of machine learning models.
  • Apply techniques to enhance model robustness and understand their limitations.
  • Understand and describe the principles of differential privacy.
  • Apply federated learning to training models on distributed data sources.
  • Compare techniques for data valuation.
  • Design and assess techniques for data selection.