The Third AAAI Workshop on Privacy-Preserving Artificial Intelligence (PPAI-22)
February 28, 2022Enter Virtual Venue Room: Blue 2
The availability of massive amounts of data, coupled with high-performance cloud computing
platforms, has driven significant progress in artificial intelligence and, in particular,
machine learning and optimization. It has profoundly impacted several areas, including computer
vision, natural language processing, and transportation. However, the use of rich data sets
also raises significant privacy concerns: They often reveal personal sensitive information
that can be exploited, without the knowledge and/or consent of the involved individuals, for
various purposes including monitoring, discrimination, and illegal activities.
The third AAAI Workshop on Privacy-Preserving Artificial Intelligence (PPAI-22) held at the
Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI-22)
builds on the success of previous years
AAAI PPAI-20 and
AAAI PPAI-21
to provide a platform for researchers, AI practitioners, and policymakers to discuss technical
and societal issues and present solutions related to privacy in AI applications.
The workshop will focus on both the theoretical and practical challenges related to the design
of privacy-preserving AI systems and algorithms and will have strong multidisciplinary
components, including soliciting contributions about policy, legal issues, and societal
impact of privacy in AI.
Finally, the workshop will welcome papers that describe the release of privacy-preserving benchmarks and data sets that can be used by the community to solve fundamental problems of interest, including in machine learning and optimization for health systems and urban networks, to mention but a few examples.
The workshop will be a one-day meeting. The workshop will include a number of technical sessions, a poster session where presenters can discuss their work, with the aim of further fostering collaborations, multiple invited speakers covering crucial challenges for the field of privacy-preserving AI applications, including policy and societal impacts, a number of tutorial talks, and will conclude with a panel discussion.
Submission URL: https://cmt3.research.microsoft.com/PPAI2022
Rejected NeurIPS/AAAI papers with *average* scores of at least 4.5 may be asubmitted directly to PPAI along with previous reviews. These submissions may go through a light review process or accepted if the provided reviews are judged to meet the workshop standard.
All papers must be submitted in PDF format, using the AAAI-22 author kit.
Submissions should include the name(s), affiliations, and email addresses of all authors.
Submissions will be refereed on the basis of technical quality, novelty, significance, and
clarity. Each submission will be thoroughly reviewed by at least two program committee members.
Submissions of papers rejected from the AAAI 2022 technical program are welcomed.
For questions about the submission process, contact the workshop chairs.
Registration in each workshop is required by all active participants, and is also open to all interested individuals. Early registration deadline is on December 31th. For more information please refer to AAAI-22 Workshop page.
Time | Talk / Presenter | |
---|---|---|
09:50 | Introductory remarks | |
10:00 | Invited Talk: "A bottom-up approach to making differential privacy ubiquitous" by Damien Desfontaines | |
Session chair: Ferdinando Fioretto | ||
10:45 | Spotlight Talk: Differential privacy and robust statistics in high dimensions | |
11:00 | Spotlight Talk: Element Level Differential Privacy: The Right Granularity of Privacy | |
11:15 | Break | Session chair: Fatemeh Mireshghallah |
11:30 | Spotlight Talk: A Fairness Analysis on Private Aggregation of Teacher Ensembles | |
11:45 | Spotlight Talk: Benchmarking Differentially Private Synthetic Data Generation Algorithms | |
12:00 | Tutorial: "Differentially Private Deep Learning, Theory, Attacks, and PyTorch Implementation" by Ilya Mironov, Alexandre Sablayrolles, and Igor Shilov | |
13:45 | Flash Poster Presentations | |
14:00 | Poster Session (on VirtualChair Room Blue 2) | |
15:00 | Invited Talk: "When is Memorization Necessary for Machine Learning?" by Adam Smith | |
Session chair: Xi He | ||
15:45 | Spotlight Talk: Calibration with Privacy in Peer Review: A Theoretical Study | |
16:00 | Spotlight Talk: APRIL: Finding the Achilles’ Heel on Privacy for Vision Transformers | |
16:15 | Break | |
16:30 | Invited Talk: "Personal Privacy and the Public Good" by Claire McKay Bowen: Personal Privacy and the Public Good | |
17:30 | Panel Discussion: Differential Privacy and its disparate impacts. | |
18:20 | Concluding Remarks and Poster Session (on VirtualChair Room Blue 2) |
Abstract:
The tutorial is designed as a gentle introduction to the topic of differentially private deep learning. In the first part of the tutorial we cover relevant topics in the theory of differential privacy, such as composition theorems and privacy-preserving mechanisms. In the second part, we discuss Opacus, a PyTorch-based library for performant and user-friendly differentially private training, and Privacy Linter, a library for identifying privacy violations.
We begin by motivating privacy-preserving deep learning with several examples where industry-grade models demonstrably leak training data. We briefly review the notion of differential privacy (DP) as a remediation strategy, and learn how SGD-based optimization algorithms can be adapted to DP. We take a close look at several privacy accountants (upper bounds on privacy loss) and the complementary lower bounds.
In the second part of the tutorial we present our Pytorch framework for DP-SGD, Opacus. Opacus is designed to be a simple and extensible framework that can easily be plugged into an existing machine learning pipeline. Opacus’ features include implementations of several privacy accountants, vectorized computations of per-sample gradients, and support for distributed computations. We conclude by presenting the Privacy Linter, a PyTorch framework for evaluating practical privacy attacks on machine learning models.The Linter implements a number of recently proposed attacks on trained models and support for more advanced strategies.
Abstract:
Modern machine learning models are complex, and frequently encode surprising amounts of information about individual inputs. In extreme cases, complex models appear to memorize entire input examples, including seemingly irrelevant information (exact addresses from text, for example). In this talk, we aim to understand whether this sort of memorization is necessary for accurate learning, and what the implications are for privacy.
We describe two results that explore different aspects of this phenomenon. In the first, published at STOC 2021, we give natural prediction problems in which every sufficiently accurate training algorithm must encode, in the prediction model, essentially all the information about a large subset of its training examples. This remains true even when the examples are high-dimensional and have entropy much higher than the sample size, and even when most of that information is ultimately irrelevant to the task at hand. Further, our results do not depend on the training algorithm or the class of models used for learning.
Our second, unpublished result shows how memorization must occur during the training process, even when the final model is succinct and depends only on the underlying distribution. This leads to new lower bounds on the memory size of one-pass streaming algorithms for fitting natural models.
Joint work with (subsets of) Gavin Brown, Mark Bun, Vitaly Feldman, and Kunal Talwar.
Abstract:
Both privacy experts and policymakers are at an impasse, trying to answer,
“At what point does the sacrifice to our personal information outweigh the public good?”
If public policymakers had access to society’s personal and confidential data, they could make more evidence-based, data-informed decisions that could accelerate economic recovery and improve COVID-19 vaccine distribution. Although privacy researchers strive to balance the need for data privacy and accuracy, access to personal data comes at a steep privacy cost for contributors, especially underrepresented groups. This situation results in many federal statistical agencies in the United States to never produce public data or restrict the data to a select few external researchers. Further, most public data users and policymakers are not familiar with data privacy and confidentiality methods and must make informed policy decisions on how to best balance the need for confidential data access and privacy protection.
This talk will cover several projects conducted at the intersection of expanding access to confidential data and public policy, the lessons learned when working with data users and non-privacy researchers, and the inequity of data privacy and confidentiality methodologies for underrepresented groups.
Abstract:
Among computer science researchers, differential privacy has been the
gold standard for anonymization for over a decade. The real world is
starting to catch up, albeit slowly. There is a need for strong
anonymization techniques in small and large organizations alike… but
it seems like only large organizations end up deploying DP for
practical use cases, and only because they can afford to invest in
specialized science and engineering teams to help them. How can we
bridge this gap, and drive the widespread adoption of differential
privacy?
This talk will outline a bottom-up approach to bring differential
privacy to a much more widespread audience. From initial outreach
efforts all the way to production deployments, I will describe what a
compelling solution could look like, and what role the scientific
community can play in these efforts.