CryptoDB
Michael P. Kim
Publications
Year
Venue
Title
2024
CRYPTO
Certifying Private Probabilistic Mechanisms
Abstract
In past years,
entire research communities have arisen to address concerns of privacy and fairness in data analysis.
At present, however, the public must trust that institutions will re-implement algorithms voluntarily to account for these social concerns.
Due to additional cost, widespread adoption is unlikely without effective legal enforcement.
A technical challenge for enforcement is that the methods proposed are often \emph{probabilistic mechanisms}, whose output must be drawn according to precise, and sometimes secret, distributions.
The Differential Privacy (DP) case is illustrative: if a cheating curator answers queries according to an overly-accurate mechanism, privacy violations could go undetected.
This raises our central question:
\textit{Can we efficiently certify the output of a probabilistic mechanism enacted by an untrusted party?}
To this end:
\begin{enumerate}
\item
We introduce two new notions: {\it Certified Probabilistic Mechanisms} (CPM) and \emph{Random Variable Commitment Schemes} (RVCS).
A CPM is an interactive protocol that forces a prover to enact a given probabilistic mechanism or be caught; importantly, the interaction does not reveal the mechanism's secret parameters.
An RVCS---a key primitive for constructing CPMs---is a commitment scheme where the verifier is convinced that the commitment is to an RV sampled according to an agreed-upon distribution, but learns nothing else.
\item
We instantiate the general notion of CPM for the special case of Certifying DP.
We build a lightweight, doubly-efficent interactive proof system to certify arbitrary-predicate counting queries released via the DP Binomial mechanism.
The construction relies on a commitment scheme with perfect hiding and additive homomorphic properties that can be used to release a broad class of queries about a committed database, constructed on top of Pedersen commitments.
\item
Finally, we demonstrate the immediate feasibility of Certified DP via a highly-efficient and scalable
prototype
implementation to answer counting queries of arbitrary predicates.
The mechanism is composed of an offline and online stage, where
the online phase allows for non-interactive certification of queries. For example, we show that CDP queries over a US Census Public Use Microdata Sample (PUMS)~\cite{census-pums} ($n=7000$) can be completed in only 1.6~ms and verified in just \emph{38~$\mu$s}. Our implementation is available in open source at \url{https://github.com/jlwatson/certified-dp}.
Coauthors
- Zoë Ruha Bell (1)
- Shafi Goldwasser (1)
- Michael P. Kim (1)
- Jean-Luc Watson (1)