CryptoDB
Arkady Yerukhimovich
Publications
Year
Venue
Title
2025
CIC
On the Privacy of Sublinear-Communication Jaccard Index Estimation via Min-hash
Abstract
<p> The min-hash sketch is a well-known technique for low-communication approximation of the Jaccard index between two input sets. Moreover, there is a folklore belief that min-hash sketch-based protocols protect the privacy of the inputs. In this paper, we consider variants of private min-hash sketch based-protocols and investigate this folklore to quantify the privacy of the min-hash sketch.</p><p> We begin our investigation by presenting a highly-efficient two-party protocol for estimating the Jaccard index while ensuring differential privacy. This protocol adds Laplacian noise to the min-hash sketch counts to provide privacy protection.</p><p> Then, we aim to understand what privacy, if any, is guaranteed if the results of the min-hash are released without any additional noise, such as in the case of historical data. We begin our investigation by considering the privacy of min-hash in a centralized setting where the hash functions are chosen by the min-hash functionality and are unknown to the participants. We show that in this case the min-hash output satisfies the standard definition of differential privacy (DP) without any additional noise.</p><p> We next consider a more practical distributed setting, where the hash function must be shared among all parties and is typically public.</p><p> Unfortunately, we show that in this public hash function setting, the min-hash output is no longer DP. We therefore consider the notion of distributional differential privacy (DDP) introduced by Bassily et al. (FOCS 2013). We show that if the honest party's set has sufficiently high min-entropy, the min-hash output achieves DDP without requiring noise.</p><p> Our findings provide guidance on how to use the min-hash sketch for private Jaccard index estimation and clarify the extent to which min-hash protocols protect input privacy, refining the common belief in their privacy guarantees. </p>
2022
TCC
Secure Sampling with Sublinear Communication
Abstract
Random sampling from specified distributions is an important tool with wide applications for analysis of large-scale data. In this paper we study how to randomly sample when the distribution is partitioned among two parties' private inputs. Of course, a trivial solution is to have one party send a (possibly encrypted) description of its weights to the other party who can then sample over the entire distribution (possibly using homomorphic encryption). However, this approach requires communication that is linear in the input size which is prohibitively expensive in many settings. In this paper, we investigate secure 2-party sampling with sublinear communication for many standard distributions. We develop protocols for L_1, and L_2 sampling. Additionally, we investigate the feasibility of sublinear product sampling, showing impossibility for the general problem and showing a protocol for a restricted case of the problem. We additionally show how such product sampling can be used to instantiate a sublinear communication 2-party exponential mechanism for differentially-private data release.
2021
EUROCRYPT
The More The Merrier: Reducing the Cost of Large Scale MPC
📺
Abstract
Secure multi-party computation (MPC) allows multiple parties to perform secure joint computations on their private inputs. Today, applications for MPC are growing with thousands of parties wishing to build federated machine learning models or trusted setups for blockchains. To address such scenarios we propose a suite of novel MPC protocols that maximize throughput when run with large numbers of parties. In particular, our protocols have both communication and computation complexity that decrease with the number of parties. Our protocols build on prior protocols based on packed secret-sharing, introducing new techniques to build more efficient computation for general circuits. Specifically, we introduce a new approach for handling \emph{linear attacks} that arise in protocols using packed secret-sharing and we propose a method for unpacking shared multiplication triples without increasing the asymptotic costs. Compared with prior work, we avoid the $\log |C|$ overhead required when generically compiling circuits of size $|C|$ for use in a SIMD computation, and we improve over folklore ``committee-based'' solutions by a factor of $O(s)$, the statistical security parameter. In practice, our protocol is up to $10X$ faster than any known construction, under a reasonable set of parameters.
2019
JOFC
(Efficient) Universally Composable Oblivious Transfer Using a Minimal Number of Stateless Tokens
Abstract
We continue the line of work initiated by Katz (Eurocrypt 2007) on using tamper-proof hardware tokens for universally composable secure computation. As our main result, we show an oblivious-transfer (OT) protocol in which two parties each create and transfer a single, stateless token and can then run an unbounded number of OTs. We also show a more efficient protocol, based only on standard symmetric-key primitives (block ciphers and collision-resistant hash functions), that can be used if a bounded number of OTs suffice. Motivated by this result, we investigate the number of stateless tokens needed for universally composable OT. We prove that our protocol is optimal in this regard for constructions making black-box use of the tokens (in a sense we define). We also show that nonblack-box techniques can be used to obtain a construction using only a single stateless token.
2014
TCC
Program Committees
- Crypto 2024
- Crypto 2023
- Eurocrypt 2022
Coauthors
- Zvika Brakerski (1)
- Seung Geol Choi (4)
- Dana Dachman-Soled (2)
- S. Dov Gordon (2)
- Adam Groce (1)
- Gene Itkis (1)
- Jonathan Katz (6)
- Mingyu Liang (1)
- Linsheng Liu (2)
- Dominique Schröder (3)
- Gil Segev (1)
- Emily Shen (1)
- Daniel Starin (1)
- Mayank Varia (1)
- David Wilson (1)
- Arkady Yerukhimovich (10)
- Hong-Sheng Zhou (2)