Mia Filić

CryptoDB

Mia Filić

Publications

Year

Venue

Title

2024

ASIACRYPT

Deletions and Dishonesty: Probabilistic Data Structures in Adversarial Settings Abstract

Mia Filić Keran Kocher Ella Kummer Anupama Unnikrishnan

Probabilistic data structures (PDS) are compact representations of high-volume data that provide approximate answers to queries about the data. They are commonplace in today's computing systems, finding use in databases, networking and more. While PDS are designed to perform well under benign inputs, they are frequently used in applications where inputs may be adversarially chosen. This may lead to a violation of their expected behaviour, for example an increase in false positive rate. In this work, we focus on PDS that handle approximate membership queries (AMQ). We consider adversarial users with the capability of making adaptive insertions, deletions and membership queries to AMQ-PDS, and analyse the performance of AMQ-PDS under such adversarial inputs. We argue that deletions significantly empower adversaries, presenting a challenge to enforcing honest behaviour when compared to insertion-only AMQ-PDS.To address this, we introduce a new concept of an honest setting for AMQ-PDS with deletions. By leveraging simulation-based security definitions, we then quantify how much harm can be caused by adversarial users to the functionality of AMQ-PDS. Our resulting bounds only require calculating the maximal false positive probability and insertion failure probability achievable in our novel honest setting. We apply our results to Cuckoo filters and Counting filters. We show how to protect these AMQ-PDS at low cost, by replacing or composing the hash functions with keyed pseudorandom functions in their construction. This strategy involves establishing practical bounds for the probabilities mentioned above. Using our new techniques, we demonstrate that achieving security against adversarial users making both insertions *and* deletions remains practical.

2024

RWC

Compact Frequency Estimators in Adversarial Environments Abstract

Sam A. Markelon Mia Filić Thomas Shrimpton

Count-Min Sketch (CMS) and HeavyKeeper (HK) are two realizations of a compact frequency estimator (CFE). These probabilistic data structures maintain a compact summary of high-volume streaming data and provide approximate estimates of the number of times an element occurred. CFEs are commonly used in streaming settings to identify elements with the largest frequencies (i.e., top-K elements, heavy hitters, elephant flows). Finding extreme elements is important for network planning, network monitoring, recommendation systems, etc. Traditionally, probabilistic guarantees on the accuracy of frequency estimates are proved under the implicit assumption that stream elements do not depend upon the internal randomness of the structure. This assumption is often not well-matched with reality; malicious actors could be incentivized to manipulate the data stream. In this talk, we reveal vulnerabilities in CMS and HK to adaptive attacks, by presenting attacks that cause significant estimation errors. For instance, elements never seen in the stream can be manipulated to resemble heavy hitters in CMS. This could, for example, cause network flow monitoring systems relying on CFEs to identify non-existent or benign flows as possible threats. Conversely, HK can make legitimate heavy hitters disappear. We analyze our attacks analytically and experimentally, obtaining a tight agreement between the two. These negative results seem unavoidable for (at least) sketch-based CFEs with parameters that are reasonable in practice. On the positive side, we build a new CFE (Count-Keeper) that can be seen as a composition of the CMS and HK structures. Count-Keeper estimates are typically more accurate (by at least a factor of two) than CMS for “honest” streams. Further, our attacks against CMS and HK are less effective (and more resource intensive) when used against Count-Keeper. Lastly, Count-Keeper has a native ability to flag estimates that are suspicious, which neither CMS or HK (or any other CFE, to our knowledge) admits.