Authors: |
- Yi Chen , Institute for Advanced Study, Tsinghua University, Beijing, China
- Xiaoyang Dong , Institute for Network Sciences and Cyberspace, BNRist, Tsinghua University, Beijing, China; Zhongguancun Laboratory, Beijing, China
- Jian Guo , School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore
- Yantian Shen , Department of Computer Science and Technology, Tsinghua University, Beijing, China
- Anyu Wang , Institute for Advanced Study, Tsinghua University, Beijing, China; Zhongguancun Laboratory, Beijing, China
- Xiaoyun Wang , Institute for Advanced Study, Tsinghua University, Beijing, China; Zhongguancun Laboratory, Beijing, China; Shandong Key Laboratory of Artificial Intelligence Security, Shandong, China
|
Abstract: |
The machine learning problem of
extracting neural network parameters
has been proposed for nearly three decades.
Functionally equivalent extraction is a crucial goal
for research on this problem.
When the adversary has access to
the raw output of neural networks, various attacks,
including those presented at CRYPTO 2020 and EUROCRYPT 2024,
have successfully achieved this goal.
However, this goal is not achieved
when neural networks operate under a hard-label setting
where the raw output is inaccessible.
In this paper,
we propose the first attack that theoretically achieves
functionally equivalent extraction under the hard-label setting,
which applies to ReLU neural networks.
The effectiveness of our attack is
validated through practical experiments
on a wide range of ReLU neural networks,
including neural networks
trained on two real benchmarking datasets
(MNIST, CIFAR10) widely used in computer vision.
For a neural network consisting of $10^5$ parameters,
our attack only requires several hours on a single core. |