CryptoDB
Fanjie Ji
Publications
Year
Venue
Title
2024
TCHES
Efficient Table-Based Masking with Pre-processing
Abstract
Masking is one of the most investigated countermeasures against sidechannel attacks. In a nutshell, it randomly encodes each sensitive variable into a number of shares, and compiles the cryptographic implementation into a masked one that operates over the shares instead of the original sensitive variables. Despite its provable security benefits, masking inevitably introduces additional overhead. Particularly, the software implementation of masking largely slows down the cryptographic implementations and requires a large number of random bits that need to be produced by a true random number generator. In this respect, reducing the< overhead of masking is still an essential and challenging task. Among various known schemes, Table-Based Masking (TBM) stands out as a promising line of work enjoying the advantages of generality to any lookup tables. It also allows the pre-processing paradigm, wherein a pre-processing phase is executed independently of the inputs, and a much more efficient online (using the precomputed tables) phase takes place to calculate the result. Obviously, practicality of pre-processing paradigm relies heavily on the efficiency of online phase and the size of precomputed tables.In this paper, we investigate the TBM scheme that offers a combination of linear complexity (in terms of the security order, denoted as d) during the online phase and small precomputed tables. We then apply our new scheme to the AES-128, and provide an implementation on the ARM Cortex architecture. Particularly, for a security order d = 8, the online phase outperforms the current state-of-the-art AES implementations on embedded processors that are vulnerable to the side-channel attacks. The security order of our scheme is proven in theory and verified by the T-test in practice. Moreover, we investigate the speed overhead associated with the random bit generation in our masking technique. Our findings indicate that the speed overhead can be effectively balanced. This is mainly because that the true random number generator operates in parallel with the processor’s execution, ensuring a constant supply of fresh random bits for the masked computation at regular intervals.
2024
TCHES
Random Probing Security with Precomputation
Abstract
At Eurocrypt 2014, Duc, Dziembowski and Faust proposed the random probing model to bridge the gap between the probing model proposed at Crypto 2003 and the noisy model proposed at Eurocrypt 2013. Compared with the probing model whose noise in the leakages should (linearly) increase with the number of shares, the random probing model allows each variable leak its value with a probability p, which reflects the physical reality of side channels much better. In Crypto 2020, Belaïd et al. proposed the Random Probing Expandability (RPE) security ensuring the random probing security for arbitrary order masking algorithms with constant leakage probability. However, the complexity of existing RPE algorithms is much higher than that of the probing secure algorithms, which is short of practical usage. In this paper, we investigate the random probing security with precomputation, where a masked cryptographic implementation can be divided into two phases. The first phase, called preprocessing, takes random bits and returns a number of precomputed values. The second phase, called online computation, takes input (e.g., plaintext and shares of secret) and precomputed values to calculate output (e.g., ciphertext) efficiently. We describe a random probing secure precomputable scheme, which transforms an arbitrary circuit compiler with tolerant leakage probability p into a precomputable one by adding a public (but random) share that is calculated in the online phase and the tolerant leakage probability of the new compiler is min{p, 2−5.01}. Then, we apply the new scheme to the bitsliced AES. Notably, the implementation under ARM Cortex M architecture shows that the performance of the online phase is significantly improved and even comparable to masking schemes only secure in the probing model.
2023
TCHES
Efficient Private Circuits with Precomputation
Abstract
At CHES 2022, Wang et al. described a new paradigm for masked implementations using private circuits, where most intermediates can be precomputed before the input shares are accessed, significantly accelerating the online execution of masked functions. However, the masking scheme they proposed mainly featured (and was designed for) the cost amortization, leaving its (limited) suitability in the above precomputation-based paradigm just as a bonus. This paper aims to provide an efficient, reliable, easy-to-use, and precomputation-compatible masking scheme. We propose a new masked multiplication over the finite field Fq suitable for the precomputation, and prove its security in the composable notion called Probing-Isolating Non-Inference (PINI). Particularly, the operations (e.g., AND and XOR) in the binary field can be achieved by assigning q = 2, allowing the bitsliced implementation that has been shown to be quite efficient for the software implementations. The new masking scheme is applied to leverage the masking of AES and SKINNY block ciphers on ARM Cortex M architecture. The performance results show that the new scheme contributes to a significant speed-up compared with the state-of-the-art implementations. For SKINNY with block size 64, the speed and RAM requirement can be significantly improved (saving around 45% cycles in the online-computation and 60% RAM space for precomputed values) from AES-128, thanks to its smaller number of AND gates. Besides the security proof by hand, we provide formal verifications for the multiplication and T-test evaluations for the masked implementations of AES and SKINNY. Because of the structure of the new masked multiplication, our formal verification can be performed for security orders up to 16.
2022
TCHES
Side-Channel Masking with Common Shares
Abstract
To counter side-channel attacks, a masking scheme randomly encodes keydependent variables into several shares, and transforms operations into the masked correspondence (called gadget) operating on shares. This provably achieves the de facto standard notion of probing security.We continue the long line of works seeking to reduce the overhead of masking. Our main contribution is a new masking scheme over finite fields in which shares of different variables have a part in common. This enables the reuse of randomness / variables across different gadgets, and reduces the total cost of masked implementation. For security order d and circuit size l, the randomness requirement and computational complexity of our scheme are Õ(d2) and Õ(ld2) respectively, strictly improving upon the state-of-the-art Õ(d2) and Õ(ld3) of Coron et al. at Eurocrypt 2020.A notable feature of our scheme is that it enables a new paradigm in which many intermediates can be precomputed before executing the masked function. The precomputation consumes Õ(ld2) and produces Õ(ld) variables to be stored in RAM. The cost of subsequent (online) computation is reduced to Õ(ld), effectively speeding up e.g., challenge-response authentication protocols. We showcase our method on the AES on ARM Cortex M architecture and perform a T-test evaluation. Our results show a speed-up during the online phase compared with state-of-the-art implementations, at the cost of acceptable RAM consumption and precomputation time.To prove security for our scheme, we propose a new security notion intrinsically supporting randomness / variables reusing across gadgets, and bridging the security of parallel compositions of gadgets to general compositions, which may be of independent interest.
Coauthors
- Chun Guo (1)
- Fanjie Ji (4)
- Lu Li (1)
- Yang Su (1)
- Yiteng Sun (2)
- Weijia Wang (4)
- Taoyun Wang (1)
- Bohan Wang (2)
- Yu Yu (3)
- Juelin Zhang (2)