### SCRATCHS: Side-Channel Resistant Applications Through Co-designed Hardware/Software

#### **Vianney Lapôtre**

Journée thématique du club des partenaires sur le RISC-V - GDR SO $C^2$ 

September 29, 2023







### SCRATCHS project





#### Side-Channel Resistant Applications Through Co-designed Hardware/Software

- Aimed attacks : Timing side channel on the microarchitecture
- Ensure efficient and on-demand constant-time execution
  - Best convenience between hardware and toolchain contributions







Only timing side channels are considered.

The attacker 👛 :

- knows the victim 🛉 program.
- measures time with cycle accuracy.
  - victim execution
  - its memory accesses {hit;miss}
- can interrupt.
- shares cache memories with victim.





### **Threat model**



Ensure efficient and on-demand constant-time execution

#### Sources of leakages

Branching if (condition(**secret**))

Operation with variable execution time dividend/secret;

Index for Memory access array[**secret**];





### **Project contributions**



Ensure efficient and on-demand constant-time execution

- Hardware support for
  - software controlled mode for constant time execution
  - software controlled cache lines Locked and Unlocked
- Software tools for
  - constant time programming
  - proposed protections simulation and formal verification





# **Project toolchain**

- CompCert Abs-Int (minor changes for our annotations)
- RISC-V ISA extension
- Simulator to evaluate security
- WIP : formal proof of the proposed security mechanism







### **Project contributions**



Ensure efficient and on-demand constant-time execution

- Hardware support for
  - software controlled mode for constant time execution
  - software controlled cache lines Locked and Unlocked
- Software tools for
  - constant time programming
  - proposed protections simulation and formal verification





# multi-cycle instructions in CV32E40P

| Instruction Type      | Cycles                              |
|-----------------------|-------------------------------------|
| Integer Computational | 1                                   |
| CSR Access            | 4 (some CSRs)<br>1 (the other CSRs) |
| Load/Store            | 1 access                            |
|                       | 2 accesses (if data is non-aligned) |
| Jump                  | 2                                   |
| Branch                | 1 (not-taken)                       |
| branen                | 3 (taken)                           |
| Multiplication        | 1 (32-LSBs computation)             |
| Multiplication        | 5 (32-MSBs computation)             |
| Division<br>Remainder | 3-35                                |



# Software controlled mode for constant time execution



Figure: CV32E40P Block diagram

With the swctm pseudo instruction compiled as :

csrr{s|c} rd, CONSTANT\_TIME, rs1

| 1 | <pre># block of sensitive code</pre> |
|---|--------------------------------------|
| 2 | swctm #set CT mode                   |
| 3 | add a2, t0, a5                       |
| 4 | c.addi a3, 82                        |
| 5 | div a0, a3, a2                       |
| 6 | rem t1, a5, a2                       |
| 7 | lw a2, 4(sp)                         |
| 8 | div a0, a3, a2                       |
| 9 | <mark>swctm</mark> #reset CT mode    |





### **Project contributions**



Ensure efficient and on-demand constant-time execution

- Hardware support for
  - software controlled mode for constant time execution
  - software controlled cache lines Locked and Unlocked
- Software tools for
  - constant time programming
  - proposed protections simulation and formal verification





# State of the Art - Mitigations of Cache-based SCA

### **Randomization based caches**

- **RPcache**<sup>*a*</sup>, **ScatterCache**<sup>*b*</sup> and **Ceaser**<sup>*c*</sup> propose cache designs based on randomization.
- A Prime+Prune+Probe<sup>d</sup> find eviction sets in randomized caches from only hundred accesses.
- C Randomized caches provide a strong security but it require regular updates of the cache mapping. This can be a source of performance loss.
  - <sup>a</sup>Wang and Lee, "New Cache Designs for Thwarting Software Cache-Based Side Channel Attacks", 2007
  - <sup>b</sup>Werner et al., "ScatterCache: Thwarting Cache Attacks via Cache Set Randomization", 2019
  - <sup>C</sup>Qureshi, "CEASER: Mitigating Conflict-Based Cache Attacks via Encrypted-Address and Remapping", 2018
  - $^d$ Purnal et al., "Systematic Analysis of Randomization-based Protected Cache Architectures" , 2021





11/20

# State of the Art - Mitigations of Cache-based SCA

#### Caches partitioning - with the support of the software

- **NoMo-cache**<sup>*a*</sup> partitions the cache by allocating a set of ways to sensitive applications.
- SecDCP<sup>b</sup> (secure and unsecure ways), or COLORIS<sup>c</sup> (memory page allocation) use coarse-grained partitioning.
- Wang et al. proposes **PLcache**<sup>d</sup>, a lightweight mechanism allowing the lock of process cache lines.

Cache partitioning is (generally) a lightweight solution, but may have a major impact on performance depending on granularity.



<sup>&</sup>lt;sup>a</sup>Domnitser et al., "Non-Monopolizable Caches: Low-Complexity Mitigation of Cache Side Channel Attacks", 2012

<sup>&</sup>lt;sup>b</sup>Wang et al., "SecDCP: Secure dynamic cache partitioning for efficient timing channel protection", 2016

<sup>&</sup>lt;sup>C</sup>Ye et al., "COLORIS: A dynamic cache partitioning system using page coloring" , 2014

<sup>&</sup>lt;sup>d</sup>Wang and Lee, "New Cache Designs for Thwarting Software Cache-Based Side Channel Attacks", 2007



## **PLcache limitations**

- A PLcache<sup>1</sup> does not ensure constant-time accesses
  - some accesses bypass the cache
  - Locked cache lines can be accidentally evicted by the owner process
  - Replacement policy is shared with other processes
- A PLcache is not fully secured it's provides cache line reservation rather than cache line locking.
  - Replacement policy can be manipulated by an attacker for both non-locked and locked cache lines

#### Our approach

C No accidental unlock →use Unlock instructions
C LRU-relaed meta-data is not updated for locked lines.

C At least one free way: Lock fail when only one unlocked way left in the cache set



<sup>&</sup>lt;sup>1</sup>Wang and Lee, "New Cache Designs for Thwarting Software Cache-Based Side Channel Attacks", 2007















| 1 | lock   | a1 |
|---|--------|----|
| 2 | lock   | a2 |
| 3 | lw     | a3 |
| 4 | lw     | a1 |
| 5 | unlock | a2 |
| 6 | lock   | a4 |
| 7 | lw     | a5 |
| 8 | lw     | a6 |
|   |        |    |



execution of  $\bigcirc$ 

- cache miss
- locking the data
  - cannot be evicted
  - update the policy





| 1 | lock   | a1 |
|---|--------|----|
| 2 | lock   | a2 |
| 3 | lw     | a3 |
| 4 | lw     | a1 |
| 5 | unlock | a2 |
| 6 | lock   | a4 |
| 7 | lw     | a5 |
| 8 | lw     | a6 |
|   |        |    |

13/20



execution of (2)

- cache miss
- locking the data
  - cannot be evicted
  - update the policy









execution of  $(\mathbf{3})$ 

- cache miss
- standard access
  - update the policy





| 1 | lock   | a1 |
|---|--------|----|
| 2 | lock   | a2 |
| 3 | lw     | a3 |
| 4 | lw     | a1 |
| 5 | unlock | a2 |
| 6 | lock   | a4 |
| 7 | lw     | a5 |
| 8 | lw     | a6 |
|   |        |    |



execution of (4)

- cache hit
- access to a locked line
  - return the data
  - **no** update the policy











unlocking the data
update the policy





| 1 | lock   | a1 |
|---|--------|----|
| 2 | lock   | a2 |
| 3 | lw     | a3 |
| 4 | lw     | a1 |
| 5 | unlock | a2 |
| 6 | lock   | a4 |
| 7 | lw     | a5 |
| 8 | lw     | a6 |
|   |        |    |



execution of 6

- cache miss
- locking the data
  - cannot be evicted
  - update the policy





| 1 | lock   | a1 |
|---|--------|----|
| 2 | lock   | a2 |
| 3 | lw     | a3 |
| 4 | lw     | a1 |
| 5 | unlock | a2 |
| 6 | lock   | a4 |
| 7 | lw     | a5 |
| 8 | lw     | a6 |



execution of (7)

- cache miss
  - evince a3
- standard access

update the policy





| 1        | lock   | a1 |
|----------|--------|----|
| <b>2</b> | lock   | a2 |
| 3        | lw     | a3 |
| 4        | lw     | a1 |
| 5        | unlock | a2 |
| 6        | lock   | a4 |
| 7        | lw     | a5 |
| 8        | lw     | a6 |



execution of (8)

• cache miss

- evince a2 (no longer locked)
- standard access

update the policy





### Lock - 4-way set associative cache architecture







## LRU with Lock & Unlock - HW implementation







#### Preliminary results Core: CV32E40P (RISC-V based) Cache: 8 KiB, 4-way set-associative, L1 data cache

|               | W/o Lock |      | W/ Lock |               |               |       |
|---------------|----------|------|---------|---------------|---------------|-------|
|               | LUTs     | FFs  | BRAMs   | LUTs          | FFs           | BRAMs |
| CPU           | 5653     | 3484 | 8.5     | 5682 (+0.51%) | 3500 (+0.46%) | 8.5   |
| CV32E40P core | 4655     | 2260 | 0       | 4657 (+0.44%) | 2262 (+0.09%) | 0     |
| Cache         | 988      | 1057 | 8.5     | 1015 (+2.66%) | 1069 (+1.12%) | 8.5   |

<sup>2</sup>Synthesis for Kintex-7 chip using Vivado 2022 tool

16/20 SCRATCHS: Side-Channel Resistant Applications Through Co-designed Hardware/Software





- We consider a Prime+Probe <sup>3</sup> attack targeting the AES SBOX
  - X Lock mechanism is disabled





setted 1st plaintext byte

Figure: Prime+Probe on AES (1st round only, 1st plaintext byte settled, key = 0x00).

Figure: Prime+Probe on AES (1st round only, 1st plaintext byte settled, key = 0x42).



<sup>&</sup>lt;sup>3</sup>Gullasch, Bangerter, and Krenn, "Cache Games – Bringing Access-Based Cache Attacks on AES to Practice", 2011



- We consider a Prime+Probe <sup>4</sup> attack targeting the AES SBOX
  - ✓ Lock mechanism is enable for the entire SBOX



Figure: Prime+Probe on AES locking full table (1st round only, 1st plaintext byte settled, key = 0x00).



<sup>&</sup>lt;sup>4</sup>Gullasch, Bangerter, and Krenn, "Cache Games – Bringing Access-Based Cache Attacks on AES to Practice", 2011



- We consider a Prime+Probe <sup>4</sup> attack targeting the AES SBOX
  - ✓ Lock mechanism is enable for the entire SBOX



|                 | Binairy size (in ko) |
|-----------------|----------------------|
| Unprotected AES | 60.6                 |
| Protected AES   | 60.9                 |
|                 | +0.5%                |

Figure: Prime+Probe on AES locking full table (1st round only, 1st plaintext byte settled, key = 0x00).

<sup>&</sup>lt;sup>4</sup>Gullasch, Bangerter, and Krenn, "Cache Games – Bringing Access-Based Cache Attacks on AES to Practice", 2011



19/20

## **Conclusion & Perspectives**



https://project.inria.fr/scratchs/

- RISC-V CV32E40P extension for constant time execution and cache-based SCA mitigation
- Hardware implementation on FPGA
- Current limits and future works
  - support for multiple cache levels
  - OS support to catch the error and run a back-up solution
  - Study hybrid solutions including cache randomization





## SCRATCHS: Side-Channel Resistant Applications Through Co-designed Hardware/Software

Many thanks to Jean-Loup Hatchikian-Houdot, Nicolas Gaudin,Frédéric Besson, Pascal Cotret, Guy Gogniat, Guillaume Hiet, and Pierre Wilke Thank you for listening! Any questions?



SCRATCHS: Side-Channel Resistant Applications Through Co-designed Hardware/Software





- Domnitser, Leonid et al. "Non-Monopolizable Caches: Low-Complexity Mitigation of Cache Side Channel Attacks". In: ACM Transactions on Architecture and Code Optimization (Jan. 2012). doi: 10.1145/2086696.2086714.
- Gullasch, David, Endre Bangerter, and Stephan Krenn. "Cache Games Bringing Access-Based Cache Attacks on AES to Practice". In: 2011 IEEE Symposium on Security and Privacy. 2011, pp. 490–505. doi: 10.1109/SP.2011.22.
- Purnal, Antoon et al. "Systematic Analysis of Randomization-based Protected Cache Architectures". In: *Proc. IEEE Symposium on Security and Privacy* (SP). May 2021. doi: 10.1109/SP40001.2021.00011.
- Qureshi, Moinuddin K. "CEASER: Mitigating Conflict-Based Cache Attacks via Encrypted-Address and Remapping". In: *Proc. International Symposium on Microarchitecture (MICRO)*. 2018. doi: 10.1109/MICR0.2018.00068.





- Wang, Yao et al. "SecDCP: Secure dynamic cache partitioning for efficient timing channel protection". In: *53ndDesign Automation Conference (DAC)*. 2016. doi: 10.1145/2897937.2898086.
- Wang, Zhenghong and Ruby B. Lee. "New Cache Designs for Thwarting Software Cache-Based Side Channel Attacks". In: *Proc. International Symposium on Computer Architecture (ISCA)*. 2007. doi: 10.1145/1250662.1250723.
- Werner, Mario et al. "ScatterCache: Thwarting Cache Attacks via Cache Set Randomization". In: Proc. 28th USENIX Security Symposium (USENIX Security). 2019. url: https:

//www.usenix.org/conference/usenixsecurity19/presentation/werner.

Ye, Ying et al. "COLORIS: A dynamic cache partitioning system using page coloring". In: *Proc. International Conference on Parallel Architecture and Compilation Techniques* (*PACT*). 2014. doi: 10.1145/2628071.2628104.

