0% found this document useful (0 votes)
14 views19 pages

Blacksmith Sp22

The document introduces Blacksmith, a scalable Rowhammer fuzzer that generates non-uniform access patterns to exploit vulnerabilities in DRAM, specifically bypassing Target Row Refresh (TRR) mechanisms. It demonstrates that these new patterns can trigger bit flips in all tested DDR4 DRAM devices, significantly outperforming existing methods. The findings highlight the inadequacies of current TRR implementations and emphasize the need for more effective mitigation strategies against Rowhammer attacks.

Uploaded by

Don Gaargan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views19 pages

Blacksmith Sp22

The document introduces Blacksmith, a scalable Rowhammer fuzzer that generates non-uniform access patterns to exploit vulnerabilities in DRAM, specifically bypassing Target Row Refresh (TRR) mechanisms. It demonstrates that these new patterns can trigger bit flips in all tested DDR4 DRAM devices, significantly outperforming existing methods. The findings highlight the inadequacies of current TRR implementations and emphasize the need for more effective mitigation strategies against Rowhammer attacks.

Uploaded by

Don Gaargan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

B LACKSMITH: Scalable Rowhammering in the Frequency Domain

Patrick Jattke Victor van der Veen Pietro Frigo Stijn Gunter Kaveh Razavi
ETH Zurich Qualcomm Technologies Inc. VU Amsterdam ETH Zurich ETH Zurich
pjattke@[Link] vvdveen@[Link] [Link]@[Link] sgunter@[Link] kaveh@[Link]

Abstract—We present the new class of non-uniform Rowhammer Existing Rowhammer patterns. Data in DRAM is stored
access patterns that bypass undocumented, proprietary in-DRAM in rows of cells. These cells consist of capacitors that leak
Target Row Refresh (TRR) while operating in a production setting. charge over time. For preserving the data, the charge needs
We show that these patterns trigger bit flips on all 40 DDR4
DRAM devices in our test pool. We make a key observation that all to be restored by refreshing the cells regularly. However, it is
published Rowhammer access patterns always hammer “aggressor” possible to leak charge from these cells with the Rowhammer
rows uniformly. While uniform accesses maximize the number of vulnerability before they have a chance to get refreshed [13].
aggressor activations, we find that in-DRAM TRR exploits this Existing approaches trigger Rowhammer by selecting one to
behavior to catch aggressor rows and refresh neighboring “victims” many different “aggressor” rows to hammer [1], [12], [14].
before they fail. There is no reason, however, to limit Rowhammer
attacks to uniform access patterns: smaller technology nodes make These aggressor rows are repeatedly accessed in a short duration
underlying DRAM technologies more vulnerable, and significantly before cells get refreshed, causing bit flips in “victim” rows that
fewer accesses are nowadays required to trigger bit flips, making are adjacent to these aggressors. As an example, the double-
it interesting to investigate less predictable access patterns. sided Rowhammer access pattern sandwiches a victim row with
The search space for non-uniform access patterns, however, is two aggressor rows, maximizing charge leakage in the victim
tremendous. We design experiments to explore this space with
row. To leak as much charge from victim rows as possible,
respect to the deployed mitigations, highlighting the importance
of the order, regularity, and intensity of accessing aggressor such patterns hammer aggressors as often as possible before
rows in non-uniform access patterns. We show how randomizing their victims have a chance to get refreshed.
parameters in the frequency domain captures these aspects and
use this insight in the design of Blacksmith, a scalable Rowhammer Target Row Refresh. Target Row Refresh (TRR) is an
fuzzer that generates access patterns that hammer aggressor rows umbrella term for hardware mitigations against the Rowhammer
with different phases, frequencies, and amplitudes. Blacksmith vulnerability, with recent variants operating entirely inside
finds complex patterns that trigger Rowhammer bit flips on all 40 DRAM chips [12]. At a high level, TRR aims to detect rows
of our recently purchased DDR4 DIMMs, 2.6× more than state
that are frequently accessed (i.e., hammered) and refresh their
of the art, and generating on average 87× more bit flips. We also
demonstrate the effectiveness of these patterns on Low Power neighbors before their charge leak results in data corruptions.
DDR4X devices. Our extensive analysis using Blacksmith further The challenge is finding the frequent items in a stream of
provides new insights on the properties of currently deployed DRAM accesses. However, as precise frequent item counting is
TRR mitigations. We conclude that after almost a decade of expensive in hardware, TRR implementations try to estimate the
research and deployed in-DRAM mitigations, we are perhaps in
frequent items (i.e., the aggressors). Recent work shows that by
a worse situation than when Rowhammer was first discovered.
increasing the number of aggressors, certain implementations
of TRR are unable to keep track of all aggressors and
I. I NTRODUCTION corruptions resurface [12]. A majority of TRR implementations
(roughly 70%), however, remain secure since they can detect
A dangerous mistake when designing a mitigation is as-
all aggressors given that they are hammered frequently enough.
suming that attackers will operate the same way after the
deployment of the new mitigation. This is especially true for Non-uniform Rowhammer patterns. We make the key obser-
in-DRAM Target Row Refresh (TRR), a selection of defense vation that prior Rowhammer attacks always access aggressors
mechanisms for stopping the ever-worsening Rowhammer uniformly. From a frequent item counting perspective, this is a
effect in the DRAM substrate. Proprietary, undocumented straightforward case for estimating frequent items. However,
in-DRAM TRR is currently the only mitigation that stands there is, of course, no need for attackers to hammer in
between Rowhammer and attackers exploiting it in various the space where TRR implementations operate effectively.
scenarios such as browsers, mobile phones, the cloud, and Given the increasing (physical) susceptibility of DRAM to
even over the network [1]–[11]. In this paper, we show how Rowhammer [15], aggressors no longer need many accesses:
deviations from known uniform Rowhammer access patterns attackers are free to choose from many hammering strategies
allow attackers to flip bits on all 40 recently-acquired DDR4 between the times a victim row is refreshed. While this provides
DIMMs, 2.6× more than the state of the art [12]. The many possibilities to fool the TRR’s estimation of the frequent
effectiveness of these new non-uniform patterns in bypassing items, at the same time, it creates a problem for attackers since
TRR highlights the need for a more principled approach to the search space for non-uniform patterns is huge.
address Rowhammer. We design a series of experiments that start by randomizing
...
the patterns and gradually discovering the essential properties Bitline
that make them successful. This exploration ultimately results Wordline

Row Decoder
in a set of parameters for constructing non-uniform patterns Row Access
Transistor
that can effectively explore the weaknesses in existing TRR
Capacitor
mechanisms. Notably, we find three temporal properties, Cell

Row Address
namely order, regularity, and intensity, play a crucial role
in constructing non-uniform patterns that can escape various DRAM Cell
Sense Sense Sense Sense Row Buffer
TRR mechanisms. Amp. Amp. Amp. Amp.

Fig. 1: DRAM structure. Low-level view on a DRAM bank.


Rowhammering in the frequency domain. To capture these
temporal parameters, we propose constructing non-uniform patterns. We also show Blacksmith’s ability to trigger bit
patterns in the frequency domain. Signal properties such as flips on 16 out of 19 LPDDR4X DRAM chips.
phase, frequency, and amplitude conveniently map to the (4) We conduct an extensive analysis of the effective patterns
parameters that are important in exploring the blind spots of and bit flips found by Blacksmith to gain insights on pat-
TRR. Based on this insight, we build Blacksmith — a scalable terns and deployed mitigations. Furthermore, we reverse-
Rowhammer fuzzer capable of generating access patterns by engineer the TRR mechanism of one of the LPDDR4X
randomizing parameters in the frequency domain for randomly devices where Blacksmith could not trigger any bit flips
selected aggressors. In contrast to previous work [12], our novel to show how it can better be configured.
patterns are highly complex, making it difficult for humans
to explore manually. Furthermore, our scalable fuzzing-based Reproducibility. To enable reproducibility, we publish the
approach makes it easy to test a large number of DRAM devices source code of Blacksmith on this URL: [Link]
against Rowhammer, without the need for time-consuming comsec-group/blacksmith.
reverse engineering. On top of generating non-uniform patterns, Responsible disclosure. We reported our findings to affected
we can distinguish interesting DRAM-dependent temporal parties by following a responsible disclosure process. In Q1-
properties by analyzing patterns that triggered bit flips. 2021, we initiated the process with the NCSC Switzerland
Our evaluation shows that Blacksmith can generate patterns
(NCSC-CH). In Q2-2021, NCSC-CH informed affected parties
that can trigger bit flips on all 40 recently purchased DRR4
and shared our results with DRAM vendors, OEMs, and cloud
DIMMs from the three major DRAM vendors (Samsung,
providers. In Q3-2021, NCSC-CH sent affected parties an up-
Micron, and Hynix), a factor of 2.6× more than state-of-the-art
dated version of our work and announced the public disclosure
many-sided patterns [12]. We also demonstrate the effectiveness
date. In Q4-2021, we have been assigned a CVE (CVE-2021-
of these patterns on 16 out of 19 Low Power DDR4X devices.
42114) and publicly disclosed Blacksmith on November 15,
These results show that instead of obscure TRR mitigations, we
2021. The three DRAM manufacturers (Samsung, SK Hynix,
need to invest in principled mitigations with clear guarantees.
and Micron), Intel, AMD, Microsoft, Oracle, and Google
To gain more insights into these non-uniform patterns, we
confirmed the receipt of our findings. SK Hynix got in touch
systematically evaluate how Blacksmith converges to the
with us to discuss the LPDDR4X results. We discussed a
specific values of the different spatial and temporal parameters.
possible mitigation with Intel and our findings more in detail
Using the bit flips triggered by these patterns, we uncover
with Google. None of the contacted parties informed us of
interesting new properties of deployed TRR mitigations such
their mitigation plans.
as the number of aggressors that they track, the importance
of the aggressors’ addresses, and significant differences in the II. BACKGROUND
number of triggered bit flips on different chips of the same This section gives an overview of DRAM, including its
device. Furthermore, we reverse-engineer properties of the internal organization and interaction with the memory controller.
TRR implementation on one of the Low Power DDRX devices We also introduce the Rowhammer attack, widely-deployed
where Blacksmith could not trigger bit flips and show how a mitigations against it, and describe common access patterns.
different configuration of Blacksmith could trigger bit flips on
these devices. A. DRAM Organization
Contributions. We make the following contributions: While there are different DRAM types for PCs, servers, and
laptops, they share a common organization discussed here.
(1) We present novel non-uniform Rowhammer patterns
that make it difficult for TRR to estimate the potential Addressing & Geometry. A DRAM address is composed of
aggressor rows accurately. a channel, bank, rank, row, and column. Each channel is
(2) We design Blacksmith, a new Rowhammer fuzzer that can connected to one or multiple DIMMs, of which each can
effectively explore the important parameters of these non- operate independently. A DIMM is equipped with multiple
uniform patterns by hammering in the frequency domain. DRAM chips that are grouped into ranks and these, in turn,
(3) We evaluate Blacksmith on 40 DDR4 DIMMs from all consist of multiple banks that can operate in parallel [16]. A
three major DRAM vendors, showing that it is possible to bank is made of many DRAM cells, of which each contains a
trigger bit flips on 100% of them by using non-uniform capacitor, which stores a single data bit as electrical charge,
and an access transistor. These cells are arranged in a two- Single-sided Double- 4-sided
... sided
dimensional grid (see Figure 1) and connected row- and column- x+2
wise by a word- and a bitline, respectively. Every bank has a x+1
row buffer, an array of sense amplifiers connected to the bit x
x–1 Single-sided
lines involved in reading/writing data from/to rows.
x–2 Double-sided
DRAM Commands [16]. Before reading or writing data x–3
x–4
4-sided
to a DRAM address, the memory controller (MC) puts
x–5 Frequency (ACTs/Aggressor)
the associated bank in a precharged state by issuing the
x–6
PRECHARGE command to DRAM, deactivating the row buffer.
Next, the MC issues an ACTIVATE command, after which the (a) Spatial arrangement of aggressor (b) Relative activation frequency, i.e., number
rows ( ) and victim rows ( / ) in DRAM of ACTIVATEs per aggressor in a Rowham-
requested row is loaded into the row buffer. Now, data can memory. mer pattern.
be read (READ) or written (WRITE); both require specifying Fig. 2: Common Rowhammer access patterns. Overview of the most common
the targeted column(s) of the loaded row. Additionally, the Rowhammer access patterns from prior work.
MC must issue REFRESH commands regularly, on average
every 7.8 µs (the refresh interval or tREFI) [17], to preserve answer in this paper is whether there are more effective ways
a cell’s value since the capacitors leak charge over time [18]. of discovering gaps in the estimation of aggressor rows.
The REFRESH only refreshes a small subset of rows at a time, Rowhammer Access Patterns. We use the term pattern to
which are determined by the DRAM chip, based a row’s last describe memory access sequences and denote patterns as
refresh time. Related to that is the retention time, typically being effective when they can trigger bit flips. In search of
64 ms in DDR4 [18], [19], the minimum time that DRAM effective patterns for more DIMMs, we must understand how
cells must be able to hold data without losing information. existing instances work. Figure 2a shows the three common
B. Rowhammer Rowhammer access patterns. In the original work [13], the
authors used two far apart aggressor rows for hammering, later
While the industry has been aware of the Rowhammer
termed as single-sided because, from the victim row’s point of
vulnerability in DRAM since at least 2012 [20], Kim et
view, their charge is being leaked from one side. Later, Seaborn
al. [13] studied the problem rigorously for the first time in
and Dullien [1] showed that if a victim row is sandwiched by
their seminal paper in 2014. They observed that commodity
two aggressors, it increases the chance of bit flips (i.e., double-
DRAM chips from all major vendors suffer from disturbance
sided). Frigo et al. [12] introduced n-sided Rowhammer, where
errors induced by repeatedly opening (ACTIVATE) and closing
n refers to n − 1 victims being hammered by n aggressors
(PRECHARGE) a DRAM row (i.e., aggressor row) in a short
from both sides. Figure 2a shows an example with n = 4. The
period of time. This action causes some cells in neighboring
recent SMASH attack [11] shows that it can trigger bit flips in
rows (i.e., victim rows) to leak charge at a faster pace than usual.
JavaScript by synchronizing n-sided patterns with the DRAM
Consequently, these cells can no longer retain their charge for
REFRESH command. Our experiments with SMASH patterns,
the period they are supposed to before the cell is refreshed,
as discussed in Appendix A, show that while aligning with
resulting in their bits flipping.
REFRESH increases the number of effective patterns found on
The Rowhammer attack attracted much attention due to its certain DIMMs, overall, it does not compromise TRR on more
devastating impact on systems security. Follow-up research devices than the original n-sided patterns.
showed how Rowhammer can be used to compromise users We make a key observation that the aggressors in all
via JavaScript [2], [3], [8], [11], in the cloud [4], [5] on mobile these previous patterns are hammered uniformly as shown
phones [6], [7], and even over the network [9], [10]. in Figure 2b. While hammering uniformly maximizes the
Target Row Refresh. The industry has responded to Rowham- chance of triggering a Rowhammer bit flip, since it maximizes
mer by deploying a mitigation known as Target Row Refresh the frequency in which the aggressors are hammered, it is
(TRR). Frigo et al. [12] analyzed TRR and found that it refers also the easiest case for TRR to estimate the rows that are
to a variety of different solutions with the recent variants all accessed the most (i.e., hammered). Given the increasing
operating inside the DRAM chips. They further show that in- degree of vulnerability to Rowhammer, the aggressors no
DRAM TRR tries to detect which rows are being hammered longer need to be hammered as frequently as possible, and a
using a sampling mechanism and internally refreshes their significantly smaller number of accesses is enough to trigger
victims before these receive their regular refresh. An ideal Rowhammer [15]. This provides an opportunity to better
TRR sampler needs to keep track of every row that receives an exercise the TRR’s estimation of aggressor rows by hammering
ACTIVATE command but doing so is expensive in hardware. non-uniformly. This paper explores the design of non-uniform
Instead, existing TRR mechanisms estimate the rows that are patterns against in-DRAM TRR.
activated most often. The TRRespass fuzzer [12] shows gaps
in this estimation by increasing the number of aggressor rows, III. P ROPERTIES OF E FFECTIVE N ON - UNIFORM PATTERNS
causing Rowhammer bit flips to resurface on roughly 30% of While non-uniform access patterns will likely make it more
modern DDR4 DIMMs. The question that we are trying to challenging for TRR to estimate the aggressors, at the same
Precomputed Bit Mask index=2 random rounds
time, they are challenging to craft due to the large design
0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 ...
space. Let us consider the possible number of activations in a
6-sided Pattern r1 r2 r1 r2 row number
tREFI (≈ 100 accesses), so we end up with ≈ 819 k possible a1 a3 a5 a7 a9 a11 a1 a3 a5 a7 a9 a11 a1 a3 a5 a7 a9 a11 a1 a3 a5 a7 a9 a11 ...
activations between two (victim) row refreshes, where each pattern's length
random pattern all random rows
could potentially be used to hammer our aggressors. Assuming Randomized Pattern
location and round of same bank
r1 r2 r1 r2
that we need to hammer 10 k times, it gives us more than 6.7 × aa ab ac ad ae af aa ab ac ad ae af aa ab ac ad ae af aa ab ac ad ae af ...
1023447 possibilities to distribute our double-sided aggressor pattern's length

accesses (see Appendix C for details). As this is impractically 0 1 2 3 4


Hammering Rounds
large, we explore the important properties of effective non- Fig. 3: Non-uniform patterns experiment. (i) We take a n-sided pattern (e.g.,
uniform patterns to reduce the size of this search space. n = 6) and based on a precomputed bit mask, randomly replace accesses to a
One possibility is to reverse-engineer specific details of randomly picked double-sided aggressor pair r1 , r2 . (ii) We create a randomized
pattern and hammer a randomly-picked double-sided aggressor pair r1 , r2 at
various TRR implementations, as has been done in concur- random times.
rent [21] and earlier work [12]. This is a time-consuming
process and needs to be repeated on new devices given that Count) are required on modern DDR4 devices to trigger bit
vendors tend to change their implementations [12]. Instead, flips. Ideally, this value should be as small as possible to reduce
our goal here is to determine the generic properties of existingthe chance of detection by TRR, yet large enough to cause a
TRR implementations. For this purpose, we conduct a series of bit flip. As we cannot determine this value for our PC-DDR
experiments on DIMMs A10 and B2 of the two major vendors DIMMs, we randomly pick a value in between 10 k and 147.5 k
in our test pool. We later show how these discovered properties for each pattern. While hammering the pattern, we then use the
can help in triggering bit flip on other DIMMs from the same bit mask to offset an array that points to part of our n-sided
vendors as well in devices from other vendors. pattern or our randomly-picked double-sided aggressor pair.
We start exploring non-uniform patterns by randomizing In experiment (ii), we follow the same methodology to access
the number of aggressors being hammered and their location a selected double-sided aggressor pair non-uniformly; however,
(Section III-A). To limit the search space, we try to answer instead of a n-sided pattern as a basis, we now randomize
questions such as when we should hammer an aggressor and the pattern’s accesses. Note that these random accesses are
for how long. We first answer these questions for patterns spread over the same bank as our aggressors, i.e., there are no
that fit within a REFRESH interval (Section III-B) and later fixed distances in-between aggressors like in n-sided patterns.
extend our search to larger patterns (Section III-C). After we Similarly as in experiment (i), we use patterns of length n ∈
understand the properties of successful patterns, we discuss [2, 32] but we replace aggressors by our double-sided aggressor
how we can capture these properties when generating effective pair at random locations of the pattern. This makes all aggressor
non-uniform access patterns (Section III-D). accesses in our pattern non-uniform.
A. Can non-uniform access patterns bypass mitigations? We extended TRRespass [22] by these two new ways of
We design an experiment to explore the effectiveness of non- creating patterns and try these patterns as well as the original n-
uniform patterns. In this experiment, we assess the importance sided patterns on all DIMMs of our test pool (see Appendix B)
of non-uniformity by considering two extremes in the design for 6 h. To see if a pattern is successful, we check all rows
space: (i) adding some randomness to n-sided patterns and next to accessed rows for bit flips. The randomized approach
(ii) creating randomized patterns. was the most successful and could trigger bit flips on 37.5%
In the first experiment (i), we introduce non-uniform ag- of all devices in our test pool, followed by n-sided patterns
gressor accesses (i.e., accesses at random times) into common (35%), and n-sided patterns with random accesses (27.5%).
n-sided patterns by accessing selected aggressors more or less Considering all three approaches together, we observed bit
often than all others. This means, we access a randomly picked flips on 20 of 40 DIMMs (50%). From these 20 DIMMs, there
double-sided aggressor pair at random times during the regular are 8 DIMMs where all three approaches triggered bit flips
accesses of an n-sided pattern .1 and 6 DIMMs where one (or both) of the two non-uniform
The naive approach for implementing such random accesses approaches succeeded. Table VII in Appendix D provides more
would be using conditional branching based on some random detailed results from these experiments.
value. However, the CPU might speculatively execute the wrong These experiments confirm our assumption that there are
branch, leading to unwanted memory accesses. Therefore, we DIMMs where we need non-uniform patterns to bypass the
rewrite our branching into a statement that targets different mitigation. This shows that non-uniformity is a promising
memory locations depending on the condition’s value. As concept for finding effective Rowhammer access patterns on
depicted in Figure 3, we precompute a bit mask that decides more devices.
when and how often our aggressor pair should be hammered. Observation (O1). Non-uniform accesses can lead to
This bit mask is computed based on existing work [15] effective patterns on DIMMs where previous n-sided
that showed between 10 k and 147.5 k ACTIVATEs (Hammer patterns could not trigger any bit flips.
1 with a randomly picked number of aggressors n ∈ [2, 32], an aggressor
intra-distance d ∈ [0, 16], and an aggressor intra-distance v ∈ [1, 4]. However, there are also three opposite cases where only pure
Offset Intensity Pattern Execution (Rounds) 200
Intensity 1 2 3 4 5
0 1 150

Bit Flips
1 1 100
1 50
N-2 1 0
REF
Aggressors (a1,a2) 77 79 81 83 85 87 89 91 93 95 97
sync Random rows
Fig. 4: Offset & intensity experiment. Systematic probing of aggressor offsets Fig. 6: Hammering intensity. Number of observed bit flips when repeating
0...N − 2 for a pattern of length N . hammering the aggressors with different intensity (1–5), accumulated over 10
150
different locations on A10 . Hammering with an intensity of two, starting from offset
Bit Flips

100 78, triggers the most (190) bit flips.

50 the pattern that triggered the highest number of bit flips (red
0
bar), starts at offset 91 and generates 140 bit flips. We can see
0 5 10 15 20 70 75 80 85 90 95 100
that an arbitrarily chosen aggressor offset may lead to no bit
Fig. 5: Aggressor offset. Observed bit flips on A10 , over ten probed locations, at flips because the TRR sampler on this device considers the
which we place aggressors at different offsets in the pattern (N = 100). Using an first accesses in a refresh interval, similar to the observations
offset of 91 (red) triggers the most (140) bit flips.
reported in earlier work [12]. These results suggest that towards
n-sided patterns are effective; this indicates that these simple the end of the refresh interval, only certain accesses (at
approaches for pattern generation are not effective enough. offsets 80, 84, . . . , 96) are sampled. Hence, we can trigger
Besides that, we observe that our pattern search space is not bit flips by hammering at specific times in the last ≈23%
optimal yet: using n-sided patterns as a base seems to be too of the refresh interval (i.e., offsets 77 − 98). The number
restrictive, whereas the random approach creates an enormous of bit flips that we observe in this range is, on average,
search space that cannot be explored in sufficient depth within higher than for all other possible offsets within a REFRESH
a reasonable time. Therefore, we aim to identify parameters interval. From that we conclude that our assumption is correct:
of effective patterns that allow us to guide pattern generation carefully choosing when to access aggressors is significant for
and, as such, reduce the search space. maximizing effectiveness.
B. When should we hammer an aggressor and for how long? Observation (O2). Inserting aggressors at the “right”
Prior work [11], [12] suggests that in-DRAM TRR acts at location in a pattern enables them to bypass the mitigation.
the same time of a REFRESH. Based on this, we aim to explore
the parameters of effective non-uniform patterns within two A natural follow-up question from this result is whether
consecutive REFRESH commands, i.e., a refresh interval. hammering our aggressor pair with greater intensity (i.e., more
To verify when we should hammer, we design an exper- than only once successively) increases the number of observed
iment where we randomly choose a double-sided aggressor bit flips. More bit flips are favorable for attacks as they typically
pair (a1 , a2 ) and generate a pattern of length N , where N require bit flips at specific page offsets. Hence, more bit
corresponds to the number of memory accesses that fit inside flips increase the attack’s success rate. However, accessing
a refresh interval (determined experimentally beforehand). For an aggressor pair successively too often will likely result in
each possible offset t = 0, . . . , N − 2 in that we can place the a TRR. To investigate this, we extend our last experiment
two aggressors, we craft a pattern as follows: the aggressors by repeating hammering each possible pattern offset up to
a1 and a2 are placed at position t and t + 1 in the pattern, five times for one million rounds in total. This experiment is
respectively, and the remaining N − |{a1 , a2 }| = N − 2 depicted in Figure 4. We limit the intensity to five because
accesses, (i.e., positions 0 ≤ i < N for i ̸∈ {t, t + 1}) are filled higher intensities do not trigger bit flips anymore.
up by accesses to random rows in the same bank as a1 and a2 . In Figure 6, we show the results of this experiment. We report
This is depicted in Figure 4: the pattern’s aggressor accesses observed bit flips within aggressor offset 77 − 98 (derived from
are highlighted in yellow and the random accesses in grey. We the previous experiment, see Figure 5). We can see that for
repeat hammering each pattern for one million rounds, i.e., some offsets, an increased hammering intensity leads to more
long enough to see bit flips even with strong DRAM cells [15]. bit flips. For example, starting from offset 78 and successively
We note that the rows are randomly picked for each offset hammering two times is more effective (190 bit flips) than
(including the aggressors) in each iteration of the experiment. only a single time (110 bit flips) and also outperforms the
For improving the reliability, we repeat the experiment ten best offset hammered only a single time (offset 91, 140 bit
times on different locations (i.e., DRAM rows). To ensure that flips). As expected, hammering the aggressors for too long
these patterns remain inside the refresh interval, at the end of triggers a TRR, which results in fewer or no bit flips at all.
each round, we access two random rows from the same bank This strongly indicates that TRR sampling happens at specific
repeatedly until we observe a peak in the access time, which offsets (80, 84, 88, . . .), but it is not enough for an aggressor
signals that a REFRESH happened. row to get sampled only at one of them. For example, we can
Figure 5 depicts the results of our experiment for A10 , see that at offset 80 with an intensity of 5, our aggressors are
aggregated over ten DRAM locations. The best pattern, i.e., sampled by the mitigation; however, if we use an intensity
60
Aggressor Hammering Intensity
6 16 19 22 34 37 40
Bit Flips

30 9 17 20 23 35 39 41
14 18 21 33 36
0
0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220
Random Accesses in Between Aggressor Accesses
Fig. 8: Hammering duration. Observed bit flips on B2 for patterns up to three REFRESH intervals, a varying number of random accesses and aggressor hammering
intensities. Choosing an offset of 138 with intensity of 41 triggers the most (56) bit flips. We omit intensities without any bit flips.

a) intensity = 17, #random accesses = 40 allows us to investigate how access intensity and regularity
affect a pattern’s effectiveness.
b) intensity = 21, #random accesses = 54 Figure 8 shows the experiment’s result for intensities where
we observed bit flips. As the number of observed bit flips
c) intensity = 42, #random accesses = 110
decreased if we issued more than 200 random accesses in-
between, we focus here on two refresh intervals only. In contrast
to the earlier observation on A10 (Section III-B), the DIMM
REFRESH interval
Pattern's length
B2 considered here requires a higher intensity (≥ 6) to trigger
time any bit flips due to its different TRR implementation. The
Fig. 7: Regularity experiment. Examples of tested patterns with different intensity plot shows notable differences in the number of bit flips for
and number of random accesses: a pattern smaller than (a) and of equal length
to (b) a refresh interval, and (c) one covering two refresh intervals. Opaque regions
specific pattern lengths. Interestingly, there are cases where we
show the pattern’s repetition during execution. hammered almost the whole refresh interval (≈ 85 accesses)
without being captured by the mitigation. For example, with
of 4 starting at offset 79, we also do access an aggressor at hammering intensity of 41 and offset of 138, we first issue
offset 80 but we do trigger bit flips. This suggests that the TRR 41 × 2 aggs. = 78 aggressor accesses (i.e., almost a whole
mechanism on this device deploys a counter and we need to REFRESH interval), followed by 138 random accesses.
get sampled multiple times before a TRR. We conclude from We conclude with two points from these findings. For an
this that there is a sweet spot up to which we can increase the aggressor pair to successfully trigger bit flips, (1) it should
intensity to induce more bit flips. not be hammered in certain (long) periods, and (2) when it
is hammered, it should be with high intensity, even up to a
Observation (O3). Up to a specific point (sweet spot),
whole refresh interval. These results naturally imply that we
increasing the hammering intensity leads to more bit flips.
need to consider patterns larger than a single refresh interval.
These two properties of effective non-uniform patterns allow Observation (O4). Hammering aggressors with a high
us to reduce the search space because the pattern’s length intensity at specific points inside multiple refresh intervals
of one refresh interval implicitly limits possible offsets and allows us to bypass the mitigation more effectively.
hammering intensities for our aggressors. However, running
the same experiment on B2 , required a significantly higher
D. How can we generate new patterns based on these insights?
hammering intensity to trigger bit flips. We tried intensities up
to a whole refresh interval and could trigger only 5 bit flips In this section, we showed that non-uniformity allows finding
with an intensity of 19. Not to risk limiting our search space effective patterns where previous approaches failed (O1) and
by too much, we will also explore whether larger patterns can that it is crucial to carefully choose when, within the pattern,
be more effective in bypassing certain TRR variants, such as to issue memory accesses to the aggressors (O2). We further
the one employed in B2 . discovered that the number of successive hammering repetitions
can increase the number of bit flips (O3) and that long patterns,
C. Should our patterns be longer than one refresh interval? covering multiple refresh intervals, are necessary to discover
To answer the question of the pattern’s length, we design patterns triggering bit flips on certain DIMMs (O4).
the experiment presented in Figure 7. We first hammer two We leveraged these four observations to design and im-
randomly picked double-sided aggressors with a given intensity plement Blacksmith, a new blackbox Rowhammer fuzzer.
and then issue a varying number of alternating accesses to two Blacksmith generates patterns consisting of aggressors that are
randomly picked rows. In our experiment, we cover intensities placed in the pattern using concepts from the frequency domain,
from 1 up to 64 and between 1 and 384 random accesses such as phase, amplitude, and frequency. This enables us to
because they result in patterns of up to 64 × 2 + 2 × 192 = distinct aggressors by expressing when we access them (phase),
512 accesses, which covers five full refresh intervals. Again, how often we repeat accessing them successively (amplitude),
we repeat the experiment for each combination ten times on and how their accesses are distributed over time (frequency).
different DRAM locations and check the rows around the By fuzzing these properties, we can compose patterns that
double-sided aggressors for bit flips. Unlike before, we do not stress TRR mitigations to trigger bit flips successfully. Our
synchronize with the REFRESH anymore since our patterns approach finds parameters efficiently by probing multiple
now grow beyond a single refresh interval. This approach {phase, amplitude, frequency} sets for different aggressors
1
Parameters Pattern #ACTs/REF Interval the pattern for multiple refresh windows (i.e., multiple 64 ms).
Generator
Temporal
To ensure that we keep accessing rows with their defined
Access Pattern
2
frequency, we synchronize accesses with the REFRESH at
Parameter Param. Aggressor DRAM Functions DRAM
Manager Mapper Inspector the beginning of each pattern’s repetition (similar to [11]).
Addresses Finally, the Memory Scanner 5 verifies if the random data
3 DRAM
Parameters Code Functions pattern, written before to memory, changed during hammering.
Generator
Code page Because all pattern’s aggressors can potentially trigger bit flips,
4
Victim Rows
5
Memory the Memory Scanner checks two rows around each of them for
Executor Scanner
flipped bits; and if it finds any flips, it reports them and restores
Fig. 9: Blacksmith’s architecture. Overview of Blacksmith’s main components,
their interaction, and execution order ( 1 – 5 ).
the original data pattern. We then either (i) hammer the same
pattern with the same mapping again on a different DRAM
in a single pattern. This eliminates the need to explicitly select location ( 3 – 5 ), (ii) hammer the same pattern with a new
aggressors given that now the entire pattern is comprised of mapping ( 2 – 5 ), (iii) or generate a new pattern and repeat
potential aggressors, some fooling the mitigations while the the whole procedure ( 1 – 5 ). Probing different locations is
others effectively hammering. To the best of our knowledge, required because we could have been unlucky and tried a
Blacksmith is the first fuzzer that uses this novel strategy for pattern on a memory region where cells are less vulnerable,
generating non-uniform Rowhammer patterns. thus resulting in no bit flips. The Parameter Manager and
the DRAM Inspector are two supporting components. The
IV. B LACKSMITH Parameter Manager defines fuzzing parameters, their valid
We now describe the design and implementation of Black- value ranges, and samples values from these ranges. The
smith. We first give a high-level overview of Blacksmith’s archi- DRAM Inspector loads the proper DRAM address functions
tecture (Section IV-A), followed by describing how Blacksmith (derived from a DIMM’s number of ranks as all our evaluation
generates Rowhammer patterns, including a formalization of the systems are equal) and determines required DIMM-specific
underlying concepts (Section IV-B). After that, we introduce information, such as the number of possible ACTIVATEs in a
Blacksmith’s parameter-tracking mode that uses bit flips as a refresh interval.
feedback mechanism to learn parameters of effective patterns
B. Frequency-Based Patterns
(Section IV-C). Finally, we provide selected implementation
details (Section IV-D). Blacksmith crafts access pattern by considering their two
dimensions separately: the temporal dimension, which describes
A. High-Level Overview
when we access a row, and the spatial dimension, which
Figure 9 depicts Blacksmith’s components. The Pattern defines where in memory we hammer (i.e., bank and row). Our
Generator 1 implements our non-uniform access patterns, non-uniform access patterns focus on the temporal dimension
which randomizes the temporal aspects of the aggressors inside discussed next. We consider the spatial dimension by testing a
the pattern (i.e., when within a pattern, for how long succes- crafted frequency-based pattern on three different (randomly
sively, and how often aggressors are accessed). The Aggressor chosen) memory locations as the vulnerability of different
Mapper 2 maps aggressors to DRAM locations, i.e., assigns DRAM cells may vary [15].
each aggressor of a temporal pattern to a DRAM address by
using known bank/rank address functions [11], [23]. In this Capturing temporal properties. We use a terminology
step, aggressors can either be distributed equidistantly over inspired by the frequency domain as composing signals with
the same DRAM bank (i.e., same number of rows in between) different frequencies can be used as an analogy to crafting
or randomly placed with one row in between aggressors that a Rowhammer access pattern with aggressors of different
target the same victim. These mapping parameters are also frequencies.
randomized during fuzzing. The mapper then derives the virtual First, we generalize the idea of aggressors by defining
addresses corresponding to all hammered rows and passes them the notion of an aggressor tuple Ak = (a1 , a2 , . . . , ak ), i.e.,
to the Code Generator 3 to just-in-time (JIT) compile the an ordered access sequence of k aggressors. Our pattern’s
hammer instructions into an executable code page. For the same aggressors are not associated with specific DRAM locations but
reason as in Section III-A, we compile access patterns to avoid we map them later to specific DRAM rows. For example, in the
conditionals (e.g., if-else) during pattern execution as branches case of A 2 , we could map them like a double-sided aggressor
can be executed speculatively, resulting in unwanted memory pair. Multiple such aggressor tuples fill up a Blacksmith access
accesses, and thus “break” our pattern’s access order. Also, it pattern to improve the fuzzer’s efficiency while exploring the
allows us to determine where we need to serialize memory reads parameter space.
and flushes using fences. We follow a flush-early and fence-late Each aggressor tuple Ak has three characteristics: a fre-
strategy by flushing aggressors from the cache immediately quency, a phase, and an amplitude. The frequency f defines
after accessing them and fencing immediately before accessing how often the aggressor tuple Ak is accessed within a pattern.
them again to minimize the performance impact of serialization. The phase ϕ defines when from the start of the pattern a
The Executor 4 then runs the compiled code page to execute specific aggressor tuple Ak will be executed. The amplitude û
f=½ C. Parameter-Tracking Mode
1st Period 2nd Period 3rd Period 4th Period To understand how Blacksmith parameters impact a pattern’s
a1 a2 a1 a2 ... a1 a2 a1 a2 ...
effectiveness, we implemented a parameter-tracking mode. This
=2 ^=2
u T=8
feature uses a pattern’s effectiveness (bit flip count) and rarity
Fig. 10: Parameters of pattern generation. Example showing an aggressor tuple
A2 = (a1 , a2 ) with (f ,ϕ,û) = ( 12 , 2, 2) and T = 8. (how hard it is to find) in a feedback mechanism to learn which
Table I: Blacksmith’s parameter setup. For each pattern, we choose a number of parameter sets are most successful. The parameter-tracking
aggressor tuples and refresh intervals (which results in the pattern’s length N ). For mode starts with a uniform distribution for each parameter and
each aggressor tuple, we pick a number of aggressors, a phase, an amplitude, and
derive a frequency from the base period. The amplitude is limited by ACTtREF , the gradually learns, based on the aforementioned indicators, which
number of possible activations in a REFRESH interval. parameter values work best for specific DIMMs. It uses the
feedback to modify the parameter distributions by increasing
Parameter Range Sampling Unit
the probabilities of parameter outcomes that were successful.
#Aggressor tuples [8, 96] Pattern Using this, we can learn what parameters and values are most
#Refresh intervals [1, 16] Pattern important to bypass mitigations. Furthermore, it allows us to
#Aggressors [1, 2] Aggressor tuple
derive interesting insights, as we show in Appendix F.
Base period [4, N ] Aggressor tuple
Phase [1, N ] Aggressor tuple We used our parameter-tracking mode to determine a golden
Amplitude [1, ACTtREF ] set of parameter ranges that can find effective patterns on 37/40
Aggressor tuple
– for B2,8,9 [1, 4 × ACTtREF ] DIMMs of our test pool. For three DIMMs (B2,8,9 ), we had
to slightly increase the amplitude from (up to) one to four
refresh intervals. To determine these generic parameters, we
describes for how long we should hammer a specific tuple, i.e.,
performed a 24 h run using large parameter ranges to determine
the number of consecutive hammering repetitions of Ak .
the common ranges based on the discovered effective patterns.
Building a pattern. Blacksmith combines multiple aggressor Table I shows the final ranges used in our evaluation.
tuples Ak to form an access pattern. For intertwining these D. Implementation
Ak effectively, we define a global parameter that aids the Our Blacksmith fuzzer was implemented from scratch in
construction: the base period. The base period T defines (and C++11 in around 6.7 k lines of code. It uses several open-source
limits) the frequency of an aggressor tuple. libraries such as asmjit [26] for JIT compiling a pattern’s ac-
We depict the pattern creation in Figure 10. Before starting cesses and nlohmann/json [27] for im- and exporting JSON data
to fuzz, assume we determined that we can issue 64 accesses (e.g., parameters) needed for analyzing and replaying effective
in a refresh interval, and we want our pattern to cover four patterns, and also for analyzing bit flips. The source code can
refresh intervals (i.e., 4 × 64 = 256 accesses). As a result, be found on [Link]
we can choose any of {2, 4, . . . , 256} as the base period. Let
us pick 8 so that the frequency f of any aggressor tuple is V. E VALUATION
now a multiple (or divisor) of T = 8. For instance, if f = 1 In this section, we evaluate the qualities of non-uniform
we execute the aggressor tuple once every base period, while access patterns. In Section V-A, we describe our test devices
if f = 2/8 we execute it every 4 (= 8/2) base periods. In and infrastructure. After that, we present our large-scale
Figure 10 we fill the pattern with an aggressor tuple A2 with analysis results on 40 DDR4 DIMMs in Section V-B. In
(f = 1/2, ϕ = 3, û = 2) meaning that A2 is executed every Section V-C, we evaluate how our Blacksmith-generated
two base periods (f = 1/2), it is displaced by 2 from the patterns facilitate Rowhammer exploitation. For completeness,
start of the pattern (ϕ = 2), and the aggressor tuple is always we also evaluate the effectiveness of non-uniform patterns
hammered two times sequentially (û = 2). on LPDDR4X in Section V-D. Lastly, we provide concrete
Once a tuple is inserted, other aggressor tuples are inserted examples of Blacksmith patterns in Section V-E.
following the same logic avoiding access slots that are already A. Hardware and Fuzzer Setup
occupied by previously declared aggressor tuples. For instance, Our DDR4 DRAM test pool (Appendix B) consists of 40
after adding A2 above, we cannot introduce another A2 with DIMMs acquired in July 2020 with varying sizes, module
ϕ = 5 since such time slot is already filled. We refer the speeds, and timings. We cover all three major DRAM vendors,
interested reader to Appendix E for a more detailed description abbreviated by A (20×), B (10×), and C (6×). DIMMs denoted
of the pattern generation algorithm. by D (4×) do not report their DRAM vendor properly. To show
Unlike in previous work (e.g., [12], [13], [24], [25]), all that Blacksmith works in a real-world setup, we do not directly
accesses in our patterns can potentially trigger bit flips. That interface with DRAM devices (e.g., FPGA), but we use a
means all rows are treated as aggressors as we do not traditional PC setup: ten machines equipped with an Intel i7-
distinguish the rows that are only accessed to bypass TRR. 8700K and running Ubuntu 18.04 LTS (4.15.0). We evaluate
After hammering a pattern, we can measure the distance LPDDR4X DRAM chips using an in-house, JEDEC-compliant
between accessed rows and flipped rows to estimate the effective development board that allows us to test DRAM chips from
aggressors, i.e., the ones that most likely caused the bit flip. vendors A (6×), B (5×), and C (8×) while operating at 1.5GHz.
This property also implies that we need to check for bit flips Similar to previous work [12], we use a pseudorandom, non-
around every accessed row of a pattern. repeating data pattern in all our evaluation runs.
B. Blacksmith Results on DDR4 access to their page tables, i.e., full access to all physical
We aim to evaluate the generality and effectiveness of memory. On a system with 16 GB memory, this results in
Blacksmith by answering the question: Is our approach better 23 out of every 64 bit words to be possibly exploitable (i.e.,
at finding effective patterns on DIMMs where the state-of-the- log2 16 GB − log2 4 kB). This large number of exploitable bits
art cannot trigger any bit flip? To answer this question, we makes it possible to carry out an attack even on a module
perform a large-scale Rowhammer test and compare Blacksmith that manifests very few bit flips; e.g., B3 with only 111 bit
results against the data that we obtained using TRRespass [12]. flips can be exploited in around 1 hour. The time to find
We use the following evaluation methodology: (1) we run an exploitable bit flips then dramatically decreases for more
Blacksmith for 12 h on each DIMM, i.e., we generate patterns vulnerable modules, e.g., 22 s on average on D3 . The exploit
and try each on three different DRAM locations to determine if from Razavi et al. [4] gains SSH access to a co-hosted VM by
it triggers bit flips, (2) we “sweep” each effective pattern over flipping bits on the modulus n of a RSA-2048 public key and
(the same) contiguous memory region of 2 MB to determine factoring the much easier factorable n′ (̸= n) to recover the
the best pattern (i.e., most effective) based on the number private key. We could identify exploitable bit flips on 30 out of
of observed bit flips, (3) we “sweep” the best pattern over our 40 DIMMs (75%). Finally, Gruss et al. [14] exploit specific
a contiguous memory region of 256 MB to report the best bit flips on code pages of the [Link] library, stored in
pattern’s effectiveness. By “sweeping” we refer to repeatedly the page cache, to gain root privileges. Their opcode flipping
moving each row of a pattern by one, hammering the pattern, technique induces bit flips in cached binary files that often
and checking for flipped bits. For TRRespass, we skip step (2) lead to valid opcodes with a different semantic. This technique
and use its own definition of the best pattern based on the can break the password verification logic in the [Link].
number of triggered bit flips during the fuzzing run. We remark Only 29/(4096 ∗ 8) bits in a 4 kB page are exploitable for this
that the optimality of the best pattern is relative to a fuzzing run, attack. Still, 15 out of our 40 DIMMs (37.5%) are susceptible
and it might be that there are better patterns that Blacksmith to such attack within at most 38 min 35 s (A12 ). These results
could not find within 12 hours. show how non-uniform patterns largely ease exploitation. In
Table II shows the results of our large-scale evaluation run. fact, even when considering the more difficult attack (i.e.,
TRRespass found effective patterns on 15 of 40 tested DIMMs sudo [14]) we could still build an end-to-end exploit on 15 / 40
(37.5%), similar to the results from prior work (13 of 42 DIMMs, which is the total number of DIMMs that TRRespass
DIMMs, ≈ 31%) [12]. In contrast, Blacksmith found effective could trigger bit flips on (see Table II).
Rowhammer patterns on all of our 40 DIMMs (100%). Given the large number of bit flip on some devices, we
These results demonstrate Blacksmith’s effectiveness and would have expected to see more exploitable bit flips, e.g., in
scalability in triggering corruptions — answering our initial the PTE attack. We investigated this further in Section VI-A,
question positively. Blacksmith could find effective patterns where we show that this is due to the large variance in the
that trigger, on average, 87× more bit flips than TRRespass. number of flips in different chips from the same DIMM.
We show how this massive increase in the number of bit flips
allows for more practical exploitation in Section V-C. Table II D. Blacksmith on LPDDR4X
also suggests that while there is a trend in DRAM devices We evaluate the impact of our non-uniform patterns on
from different vendors, there are also outliers. LPDDR4X memory. Due to power and die area restrictions,
C. Exploitation with Non-Uniform Patterns there are key differences compared to regular DDR DRAM
We discuss the consequences of these better access patterns that make LPDDR an interesting target for Rowhammer
found by Blacksmith by analyzing their effect on three existing analysis: (i) LPDDR’s default refresh window is 32 ms, com-
Rowhammer exploits. For this purpose, we followed prior pared to 64 ms for standard DDR4; (ii) it supports dynamic
work [12], [25] and analyzed (i) the first Rowhammer exploit temperature-based refresh changing through the MR4 Mode
targeting page tables to gain a kernel read/write primitive [1]; Register [29]; and (iii) recent devices deploy on-die ECC [15].
(ii) the exploit from Razavi et al. [4] triggering bit flips in public We applied the test methodology outlined in Section V-B
RSA 2048 bit keys to allow their factorization and private key to evaluate Blacksmith on 19 LPDDR4X devices. As our
recovery; and (iii) the exploit by Gruss et al. [14] flipping bits LPDDR4X platform is fragile, which makes it difficult to
on the [Link] library to avoid root permission checks. perform longer runs, we had to reduce the run time to 6 h;
In our analysis, we briefly summarize each exploit; we refer even then, we had to restart multiple times until we accumulated
to the original descriptions [1], [4], [14] for more details. We in total 6 h (this is equivalent as Blacksmith’s fuzzing is
measure the number of exploitable bit flips when sweeping over stateless). Table IV summarizes our results. We observe that
a 256 MB chunk of memory and report the mean time to find Blacksmith can trigger up to two orders of magnitude more
them by relying on a port of the Hammertime framework [28]. bit flips on LPDDR4X compared to DDR4 DRAM, often
We show the results for all DIMMs in Table III. finding multiple bit flips in every row of every bank. This
In the attack from Seaborn and Dullien [1], the aggressor confirms previous results that indicated the lower Rowhammer
triggers a bit flip on a page frame number (PFN) in a page tolerance of LP devices, likely a direct result of the area and
table page, “hoping” to pivot its pointer to another (attacker- power restrictions [15]. However, in contrast to DDR4, for
controlled) page table page. This gives an attacker read/write some LPDDR4X DRAM modules from vendor B Blacksmith
Table II: Blacksmith results for DDR4 DRAM compared to TRRespass. For each Table III: Analysis of exploitation of our DRAM modules. Given the bit flips found
DIMM, we report the number of effective patterns found (|P+ |), i.e., patterns that by Blacksmith’s best pattern, we evaluate how many of these bit flips are exploitable
triggered any bit flip during fuzzing; and the total number of bit flips found during (#Expl.) when considering three exploits. For each DIMM, we then computed the
fuzzing (|Ftotal
fuzz |). For a DIMM’s best pattern, we do a sweep over 256 MB and average time to find an exploitable bit flip (Time). We mark (*) values where a single
report the same (|Ftotal 0 )1
swp |), plus the number of zero-to-one bit flips (|Fswp |). On measurement is available only.
three DIMMs, marked by † , we used an amplitude of up to 4 refresh intervals, see
Table I.

Blacksmith TRRespass [12] PTE [1] RSA-2048 [4] sudo [14]


DIMM DIMM
|Ftotal 0 )1 | total |F0 )1 |
|P+ | |Ftotal
fuzz | swp | |Fswp |P+ | |Ftotal
fuzz | |Fswp | swp #Expl. Time #Expl. Time #Expl. Time
A0 47 1,061 82,183 41,471 0 – – – A0 7604 4s 210 30s 17 5m
A1 116 2,125 12,134 6,095 12 12 5 5 A1 – – 28 4m 12s – –
A2 462 106,815 134,702 68,801 715 16,054 7,404 4,563 A2 9198 6s 306 21s 13 6m 43s
A3 82 239 1,746 890 326 852 114 58 A3 73 2m 21s 3 47m 37s – –
A4 460 1,604 5,132 2,602 78 105 22 9 A4 214 33s 7 13m 16s – –
A5 42 7,771 113,190 57,655 0 – – – A5 99 1m 27s 269 34s 12 11m 41s
A6 102 17,790 98,425 49,296 4 11 4 4 A6 52 2m 12s 220 32s 9 11m 55s
A7 66 3,415 32,090 15,988 0 – – – A7 6043 6s 69 2m 5s 8 11m 11s
A8 83 11,105 92,660 46,914 0 – – – A8 64 2m 24s 184 54s 15 10m 5s
A9 349 1,176 4,889 2,461 14 844 1 1 A9 136 28s 6 9m 45s – –
A10 350 1,282 3,051 1,532 367 961 505 280 A10 216 24s 7 12m 4s – –
A11 202 632 3,171 1,630 261 479 38 25 A11 197 2m 8s 13 23m 21s – –
A12 74 13,641 43,581 22,149 0 – – – A12 6596 7s 116 55s 2 38m 35s
A13 72 9,889 59,721 30,320 0 – – – A13 4520 8s 144 49s 7 13m 44s
A14 51 9,729 64,083 32,543 1 1 4 0 A14 5172 8s 151 44s 7 14m 19s
A15 67 8,333 52,580 26,483 0 – – – A15 4567 8s 105 1m 3s 7 14m 7s
A16 372 61,493 99,552 51,029 688 5,499 1,450 983 A16 6572 6s 231 27s 13 6m 30s
A17 425 57,245 138,601 70,902 711 12,196 3,871 2,690 A17 9775 3s 324 11s 10 5m 1s
A18 126 12,689 80,601 40,876 14 14 1 1 A18 11124 5s 182 44s 23 5m 28s
A19 107 2,543 11,599 5,736 0 – – – A19 832 3s 20 1m 18s 3 6m 21s
B0 9 11 63 22 0 – – – B0 – – – – – –
B1 7 14 506 256 0 – – – B1 1 1h 44m* 1 2h 31m* – –
B2 † 9 41 15 7 7 8 5 3 B2 – – – – – –
B3 1 2 111 58 0 – – – B3 3 1h 16m – – – –
B4 101 177 1,107 577 0 – – – B4 2 1h 27m 4 34m 7s – –
B5 19 24 14 6 0 – – – B5 – – – – – –
B6 18 41 78 46 0 – – – B6 – – – – – –
B7 4 4 70 34 0 – – – B7 – – – – – –
B8 † 4 6 258 131 0 – – – B8 – – 1 26m 50s* – –
B9 † 40 86 1,223 625 0 – – – B9 3 1h 3m – – – –
C0 1 3 26 16 0 – – – C0 1 2h 8m* – – – –
C1 16 29 28 8 0 – – – C1 – – – – – –
C2 82 282 2,551 1,242 0 – – – C2 1 1h* 3 59m 39s – –
C3 6 7 636 296 0 – – – C3 – – – – – –
C4 31 57 769 385 0 – – – C4 4 59m 19s 2 2h 5m – –
C5 23 58 1,028 516 0 – – – C5 – – 1 4h 2m* – –
D0 26 250 10,646 5,329 0 – – – D0 5202 4s 23 3m 43s 4 19m 56s
D1 37 458 6,655 3,406 3 3 0 – D1 4 19m 33s 15 5m 25s – –
D2 3 16 2,030 1,008 0 – – – D2 135 40s 6 11m 41s – –
D3 41 463 6,797 3,475 8 8 1 1 D3 760 22s 32 5m 49s – –
P
4,133 1.168 M 3,209 13,425

was unable to trigger any bit flip. These DRAM devices were tuple, but we have also observed instances with more than one.
produced recently (in 2020), likely deploying an improved
mitigation scheme. To understand why Blacksmith failed to The best pattern of B2 , given in Figure 11a, consists of 6
find any effective patterns on the devices B0−3 , we reverse- aggressor tuples that all share the same period (104) but a5,6
engineered the TRR mechanism of one of them (B0 ) in that caused the bit flips has a significantly higher intensity
Section VI-C. (35×). This very well represents how one would expect a
Rowhammer pattern: the most hammered aggressors trigger
E. Pattern’s Complexity bit flips. However, this is not always the case. The effective
We analyzed the effective patterns discovered by Blacksmith pattern in Figure 11b from A10 consists of 9 aggressor tuples,
on the tested DIMMs. In Figure 11, we present three examples and the aggressors a1,2 causing the bit flips are hammered with
to show that patterns have significant differences in their a lower intensity (22) than the pattern’s highest (35) but more
parameters. Considering the complexity of these patterns, we often (period of 96). This agrees with the observation made in
argue that it is difficult to come up with them manually. We our experiments (see Section III-C), showing hammering for
note that these patterns all have only one effective aggressor too long (high intensity) might be counterproductive. Lastly,
Table IV: Blacksmith results for LPDDR4X DRAM. We report for each chip
Bit flip intensity (relative to a DIMM)
(DRAM) the no. of effective patterns found (|P+ |, or max and the elapsed time to
find the first 128 effective patterns), and for the best pattern, we report the total

Byte Offset
6
no. of observed bit flips (#Flips) and the no. of zero-to-one flips (|F0 )1
swp |) for a 4
sweep over 16 MB. Additionally, we report the total capacity (GB) and refresh 2
rate changes during the experiment; e.g., 4x→2x indicates a refresh interval of 0
4x tREFI (4x 3.904µs ≈ 15.6µs) during test initialization and early fuzzing, but an

A0
A1
A2
A3
A4
A5
A6
A7
A8
A9
A10
A11
A12
A13
A14
A15
A16
A17
A18
A19
B0
B1
B2
B3
B4
B5
B6
B7
B8
B9
C0
C1
C2
C3
C4
C5
D0
D1
D2
D3
increasing temperature eventually resulted in a lower refresh interval of 2x tREFI
(≈ 7.8µs). For C4 , the refresh interval kept alternating between 2x and 4x. All Fig. 12: Chip dependence. The distribution of bit flips over byte offsets (0-7) based
DRAM devices are from 2018, except for B0 –B2 from 2020 (marked with † ). on the DIMM’s sweep with its best pattern.

DRAM GB |P+ | (mm:ss) #Flips )1 |


|F0swp Rate Blacksmith was able to trigger the first bit flips on them after
A0 6 max (17:24) 361 K 209 K 4x 13m 19s, 28m 8s, and 3h 43m. This shows the strength of our
A1 6 max (13:57) 946 K 604 K 4x scalable black-box fuzzing approach for testing DRAM devices
A2 8 max (21:54) 993 K 572 K 4x compared to traditional reverse-engineering, which would have
A3 8 max (15:04) 1.633 M 963 K 4x
taken many weeks, if not months, to yield effective results.
A4 12 max (14:53) 844 K 531 K 4x
A5 12 max (15:05) 1.207 M 752 K 4x G. Other Insights
B0 † 6 0 – – 4x We also investigated other properties of effective patterns
B1 † 6 0 – – 4x→2x like their temporal properties (Appendix F), their portability be-
B2 † 6 0 – – 4x→2x
B3 8 max (29:27) 225 K 119 K 4x→2x
tween different DIMMs (Appendix G), and the reproducibility
B4 8 max (11:28) 1.516 M 797 K 4x→2x of bit flips triggered by these patterns (Appendix H).
C0 4 max (51:18) 140 K 78 K 4x→2x VI. I NSIGHTS ON TRR
C1 4 max (05:44) 6.560 M 3.050 M 4x
C2 6 max (05:22) 363 K 239 K 4x In this section, we investigate properties of TRR using
C3 6 max (05:24) 12.242 M 5.092 M 4x effective patterns found by Blacksmith. In Section VI-A we
C4 8 max (05:11) 3.125 M 1.423 M 4x→2x/4x start by looking into the low exploitability of some devices
C5 10 24 1,447 1,022 2x
C6 10 5 14,386 8,689 2x
despite many triggered bit flips. As we wanted to understand
C7 10 53 2,623 1,649 2x better how TRR implementations differ across devices, we
studied two characteristic properties in Section VI-B: the TRR
a1,2: (104,0,3×) a5,6: (104,20,35×) a8: (104,97,1×)
a3,4: (104,6,7×) a7: (104,90,7×) a9,10: (104,98,4×)
sampler size and the TRR’s dependence on DRAM addresses.
Period
104 Lastly, we reverse-engineered certain aspects of TRR on B0 in
0 104 208 312 Section VI-C to find out why Blacksmith could not trigger bit
(a) Best pattern found on DIMM B2 . flips on three of our LPDDR4X devices (B0 , B1 , and B2 ) and
a1,2: (96,0,22×) a5: (288,236,35×) a9,10: (96,92,1×) show how it can be better configured to find effective patterns
a3: (288,44,35×) a6,7: (96,79,6×) a11,12: (192,94,1×)
a4: (288,140,35×) a8: (96,91,1×) a13,14: (192,190,1×) on these devices.
Period
288 A. Chip Dependence
192
96 Motivated by the low number of exploitable bit flips on
0 96 192 188 284 some devices, despite that the best pattern triggered many
(b) Best pattern found on DIMM A10 . bit flips, we started looking more into the bit flips from our
a1,2: (46,0,1×) a9,10: (92,66,5×) a15,16: (46,36,3×) fuzzing. An analysis of them revealed that on certain DRAM
a3,4: (46,2,7×) a11,12: (46,30,2×) a17,18: (92,42,2×)
a5,6: (46,16,2×) a13,14: (46,34,1×) a19,20: (92,88,2×) devices, some offsets show significantly more bit flips than
Period a7,8: (92,20,5×)
others, as depicted in Figure 12. As an example, on A1 we
92
46 observe that the best pattern can trigger bit flips exclusively in
0 46 92 138 184 byte offset 6 during our sweep. Further experiments showed
that using effective patterns other than the best pattern leads
(c) Best pattern found on DIMM D1 .
to bit flips on other DRAM chips but not nearly as many as
Fig. 11: Best patterns. The best patterns of DIMMs B2 , A10 , D1 with (frequency,
when using the best pattern. Given that, we conclude that this
phase, amplitude) for each aggressor tuple. After a pattern’s end, we show how the
pattern is repeated during its execution (grey x-axis values). The aggressor tuple effect is likely due to TRR rather than the chip’s underlying
that triggers bit flips is depicted in red ( ). Rowhammer vulnerability. The analysis based on the bit flips
from our fuzzing shows that this effect is present on 65% of
we show the best pattern from D1 in Figure 11c. This shows devices in our test pool. The existence of this chip-dependent
how intermixing our effective aggressors with other aggressors variation has been confirmed by concurrent work [30].
allows us to evade TRR in this instance.
B. TRR Sampler Size and Address Dependence
F. Blacksmith on Devices From Another Vendor To learn more about how TRR implementations differ across
A fourth DRAM vendor contacted us to test three of their DIMMs in our test pool, we use the best pattern found by
DRAM devices against Rowhammer after the responsible Blacksmith to determine the number of rows that a sampler
disclosure. Although we have not studied these devices before, can track at any point in time (i.e., sampler size) and if the
Address Dependence Yes
✓ ✗ ✓ ✗ ✗ ✗ ✓ ✗ ✓ ✗ ✓ ✗ ✓ ✓ ✗ ✗ ✓ ✗ ✗ ✓ ✓ ✓ ✓ ✓✓✓ ✓ ✓ ✓ ✓ ✓✓✓✓✗ ✗ ✗

TRR Event?
30
Sampler Size
(#Aggressors)

20
? ?? ?
10
No
0 96 192 288 384 480 576 672 768 864 960 1056
0 Refresh Event No.
A0
A1
A2
A3
A4
A5
A6
A7
A8
A9
A10
A11
A12
A13
A14
A15
A16
A17
A18
A19
B0
B1
B2
B3
B4
B5
B6
B7
B8
B9
C0
C1
C2
C3
C4
C5
D0
D1
D2
D3
DIMM Fig. 15: TRR distance experiment. The TRR events over 1056 refresh intervals
for the device B0 . The x-axis shows the refresh intervals since the first observed
Fig. 13: TRR Sampler size & address dependence. We used effective patterns
TRR event. We can see that roughly every 48-th refresh, no TRR is happening.
found by Blacksmith to estimate the TRR sampler size (bars) and to detect if TRR
Yes
takes DRAM addresses into account (✓) or not (✗). A question mark (?) indicates

TRR Event?
an inconclusive sampler size.
It. Reduction step Bit Flips
best pattern randomly picked row
1 p0: aa ab ac ad ae af p1: ax ab ac ad ae af ✔
No
2 p1: ax ab ac ad ae af p2: ax ax ac ad ae af ✔ 0 7x6 REF intervals 48 Refresh Event No. 96 144

3 p2: ax ax ac ad ae af p3: ax ax ax ad ae af Fig. 16: Attack strategy. The refresh interval range 0 − 144 of Figure 15 zoomed-

in. We show the distance between TRR-free segments (every 48-th interval, red
4 p2: ax ax ac ad ae af p4: ax ax ac ax ae af ✗ lines), the 6 refresh intervals where we target hammering our aggressors ( ), and
5 p2: ax ax ac ad ae af p5: ax ax ac ad ax af ✔ the 7 × 6 intervals where we hammer two random rows ( ).
6 p5: ax ax ac ad ax af p6: ax ax ac ad ax ax ✔
p6 ⇒ Sampler size equals 3 rows. Algorithm 1: Experiment to determine the TRR distance.
Fig. 14: Sampler size estimation. An example showing the sampler size estimation 1 A2 ← P ICK R ANDOM AGGRESSOR PAIR ();
methodology over 6 iterations (It.). The pattern p6 at the end has the minimum 2 AC ← 1.5×D ETERMINE R H T HRESHOLD (A2 );
number of distinct rows to trigger bit flips. 3 D ISABLE R EFRESH ();
4 for round ← 0 to 8192 do
sampler is sensitive to the DRAM address of the rows inside 5 P REPARE V ICTIM ROW (A2 ); // restore data, refresh victim
6 for i ← 0 to AC/2 do
an effective pattern (i.e., address dependence). For increasing 7 H AMMER (A2 );
the reliability of our experiments, we repeat hammering each 8 I SSUE R EFRESH (); // issues a single REFRESH
pattern ten times, each time for 5 M activations. 9 for i ← 0 to AC/2 do
10 H AMMER (A2 );
We estimate the sampler size using a reduction process 11 C HECK B ITFLIPS (A2 );
as shown in Figure 14: we iteratively replace aggressors of 12 E NABLE R EFRESH ();
the best pattern by one randomly selected row of the same
bank until any further replacement would no longer trigger bit
flips anymore. The number of distinct rows at the end is an
means, if we know how often we need to hammer a location
overestimation of the sampler’s size. The results in Figure 13
to induce a bit flip, we can determine which REFRESHes
show the sampler size varies across DRAM devices from just
trigger TRRs. The experiment, given in Algorithm 1, works
4 to up to 28 rows. We report the sampler size as inconclusive
as follows: we randomly pick a double-sided aggressor pair to
in case that our methodology did not lead to a reliable result.
determine its hammer count (HC), i.e., the number of accesses
To identify any address dependence on a given DRAM
needed to trigger a bit flip. The HC can be determined by
device, we replace all accessed rows that do not trigger bit flips
disabling refreshes and repeating hammering while counting
in neighboring rows (i.e., all except the effective aggressors)
the number of activations until we observe bit flips. We then
by randomly selected rows of the same bank. Since these
define our target activation count AC = 1.5 × HC and hammer
aggressors in the pattern do not contribute to bit flips, replacing
the aggressors for half of the times (AC/2), issue a single
them should not affect the ability to trigger bit flips. Hence,
refresh, and again hammer for half of the times (AC/2) before
if we do not observe bit flips anymore, it indicates that the
checking for bit flips. We repeat this experiment for one tREFW,
sampler is address-dependent. Our results in Figure 13 show
i.e., 8192 refresh intervals to observe the distance (in units of
that 55% of samplers in our DIMMs are address-dependent.
refresh intervals) between TRRs to our victim row.
C. Understanding Blacksmith’s (In)Effectiveness Figure 15 shows the results for the first 1056 refresh intervals.
Our results show that Blacksmith is able to find effective Our data shows that on average, there is a TRR happening
patterns on all DDR4 DIMMs of our test pool (see Table II). every 6th refresh interval; however, there are periods where
There are, however, three LPDDR4X devices (B0 , B1 , and TRRs happen less frequently — roughly every 48th refresh
B2 ) where Blacksmith could not find any effective pattern interval (red bars) there is one TRR event skipped, resulting in
(see Table IV). To better understand why, we reverse-engineer around 12 consecutive TRR-free intervals. We conclude from
aspects of the TRR implementation on device B0 . For the this that Blacksmith, if configured properly, should be able to
following experiments, we make use of our LPDDR4X-based bypass this TRR implementation.
test platform where we have control over refresh commands.
Building an effective pattern. Our goal is to demonstrate
TRR distance. In the first experiment, we verify if the distance that we could use the TRR-free intervals to craft an effective
between TRRs is regularly repeating on B0 . The experiment pattern for B0 . Our attack assumes that we are aligned with
uses the fact that a TRR-triggered refresh masks bit flips. That the proper refresh interval. Based on our previous observation
25 auto-refreshes and try each pattern with 1 to 60 REFRESHes
20
#Bit Flips
issued before. While probing the refresh offset is possible in
15
10 testing scenarios, an attacker without this capability will need
5 to try the same pattern multiple times until one starts at the
0 right refresh offset.
0 25 50 75 100 125 150 175 200
Refresh Interval Offset Results. Using the new configuration, Blacksmith could find
Fig. 17: Attack result of B0 . Once synchronized with the proper REFRESH (after
44 tries), our manually-crafted pattern can successfully trigger bit flips every 48-th
effective patterns with the length of 48 refresh intervals
refresh interval (red line). after 19 min 1 s (B0 ), 2 h 5 min 52 s (B1 ), and 6 min 27 s (B2 ).
These results show the adaptability of Blacksmith to new
(Figure 15), we access two randomly selected rows in the six mitigations, prove the effectiveness of our approach but also
short segments of each 6 intervals, hammer our aggressors in highlight the importance of a proper parameter range selection.
the TRR-free segment consisting of 12 intervals, and afterward
VII. F UTURE W ORK
(for 6 intervals) again access the two randomly selected rows.
However, we first need to align with the right REFRESH In this section, we discuss the impact of our new findings
before we start to hammer. By analyzing Figure 15 carefully, on future attacks and mitigations.
we find out that there are always around 48 refresh intervals Improving the fuzzer’s approach. Our work shows that with
in between the two REFRESHes where TRRs are skipped. blackbox fuzzing and some assumptions about a pattern’s
To make this more clear on an example, we focus on the structure, we can efficiently generate patterns bypassing TRR
refresh interval range 0–144 in Figure 16. Here, we can see mitigations on a wide range of DIMMs. Although this approach
that the two REFRESHes without TRR events are around 48 is scalable and outperforms previous work [12], on certain
refresh intervals apart. Consequently, by shifting our pattern’s DIMMs we could only find very few bit flips. This leaves
REFRESH alignment every repetition by one, we need at most improvements to our fuzzing strategy as an attractive direction
47 tries to find the proper start interval. for future research.
We select a double-sided aggressor pair (of the same bank) One possibility is tweaking the parameters of effective
to hammer during the long TRR-free segments for 6 intervals. patterns found by Blacksmith to discover new effective pat-
In theory, we could hammer our double-sided aggressor pair terns that can trigger more bit flips. This, however, assumes
even longer but we determined that 6 intervals are already Blacksmith has already found effective patterns.
sufficient to trigger bit flips. In situations where Blacksmith does not find effective
Figure 17 shows the result of our hand-crafted pattern. We patterns, reverse-engineering can provide an alternative. As
can see that it took 44 tries to align with the proper REFRESH, adequate reverse-engineering of a DIMM is time-consuming
after which we are synchronized and can trigger bit flips. By and does not scale, an interesting approach could be to combine
repeating this method a few times, we can see that the distance automated reverse engineering to guide Blacksmith in a grey-
between successful offsets matches the estimated distance of box manner. As an example, reverse-engineering can provide
48 in Figure 15. the distance between TRRs (Section VI-C). This information,
Effective configuration of Blacksmith for LPDDR. Our in turn, can be used by Blacksmith to significantly reduce the
experiment in Section VI-C shows that the TRR distance is size of the search space.
regular, which means that our frequency-based patterns should Making TRR more secure. Blacksmith enables scalable
be able to bypass this TRR implementation. Comparing our and effective fuzzing of a given DRAM device. Since our
insights with the fuzzing parameter ranges (see Table I) shows initial disclosure, major companies have already started using
the distance between where we hammer (48 intervals) is not Blacksmith to test their devices and evaluate the effectiveness
in the range of 1 to 16 refresh intervals. Also, we need to of their mitigations. We are confident that this adoption will
allow hammering an aggressor tuple for at least 6 consecutive directly result in improved future mitigations.
refresh intervals, currently we only allow an amplitude up to The properties of effective Blacksmith patterns can also guide
one refresh interval. This explains why Blacksmith could not the design of better mitigations. Blacksmith can trigger bit flips
find any effective patterns on this device. However, even if on our DRAM devices since their TRR implementations do not
we consider a proper configuration, an effective pattern needs accurately capture aggressor rows. In deterministic mitigations
to start at the right refresh command, which may take a long with strong security guarantees, every access needs to be consid-
time given the larger parameter space. ered, unlike in existing in-DRAM mitigations. Recent work [31]
We assessed Blacksmith’s ability to find effective patterns shows how this can be achieved with a reasonably small number
on this device. We updated the parameters to consider patterns of counters. Our measurements show that currently deployed
of length 36 up to 60 refresh intervals and amplitudes between mitigations keep track of significantly fewer aggressors than
1 (i.e., access the aggressors only one time) up to 6 refresh needed for complete protection. Probabilistic mitigations (e.g.,
intervals. As our current Blacksmith implementation does not PARA [13]) can also be used as secure in-DRAM mitigation,
consider that the specific REFRESH where we start hammering but recent work shows that additional refreshes have become
matters, we make Blacksmith refresh-aware — we disable prohibitively expensive in recent devices [15].
VIII. R ELATED W ORK activations in a bank and issues selective refreshes to highly
In this section, we provide an overview of existing work on activated rows once a threshold has been reached. However, this
Rowhammer attacks and defenses. requires supported DRAM modules and coordination between
DRAM and memory controllers [38].
Attacks. While initially considered an exotic attack vector, There has been extensive research on novel software- and
Rowhammer has since emerged as an effective means to build hardware-based defenses that try to implement a more effective
a plethora of exploits [32] on a great variety of platforms: on TRR. Software-based defenses may be deployed on systems
personal computers [1]–[3], [14], mobile platforms [6]–[8], with DRAM modules that are already in production [7], [35],
and co-located cloud servers [4], [5], [25]. Attacks were not [40], [41]. However, they require support by the OS, do not
only demonstrated using native code [1], [4]–[7], [14], [25] but always provide complete protection [14], [42], can waste
also from the restricted JavaScript sandbox running in modern memory [7], [35], and potentially impact performance more
browsers [2], [3], [8], [11] and even over the network [9], negatively [41]. In comparison, hardware-based solutions have
[10]. While TRRespass [12] showed the Rowhammer issue a lower performance overhead [13], [31], [43]–[46], but they
still affects some DDR4 systems, the patterns generated by require hardware adoption that can take many years.
Blacksmith expose how every DDR4 system is still vulnerable
to it — even more so in the case of LPDDR4. Such results IX. C ONCLUSION
make the case for better mitigations more significant. Deployed in-DRAM TRR mitigations against Rowhammer
Typical Rowhammer attacks consist of three phases [4]: estimate hammered rows and aim to prevent bit flips by
(i) memory templating, (ii) memory massaging, and (iii) ex- issuing extra refreshes to their neighbors. Motivated by the
ploitation. During (i) memory templating, an attacker aims to observation that all existing Rowhammer patterns hammer their
find a pattern that triggers a bit flip at an attack-dependent aggressors uniformly, and given that this is likely an easy case
offset of a page (template). This is where Blacksmith comes to catch by TRR, we explored the novel class of non-uniform
into play and can help to find an effective pattern. Thereafter, Rowhammer access patterns by randomizing parameters in the
(ii) memory massaging is used to trick the victim into mapping frequency domain, obtained using a number of carefully crafted
the target data into one of the attacker’s templates in which experiments. Our scalable Rowhammer fuzzer Blacksmith, is
the attacker can trigger a bit flip during the (iii) exploitation. capable of crafting complex non-uniform patterns that trigger
Concurrent work [21] uses a new reverse engineering bit flips on all 40 recently acquired DDR4 DIMMs, 2.6× more
technique based on data retention failures for studying mit- than state-of-the-art Rowhammer patterns. We used results
igations and crafting patterns that effectively bypass TRR. obtained by Blacksmith to gain insight into the properties
The methodology leads to very effective patterns but is time- of effective patterns and existing mitigations. Our findings
consuming as it is not automated. Similar to our insights on highlight an urgent need for the deployment of more principled
mitigations (see Section VI), recent work [30] studied the mitigations against Rowhammer.
Rowhammer sensitivities such as DRAM chip temperature and
ACKNOWLEDGMENTS
the Rowhammer effects of keeping aggressor rows active for
a longer time. Among others, they make a similar observation We thank the anonymous reviewers for their valuable
regarding the different Rowhammer bit flip distributions across feedback. This research was supported by the Swiss National
different DRAM chips on the same device as shown in Science Foundation under NCCR Automation, grant agreement
Section VI-A. 51NF40 180545, and in part by the Netherlands Organisation
for Scientific Research through grant NWO [Link].192.262.
Defenses. In the past, systems vendors have made several
attempts to mitigate Rowhammer practically, such as an R EFERENCES
increased (e.g., doubled) DRAM refresh rate [33], [34] to [1] S. Mark and T. Dullien, “Exploiting the DRAM Rowhammer Bug to
reduce the available time to hammer. Besides this being Gain Kernel Privileges: How to cause and exploit single bit errors,”
[Link] Black Hat USA, Las
insufficient [12], [35], it also increases power consumption and Vegas, NV, Aug. 2015.
lowers system performance [32]. It has long been believed that [2] D. Gruss, C. Maurice, and S. Mangard, “[Link]: A Remote
servers with integrity-protected error checking and correction Software-Induced Fault Attack in JavaScript,” in DIMVA ’16. Berlin,
Heidelberg: Springer-Verlag, Jul. 2016, pp. 300–321, [Link]
(ECC) DRAM are safe against Rowhammer, until Cojocar et 1007/978-3-319-40667-1 15.
al. [25] showed that this is not always the case. [3] E. Bosman, K. Razavi, H. Bos, and C. Giuffrida, “Dedup Est Machina:
More recent proposals use tailored solutions against Rowham- Memory Deduplication as an Advanced Exploitation Vector,” in S&P
’16. San Jose, CA: IEEE, May 2016, pp. 987–1004, [Link]
mer. For example, Intel’s proprietary MC-based implementation [Link]/document/7546546/.
pseudo-TRR (pTRR) [36] that is available on selected server [4] K. Razavi, B. Gras, C. Giuffrida, E. Bosman, B. Preneel, and H. Bos,
systems [12] and requires pTRR-compliant DIMMs. Little is “Flip Feng Shui: Hammering a Needle in the Software Stack,” in USENIX
Security ’16, 2016, [Link]
known about its implementation, but it promises a negligible technical-sessions/presentation/razavi.
performance impact [37]. There have been ongoing standard- [5] Y. Xiao, X. Zhang, Y. Zhang, and R. Teodorescu, “One Bit Flips,
ization efforts for mitigations, such as in the latest generation One Cloud Flops: Cross-VM Row Hammer Attacks and Privi-
lege Escalation,” in USENIX Security ’16, Austin, TX, Aug. 2016,
of LPDDR (LPDDR5), where TRR is replaced by Refresh pp. 19–35, [Link]
Management [38], [39] — a mechanism that keeps track of sessions/presentation/xiao.
[6] V. van der Veen, Y. Fratantonio, M. Lindorfer, D. Gruss, C. Maurice, Cloud Providers,” in IEEE S&P. San Francisco, CA, USA: IEEE, May
G. Vigna, H. Bos, K. Razavi, and C. Giuffrida, “Drammer: Deterministic 2020, pp. 712–728, [Link]
Rowhammer Attacks on Mobile Platforms,” in CCS ’16. Vienna Austria: [25] L. Cojocar, K. Razavi, C. Giuffrida, and H. Bos, “Exploiting Correcting
ACM, Oct. 2016, pp. 1675–1689, [Link] Codes: On the Effectiveness of ECC Memory Against Rowhammer
2978406. Attacks,” in IEEE S&P ’19. San Francisco, CA, USA: IEEE, May
[7] V. van der Veen, M. Lindorfer, Y. Fratantonio, H. Padmanabha Pillai, 2019, pp. 55–71, [Link]
G. Vigna, C. Kruegel, H. Bos, and K. Razavi, “GuardION: Practical [26] P. Kobalicek, “AsmJit: A lightweight library for X86/X64 machine code
Mitigation of DMA-Based Rowhammer Attacks on ARM,” in DIMVA, generation written in C++,” [Link] 2011.
Jun. 2018, [Link] [27] L. Niels, “JSON for Modern C++,” [Link]
2 5. 2011.
[8] P. Frigo, C. Giuffrida, H. Bos, and K. Razavi, “Grand Pwning Unit: [28] A. Tatar, C. Giuffrida, H. Bos, and K. Razavi, “Defeating Software
Accelerating Microarchitectural Attacks with the GPU,” in IEEE S&P ’18, Mitigations against Rowhammer: A Surgical Precision Hammer,” in
May 2018, pp. 195–210, [Link] RAID ’18, Sep. 2018, [Link]
8418604. 030-00470-5 3.
[9] A. Tatar, R. K. Konoth, C. Giuffrida, H. Bos, E. Athanasopoulos, and [29] “JESD209-4A: Low Power Double Data Rate 4 (LPDDR4),” https:
K. Razavi, “Throwhammer: Rowhammer Attacks over the Network and //[Link]/sites/default/files/docs/[Link], Aug. 2014.
Defenses,” in USENIX ATC ’18, 2018, p. 14, [Link] [30] L. Orosa, A. G. Yaglikci, H. Luo, A. Olgun, J. Park, H. Hassan, M. Patel,
conference/atc18/presentation/tatar. J. S. Kim, and O. Mutlu, “A Deeper Look into RowHammer’s Sensitivi-
[10] M. Lipp, M. Schwarz, L. Raab, L. Lamster, M. T. Aga, C. Maurice, and ties: Experimental Analysis of Real DRAM Chips and Implications on
D. Gruss, “Nethammer: Inducing Rowhammer Faults through Network Future Attacks and Defenses,” in MICRO ’21. Virtual Event Greece:
Requests,” in EuroS&P Workshops ’20, Sep. 2020, pp. 710–719, https: ACM, Oct. 2021, pp. 1182–1197, [Link]
//[Link]/abstract/document/9229701/. 3480069.
[11] F. de Ridder, P. Frigo, E. Vannacci, H. Bos, C. Giuffrida, and [31] Y. Park, S. N. University, W. Kwon, S. N. University, E. Lee, S. N.
K. Razavi, “SMASH: Synchronized Many-Sided Rowhammer Attacks University, T. J. Ham, S. N. University, J. H. Ahn, S. N. University,
From JavaScript,” in USENIX Security ’21, Aug. 2021, [Link] J. W. Lee, and S. N. University, “Graphene: Strong yet Lightweight Row
[Link]/conference/usenixsecurity21/presentation/ridder. Hammer Protection,” in MICRO ’20, 2020, p. 13, [Link]
[12] P. Frigo, E. Vannacc, H. Hassan, V. v. der Veen, O. Mutlu, C. Giuffrida, org/abstract/document/9251863.
H. Bos, and K. Razavi, “TRRespass: Exploiting the many sides of target [32] O. Mutlu and J. S. Kim, “RowHammer: A Retrospective,” IEEE TCADICS
row refresh,” in IEEE S&P ’20, 2020, pp. 747–762, [Link] ’19, vol. 39, no. 8, pp. 1555–1571, Aug. 2020, [Link]
[Link]/abstract/document/9152631. TCAD.2019.2915318.
[13] Y. Kim, R. Daly, J. Kim, C. Fallin, J. H. Lee, D. Lee, C. Wilkerson, [33] “About the security content of Mac EFI Security Update 2015-001,”
K. Lai, and O. Mutlu, “Flipping Bits In Memory Without Accessing [Link]
Them: An Experimental Study of DRAM Disturbance Errors,” in ISCA [34] “Row Hammer Privilege Escalation - Lenovo Support CH,” https://
’14. Minneapolis, MN, USA: IEEE, Jun. 2014, pp. 361–372, http: [Link]/ch/en/product security/row hammer.
//[Link]/document/6853210/. [35] Z. B. Aweke, S. F. Yitbarek, R. Qiao, R. Das, M. Hicks, Y. Oren, and
[14] D. Gruss, M. Lipp, M. Schwarz, D. Genkin, J. Juffinger, S. O’Connell, T. Austin, “ANVIL: Software-Based Protection Against Next-Generation
W. Schoechl, and Y. Yarom, “Another Flip in the Wall of Rowhammer Rowhammer Attacks,” in ASPLOS ’16. Atlanta, Georgia, USA: ACM
Defenses,” in IEEE S&P ’18, May 2018, pp. 245–261, [Link] Press, 2016, pp. 743–755, [Link]
[Link]/abstract/document/8418607. 2872390.
[15] J. S. Kim, M. Patel, A. G. Yaglikci, H. Hassan, R. Azizi, L. Orosa, [36] M. Kaczmarski, “Thoughts on Intel® Xeon® E5-2600 v2 Prod-
and O. Mutlu, “Revisiting RowHammer: An Experimental Analysis uct Family Performance Optimisation – component selection guide-
of Modern DRAM Devices and Mitigation Techniques,” in ISCA ’20. lines,” [Link]
Valencia, Spain: IEEE, May 2020, pp. 638–651, [Link] [Link], Aug. 2014.
org/document/9138944/. [37] S. Mandava, B. S. Morris, S. Sah, R. M. Stevens, T. Rossin, M. W.
[16] Y. Kim, V. Seshadri, D. Lee, J. Liu, and O. Mutlu, “A Case for Exploiting Stefaniw, and J. H. Crawford, “Techniques for determining victim row
Subarray-Level Parallelism (SALP) in DRAM,” in ISCA ’12, p. 12, addresses in a volatile memory,” US Patent US9 824 754B2, Nov., 2017,
[Link] [Link]
[17] “JEDEC Standard: DDR4 SDRAM (JESD79-4B),” [Link] [38] A. Hastings and S. Sethumadhavan, “WaC: A new doctrine for
org/sites/default/files/docs/[Link], Jun. 2017. hardware security,” in ASHES ’20, ser. ASHES’20. New York,
[18] J. Liu, B. Jaiyen, Y. Kim, C. Wilkerson, and O. Mutlu, “An Experimental NY, USA: Association for Computing Machinery, 2020, pp. 127–136,
Study of Data Retention Behavior in Modern DRAM Devices: Implica- [Link]
tions for Retention Time Profiling Mechanisms,” in ACM SIGARCH ’13, [39] “JEDEC Standard: LPDDR5 (JESD209-5),” [Link]
p. 12, [Link] default/files/docs/[Link], 2019.
[19] W.-K. Cheng, P.-Y. Shen, and X.-L. Li, “Retention-Aware DRAM Auto- [40] F. Brasser, L. Davi, D. Gens, C. Liebchen, and A.-R. Sadeghi, “CAn’t
Refresh Scheme for Energy and Performance Efficiency,” Micromachines, touch this: Software-Only mitigation against rowhammer attacks targeting
vol. 10, no. 9, p. 590, Sep. 2019, [Link] kernel memory,” in USENIX Security ’17. Vancouver, BC: USENIX
9/590. Association, Aug. 2017, pp. 117–130, [Link]
[20] K. S. Bains, J. B. Halbert, C. P. Mozak, T. Z. Schoenborn, and usenixsecurity17/technical-sessions/presentation/brasser.
Z. Greenfield, “Row Hammer Refresh Command,” Patent, Jun., 2012, [41] R. K. Konoth, M. Oliverio, A. Tatar, D. Andriesse, H. Bos, C. Giuffrida,
[Link] and K. Razavi, “ZebRAM: Comprehensive and Compatible Software
[21] H. Hassan, Y. C. Tugrul, J. S. Kim, V. van der Veen, K. Razavi, and Protection Against Rowhammer Attacks,” in USENIX Security ’18, p. 15,
O. Mutlu, “Uncovering In-DRAM RowHammer Protection Mechani[Link] [Link]
New Methodology, Custom RowHammer Patterns, and Implications,” in [42] Z. Zhang, Y. Cheng, D. Liu, S. Nepal, Z. Wang, and Y. Yarom,
MICRO ’21. Virtual Event Greece: ACM, Oct. 2021, pp. 1198–1213, “PThammer: Cross-User-Kernel-Boundary Rowhammer through Implicit
[Link] Accesses,” in MICRO ’20, Oct. 2020, pp. 28–41, [Link]
[22] P. Frigo, E. Vannacci, H. Hassan, O. Mutlu, C. Giuffrida, H. Bos, and org/abstract/document/9251982.
K. Razavi, “TRRespass,” [Link] 2020. [43] E. Lee, I. Kang, S. Lee, G. E. Suh, and J. H. Ahn, “TWiCe: Preventing
[23] P. Pessl, D. Gruss, C. Maurice, M. Schwarz, and S. Mangard, “DRAMA: row-hammering by exploiting time window counters,” in ISCA ’19.
Exploiting DRAM Addressing for Cross-CPU Attacks,” in USENIX Phoenix Arizona: ACM, Jun. 2019, pp. 385–396, [Link]
Security ’16, p. 18, [Link] 10.1145/3307650.3322232.
technical-sessions/presentation/pessl. [44] A. G. Yağlikçi, M. Patel, J. S. Kim, R. Azizi, A. Olgun, L. Orosa,
[24] L. Cojocar, J. Kim, M. Patel, L. Tsai, S. Saroiu, A. Wolman, and O. Mutlu, H. Hassan, J. Park, K. Kanellopoulos, T. Shahroodi, S. Ghose, and
“Are We Susceptible to Rowhammer? An End-to-End Methodology for O. Mutlu, “BlockHammer: Preventing RowHammer at Low Cost by
Table VI: Data of the DIMMs in our testpool. DIMMs are grouped by their vendor
Blacklisting Rapidly-Accessed DRAM Rows,” in HPCA ’21, 2021, pp. (A − D ). If a DIMM’s SPD chip does not report a manufacturing date (†), we
345–358, [Link] instead report its purchase date.
[45] M. Son, H. Park, J. Ahn, and S. Yoo, “Making DRAM Stronger Against
Row Hammering,” in DAC ’17. Austin TX USA: ACM, Jun. 2017, pp.
1–6, [Link] Date Freq. Size Organization
Module
[46] J. M. You and J. Yang, “MRLoc: Mitigating Row-Hammering based on (yy-ww) (MHz) (GB) #Ranks #Banks #Pins
memory Locality,” in DAC ’19, Jun. 2019, pp. 1–6, [Link]
[Link]/abstract/document/8806778. A0 20-03 2666 8 1 16 ×8
[47] “4Gb: X16 DDR4 SDRAM Features, EDY4016A - 256Mb A1 20-07† 2400 8 1 16 ×8
X16,” [Link]
data-sheet/dram/ddr4/4gb ddr4 dram [Link].
A2 20-06 2666 32 2 16 ×8
A3 20-10 2400 8 1 16 ×8
Table V: Synchronized n-sided patterns. Number of effective patterns (|P+ |) A4 16-51 2132 4 1 16 ×8
and bit flips (|Ftotal A5−10 20-07† 2132 8 1 16 ×8
fuzz |) found during fuzzing using n-sided patterns with REFRESH
synchronization (TRRespass + Sync.) compared to regular n-sided patterns A11 20-07† 2400 8 1 16 ×8
(TRRespass). DIMMs where we could not find any patterns with REFRESH A12−15 20-07† 2132 16 2 16 ×8
synchronization are omitted.
A16 20-23 2666 32 2 16 ×8
A17 20-08 2666 32 2 16 ×8
TRRespass + Sync. A18 20-07† 2666 8 1 16 ×8
TRRespass
DIMM (SMASH)
A19 20-16 2666 16 2 16 ×8
|P+ | |Ftotal
fuzz | |P| |Ftotal
fuzz |
B0 19-38 2400 16 2 16 ×8
A2 2,233 8,131 777 3,279 B1 20-07† 2132 8 1 16 ×8
A3 24 77 53 79
A4 5 15 5 5 B2 19-34 2400 4 1 16 ×8
A9 40 121 47 65 B3 20-05 2666 8 1 16 ×8
A10 54 165 57 72 B4 20-07 2400 8 1 16 ×8
A11 16 48 26 27 B5 19-51 2400 16 2 16 ×8
A16 25 87 310 499 B6 20-07† 2132 32 2 16 ×8
A17 1,312 4,299 593 1,574
B7 20-09 2134 8 2 16 ×8
B2 5 16 0 –
B8 20-07† 2400 4 1 16 ×8
B9 20-07† 2400 8 1 16 ×8
A PPENDIX A C0 20-07† 2132 16 2 16 ×8
S YNCHRONIZED n- SIDED H AMMERING C1−4 20-38 2400 8 1 16 ×8
C5 17-48 2400 4 1 16 ×8
Recent work [11] showed that synchronizing with the
REFRESHes while hammering facilitates bypassing Rowham- D0 20-15 2400 8 1 16 ×8
mer mitigations. To investigate whether adding synchronization D1 20-19 2400 16 2 16 ×8
to n-sided patterns is enough to find effective patterns on more D2 20-20 2400 16 2 16 ×8
D3 20-20 2400 8 1 16 ×8
devices than previous work [12] did, we added synchronization
to the open-source Rowhammer fuzzer TRRespass.
In Table V, we present the results for a 30 minutes run: we
found effective patterns on only 9 of 40 DIMMs (22.5%) A PPENDIX C
of our test pool (see Appendix B), which indicates that S EARCH S PACE E STIMATION
synchronization alone is insufficient to find effective patterns on
more DIMMs. Our results show that although we do not always We following present a simple back-of-the-envelope calcula-
find more effective patterns, the effective patterns we found tion showing the number of possible combinations for the most
trigger a higher number of bit flips. This matches observations simple case of a Rowhammer pattern. We assume the standard
from previous work [11]. DDR4 parameters: a tREFI of 7.8125 µs and a retention time
(refresh window) of 64 ms, see Section II-A for details. This
A PPENDIX B gives us 64 ms/7.8125 µs = 8192 refresh intervals in each of
PC-DRAM T EST P OOL which we can issue roughly 100 activations, and thus, in total
In Table VI, we provide an overview on the DIMMs in our around 819200 activations in a refresh window.
test pool based on the DIMM’s reported Serial Presence Detect Next, we derive the number of distinct patterns that we
(SPD) data. We group DIMMs by their DRAM chip vendor can build given these constraints. We assume a Rowhammer
(e.g., A) and assign to each a sequentially chosen number (e.g., threshold of 10 k activations based on the findings in previous
A0 , A1 , . . . , A19 ) to uniquely identify them. work [15]. This means there are in total 10 k accesses to
We check whether DIMMs report being Rowhammer-safe, by aggressors needed to trigger a bit flip. For simplifying the
reading out their maximum activate count (MAC): the maximum calculation, we allow aggressor accesses to be intermixed
number of ACTIVATEs that a row can resist in an interval of with other accesses of the pattern (e.g., accesses required
less or equal to the maximum activate window (MAW) without to bypass TRR). In the case of a double-sided aggressor
causing flips in neighboring rows [47]. All modules claim to pair, this translates to 819200
10000 = 6.79322 × 10
23447
possible
be safe against Rowhammer (unlimited MAC value). combinations to distribute these 10 k accesses.
Period 1 2 3 4 5 6 7 8
This shows that by considering only basic constraints, we
end up with an impractically large pattern design space that 1 a1 a2 a1 a2
cannot be explored in a reasonable time. Therefore, we need 2 a1 a2 a3 a4 a1 a2
to define properties that allow us to reduce the search space 3 a1 a2 a3 a4 a5 a6 a1 a2 a5 a6
while still being general enough to generate patterns that are 4 a1 a2 a3 a4 a5 a6 a7 a8 a1 a2 a5 a6
effective on many different DIMMs. As discussed in Section III, 5 a1 a2 a3 a4 a5 a6 a7 a8 a1 a2 a9 a1₀0 a5 a6

Iteration
examples of such properties are the number of aggressors to 6 a1 a2 a3 a4 a5 a6 a7 a8 a1 a2 a9 a1₀0 a5 a6 a11 a12
hammer, how often we hammer each of them, and over how Fig. 18: Pattern generation over time. Example showing the pattern generation
many intervals we spread our hammering effort. over six iterations, where an iteration i is marked by ⃝
i.

Table VII: Results of our non-uniform accesses experiment. We compare the number of periods the pattern is composed of and let
common n-sided patterns (n-sided) with n-sided patterns where non-uniform
accesses are injected (n-sided + Rnd.) and fully random patterns (Fully Rnd.). We P = {2x : x ∈ N0 } be the powers-of-two. Next, we determine
report for each DIMM if any effective patterns were found (✓) or not (✗). DIMMs the largest pi ∈ P such that pi divides N . Consequently, all
without any effective patterns in all three experiments are omitted for brevity. elements {p0 , . . . , pi−1 } ∈ P, smaller than pi , must also divide
n-sided Fully n-sided Fully N . Then we define the set of harmonic frequencies as F′ =
Module n-sided
+ Rnd. Rnd.
Module n-sided
+ Rnd. Rnd. { p10 , p11 , . . . , p1i }. For example, for N = 40 we obtain the set
A1 ✓ ✓ ✗ A16 ✓ ✓ ✓ of harmonic frequencies F′ = { 11 , 12 , 14 , 18 } where all s−1 for
A2 ✓ ✓ ✓ A17 ✓ ✓ ✓ s ∈ F′ are divisors of N .
A3 ✓ ✓ ✓ A18 ✓ ✗ ✗
A4 ✓ ✓ ✓ B1 ✓ Matching Frequencies. Another difficulty is that we cannot
✗ ✗
A6 ✗ ✓ ✗ B2 ✓ arbitrarily combine different frequencies in a pattern because
✗ ✗
A7 ✗ ✗ ✓ B9 ✗ this may lead to overlapping accesses. By means of illustration,
✗ ✓
A9 ✓ ✓ ✓ C0 ✓ ✗ ✓
consider a pattern of length 16 and period T = 2, as given in
A10 ✓ ✓ ✓ D0 ✓ ✗ ✓
A11 ✓ ✓ ✓ D1 ✗
Figure 18. Before starting to fill up the pattern, we compute
✓ ✓
A14 ✗ ✗ ✓ D3 ✗ the harmonic frequencies F′ = { 11 , 12 , 14 , 18 }. In iteration ⃝,
✗ ✓ 1
′ 1
35% 27.5% 37.5%
we can choose any frequency s ∈ F , for example, f = 4
for the aggressor tuple (a1 , a2 ). For the next aggressor tuple
(a3 , a4 ), to be placed in the second period, there are fewer
A PPENDIX D options available as we cannot choose f = 1 anymore without
R ANDOM ACCESSES E XPERIMENT overlapping with aggressors (a1 , a2 ). As an example, we
We assess three different approaches to generate Rowhammer choose f = 81 for this tuple (see ⃝), 2 resulting in two
access patterns: n-sided patterns from previous work [12], n- remaining compatible frequencies (i.e., f = { 14 , 81 }), for the
sided patterns with random accesses in between, and patterns next aggressor tuple (a5 , a6 ). If we choose f = 81 , we end up
where all except aggressor accesses are fully random. The in iteration ⃝ 3 with three unfilled periods (4, 6, 8) that only
results of this experiment are presented in Table VII. support the frequency f = 18 . In iterations ⃝- 4 ⃝,
6 we show how
these accesses would be filled up by other aggressor tuples.
A PPENDIX E To automate selecting only from suitable frequencies, we
PATTERN G ENERATION proceed as follows: a random frequency s0 ∈ F′ is picked
In this appendix, we explain the technicalities involved in in the iteration ⃝. 1 In following iterations (⃝ i > 0), where
building patterns. We describe how we determine harmonic we add another tuple, we restrict ourselves to frequencies
frequencies that respect pattern repetitions, we explain how F′′ = {s ∈ F′ : s ≤ si−1 }, i.e., the current frequency is the
matching frequencies can fill up a pattern, and finally, we upper bound for the available frequencies in the next iteration.
present our algorithm for combining different aggressors with
Building Patterns. The next step is to combine multiple
different parameters into a single pattern.
aggressor tuples in a pattern (at different phases) to increase
Harmonic Frequencies. There are constraints in the choice of the probability that one of the aggressor tuples can successfully
an aggressor tuple’s frequency. As the whole pattern is repeated bypass the mitigation. We could include only one aggressor
during hammering, we must design it in a way to maintain tuple with a randomly picked set of parameters (f, ϕ, û) and
frequencies over repetitions. For example, given a pattern of randomize all other accesses. Our chosen approach, however,
4 periods (like the one in Figure 10), then choosing f = 13 allows us to simultaneously try out different parameter sets
for an aggressor tuple A2 , would lead to accessing the tuple ({(f1 , ϕ1 , û,1 ), (f2 , ϕ2 , û,2 ), . . .} and DRAM locations; as
in the first and fourth period. However, repeating the pattern such, we expect it to find effective patterns more quickly.
leads to accessing A2 again in the subsequent (5th) period, However, combining aggressor tuples with different parame-
thus deviating from its defined frequency. ters in a pattern brings up new challenges: we need to ensure
As a solution, we define a subset of compatible frequencies, accesses do not overlap and the parameters of each aggressor
namely harmonic frequencies. For that, we first define F = tuple are respected. Additionally, we want to make sure that
{ 1i : i ∈ N} as the set of all frequencies. Then, let N be we exhaust but not exceed the possible number of accesses in
Alg. 2: Frequency-based pattern generation. 101

#DIMMs
A B C D
Input : period T , pattern’s length L 5
Output : access pattern P 0
F′ ← C OMPUTE H ARMONIC F REQUENCIES (); 0.0
0 2500.2 500 0.4 750 -1 0.6
1000 4000 0.8
4200 6200
1
Frequency (Period)
2 P ← C REATE PATTERN (L);
3 for ϕ ← 0 to T do // fill 1st period at phase ϕ (a) Learned period for each DIMM, grouped by vendor.
4 n ← P ICK R ANDOM N(T − ϕ);

#DIMMs
A B C D
5 A ← P ICK R ANDOM AGGRESSORS (n);
10
6 û ← P ICK R ANDOM A MPLITUDE (⌊(T − ϕ )/n⌋);
7 f ← P ICK R ANDOM F REQUENCY (F′ ); 0
F ILL PATTERN B YAGGRESSORS (A, f , ϕ, û, ϕ, P ); 0 50 100 150 200 250 300
Phase
8
9 F′′ ← F′ ; // Copy F′ to preserve its value
/* fill remaining periods at offset ϕ using same (b) Learned phase for each DIMM, grouped by vendor.
values for n, û, ϕ */ 20

#DIMMs
10 while not every slot at ϕ in period i > 1 is filled do A B C D
5
11 i ← G ET N EXT U NFILLED P ERIOD (ϕ);
12 Φ ← (i × T ) + ϕ; 0
0 10 20 30 40 50
13 A ← P ICK R ANDOM AGGRESSORS (n); Amplitude
14 F′′ ← R EMOVE F REQUENCIES L ARGERT HAN (F′′ , f );
15 f ← P ICK R ANDOM F REQUENCY (F′′ ); (c) Learned amplitude for each DIMM, grouped by vendor.
16 F ILL PATTERN B YAGGRESSORS (A, f , ϕ, û, Φ, P ); 10

#DIMMs
17 ϕ ← n + û; // update iteration variable A B C D
18 return P 5
0
0 250 500 750 1000 1250 1500 1750
Pattern Length
(d) Learned pattern length for each DIMM, grouped by vendor.

1 a1 a2 a1 a2 a15 a16 9 a1 a2 a1 a2 a15 a16 Fig. 20: Temporal properties. We show for each DIMM, grouped by DRAM vendor,
2 a3 a4 a3 a4 a17 a18 10 a3 a4 a3 a4 a17 a18 the learned values of the temporal properties frequency, phase, amplitude, and
pattern length.
3 a5 a6 a5 a6 a15 a16 11 a5 a6 a5 a6 a15 a16
4 a7 a8 a7 a8 a19 a20 12 a7 a8 a7 a8 a19 a20
A PPENDIX F
T EMPORAL P ROPERTIES
5 a1 a2 a1 a2 a15 a16 13 a1 a2 a1 a2 a15 a16
6 a3 a4 a3 a4 a21 a22 14 a3 a4 a3 a4 a25 a26 We extended our Blacksmith fuzzer with a parameter-
7 a8 a9 a8 a9 a15 a16 15 a12 a13 a12 a13 a15 a16 tracking mode, as described in Section IV-C. Using this mode,
8 a10 a11 a10 a11 a23 a24 16 a14 a15 a14 a15 a27 a28 we want to answer whether a correlation between specific
period execution order parameter values and vendors exists. In Figure 20, we show for
Fig. 19: A frequency-based pattern. Example of a frequency-based pattern with each DRAM vendor how the temporal properties converged to
T = 6, pattern length 96, and 16 periods. Aggressors are colored based on their certain values. The properties we consider include (a) period,
frequency f : 1 ( ), 1 1
( ), 8 ( ), 161
( ).
2 4 (b) phase, (c) amplitude, and (d) pattern length.
Looking at the frequencies (Figure 20a) shows that A tends
towards a high frequency (i.e., low period). We can see for A a
each period to stay synchronized with the REFRESH.
period of around 110 (about one refresh interval) up to ≈ 400
To solve this, we implemented a pattern building algorithm. (about four refresh intervals). D is similar, although both have
It uses the fact that a pattern can be expressed as a A × B a few outliers. The phase plot (Figure 20b) shows for A and D
matrix, where A refers to the pattern’s number of periods and a strong preference towards a very low phase, i.e., hammering
B to the base period T . Each index (a, b) ∈ A × B refers at the beginning of a refresh interval. However, this does not
to a single access in the pattern, which we will refer to as a contradict our earlier findings in Section III-B: Blacksmith
slot. The algorithm fills up the free slots of an access pattern also found effective patterns with effective aggressors at the
by adding an aggressor tuple to the first period (lines 3 to 8) end of a refresh interval (i.e., high offset). We can see in the
with a randomly picked set of (f, ϕ, û), and then (lines 10 amplitude plot (Figure 20c) that there is a clear preference
to 16), fills up the same phase ϕ in all other periods with for A to an amplitude of one, whereas B favors an amplitude
another aggressor tuple with the same amplitude û but a second, in the range 18-25. Lastly, the length of effective patterns
randomly picked, compatible frequency f . Reusing the same (Figure 20d) shows that there is a tendency towards shorter
amplitude for aggressor tuples in all other periods (at the same patterns (≤ 500 accesses) for all vendors. Three ranges clearly
phase only) is a limitation that facilitates patterns’ construction. stand out: the peak by A with pattern length 190-220, the peak
However, we believe that it does not impose a severe limitation by C with 100-130 accesses, and the two instances with very
because our patterns already consist of aggressor tuples with long patterns (≈ 1500 by D and ≈ 1800 by B).
potentially many different amplitudes. We can summarize that for some parameters, there is a clear
preference for DIMMs of the same vendor. It is possible to use
Pattern Example. In Figure 19, we provide a complete this knowledge in future work to tweak the fuzzer’s parameter
example of a pattern with 96 accesses and a period of 6. search space further. We discuss this more in Section VII.
× Bit Flips Relative to a DIMM’s Best Pattern Avg. time-to-flip (s) Avg. number of tries
0.0 0.1 0.9 1.1 >1.1
300
0
1 200
2
3
4 100
5
6
Manufacturer A

7
8 0

A0
A1
A2
A4
A5
A6
A7
A8
A9
A11
A12
A13
A14
A15
A16
A17
A18
A19
B1
B2
B3
B4
B5
B6
B7
B8
B9
C0
C1
C4
C5
D0
D1
D2
D3
9
10
11 Fig. 22: Bit flip reproducibility. The average time-to-flip (in seconds) and the
12
13 average number of hammering repetitions needed (limit: 1000) to retrigger bit flips
14 with a DIMM’s best pattern. We omit DIMMs where we could not retrigger bit flips
15
16 successfully (A3,10 , B0 , C2,3 ).
Applied on DIMM

17
18 not adjusted when porting a pattern from another DIMM.
19
0
1 A PPENDIX H
Manufacturer B

2
3
4
B IT F LIP R EPRODUCIBILITY
5
6 We investigated the reproducibility of bit flips for exploita-
7
8 tion. For this, we considered the best pattern of each DIMM
9
0 and tried to retrigger bit flips ten times while measuring the
1
Manuf. C

2 number of trials needed and the elapsed time. To limit the total
3
4 time of our experiment, we limited the maximum number of
5
0 trials in each round to 1000, i.e., in total 10 × 1000 trials. As
Manuf. D

1
2
3
target DRAM location in the experiment, we use the location
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 0 1 2 3 4 5 6 7 0 1 2 3 4 5 0 1 2 3
Manufacturer A Manufacturer B Manuf. C Manuf. D
where the best pattern triggered bit flips during fuzzing.
Best Pattern from DIMM While validating our experiment, we observed that the
Fig. 21: Portability results. We run each DIMM’s best pattern (x-axis) on every
other DIMM (y-axis) and report the factor of more observed bit flips compared to
starting time of hammering plays a crucial role. In some cases,
the DIMM’s best pattern (e.g., 3× for 3 times more bit flips). our data suggested that our pattern has only been effective in
bypassing TRR because we started executing it at the right
A PPENDIX G REFRESH. Hence, we improve the chance to reproduce a bit
P ORTABILITY OF B LACKSMITH PATTERNS flip by waiting between 0 ms–1 ms in between retries. We argue
that this is negligible as it only adds at most 1 s (for 1000
Our data analysis raised the question if effective patterns are repetitions) to the total time needed to retrigger a bit flip. We
portable, i.e., can be transferred between DIMMs. Because a do not try to optimize for the optimal REFRESH where we start
pattern inherently encodes information to bypass the mitigation, hammering during fuzzing as there exist so many possibilities.
a pattern working on different DIMMs would suggest that We think it is more efficient to try out more different patterns
their deployed mitigations work similarly. From an attacker’s considering the limited fuzzing time.
perspective, portability is of interest as it allows to perform Figure 22 presents the results of our measurements. We can
templating on another machine (offline) and later, during the see that for DIMMs of A a very small number of repetitions
attack, use the golden patterns found on the victim’s host. This (1 − 2) are needed to retrigger a bit flip successfully. Other
can drastically reduce the attack execution time as templating DIMMs (e.g., A2,11 , B4−7 ), in particular those of vendor B,
is the most time-consuming step. require much more repetitions (e.g., up to 198 for B2 ) until
We aggregated the best patterns from all DIMMs and we succeed. However, there are 5 out of 40 DIMMs (A3,10 ,
performed a sweep with each of these patterns on each module B0 , C2,3 ) where we could not retrigger any bit flips over all
over 8 MB of contiguous memory. We report the results of this repetitions. On B0 we succeeded by increasing the number of
experiment in a heatmap in Figure 21. The plot shows that the hammering repetitions to 10 (i.e., we hammer longer). We think
effective patterns from A are portable: for 17 of 20 DIMMs that the reason for non-reproducibility on these four DIMMs
we could even find a better pattern by taking an effective is that they require special conditions to retrigger bit flips (e.g.,
pattern that we found previously on another DIMM. Given that proper REFRESH alignment), which are hard to reproduce.
Blacksmith is performing a randomized search, likely some For the DIMMs where we could retrigger bit flips, their
executions do not necessarily find the best possible access reproducibility allows practical exploitation. Assuming a bit
patterns. This explains why patterns discovered on certain flip in an exploitable page offset, since retriggering of the bit
DIMMs trigger more bit flips than on others. We observed that flip happens after the memory massaging step in all presented
effective patterns from vendor A are more efficient on D1,3 attacks, and given that the retriggering is on 90% of our DIMMs
than the best one we found on these DIMMs. Based on that, successful, the only impact to the end-to-end attack time is an
we believe that these DIMMs, for which we cannot tell the increase by the average time-to-flip as shown in Figure 22.
DRAM chip vendor, have chips from vendor A. The DIMMs
from vendor B, C generally show a low portability. This could
be because mitigations use DIMM-specific properties that are

You might also like