0% found this document useful (0 votes)
430 views428 pages

Fo CS24

Uploaded by

lahiru962
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
430 views428 pages

Fo CS24

Uploaded by

lahiru962
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

See discussions, stats, and author profiles for this publication at: [Link]

net/publication/323243320

Foundations of Cybersecurity, volume I: An Applied Introduction to


Cryptography

Preprint · April 2020

CITATIONS READS

0 54,786

1 author:

Amir Herzberg
University of Connecticut
269 PUBLICATIONS   6,928 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Denial-of-Service Attacks (and Defenses) View project

Cloud security View project

All content following this page was uploaded by Amir Herzberg on 30 September 2020.

The user has requested enhancement of the downloaded file.


Foundations of Cybersecurity, Volume I:
An Applied Introduction to Cryptography

Amir
c Herzberg
Comcast Professor of Security Innovations
Department of Computer Science and Engineering
University of Connecticut

September 30, 2020

The draft versions of the Foundations of Cybersecurity (both volumes), as


well as presentations (lectures) for each chapter, are available for download
from: https:
//[Link]/project/Foundations-of-Cyber-Security.
Comments, and especially corrections and suggestions, are appreciated; send
email to the author or use the comment mechanism in the website.

i
Contents

Contents ii

1 Introduction 1
1.1 About the Foundations of Cybersecurity and this Volume . . . 1
1.1.1 First volume: an applied introduction to cryptography. 2
1.1.2 Online and lecturer resources . . . . . . . . . . . . . . . 4
1.2 Brief History of Computers, Cryptology and Cyber . . . . . . . 4
1.2.1 From classical to modern Cryptology . . . . . . . . . . . 7
1.2.2 Cryptology is not just about secrecy! . . . . . . . . . . . 8
1.3 Cybersecurity Goals and Attack Models . . . . . . . . . . . . . 10
1.4 Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2 Encryption and Pseudo-Randomness 13


2.1 From Ancient Ciphers to Kerckhoffs’ Principle . . . . . . . . . 15
2.1.1 Ancient Keyless Monoalphabetic Ciphers . . . . . . . . 16
2.1.2 Shift Cipher: a Keyed Variant of the Caesar Cipher . . 19
2.1.3 Mono-alphabetic Substitution Ciphers . . . . . . . . . . 20
2.1.4 Kerckhoffs’ known-design principle . . . . . . . . . . . . 22
2.2 Cryptanalysis Attack Models: CTO, KPA, CPA and CCA . . . 24
2.3 Generic attacks and Effective Key-Length . . . . . . . . . . . . 25
2.3.1 Exhaustive search . . . . . . . . . . . . . . . . . . . . . 25
2.3.2 Table look-up and time-memory tradeoff attacks: a Generic
CPA attack . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.3.3 Effective key length . . . . . . . . . . . . . . . . . . . . 28
2.4 Unconditional security and the One Time Pad (OTP) . . . . . 29
2.4.1 OTP is a Stateful Cryptosystem / Stream Cipher . . . . 31
2.5 Pseudo-Randomness, Indistinguishability and Asymptotic Secu-
rity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.5.1 Pseudo-Random Generators and their use for Bounded
Key-length Stream Ciphers . . . . . . . . . . . . . . . . 32
2.5.2 The Turing Indistinguishability Test . . . . . . . . . . . 34
2.5.3 PRG indistinguishability test . . . . . . . . . . . . . . . 34
2.5.4 Defining security: in general and for PRG . . . . . . . . 35
2.5.5 Secure PRG Constructions . . . . . . . . . . . . . . . . 39

ii
2.5.6 Random functions . . . . . . . . . . . . . . . . . . . . . 42
2.5.7 Pseudo-Random Functions (PRFs) . . . . . . . . . . . . 47
2.5.8 PRF: Constructions and Robust Combiners . . . . . . . 51
2.5.9 The key-separation principle and application of PRF . . 52
2.5.10 Random and Pseudo-Random Permutations . . . . . . . 53
2.6 Defining secure encryption . . . . . . . . . . . . . . . . . . . . . 56
2.6.1 Attacker model . . . . . . . . . . . . . . . . . . . . . . . 57
2.6.2 The Indistinguishability-Test for Shared-Key Cryptosys-
tems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
2.6.3 The Indistinguishability-Test for Public-Key Cryptosys-
tems (PKCs) . . . . . . . . . . . . . . . . . . . . . . . . 63
2.7 The Cryptographic Building Blocks Principle . . . . . . . . . . 64
2.8 Block Ciphers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
2.8.1 Constructing PRP from PRF: the Feistel Construction . 68
2.9 Secure Encryption Modes of Operation . . . . . . . . . . . . . . 70
2.9.1 The Electronic Code Book mode (ECB) mode . . . . . 72
2.9.2 The Per-Block Random Mode (PBR) . . . . . . . . . . 74
2.9.3 The Output-Feedback (OFB) Mode . . . . . . . . . . . 76
2.9.4 The Cipher Feedback (CFB) Mode . . . . . . . . . . . . 79
2.9.5 The Cipher-Block Chaining (CBC) mode . . . . . . . . 80
2.9.6 Ensuring CCA Security . . . . . . . . . . . . . . . . . . 82
2.10 Case study: the (in)security of WEP . . . . . . . . . . . . . . . 82
2.10.1 CRC-then-XOR does not ensure integrity . . . . . . . . 84
2.11 Encryption: Final Words . . . . . . . . . . . . . . . . . . . . . 85
2.12 Encryption and Pseudo-Randomness: Additional exercises . . . 86

3 Authentication: Message Authentication Code (MAC) and


Signature Schemes 101
3.1 Encryption for Authentication? . . . . . . . . . . . . . . . . . . 101
3.2 Message Authentication Code (MAC) schemes . . . . . . . . . 102
3.3 MAC and Signature Schemes: definitions . . . . . . . . . . . . 103
3.3.1 Definition of Message Authentication Code (MAC) Scheme103
3.3.2 Signature Schemes . . . . . . . . . . . . . . . . . . . . . 104
3.4 Applying MAC and Signatures Schemes . . . . . . . . . . . . . 107
3.4.1 Applying MAC functions . . . . . . . . . . . . . . . . . 107
3.4.2 Applications of Signature Schemes . . . . . . . . . . . . 109
3.5 Constructing MAC, part I: constructions from PRF . . . . . . 112
3.5.1 Every PRF is a MAC . . . . . . . . . . . . . . . . . . . 113
3.5.2 CBC-MAC: ln-bit MAC (and PRF) from n-bit PRF . . 114
3.5.3 Constructing Secure VIL MAC from PRF . . . . . . . . 116
3.6 Other MAC constructions . . . . . . . . . . . . . . . . . . . . . 117
3.6.1 MAC design ‘from scratch’ . . . . . . . . . . . . . . . . 117
3.6.2 Robust combiners for MAC . . . . . . . . . . . . . . . . 119
3.6.3 MAC constructions from other cryptographic mechanisms 120
3.7 Combining Authentication and Encryption . . . . . . . . . . . 120
3.7.1 Authenticated Encryption (AE) and AEAD schemes . . 120

iii
3.7.2 EDC-then-Encrypt Schemes . . . . . . . . . . . . . . . . 122
3.7.3 Generic Authenticated Encryption Constructions . . . . 122
3.7.4 Single-Key Generic Authenticated-Encryption . . . . . . 126
3.8 Message Authentication: Additional exercises . . . . . . . . . . 128

4 Cryptographic Hash Functions 132


4.1 Introducing Crypto-hash functions, their goals and applications 132
4.1.1 Warmup: hashing for efficiency . . . . . . . . . . . . . . 133
4.1.2 Goals and requirements for crypto-hashing . . . . . . . 136
4.1.3 Applications of crypto-hash functions . . . . . . . . . . 137
4.1.4 Standard cryptographic hash functions . . . . . . . . . . 138
4.2 Collision Resistant Hash Function (CRHF) . . . . . . . . . . . 139
4.2.1 Keyless Collision Resistant Hash Function (Keyless-CRHF)139
4.2.2 Are there Keyless CRHFs? . . . . . . . . . . . . . . . . 140
4.2.3 Keyed Collision Resistance . . . . . . . . . . . . . . . . 143
4.2.4 Birthday and exhaustive attacks on CRHFs . . . . . . . 145
4.2.5 CRHF Applications (1): File Integrity . . . . . . . . . . 146
4.2.6 CRHF Applications (2): Hash-then-Sign (HtS) . . . . . 147
4.3 Second-preimage resistance (SPR) . . . . . . . . . . . . . . . . 149
4.3.1 The Chosen-Prefix Vulnerability and its HtS exploit . . 151
4.4 One-Way Functions, aka Preimage Resistance . . . . . . . . . . 153
4.4.1 Using OWF for One-Time Passwords (OTP) and OTP-
chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
4.4.2 Using OWF for One-Time Signatures . . . . . . . . . . 155
4.5 Randomness extraction . . . . . . . . . . . . . . . . . . . . . . 156
4.6 The Random Oracle Model . . . . . . . . . . . . . . . . . . . . 159
4.6.1 HMAC and other constructions of a MAC from a Hash
function . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
4.7 The Digest-Chain and Merkle-Damgård Construction . . . . . 164
4.7.1 The Merkle-Damgård Construction of collision resistant
digest function . . . . . . . . . . . . . . . . . . . . . . . 164
4.7.2 The Extend Function and Validation of Entries and Ex-
tensions . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
4.8 The Merkle Digest Scheme . . . . . . . . . . . . . . . . . . . . . 171
4.8.1 The Merkle digest scheme: Definitions . . . . . . . . . . 172
4.8.2 Extending the sequence: Proofs of Consistency . . . . . 173
4.8.3 Merkle Digest scheme and Privacy . . . . . . . . . . . . 175
4.8.4 2lMT : the two-layered Merkle Tree construction . . . . 175
4.8.5 The Merkle tree MT construction . . . . . . . . . . . . 177
4.9 Blockchains, Proof-of-Work (PoW) and Bitcoin . . . . . . . . . 180
4.9.1 The blockchain digest scheme . . . . . . . . . . . . . . . 181
4.9.2 Controlled blockchains: permissioned and permissionless 182
4.9.3 Proof-of-Work (PoW) schemes . . . . . . . . . . . . . . 183
4.9.4 The Bitcoin Blockchain and Crytocurrency . . . . . . . 185
4.10 Cryptographic hash functions: additional exercises . . . . . . . 186

iv
5 Shared-Key Protocols 191
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
5.1.1 Inputs and outputs signals . . . . . . . . . . . . . . . . . 192
5.1.2 Focus: two-party shared-key handshake and session pro-
tocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
5.1.3 Adversary Model . . . . . . . . . . . . . . . . . . . . . . 193
5.1.4 Secure Session Protocols . . . . . . . . . . . . . . . . . . 194
5.2 Shared-key Entity-Authenticating Handshake Protocols . . . . 198
5.2.1 Shared-key entity-authenticating handshake protocols: sig-
nals and requirements . . . . . . . . . . . . . . . . . . . 198
5.2.2 The (insecure) SNA entity-authenticating handshake pro-
tocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
5.2.3 2PP: Three-Flows authenticating handshake protocol . 203
5.3 Session-Authenticating Handshake Protocols . . . . . . . . . . 204
5.3.1 Session-authenticating handshake: signals, requirements
and variants . . . . . . . . . . . . . . . . . . . . . . . . . 205
5.3.2 Session-authenticating 2PP . . . . . . . . . . . . . . . . 206
5.3.3 Nonce-based request-response authenticating handshake
protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
5.3.4 Two-Flows Request-Response Authenticating Handshake,
assuming Synchronized State . . . . . . . . . . . . . . . 208
5.4 Key-Setup Handshake . . . . . . . . . . . . . . . . . . . . . . . 209
5.4.1 Key-Setup Handshake: Signals and Requirements . . . . 210
5.4.2 Key-setup 2PP extension . . . . . . . . . . . . . . . . . 211
5.4.3 Key-Setup: Deriving Per-Goal Keys . . . . . . . . . . . 211
5.5 Key Distribution Protocols and GSM . . . . . . . . . . . . . . . 212
5.5.1 Case study: the GSM Key Distribution Protocol . . . . 214
5.5.2 Replay attacks on GSM . . . . . . . . . . . . . . . . . . 217
5.5.3 Cipher-agility and Downgrade Attacks . . . . . . . . . . 218
5.6 Resiliency to key exposure: forward secrecy, recover secrecy and
beyond . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
5.6.1 Forward secrecy handshake . . . . . . . . . . . . . . . . 222
5.6.2 Recover-Security Handshake Protocol . . . . . . . . . . 224
5.6.3 Stronger notions of resiliency to key exposure . . . . . . 225
5.6.4 Per-goal Keys Separation. . . . . . . . . . . . . . . . . . 227
5.6.5 Resiliency to exposures: summary . . . . . . . . . . . . 228
5.7 Shared-Key Session Protocols: Additional Exercises . . . . . . 230

6 Public Key Cryptology 235


6.1 Introduction to PKC . . . . . . . . . . . . . . . . . . . . . . . . 235
6.1.1 Public key cryptosystems . . . . . . . . . . . . . . . . . 236
6.1.2 Digital signature schemes . . . . . . . . . . . . . . . . . 236
6.1.3 Key exchange protocols . . . . . . . . . . . . . . . . . . 236
6.1.4 Advantages of Public Key Cryptography (PKC) . . . . 238
6.1.5 The price of PKC: assumptions, computation costs and
key-length . . . . . . . . . . . . . . . . . . . . . . . . . . 239

v
6.1.6 Hybrid Encryption . . . . . . . . . . . . . . . . . . . . . 242
6.1.7 The Factoring and Discrete Logarithm Hard Problems . 243
6.2 The DH Key Exchange Protocol . . . . . . . . . . . . . . . . . 247
6.2.1 Physical key exchange . . . . . . . . . . . . . . . . . . . 247
6.2.2 Some candidate key exchange protocol . . . . . . . . . 249
6.2.3 The Diffie-Hellman Key Exchange Protocol and Hard-
ness Assumptions . . . . . . . . . . . . . . . . . . . . . . 252
6.3 Key Derivation Functions (KDF) and the Extract-then-Expand
paradigm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
6.4 Using DH for Resiliency to Exposures: PFS and PRS . . . . . 257
6.4.1 Authenticated DH: Perfect Forward Secrecy (PFS) . . . 257
6.4.2 The Synchronous-DH-Ratchet protocol: Perfect Forward
Secrecy (PFS) and Perfect Recover Secrecy (PRS) . . . 258
6.4.3 The Asynchronous-DH-Ratchet protocol . . . . . . . . . 259
6.4.4 The Double Ratchet Key-Exchange protocol [to be com-
pleted] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
6.5 Discrete-Log based public key cryptosystems: DH and El-Gamal 262
6.5.1 The DH PKC . . . . . . . . . . . . . . . . . . . . . . . . 262
6.5.2 The El-Gamal PKC . . . . . . . . . . . . . . . . . . . . 264
6.5.3 Homomorphic encryption and re-encryption. . . . . . . 265
6.6 The RSA Public-Key Cryptosystem . . . . . . . . . . . . . . . 267
6.6.1 RSA key generation. . . . . . . . . . . . . . . . . . . . . 267
6.6.2 Textbook RSA: encryption, decryption and correctness. 268
6.6.3 The RSA assumption and security . . . . . . . . . . . . 270
6.6.4 RSA with OAEP (Optimal Asymmetric Encryption Padding)271
6.7 Public key signature schemes . . . . . . . . . . . . . . . . . . . 272
6.7.1 RSA-based signatures . . . . . . . . . . . . . . . . . . . 274
6.7.2 Discrete-Log based signatures . . . . . . . . . . . . . . . 276
6.8 Additional Exercises . . . . . . . . . . . . . . . . . . . . . . . . 277

7 The TLS/SSL protocols for web-security and beyond 285


7.1 Introducing TLS/SSL . . . . . . . . . . . . . . . . . . . . . . . 286
7.1.1 TLS/SSL: High-level Overview . . . . . . . . . . . . . . 286
7.1.2 TLS/SSL: security goals . . . . . . . . . . . . . . . . . 287
7.1.3 SSL/TLS: Engineering goals . . . . . . . . . . . . . . . . 289
7.1.4 TLS/SSL and the TCP/IP Protocol Stack . . . . . . . . 290
7.1.5 The SSL/TLS record protocol . . . . . . . . . . . . . . . 291
7.2 The beginning: the handshake protocol of SSLv2 . . . . . . . . 292
7.2.1 SSLv2: the ‘basic’ handshake . . . . . . . . . . . . . . . 293
7.2.2 SSLv2: ID-based Session Resumption . . . . . . . . . . 295
7.2.3 SSLv2: ciphersuite negotiation and downgrade attack . 296
7.2.4 Client authentication in SSLv2 . . . . . . . . . . . . . . 298
7.3 The Handshake Protocol: from SSLv3 to TLSv1.2 . . . . . . . 298
7.3.1 SSLv3 to TLSv1.2: improved derivation of keys . . . . . 299
7.3.2 Crypto-agility, backwards compatibility and downgrade
attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . 300

vi
7.3.3 Secure extensibility principle and TLS extensions . . . . 305
7.3.4 SSLv3 to TLSv1.2: DH-based key exchange . . . . . . . 307
7.3.5 SSLv3 to TLSv1.2: session resumption . . . . . . . . . . 309
7.3.6 Client authentication . . . . . . . . . . . . . . . . . . . . 312
7.4 State-of-Art: TLS 1.3 . . . . . . . . . . . . . . . . . . . . . . . 314
7.5 TLS/SSL: Additional Exercises . . . . . . . . . . . . . . . . . . 315

8 Public Key Infrastructure (PKI) 320


8.1 Introduction: PKI Concepts and Goals . . . . . . . . . . . . . . 320
8.1.1 Requirements from PKI schemes. . . . . . . . . . . . . . 322
8.1.2 The Web PKI . . . . . . . . . . . . . . . . . . . . . . . . 323
8.1.3 PKI failures . . . . . . . . . . . . . . . . . . . . . . . . . 324
8.2 Basic X.509 PKI Concepts . . . . . . . . . . . . . . . . . . . . . 325
8.2.1 The X.500 Global Directory Standard . . . . . . . . . . 325
8.2.2 The X.500 Distinguished Name . . . . . . . . . . . . . . 325
8.2.3 X.509 Public Key Certificates . . . . . . . . . . . . . . . 330
8.2.4 The X.509v3 Extensions Mechanism . . . . . . . . . . . 333
8.3 Certificate Validation and Standard Extensions . . . . . . . . . 335
8.3.1 Trust-Anchor-signed Certificate Validation . . . . . . . . 336
8.3.2 Standard Alternative-name Extensions . . . . . . . . . . 337
8.3.3 Standard key-usage and policy extensions . . . . . . . . 338
8.3.4 Certificate path validation . . . . . . . . . . . . . . . . . 339
8.3.5 The certificate path constraints extensions . . . . . . . . 341
8.3.6 The basic constraints extension . . . . . . . . . . . . . . 342
8.3.7 The name constraint extension . . . . . . . . . . . . . . 344
8.3.8 The policy constraints extension . . . . . . . . . . . . . 347
8.4 Certificate Revocation . . . . . . . . . . . . . . . . . . . . . . . 348
8.4.1 Certificate Revocation List (CRL) . . . . . . . . . . . . 349
8.4.2 Optimized Prefetch (‘push’) Revocation Mechanisms . . 351
8.4.3 Online Certificate Status Protocol (OCSP) . . . . . . . 352
8.4.4 OCSP Stapling and the Must-Staple Extension . . . . . 358
8.4.5 Optimized variants of OCSP . . . . . . . . . . . . . . . 364
8.5 X.509/PKIX Web-PKI CA Failures and Defenses . . . . . . . . 367
8.5.1 Weaknesses of X.509/PKIX Web-PKI . . . . . . . . . . 367
8.5.2 X.509/PKIX Defenses against Corrupt/Negligent CAs . 369
8.6 Certificate Transparency (CT) . . . . . . . . . . . . . . . . . . 371
8.6.1 CT: concepts, entities and goals . . . . . . . . . . . . . . 372
8.6.2 High level CT design with inefficient cryptographic func-
tions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374
8.6.3 Efficient CT Functions Design and the Gossip Process . 378
8.7 PKI: Additional Exercises . . . . . . . . . . . . . . . . . . . . . 380

9 Usable Security and User Authentication 384


9.1 Password-based Login . . . . . . . . . . . . . . . . . . . . . . . 384
9.1.1 Hashed password file . . . . . . . . . . . . . . . . . . . . 384
9.1.2 One-time Passwords with Hash-Chain . . . . . . . . . . 384

vii
9.2 Phishing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384
9.3 Usable Defenses against Phishing . . . . . . . . . . . . . . . . . 384
9.4 Usable End-to-End Security . . . . . . . . . . . . . . . . . . . . 384
9.5 Usable Security and Authentication: Additional Exercises . . . 384

10 Review exercises and solutions 387


10.1 Review exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 387
10.2 Solutions to selected exercises . . . . . . . . . . . . . . . . . . . 388

Index 397

Bibliography 405

viii
Acknowledgments
While this is still a work-in-progress, I realize that I should better begin ac-
knowledging the people who help me, to minimize the unavoidable errors of
omission where I forget to thank and acknowledge help from friends and col-
leagues. So I am beginning now to write down these acknowledgments; please
let me know if you notice any omissions or mistakes, and accept my apologies
in advance.
I received a lot of help from my teaching assistants (TAs) in the University
of Connecticut (UConn): Sam Markelon (2019) and Justin Furuness (2018). I
am also grateful for my TAs for the similar course I gave earlier in Bar Ilan
University: Hemi Leibowitz, Haya Shulman and Yigal Yoffe.
I also received great comments and many corrections, suggestions and help
from professors and colleagues who have read and/or used the text. Anna
Gorbenko and Jerry Shi taught the course together in Spring 2020, and gave me
incredible amount of feedback on both text and lectures; Anna also collected
an amazing number of comments and suggestions from many students, it’s
amazing how she knows to make students so engaged! Shaanan Cohney kindly
sent me the LaTeX source code for Figure 2.33, taken from [45].
I am also grateful to the many students and other readers, mainly in the
University of Connecticut, who have been tolerant of my innumerable mistakes
and pointed many of them out to me, directly or (more commonly) via Anna.
This really helps. Sara Wrotniak gave especially helpful feedback and sugges-
tions, including some useful text (already formatted in Latex, too!); this was
so helpful that I’ve asked Sara to do a more in-depth review, and she helped
even much more, I’m very grateful.
I have no doubt that I already forgot to mention some specific important
contributors; please let me know and I’ll also fix this error... and probably
introduce another one. Oh well.
Many thanks to all of you, and further help would be greatly appreciated!

ix
Chapter 1

Introduction

Cybersecurity refers to security aspects of systems involving communication


and computation mechanisms. Cybersecurity is a wide area, which involves
the security of many different applications and types of infrastructure.
Security is a tricky area, where intuition may often mislead as attacks often
exploit subtle vulnerabilities, which our intuition fails to consider. This makes
this area challenging, and interesting; in particular, in this area, intuition is
a dangerous guide, and careful, adversarial thinking is crucial. Indeed, for
such an applied, systems area, precise definitions and proofs are surprisingly
important. In particular, in many areas of engineering, designs are evaluated
under typical, expected scenarios and failures. When this approach is adopted
to evaluate security solutions, the designers often evaluate the system under
what they consider as expected adversarial attacks. However, this is a mistake:
security systems should be evaluated against arbitrary adversarial strategies,
as much as possible. Of course, this does not mean that we should assume an
omnipotent adversary, against whom every reasonable defense would fail; but
our defenses should be designed assuming only limitations on the capabilities
of the adversary, not on the adversarial strategy.
In this introduction, we begin with discussion of the ‘foundations of cyber-
security’ and of this volume, an introduction to cryptography for cybersecurity
practitioners. We then present some historical perspective for computer science
in general - focusing on cryptology and cybersecurity. Next, we introduce one
of the basic principles of cybersecurity, the attack model and security require-
ments principle. Finally we present notations used in this textbook.

1.1 About the Foundations of Cybersecurity and this


Volume
The goal of the Foundations of Cybersecurity is to provide solid foundations
to the area of applied, technical cybersecurity, accessible to students learning
this topic for the first time, and possibly also helpful for experts in the area.
The textbook can be used either as a companion to a course with lectures,

1
or for self-study; in both cases, a set of presentations is available and may
be useful for lecturers and students. The textbook also contains a significant
number of exercises, solutions and examples, although adding even more is
clearly desirable. Lecturers may receive some more exercises and solutions by
writing to the author.
An important goal of the foundations is to try to help readers develop the
adversarial thinking that allows the cybersecurity expert to avoid and identify
vulnerabilities - even subtle ones - in practical systems. To achieve this, we
combine discussion of informal design principles and of practical systems and
their vulnerabilities, with presentation of key theoretical tools and definitions.
There is a wide variety of different tools and techniques used to ensure
cybersecurity, and obviously, we will not be able to cover all of them. The
current plan is to focus on two volumes and central topics. In this, first, volume,
we focus on applied cryptography. Volume 2 focuses on network security [93],
specifically, Internet and web security, with much of the discussion focusing on
non-cryptographic aspects.
Some of the other important topics, that we currently don’t plan to cover,
include software security, including malware, secure coding, and more; privacy
and anonymity, operating systems and computer security, security of cyber-
physical systems and Internet-of-Things (IoT), usable security and social engi-
neering, machine learning security, information flow analysis, and more.

1.1.1 First volume: an applied introduction to cryptography.


Cybersecurity and cryptography are vast, fascinating fields. The first volume
of the foundations of cybersecurity is an applied introduction to cryptogra-
phy, introducing cybersecurity principles and approaches. The goal is to make
the text self-contained and limited to reasonable scope, yet provide sufficient
background in cryptography for cybersecurity practitioners.
In this volume, I cover important applied and basic cryptographic schemes,
including shared and public key cryptosystems, digital signatures, pseudo-
randomness, message authentication codes, cryptographic hash functions, au-
thentication and key distribution protocols, and more. I also discuss several
applications of cryptography for other areas of cybersecurity, including public
key infrastructure, web and transport-layer security, privacy and anonymity,
and user authentication, as well as relevant topics related to secure usability
and social engineering.
There are few reasons for dedicating the first volume to applied cryptogra-
phy. One reason is that cryptography is important, ancient, mature - yet still
exciting and fun to learn.
Furthermore, the study of cryptography is a great way to develop the pre-
cise and ‘adversarial’ thinking, which is so important for the cybersecurity
practitioner. In particular, I believe that every cybersecurity practitioner must
be familiar with the precise way of defining and proving security, based on
well-defined assumptions on the adversary capabilities, rather the outdated,

2
insecure practices of basing security only on intuition and/or against specific
adversary strategies.
Another reason is that cryptography is used in almost every area of cyber-
security - and definitely in network security, the focus of the next volume.
This allows us, in this volume, to introduce aspects from some of these other
areas, mainly network security, privacy, usable security and social engineering.
Indeed, in the second volume, we often take advantage of the first volume.
In fact, many of the principles we present in the first volume, apply also to
network security and other areas of cybersecurity.
The final reason is scheduling and prerequisites. While modern cryptogra-
phy is based on both mathematics and on the theory of computing, I believe
that the most important and applied aspects do not require extensive, in-depth
background in these (or other) areas.
My goal is to make this volume useful even to a student or reader who did
not learn previous math or computer-science courses, beyond what is learned
in good high-school programs. I am far from sure that I will be able to meet
this goal, but I try ; in particular, the text includes limited background on
relevant topics such as algebra, probability, and complexity theory.
Indeed, this approach may help introduce a student with these areas of math
and theory-of-computer-science. Some people find it easier to learn abstract,
theoretical topics, after a taste of their practical applications and importance.
This holds for me personally; as an undergrad, I learned computer engineering,
with emphasize on practical areas such as networking, systems and software en-
gineering, and took only the mandatory theory and math classes. This changed
only later as I learned the importance of theory and math - often ‘in the hard
way’...
Furthermore, the use of math and theory is limited to what I consider
essential for understanding - at least for some readers and lecturers. Each
specific lecturer using this textbook, or readers using it on their own, should
decide how ‘deep’ to go, e.g., whether to study each of the limited number of
proofs provided.
I hope and expect that students interested in cryptography, will follow-up,
possibly with one of the multiple excellent textbooks on cryptography; some
of my favorites are [3, 79, 80, 98, 138, 159]. However, we believe that for many
cybersecurity practitioners, these textbooks go too deep in cryptography and
theory - and do not cover enough applied aspects.
Note that, in contrast, in volume 2 (network security), I assume that the
reader has, by now, learned networking, from one of the multiple excellent
textbooks, e.g., my favorite, [112]. But one of the reasons I do not feel the
need to teach network security with precise proofs and definitions, is exactly
the reliance on this first part to prepare the reader in this way. On top of that,
the theory of network security is much more complex - and mostly does not
yet exist, and is a subject for further study.
Ultimately, my goal is to combine practice with theory, applicability with
precision, breadth with depth. Let me know how well I have achieved it - and
where can I improve!

3
1.1.2 Online and lecturer resources
These notes have a corresponding set of powerpoint presentations. These pre-
sentations, like the manuscript itself, are available for download from the Re-
searchGate website. If some presentation is not available or appears outdated,
or if these notes appear outdated, please alert me by email or by a Research-
Gate message - I may have forgotten to update.
Lecturers are welcome to use these notes and presentations; I will appreci-
ate, however, if you let me know. I can also provide some additional resources
to lecturers such as (additional) exercises and solutions.
I appreciate feedback, esp. corrections and other suggestions for improve-
ments and places where clarifications are needed; don’t hesitate, if you didn’t
understand, it is my fault, not yours, tell me and I will improve. Finally, I wel-
come also contributions, esp. additional exercises and/or solutions. Thanks!

1.2 Brief History of Computers, Cryptology and Cyber


We now present a brief history of computers, cryptology and cyber, including
some of the key terms such as ‘cybersecurity’.
Computers have quickly become a basic component of life, with huge im-
pact on society and economy. However, their beginning was slow and modest.
Babbage’s concept of a computer was in 1833, but the first working computer
was Konrad Zuse’s Z1, completed in 1938 - full 105 years later. Furthermore,
Z1 was an unreliable, slow mechanical computer, and did not have any useful
applications or use, except as proof of feasibility. In particular, special-purpose
calculating devices were way better in performing applied calculations.
Really useful computers appeared only during World War II - as a result
of massive investment by the armies of both sides. The British effort was by
far more important and more successful, since they identified a very important
military application - for which programmable computers had large advantage
over special-purpose devices: cryptanalysis.
Cryptology, which literally means the ‘science of secrets’, has been applied
to protect sensitive communication since ancient times, and is therefore much
more ancient than computers. Originally, cryptology focused on protecting con-
fidentiality, is provided by encryption; see Figure 1.1. Encryption transforms
sensitive information, referred to as plaintext, into a form called ciphertext,
which allows the intended recipients to decrypt it back into the plaintext; the
ciphertext should not expose any information to an attacker. The focus on
encryption and confidentiality is evident in the term cryptography, i.e., ‘secret
writing’, which is often used as a synonym for cryptology, and for some reason,
seem to be the more common term in the recent years.
Cryptology, and in particular cryptanalysis, i.e., ‘breaking’ the security of
cryptographic schemes, played a key role throughout history. Possibly most
notably - and most well known - is the role cryptanalysis played during the
second world war (WWII). Both sides used multiple types of encryption devices,
where the most well known are the Enigma and Lorenz devices used by the

4
The Encryption World: basic terms

key key
plaintext ciphertext plaintext
Encrypt Decrypt

Bob

Alice Eve
(eavesdropper)

Figure 1.1: Encryption: terms and typical use. Alice needs to send sensitive
information (plaintext) to Bob, so that the information will reach Bob - but re-
main confidential from Eve, who can eavesdrop on the communication between
Alice and Bob. To do this, Alice encrypts the plaintext; the 15 encrypted form
is called ciphertext, and Eve cannot learn anything from it (except its size).
However, Bob can decrypt the ciphertext, which recovers the plaintext.

Germans; and both sides invested in cryptanalysis - with much greater successes
to the allies, luckily.
The well-known Enigma machine was used throughout the war. Early mod-
els of the Enigma machine, captured and smuggled from Poland to Germany,
were used to construct complex, special-purpose devices called Bombe, that
helped cryptanalysis of Enigma traffic.
The somewhat-less known Lorenz encryption devices were introduced later
during the war, and no complete device was available to cryptanalysts. This
motivated the design of Colossus, the first fully functional computer, by Tommy
Flowers. Namely, Collossus was the first practical device that could be pro-
grammed for arbitrary tasks, rather than only perform a predefined set of tasks
or computations.
The fact that Collossus was the first applied, programmable (or ‘real’) com-
puter, is important historically - but was also of immense importance during
the War. Since Colossus was programmable, it was possible to test many possi-
ble attacks and to successfully cryptanalyze (different versions of) Lorenz and
other cryptosystems.
One critical difference between the Colossus and more modern computers, is
that the Colossus did not read a program from storage. Instead, setting up the
program for the Colossus involved manual setting of switches and jack-panel
connections. This is much less convenient, of course; but it was acceptable
for the Colossus, since there were only a few such machines and only a few,
simple programs, and the simplicity of design and manufacture was more im-
portant than making it easier to change programs. Even this crude form of
‘programming’ was incomparably easier than changing the basic functionality
of the machine, as required in special-purpose devices - including the Bombe
used to decipher the Enigma.

5
The idea of a stored program was proposed already in 1936/1937 - by two
independent efforts. Konard Zuse briefly mentioned such design in a patent
on floating-point calculations published at 1936 [174], and Alan Turing defined
and studied a formal model for stored-program computers - now referred to
as the Turing machine model - as part of his seminal paper, ‘On Computable
Numbers’ [161]. However, practical implementations of stored-program com-
puters appeared only after WWII. Stored-program computers were much easier
to use, and allowed larger and more sophisticated programs as well as the use
of the same hardware for multiple purposes (and programs). Hence, stored-
program computers quickly became the norm - to the extent some people argue
that earlier devices were not ‘real computers’.
Stored-program computers also created a vast market for programs. It now
became feasible for programs to be created in one location, and then shipped
to and installed in a remote computer. For many years, this was done by
physically shipping the programs, stored in media such as tapes, discs and
others. Once computer networks became available, program distribution is
often, in fact usually, done by sending the program over the network.
Easier distribution of software meant also that the same program could be
used by many computers; indeed, today we have programs that run on billions
of computers. The ability of a program to run on many computers created an
incentive to develop more programs; and the availability of a growing number
of programs increased the demand for computers - and the impact on society
and the economy.
The impact of computers further dramatically increased when computer
networks began to facilitate inter-computer communication. The introduction
of personal computers (1977-1982), and later of the Internet, web, smartphones
and IoT devices, caused a further dramatic increase in the impact of comput-
ers and ‘smart devices’ - which are also computers, in the sense of running
programs.
This growing importance of cyberspace also resulted in growing interest in
the potential social implications, and a growing number of science-fiction works
focused on these aspects. One of these was the novel ‘Burning Chrome’, pub-
lished by William Gibson at 1982. This novel introduced the term cyberspace,
to refer to these interconnected networks, computers, devices and humans.
This term is now widely used, often focusing on the impact on people and so-
ciety. The cyber part of the term is taken from cybernetics, a term introduced
in [169] for the study communication and control systems in machines, humans
and animals.
There was also increased awareness of the risks of abuse and attacks on dif-
ferent components of cyberspace, mainly computers, software, networks, and
the data and information carried and stored in them. This awareness sig-
nificantly increased as attacks on computer systems became widespread, esp.
attacks exploiting software vulnerabilities, and/or involving malicious software,
i.e., malware. This awareness resulted in the study, research and development
of threats and corresponding security mechanisms: computer security, software
security, network security and data/information security.

6
The awareness of security risks also resulted in important works of fiction.
One of these was the 1983 novel cyberpunk, by Bruce Bethke. Bethke coined
this term for individuals which are socially-inept yet technologically-savvy.
Originally a derogatory term, cyberpunk was later adopted as a name of
a movement, with several manifestos, e.g., [106]. In these manifestos, as well
as in works of fiction, cyberpunks and hackers are still mostly presented as
socially-inept yet technology-savvy, indeed, they are often presented as pos-
sessing incredible abilities to penetrate systems. However, all this is often
presented in positive, even glamorous ways, e.g., as saving human society from
oppressive, corrupt regimes, governments and agencies.
Indeed, much of the success of the Internet is due to its decentralized na-
ture, and many of the important privacy tools, such as the Tor anonymous
communication system [57], are inherently decentralized, which may be hoped
to defend against potentially malicious governments. Furthermore, some of
these tools, such as the PGP encryption suite [76], were developed in spite of
significant resistance by much of the establishment.

1.2.1 From classical to modern Cryptology


Cryptology has been studied and used for ages; we discuss some really ancient
schemes in § 2.1. In this section, we discuss a few basic facts about the long
and fascinating history of cryptology; much more details are provided in several
excellent books, including [102, 121, 155].
One of the basic goal of cryptography is confidentiality, i.e., to hide the
meaning of sensitive information from attackers, when the information is vis-
ible to the attacker. Encryption is the topic of the following chapter; in par-
ticular, we present there the Kerckhoffs’ principle [104], which states that the
security of encryption should always depend on the secrecy of a key, and not
on the secrecy of the encryption method. Kerckhoff’s principle had fundamen-
tal impact on cryptology, and is now universally accepted. One implication
of it is that security should not rely on obscurity, i.e., on making the security
or cryptographic mechanism itself secret. Note that this applies not just to
cryptology, but also to other aspects of cybersecurity.
Kerckhoff publication [104], in 1883, marks the beginning of a revolution in
cryptology. Until then, the design of cryptosystems was kept secret; Kerckhoff
realized that it is better to assume - realistically, too - that the attacker may be
able to capture encryption devices and reverse-engineer them to learn the algo-
rithm. This holds much more for modern cryptosystems, often implemented in
software, or in chips accessible to potential attackers. It is also important that
Kerckhoff published his work; there were very few previous published works in
cryptology, again due to the belief that cryptology is better served by secrecy.
However, even after Kerckhoff’s publication, for many years, there was lim-
ited published research in cryptology, or even development, except by intelli-
gence and defense organizations - in secret. The design and cryptanalysis of
the Enigma, discussed above, are a good example; although the cryptanalysis

7
work involved prominent researchers like Turing, and had significant impact on
development of computers, it was kept classified for many years.
This changed quite dramatically in the 1970s, with the beginning of what
we now call modern cryptology, which involves extensive academic research,
publication, products and standards, and has many important applications.
Two important landmarks seem to mark the beginning of modern cryp-
tology. The first is the development and publication of the Data Encryption
Standard (DES) [134]. The second is the publication of the seminal paper New
directions in cryptography [56], by Diffie and Hellman. This paper introduced
the radical, innovative concept of Public Key Cryptology (PKC), where the key
to encrypt messages may be public, and only the corresponding decryption key
must be kept private, allowing easy distribution of encryption keys. The paper
also presented the Diffie-Hellman Key Exchange protocol; we discuss both of
these in chapter 6.
In [56], Diffie and Hellman did not yet present an implementation of public
key cryptography. The first published public-key cryptosystem was RSA by
Rivest, Shamir and Adelment in [144], and this is still one of the most popular
public-key cryptosystems. In fact, the same design was discovered already a
few years earlier, by the GCHQ British intelligence organization. The CGHQ
kept this achievement secret until 1997, long after the corresponding designs
were re-discovered independently and published in [144]. See [121] for more
details about the fascinating history of the discovery - and re-discovery - of
public key cryptology.
This repeated discovery of public-key cryptology illustrates the dramatic
change between ‘classical’ cryptology, studied only in secrecy and with impact
mostly limited to the intelligence and defense areas, and modern cryptology,
with extensive published research and extensive impact on society and the
economy.

1.2.2 Cryptology is not just about secrecy!


Our discussion so far focused on the use of cryptology to ensure secrecy of in-
formation, typically by encryption. However, modern cryptology is not limited
to encryption and the goal of confidentiality; it covers other threats and goals
related to information and communication. This includes goals such as au-
thenticity and integrity, which deal with detection of tampering of information
by the attacker (integrity), or with detection of impersonation by the attacker
(authentication).
In particular, one of the most important mechanisms of modern cryptogra-
phy, crucial to many of its applications and much of its impact, is the design
of digital signature schemes. We discuss signature schemes in chapter 6; let us
provide here a grossly simplified discussion.
Consider, first, ‘classical’, handwritten signatures. Everyone should be able
to validate a signed document by comparing the signature on it to a sample
signature, known to be of that signer; and yet, nobody but the signer should be

8
Public Key Digital Signatures

Private signing Message m


Key A.s Public verification
m Key A.v
A.s A.v

Signature Verify
Alg Sign Signature σ Alg Ver
(σ =SignA.s(m))
Alice Bob
(Ok, m) or invalid

Figure 1.2: Digital signature and verification algorithms: terms and typical
use. Alice signs message m, by applying the signing algorithm Sign using
1/10/2020 23
her private signing key A.s, resulting in the signature σ = SignA.s (m). Alice
sends the signature σ and the message m to Bob, who verifies the signature
by applying the verification algorithm Ver using Alice’s public verification key
A.v, namely, computes V erA.v (m, σ). If the result is Invalid, this implies that
σ is not a valid signature of m; if the result is Ok, then σ is valid signature,
i.e., m was indeed signed by Alice.

able to sign a document using her signature, even if the other person has seen
the sample signature, as well as signatures over other (even similar) documents.
Intuitively, digital signatures provide similar functionality to handwritten
signatures; but, of course, the documents and the signatures are strings (files)
rather than ink on paper, and the processes of signing and validating are done
by applying appropriate functions (or algorithms), as illustrated in Figure 1.2.
Specifically, the signing algorithm Sign uses a private signing key s known only
to the signer, and the validating algorithm V er uses a corresponding public
validation key v. If σ = Ss (m), i.e., σ is a signature of message m using signing
key s, then Vv (m, σ) = Ok; and for an attacker who does not know s, it
should be infeasible to find any unsigned message m0 and any ‘signature’ σ 0 s.t.
Vv (m0 , σ 0 ) = Ok. The use of digital signatures allows computer-based signing
and validation of digital documents, which is crucial to secure communication
and transactions over the Internet.
We discuss digital signatures extensively in § 6.7, and their application to
the SSL/TLS protocol in chapter 7, and to Public Key Infrastructure (PKI) in
chapter 8.

9
1.3 Cybersecurity Goals and Attack Models
In a broad sense, secure systems should protect ‘good’ parties using the sys-
tems legitimately from damages due to attackers (also known as adversaries).
Attackers could be ‘bad’ users or ‘outsiders’ with some access to the systems or
their communication. Note that attackers may often also control a (corrupted)
device which they don’t ‘own’.
There are two basic ways of protecting against attackers, prevention and
deterrence:
Prevention is a proactive approach: we design and implement the system
so that the attacker cannot cause damage (or can only cause reduced
damage). Encryption is an example of a cryptographic means to pre-
vent attacks, as it is usually used to prevent an attacker from disclosing
sensitive information.
Deterrence is a reactive approach: we design mechanisms that will cause
damages to the attacker if she causes harm, or even when we detect an
attempt to cause harm. Effective deterrence requires the ability to detect
the attack, to attribute the attack to the attacker, and to penalize the at-
tacker sufficiently. Furthermore, deterrence can only be effective against
a rational adversary; no penalty is guaranteed to suffice to deter an ir-
rational adversary, e.g., a terrorist. The use of digital signatures is one
important deterrence mechanism. Signatures are used to deter attacks
in several ways; in particular, a signature verified using the attacker’s
well-known public key, over a given message, provides evidence that the
attacker signed that message. Such evidence can be used to punish or
penalize the attacker in different ways - an important deterrent. Signa-
tures may also be provided by users, as in reviews - to deter bad services
or products, to motivate the provision of good services and products, and
to allow users to choose a good service/product based on evaluations by
peer users.

Note that deterrence is only effective if the adversary is rational, and would
refrain from attacking if her expected profit (from attack) would be less than
the expected penalty.
An obvious challenge in designing and evaluating security is that we must
‘expect the unexpected’; attackers are bound to behave in unexpected ways.
As a result, it is critical to properly define the system and to identify and
analyze any risks. In practice, deployment of security mechanisms has costs,
and risk analysis would consider these costs against the risks, taking into ac-
count probabilities and costs of different attacks and their potential damages;
however, we do not consider these aspects, and only focus on ensuring specific
security goals against specific, expected kinds of attackers.

Cybersecurity goals. Cybersecurity is often associated with three high-


level goals, i.e., ensuring Confidentiality, Integrity/authenticity and Availability

10
(CIA). Note that integrity/authenticity and availability are separate from con-
fidentiality, and often do not involve encryption; however, they often involve
other cryptographic mechanisms, such as digital signatures, as we discussed
above (subsection 1.2.2). Furthermore, note that these three goals are very
broad, as they apply to most cybersecurity systems; when we study the secu-
rity of any given system, we should first define specific security goals for that
particular system, which will usually elaborate on these three high-level goals.
One of the fundamentals of modern cryptology, which already appears
in [56], is an attempt to understand and define a clear model of the attacker
capabilities and clear goals/requirements for the scheme/system. We believe
that not only in cryptology, but in general in security, the articulation of the
attack model and of the security requirements is fundamental to the design and
analysis of security. Indeed, we consider this the first principle of cybersecurity.
This principle applies also in areas of cybersecurity where it may not be feasi-
ble to have completely rigorous models and proofs. Yet, precise articulation of
attacker model and capabilities, as well as of the security requirements, is very
important, and helps identify and avoid vulnerabilities.
A well-articulated description of the attacker model and capabilities, and of
the security requirements and assumptions, is necessary to evaluate and ensure
security for arbitrary interactions with the adversary. The adversary is limited
in its capabilities, not in its strategy.

Principle 1 (Security Goals and Attack Model). Design and evaluation of


system security should include a clear, well defined model of the attacker ca-
pabilities (attack model) and of the exact criteria for a system, function or
algorithm to be considered secure vs. vulnerable (security requirements).

1.4 Notations
Notations are essential for precise, efficient technical communication, but it can
be frustrating to understand text which uses unfamiliar or forgotten notations.
This would be a special challenge for readers of this text who were not (much)
exposed to the theory of computer science; furthermore, unfortunately, there
are often multiple notations for the same concept as well as multiple conflicting
interpretation for the same notation. The author tried to choose the more
widely used and least conflicting notations, but that included difficult tradeoffs.
For example, such conflicting usage ‘forced’ the author to adopt the less-widely-
used symbol + + to denote string concatenation1 .
To try to help the reader to follow the notations in this text, Table 1.1
presents notations which we use extensively. Please refer to it whenever you
see some unclear notation, and let me know of any missing or incorrect notation.

1 In cryptographic literature, the symbol || is usually used as the string concatenation

operator; however, this symbol is used extensively elsewhere for the logical-OR operator.
I therefore use ++ for concatenation; this is an existing notation, but not much used in
cryptography.

11
Table 1.1: Notations used in this manuscript.

S = {a, b, c} A set S with three elements - a, b and c. Sets should be


denoted with capital letter.
{x ∈ X | f (x) = The subset of elements x ∈ X s.t. f (x) = 0.
0}
(∀x ∈ X)(f (x) > For all elements x in the set X, holds f (x) > 1. Set X
1) omitted when ‘obvious’.
(∃x ∈ X|f (x) > 1) There is (exists) some x in X s.t. f (x) > 1.
Πx∈S Vx Multiplication of Vx for every x ∈ S, e.g., Πx∈{a,b,c} Vx =
Va · Vb · Vc . Similar to use of Σx∈S for addition.
C ∪B Union of sets C and B.
A×B Cross-product of sets A and B, i.e. the set {(a, b)|a ∈
A ∧ b ∈ B}.
{a, b}l The set of strings of length l over the alphabet {a, b}.
{a, b}∗ The set of strings of any length, over from the alphabet
{a, b}.
+ (or ||)
+ Concatenation1 ; abc + + def = abcdef .
an (and 1n ) String consisting of n concatenations of the string a; 1n
are n concatenations of the digit 1, which is the number n
in unary notation.
|b| The length of string b; hence, |an | = n · |a| and |0n | = n.
a[i] The ith bit of string a
a[i . . . j] or ai...j String containing a[i] + + ... +
+ a[j].
aR The ‘reverse’ of string a, e.g., abcdeR = edcba.
x∧y bitwise logical AND; 0111 ∧ 1010 = 0010.
x∨y bitwise logical OR; 0111 ∨ 1010 = 1111.
⊕ Bit-wise exclusive OR (XOR); 0111 ⊕ 1010 = 1101.
x The inverse of bit x, or bit-wise inverse of binary string x.
$
x←X Select element x from set X with uniform distribution.
Pr $ (F (x)) The probability of F (x) to occur, when x is selected uni-
x←X
formly from set X.
ABk (·) Algorithm A with oracle access to algorithm B, instanti-
ated with key k; see Definition 2.7.
PPT The set of efficient (Probabilistic Polynomial Time) algo-
rithms; see Definition 2.3.

N EGL(n) Set of ‘negligible functions’, see Def. 2.5.

12
Chapter 2

Encryption and
Pseudo-Randomness

Encryption deals with protecting the confidentiality of sensitive information,


which we refer to as plaintext message m, by encoding (encrypting) it into
ciphertext c. The ciphertext c should hide the contents of m from the adversary,
yet allow recovery of the original information by legitimate parties, using a
decoding process called decryption. Encryption is one of the oldest applied
sciences; some basic encryption techniques were already used thousands of
years ago.
One result of the longevity of encryption is the use of different terms. The
cryptographic encoding operation is referred to as either encryption or enci-
pherment, and the decoding operation is referred to as decryption or decipher-
ment. Encryption schemes are often referred to as cryptosystems or as ciphers;
in particular we will discuss two specific types of cryptosystems referred to as
block ciphers1 and stream ciphers. We use the terms ‘encryption scheme’ and
‘cryptosystem’ interchangeably.
Fig. 2.1 shows the encryption of plaintext message m (top of figure) and
decryption of ciphertext c (middle of figure), both using the same shared key
k. The message is from some message-space M , and the key is from some key-
space K; both M and K are typically sets of binary strings, e.g., all strings or
all strings of specific length. We define a shared-key cryptosystem in Def. 2.1.

Definition 2.1 (Shared-key cryptosystem). A shared-key cryptosystem is a


pair of keyed algorithms, hE, Di, ensuring correctness, i.e., for every message
m ∈ M and key k ∈ K holds: Decryptk (Encryptk (m)) = m, as illustrated in
the bottom of Fig. 2.1.

Definition 2.1 is quite general. In particular, it does not restrict the encryp-
tion and decryption algorithms, e.g., they may be randomized; and it allows
arbitrary message space M and key space K. In the rest of this chapter, we will
1A block is a term for a string of bits of fixed length, the block length.

13
Key k
Plaintext Ciphertext
m Encrypt E c=Ek(m)

Key k
Ciphertext Plaintext
c Decrypt D m=Dk(c)

Key k Key k
Plaintext Ciphertext Plaintext
m Encrypt E Decrypt D
c=Ek(m) m=Dk(Ek(m))

Figure 2.1: High-level shared-key encryption process. Top: encryption of plain-


text message m using key k, resulting in ciphertext c = Ek (m). Middle: de-
cryption of ciphertext c using key k, resulting in plaintext message m = Dk (c).
Bottom: the correctness property, i.e., decryption of ciphertext c = Ek (m)
returns back the original plaintext m = Dk (c) = Dk (Ek (m)).

see a variety of shared-key cryptosystems, some deterministic, some random-


ized, and with different message and key spaces. Some schemes even require
a further generalization, namely, the use of state, e.g., a counter; we discuss
stateful encryption schemes later in this chapter.
Shared-key cryptosystems are also referred to as symmetric cryptosystems,
referring to their use of the same key k for encryption and decryption. This
is in contrast to public-key cryptosystems, also referred to as asymmetric cryp-
tosystems, which use separate keys for encryption (e.g., denoted e) and for
decryption (d). The two keys are related, to allow the use of d to decrypt mes-
sages encrypted using e; but it should be infeasible to derive the decryption
key d given the encryption key e, hence, the encryption key can be made public
(not secret). We illustrate a public-key cryptosystem in Fig.2.2, and discuss
them in chapter 6.
Note also that Definition 2.1 focuses on correctness; it does not require the
encryption scheme to ensure any security property. The reason for that is that
defining ‘security’ is more complex than one may initially expect. Intuitively,
there is a common goal: confidentiality, in a strong sense, against powerful
adversaries; however, there are subtle issues, as well as multiple variants which
differ in their exact requirements and assumptions about the adversary capa-
bilities. Later on, we discuss the security requirements and present definitions.

14
Key length l

KeyGen KG

Encryption Key e (e,d) Decryption Key d


(public) (private)
e d
Plaintext Ciphertext Plaintext
m Encrypt E Decrypt D
c=Ee(m) m=Dd(Ee(m))

Figure 2.2: Public-Key Cryptosystem (KG, E, D). The Key-Generation al-


gorithm KG outputs a (public,private) keypair (e, d) of given length l. The
encryption algorithm E uses the public encryption key e to compute the ci-
phertext c = Ee (m), given plaintext message m. The decryption algorithm D
uses the private decryption key d to compute the plaintext; correctness holds
if for every pair of matching keys (e, d) = KG(l) and every message m, holds
m = Dd (Ee (m)).

2.1 From Ancient Ciphers to Kerckhoffs’ Principle


Cryptology is one of the most ancient sciences. We begin our discussion of
encryption schemes by discussing few ancient ciphers, and some simple variants.
An important property that one has to keep in mind is that the design of these
ciphers was usually kept as a secret; even when using a published design, users
typically kept their choice secret. Indeed, it is harder to cryptanalyze a scheme
which is not even known; see discussion in subsection 2.1.4, where we present
the Kerckhoffs’ principle, which essentially says that security of a cipher should
not depend on the secrecy of its design.
Since the ancient ciphers were considered secret, some of the ancient designs
did not use secret keys at all; we discuss such keyless ciphers in subsection 2.1.1.
Besides the historical perspective, discussing these simple, ancient ciphers helps
us introduce some of the basic ideas and challenges of cryptography and crypt-
analysis. Readers interested in more knowledge about the fascinating history of
cryptology should consult some of the excellent manuscripts such as [102, 155].
The very ancient ciphers were mono-alphabetic substitution ciphers. Monoal-
phabtic substitution ciphers use a fixed mapping from each plaintext character
to a corresponding ciphertext character (or some other symbol). Namely, these
ciphers are stateless and deterministic, and defined by a permutation from the
plaintext alphabet to a set of ciphertext characters or symbols. We further
discuss mono-alphabetic substitution ciphers in subsection 2.1.3.

15
2.1.1 Ancient Keyless Monoalphabetic Ciphers
In this section, we discuss several well-known, simple, weak and ancient ciphers.
These ciphers also share two important additional properties: they do not
utilize a secret key, i.e., are ‘keyless’; and they are monoalphabetic. A cipher is
monoalphabetic if it is defined by a single, fixed mapping from plaintext letter
to ciphertext letter or symbol.

The At-Bash cipher The At-Bash cipher may be the earliest cipher whose
use is documented; specifically, it is believed to be used, three times, in the
Book of Jeremiah belonging to the Old Testament. The cipher maps each of
the letters in the Hebrew alphabet to a different letter. Specifically, the letters
are mapped in ‘reverse order’: first letter to the last letter, second letter to
the second-to-last letter, and so on; this mapping is reflected in the name ‘At-
Bash’2 . See illustrated in Fig. 2.3; even if you are not familiar with the letters
of the Hebrew alphabet, the mapping may still be identified by the visual
appearance. If you still find it hard to match, that’s Ok; we next describe an
adaptation of the At-Bash cipher to the Latin alphabet.

‫א‬ ‫ב‬ ‫ג‬ . . . . ‫ר‬ ‫ש‬ ‫ת‬

‫ת‬ ‫ש‬ ‫ר‬ . . . . ‫ג‬ ‫ב‬ ‫א‬

Figure 2.3: The At-Bash Cipher.

The Az-By cipher The Az-By cipher is the same as the At-Bash cipher,
except using the Latin alphabet. It is convenient to define the Az-By cipher, the
At-Bash cipher, and similar ciphers, by a formula. For that purpose, we define
the cipher as a function of the input letter, where each letter is represented by
its distance from the beginning of the alphabet; i.e., we represent the letter ‘A’
by the number 0, the letter ‘B’ by 1, and so on, until letter ‘Z’, represented by
25. The Az-By cipher is now defined by the following encryption function:
2 The name ‘At-Bash’ reflects the ‘reverse mapping’ of the Hebrew alphabet. The ‘At’

refers to the mapping of the first letter (‘Aleph’, ℵ) to the last letter (‘Taf’), and of the
second letter (‘Beth’, i) to the second-to-last letter (‘Shin’).

16
EAz−By (m) = 25 − m (2.1)
We illustrate the Az-By cipher in the top part of Fig. 2.4; below it, we
present two similar ancient ciphers, which we next discuss - the Caesar and the
ROT13 ciphers.

A B C D E F G H I J K L M
Z Y X W V U T S R Q P O N
AzBy
N O P Q R S T U V W X Y Z
M L K J I H G F E D C B A

A B C D E F G H I J K L M
D E F G H I J K L M N O P
Caesar
N O P Q R S T U V W X Y Z
Q R S T U V W X Y Z A B C

A B C D E F G H I J K L M
N O P Q R S T U V W X Y Z
ROT13
N O P Q R S T U V W X Y Z
A B C D E F G H I J K L M

Figure 2.4: The AzBy, Caesar and ROT13 Ciphers.

The Caesar cipher. We next present the well-known Caesar cipher. The
Caesar cipher has been used, as the name implies, by Julius Caesar. It is also
a mono-alphabetic cipher, operating on the set of the 26 Latin letters, from
A to Z. In the Caesar cipher, the encryption of plaintext letter p is found by
simply ‘shifting’ to the fourth letter after it, when letters are organized in a
circle (with A following Z). The Caesar cipher is illustrated by the middle row
in Figure 2.4.
We next write the formula for the Caesar cipher. As above, we represent
each letter by its distance from the beginning of the alphabet; i.e., we represent
the letter ‘A’ by the number 0, and so on; ‘Z’ is represented by 25. This can
be conveniently written as a formular, using modular arithmetic notations; see
Note 2.1 for a brief summary of basic modular arithmetic. Namely, the Caesar
encryption of p is given by:
ECaesar (m) = m + 3 mod 26 (2.2)

17
Note 2.1: Basic modular arithmetic

Modular arithmetic is based on the modulo operation, denoted mod , and defined
as the residue in integer division between its two operands. Namely, a mod n is an
integer b s.t. 0 ≤ b < n and for some integer i holds a = b + i · n. Note that a and
i may be negative, but the value of a mod n is unique.
The mod operation is applied after all ‘regular’ arithmetic operations such as
addition and multiplication; i.e., (a + b mod n) = [(a + b) mod n]. On the other
hand, we can apply the mod operation also to the operands of addition and
multiplication, often simplifying the computation of the final result. Namely, the
following useful rules are easy to derive, for any integers a, b, c, d, n, r s.t. 0 ≤ r < n:

r mod n =r (2.3)
[(a mod n) + (b mod n)] mod n = (a + b) mod n (2.4)
[(a mod n) − (b mod n)] mod n = (a − b) mod n (2.5)
[(a mod n) · (b mod n)] mod n = a·b mod n (2.6)
ab mod n = (a mod n)b mod n (2.7)
[(a + c) mod n = (b + c) mod n] ⇔ [a mod n = b mod n] (2.8)
[a · r mod n = b · r mod n] ⇔ [a mod n = b mod n] (2.9)

For simplicity, we often write the mod n operation only at the right hand side of
an equation, but it still applies to both sides. To avoid confusion, in such situations,
it is preferable to use the congruence symbol ≡, instead of the ‘regular’ equality
symbol =, e.g., a + b ≡ c · d mod n.
We apply - and discuss - more advanced modular arithmetic in chapter 6, where we
discuss public key cryptography.

Exercise 2.1. Write the formulas for the decryption of the Caesar and Az-By
ciphers.

Exercise 2.2. Use Equations (2.4) and (2.6) to prove that a number is divisible
by 3 if and only if the sum of its decimal digits is divisible by 3; prove the similar
rule for division by 9.

Exercise 2.3. Use Eq. (2.7) to show that for any integers a, b holds ab
mod (a − 1) = 1.

The ROT13 cipher ROT13 is popular variant of the Caesar cipher, with the
minor difference that the ‘shift’ is of 13 rather than of 3, i.e., EROT 13 (p) = p+13
mod 26. Note that in this special case, encryption and decryption are exactly
the same operation: EROT 13 (EROT 13 (p)) = p. The ROT13 cipher is illustrated
by the bottom row in Figure 2.4. Instead of writing down the formula for
ROT13 encryption, as done above for Az-By and Caesar ciphers, we encourage
the readers themselves to write it down.

18
The Masonic cipher. A final example of a historic, keyless, monoalphabetic
cipher is the Masonic cipher. The Masonic cipher is from the 18th century and
is illustrated in Fig. 2.5. This cipher uses a ‘key’ to map from plaintext to
ciphertext and back, but the key is only meant to assist in the mapping, since
it has a regular structure and is considered part of the cipher.

Figure 2.5: The Masonic Cipher, written graphically and as a mapping from
the Latin alphabet to graphic shapes.

2.1.2 Shift Cipher: a Keyed Variant of the Caesar Cipher


Keyless ciphers have limited utility; in particular, the design of the cipher be-
comes a critical secret, whose exposure completely breaks security. Therefore,
every modern cipher, and even most historical ciphers, use secret keys.
Readers who are interested in these historical (yet keyed) ciphers should
consult manuscripts on the history of cryptology, e.g. [102, 155]. We merely
focus on the shift cipher, which is a simple keyed variant of the Caesar cipher,
E(p) = p+3 mod 26. We find the shift cipher helpful in introducing important
concepts and principles later in this chapter.
The first variant of the shift cipher encrypts a single Latin character at a
time, just like the Caesar cipher; it simply uses an arbitrary shift rather than
the fixed shift of three as in the original Caesar cipher. This variable shift can
be considered as a key and denoted k. Namely, this shift cipher is defined by
Ek (p) = p + k mod 26.
Obviously, this variant of the shift cipher is about as insecure as the original
Caesar cipher; the attacker only needs to exhaustively search the 26 possible
key values. To prevent this simple attack, one can further extend Caesar to
use longer blocks of plaintext (instead of single characters) and longer keys,
allowing the use of many shift values rather than just one of 26 possible shifts.
For convenience, assume that the input is now encoded as a binary string.
The l-bits shift cipher operates on plaintexts of l bits and also uses an l-bit
key. To encrypt plaintext block p ∈ {0, 1}l , using key k ∈ {0, 1}l , we compute
Ek (p) = p + k mod 2l .
By using sufficiently large l (length of keys and blocks), exhaustive search
becomes impractical. However, this cipher is still insecure. For example, as

19
the following exercise shows, it is easily broken by an attacker who has access
to one known plaintext pair, i.e., a pair of (plaintext m, ciphertext Ek (m)). An
attack exploiting such pairs is called a known plaintext attack (KPA).

Exercise 2.4 (Known plaintext attack (KPA) on shift cipher). Consider the
l-bits shift cipher Ek (p) = p + k mod 2l , where p, k ∈ {0, 1}l . Suppose an
attacker can get the encryption Ek (p) of one known block p ∈ {0, 1}l . Show
that this allows the attacker to decrypt an arbitrary ciphertext c0 , i.e., find p0
s.t. c0 = Ek (p0 ).

2.1.3 Mono-alphabetic Substitution Ciphers


The Mason, At-Bash, Caesar and shift ciphers are all mono-alphabetic substi-
tution ciphers. Monoalphabtic substitution ciphers are deterministic, stateless
mappings from plaintext characters to ciphertext characters or symbols; in-
deed, the use of any other set of symbols instead of letters does not make any
real change in the security of such ciphers, hence we will assume a permutation
from characters into characters. For the 26 letters of the Latin alphabet, there
are 26! > 288 such permutations, clearly ruling out exhaustive search for the
key.
Of course, specific mono-alphabetic substitution ciphers may use only a
small subset of these permutations. This is surely the case for all of the naive
ciphers we discussed above - the Mason, At-Bash, Caesar and shift ciphers.
However, what about using a general permutation over the alphabet - where
the permutation itself becomes a key? We refer to this cipher as the general
mono-alphabetic substitution cipher. The key for this cipher may be written
as a simple two-rows table, with one row containing the plaintext letters and
the other row containing the corresponding ciphertext letters (or symbols).
For example, see such a key in Figure 2.6, where the plaintext alphabet is the
Latin alphabet plus four special characters (space, dot, comma and exclamation
mark), for total of 30 characters. The reader may very well recall the use of
similar ‘key tables’ from mono-alphabetic ciphers often used by kids.
As we already concluded, exhaustive search for the key is impractical -
too many keys. However, even when we select the permutation completely at
random, this cipher is vulnerable to a simple attack, which does not even require
encryption of known plaintext. Instead, this attack requires some knowledge
about the possible plaintexts. In particular, assume that the attacker knows
that the plaintext is text in the English language. An attack of this type,
that (only) assumes some known properties about the distribution of the input
plaintext, is called a cipher-text only attack (CTO).
In our example, the CTO attacker makes use of known, measurable facts
about English texts. The most basic fact is the distribution of letters used
in English texts, as shown in Figure 2.7. Using this alone, it is usually quite
easy to identify the letter E, as it is considerably more common than any other
letter. Further cryptanalysis can use the guess for E to identify other letters,
possibly using also the distributions of two-letter and three-letter sequences;

20
A B C D E F G H I J
Y * X K L D B C Z F

K L M N O P Q R S T
G E R U + J W ! H M

U V W X Y Z . , !
P N I O Q S - V T A

Figure 2.6: Example of a possible key for a general mono-alphabetic substitu-


tion cipher, for an alphabet of 30 characters.

if the plaintext also contains spaces and punctation marks, this can help the
attacker significantly. This type of ciphertext-only attack (CTO) is called the
letter-frequency attack.

Figure 2.7: Distribution of letters in typical English texts.

Note that for the letter-frequency and other CTO attacks to be effective
and reliable, the attacker requires a significant amount of ciphertext.

Exercise 2.5. Write a program that computes the distribution of letters in


given texts, and run it over random texts of different sizes. Compare the re-
sulting distributions to Fig. 2.7 and to each other (for consistency). You will
see that longer texts tend to be much more consistent (and closer to Fig. 2.7).

21
In fact, this phenomenon exists for other attacks too; cryptanalysis often
requires a significant amount of ciphertext encrypted using the same encryption
key. This motivates limiting the use of each cryptographic key to a limited
amount of plaintext (and ciphertext), with the hope of making cryptanalysis
harder or, ideally, infeasible.

Principle 2 (Limit usage of each key). Systems deploying ciphers/cryptosys-


tems should limit the amount of usage of each key, changing keys as necessary,
to foil cryptanalysis attacks.

An extreme example of this is the One-Time Pad cipher, which we discuss


later. The one-time pad is essentially a one-bit substitution cipher - but with
a different random mapping for each bit. This turns the insecure substitution
cipher into a provably secure cipher!

2.1.4 Kerckhoffs’ known-design principle


The confidentiality of keyless ciphers is completely based on the secrecy of the
scheme itself, since it is enough to know the decryption process in order to
decrypt - no key is required. However, even for keyed cryptosystems, it seems
harder to attack without knowing the design; see Exercise 2.6. Therefore, in
‘classical’ cryptography, cryptosystems were kept secret and not published, to
make cryptanalysis harder.
Exercise 2.6. Table 2.1 shows six ciphertexts, all using very simple substitu-
tion ciphers. The ciphers used to encrypt the top three ciphertexts are indicated,
but the ciphers used to encrypt the bottom three ciphertexts are not indicated.
Decipher, in random order, all six ciphertexts and measure the time it took you
to decipher each of them. Fill in the blanks in the table: the plaintexts, the time
it took you to decipher each message, and the ciphers used for ciphertexts D,
E and F. Did the knowledge of the cipher significantly ease the cryptanalysis
process?
One of the recent and quite famous examples of this policy are the en-
cryption algorithms in the GSM network, which were kept secret - until they
were eventually leaked. Indeed, soon after this leakage, multiple attacks were
published; possibly the most important and interesting being a practical cipher-
text only (CTO) attack [7]. One may conclude from this that, indeed, ciphers
should remain secret; however, most experts believe that the opposite is true,
i.e., that GSM designers should have used a published cryptosystem. In fact,
newer cellular networks indeed use cryptosystems with published specifications.
The idea that ciphers should be designed for security even when known to
attackers was presented already in 1883, by the Dutch cryptographer Auguste
Kerckhoffs. This is now known as Kerckhoffs’ principle and considered one of
the basic principles of cryptography:

22
Identifier Cipher Ciphertext Plaintext Time
A Caesar JUHDW SDUWB
B AzBy ILFMW GZYOV
C ROT13 NYBAR NTNVA
D BLFMT OLEVI
E FZNEG UBHFR
F EUHDN UXOHV

Table 2.1: Ciphertexts for Exercise 2.6. All plaintexts are pairs of two simple
five-letter words. The three upper examples have the cipher spelled out, the
three lower examples hide it (’obscurity’). It does not make them secure, but
decryption may take a bit longer.

Principle 3 (Kerckhoffs’ known-design principle). When designing or evalu-


ating the security of (cryptographic) systems, assume the adversary knows the
design – everything except the secret keys.
We intentionally put the word ‘cryptographic’ in parenthesis; this is since
the principle is mostly accepted today also with regard to non-cryptographic
security system such as operating systems and network security devices.
There are several reasons to adopt Kerckhoffs’ principle. Kerckhoffs’ orig-
inal motivation was apparently the realization that cryptographic devices are
likely to be captured by the enemy, and if secrecy of the design is assumed, this
renders them inoperable - exactly in conflict situations, when they are most
needed. The GSM scenario, as described above, fits this motivation; indeed,
GSM designers did not even plan a proper ‘migration plan’ for changing from
the exposed ciphers to new, hopefully secure ciphers.
Indeed, it appears that one reason to adopt Kerckhoffs’ principle when de-
signing a system is simply that this makes the designers more aware of possible
attacks - and usually, results in more secure systems.
However, the best argument in favor of Kerckhoffs’ principle is that it allows
public, published, standard designs, used for multiple applications. Such stan-
dardization and multiple applications have the obvious advantage of efficiency
of production and support. However, in the case of cryptographic systems,
there is an even greater advantage. Namely, a public, published design facili-
tates evaluation and cryptanalysis by many experts, which is the best possible
guarantee for security and cryptographic designs - except for provably-secure
designs, of course. In fact, even ‘provably-secure’ designs were often found to
have vulnerabilities only during careful review by experts, due to a mistake in
the proof or some modeling or other assumption, often made implicitly.
Sometimes it is feasible to combine the benefits of open design (following
Kerckhoffs’ principle) and of secret design (placing another challenge on at-
tackers), by combining two candidate schemes, one of each type, using a robust
combiner construction, which ensures security as long as one of the two schemes
is not broken [92]. For example, see Lemma 2.3 in subsection 2.5.8 below.

23
2.2 Cryptanalysis Attack Models: CTO, KPA, CPA and
CCA
As discussed in § 1.3 and in particular in Principle 1, security should be defined
and analyzed with respect to a clear, well defined model of the attacker capa-
bilities, which we refer to as the attack model. In this section, we introduce
four cryptanalysis attack models: CTO, KPA, CPA and CCA. These define the
capabilities of attackers trying to ‘break’ an encryption scheme.
We discussed above the letter-frequency attack, which relied only on access
to sufficient amount of ciphertext - and on knowing the letter-distribution of
plaintext messages. In subsection 2.3.1, we present exhaustive search, an attack
requiring the ability to identify correctly-decrypted plaintext (with significant
probability). Both of these attacks requires only access to (sufficient) cipher-
text, and some knowledge about the plaintext distribution (letter frequencies
or ability to identify possible plaintexts). We refer to attacks which require
only these ‘minimal’ attacker capabilities, as ciphertext-only (CTO) attacks.
In particular, ciphertext-only attacks do not require a (plaintext, ciphertext)
pair.
To facilitate CTO attacks, the attacker must have some knowledge about
the distribution of plaintexts. In practice, such knowledge is typically implied
by the specific application or scenario. For example, when it is known that the
message is in English, then the attacker can apply known statistics such as the
letter-distribution histogram Figure 2.7. For a formal, precise definition, we
normally allow the adversary to pick the plaintext distribution. Note that this
requires defining security carefully, to prevent absurd ‘attacks’, which clearly
fail in practice, to seem to fall under the definition.
In subsection 2.3.2, we discuss the table look-up and time-memory tradeoff
attacks; in both of these generic attacks, the adversary must be able to obtain
the encryption of one or few specific plaintext messages - the messages used to
create the precomputed table. We refer to an attack of this type as a chosen-
plaintext attack (CPA). We say that the CPA attack model is stronger than
the CTO attack model, meaning that every cryptosystem vulnerable to CTO
attack is also vulnerable to CPA.
Let us briefly mention two additional common attack models for encryption
schemes, which represent two different variants of the CPA attack. The first
is the known-plaintext attack (KPA) model, where the attacker receives one or
multiple pairs of plaintext and the corresponding ciphertext, but the plaintext
is chosen randomly.
Exercise 2.7 (CP A > KP A > CT O). Explain (informally) why every cryp-
tosystem vulnerable to CTO attack, is also vulnerable to KPA, and every cryp-
tosystem vulnerable to KPA, is also vulnerable to CPA. We say that the KPA
model is stronger than the CTO model and weaker than the CPA model.

Finally, in the chosen-ciphertext attack (CCA) attack model, the attacker


has the ability to receive the decryption of arbitrary ciphertexts, chosen by

24
the attacker. We adopt the common definition, where CCA-attackers are
also allowed to perform CPA attacks, i.e., the attacker can obtain the en-
cryptions of attacker-chosen plaintext messages. With this definition, trivially,
CCA > CP A. Combining this with the previous exercise, we have the com-
plete ordering: CCA > CP A > KP A > CT O.

2.3 Generic attacks and Effective Key-Length


We discussed several ciphers and attack models; how can we evaluate the se-
curity of different ciphers, under a given attack model? This is a fundamental
challenge in cryptography; we will discuss this challenge in this section, as well
as later on, esp. when we introduce our first definition of a cryptographic
mechanism - the Pseudo-Random Generator (PRG), in subsection 2.5.2.
One way in which non-experts often compare the security of different ci-
phers, is using their key length. Indeed, as we already mentioned, ciphers using
short keys are insecure. In subsection 2.3.1, we present exhaustive search, an
attack which essentially tries out all the keys until finding the right one.
Exhaustive search is a generic attack. Generic attacks work for all (or many)
schemes, without depending on the specific design of the attacked scheme, but
only on general properties, such as key length and attack model. Exhaus-
tive search works on most ciphers and scenarios; it requires the ability to text
candidate keys. Such ability exists usually, but not always, as we explain.
In subsection 2.3.2 we discuss two other generic attacks: table look-up and
time-memory tradeoff. These attacks further demonstrate, that the attacker’s
success does not depend only on the key-length - it also depends on the attack
model and attacker capabilities, e.g., storage capacity.
Finally, in subsection 2.3.3 we discuss additional challenges in evaluating
security of cryptographic mechanisms, and introduce the effective key length
concept and principle.

2.3.1 Exhaustive search


Following Kerckhoffs’ principle, we assume henceforth that all designs are
known to the attacker; defense is only provided via the secret keys. There-
fore, one simple attack strategy is to try to decrypt ciphertext messages using
all possible keys, detect the key - or (hopefully few) keys - where the de-
cryption seems to result in possible plaintext, and discard keys which result
in clearly-incorrect plaintext. For example, if we know that the plaintext is
text in English, we can test candidate plaintexts, e.g., by checking whether
their letter distributions are reasonably-close to the distribution of letters in
English (Figure 2.7). This attack is called exhaustive key search, exhaustive
cryptanalysis or brute force attack; we will use the term exhaustive search.

Requirements of Exhaustive Search. Exhaustive search may not work for


stateful ciphers, since decryption depends on the state and not only on the key.

25
Furthermore, it should be possible to efficiently identify correct decryption by
checking the output of the decryption of the ciphertext. Namely, the decoding
of an arbitrary plaintext with an incorrect key should result in clearly-invalid
plaintext, with significant probability. Exhuastive search is applicable for any
stateless encryption provided that such validation of plaintext is possible; this
is why we refer to it as a generic CTO attack.
Exhaustive search is practical only when the key space is not too large. For
now we focus on symmetric ciphers, where the key is usually an arbitrary binary
string of given length, namely the key space is 2l where l is the key length, i.e.,
the problem is the use of insufficiently-long keys. Surprisingly, designers have
repeatedly underestimated the risk of exhaustive search and used ciphers with
insufficiently long keys, i.e., insufficiently large key spaces.
Let TS be the sensitivity period, i.e., the duration required for maintaining
secrecy, and TD be the time it takes to test each potential key, by performing
one or more decryptions. Hence, the attacker can test TS /TD keys out of the
key-space containing 2l keys. If TS /TD > 2l , then the attacker can test all
keys, and find the key for certain (with probability 1); otherwise, the attacker
TS
succeeds with probability TD ·2l
. By selecting sufficient key length, we can
ensure that the success probability is as low as desired.
For example, consider the conservative assumption of testing a billion keys
per second, i.e., TD = 10−9 , and requiring the security for three thousand
years, i.e., TS = 1011 , with probability of attack succeeding at most 0.1%. We
find that to ensure security with  these
 parameters against brute force attack,
we need keys of length l ≥ log2 TTDS = log2 (1020 ) < 74 bits.
The above calculation assumed a minimal time to test each key. Of course,
attackers will often be able to test many keys in parallel, by using multiple com-
puters and/or parallel processing, possibly with hardware acceleration. Such
methods were used during 1994-1999 in multiple demonstrations of the vulner-
ability of the Data Encryption Standard (DES) to different attacks. The final
demonstration was exhaustive search completing in 22 hours, testing many
keys in parallel using a $250,000 dedicated-hardware machine (‘deep crack’)
together with [Link], a network of computers contributing their idle
time.
However, the impact of such parallel testing, as well as improvements in
processing time, is easily addressed by reasonable extension of key length. As-
sume that an attacker is able to test 100 million keys in parallel during the
same 10−9 second, i.e., TD = 10−17 . With the same  goals
 and calculation as
above we find that we need keys of length l ≥ log2 TTDS = log2 (1026 ) < 100.
This is far below even the minimal key length of 128 bits supported by the
Advanced Encryption Standard (AES). Therefore, exhaustive search is not a
viable attack against AES or other ciphers with over 100 bits.

Testing candidate keys. Recall that we assume that decrypting arbitrary


ciphertext with an incorrect key should usually result in clearly-invalid plain-
text. Notice our use of the term ‘usually’; surely there is some probability that

26
decryption with the wrong key will result in seemingly-valid plaintext. Hence,
exhaustive search may often not return only the correct secret key. Instead,
quite often, exhaustive search may return multiple candidate keys, which all
resulted in seemingly-valid decryption. In such cases, the attacker must now
eliminate some of these candidate keys by trying to decrypt additional cipher-
texts and discarding a key when its decryption of some ciphertext appears to
result in invalid plaintext.

2.3.2 Table look-up and time-memory tradeoff attacks: a


Generic CPA attack
Exhaustive search is very computation-intensive; it finds the key, on the aver-
age, after testing half of the keyspace. On the other hand, its storage require-
ments are very modest, and almost3 independent of the key space.
In contrast, the table look-up attack, which we next explain, uses O(2l )
memory, where l is the key length, but only table-lookup time. However, this
requires ciphertext of some pre-defined plaintext message, which we denote p∗ .
This can be achieved by an attacker with Chosen Plaintext Attack (CPA) ca-
pabilities, or whenever the attacker can obtain encryptions of some well known
message p∗ . Many communication protocols use predictable, well-known mes-
sages at specific times, often upon connection initialization, which provides the
attacker with encryptions of this predictable known plaintext message p∗ - and
suffice for this attack.
In the table look-up attack, the attacker first precomputes T (k) = Ek (p∗ ) for
every key k. Later, the attacker asks for the encryption of the same plaintext
p, using the unknown secret key k ∗ ; let c∗ = Ek∗ (p∗ ) denote the received
ciphertext. The attacker now looks up c∗ in the table T and identifies all the
keys k such that c∗ = T (k). The number of matching keys is usually one or
very small, allowing the attacker to quickly rule out the incorrect keys, usually
by decrypting some additional ciphertext messages.
The table look-up attack requires O(2l ) storage to ensure O(1) computation,
while the exhaustive search attack uses O(1) storage and O(2l ) computations.
Several more advanced generic attacks allow different tradeoffs between the
computing time and the amount of storage (memory) required for the attack.
The first and most well known time-memory tradeoff attack was presented
by Martin Hellman [91]. Later works presented other tradeoff attacks, such
as the time/memory/data tradeoff of [29] and the rainbow tables technique
of [136]. Unfortunately, we will not be able to cover these interesting attacks,
and the readers are encouraged to read these (and other) papers presenting
them. Note that these tradeoffs use cryptographic hash functions, which we
discuss in chapter 4.
3 Exhaustive search needs storage for the key guesses.

27
2.3.3 Effective key length
Cryptanalysis, i.e., developing attacks on cryptographic mechanisms, is a large
part of the research in applied cryptography; it includes generic attacks such
as these presented earlier in this section, as well as numerous attacks which are
tailored to a specific cryptographic mechanism. This may look surprising; why
publish attacks? Surely the goal is not to help attacks against cryptographic
systems?
Cryptanalysis facilitates two critical decisions facing designers of security
systems which use cryptography: which cryptographic mechanism to use, and
what parameters to use, in particular, which key length to use.
Let us focus first on the key length. All too often, when cryptographic prod-
ucts and protocols are mentioned in the popular press, the key-length in use
is mentioned as an indicator of their security. Furthermore, this is sometimes
used to argue for the security of the cryptographic mechanism, typically by
presenting the number of different key values possible with a given key length.
The number of different keys, is the time required for the exhaustive search
attack, and has direct impact on the resources required by the other generic
attacks we discussed. Hence, clearly, keys must be sufficiently long to ensure
security. But how long?
It is incorrect to compare the security of two different cryptographic sys-
tems, which use different cryptographic mechanisms (e.g., ciphers), by compar-
ing the key length used in the two systems. Let us give two examples:

1. We saw that the general mono-alphabetic substitution cipher (subsec-


tion 2.1.3) is insecure, although its key space is relatively large. We
could easily increase the key length, e.g., by adding more symbols, e.g.,
digits; but this will not significantly improve security.
2. The key length used by symmetric cryptosystems, as discussed in this
chapter, rarely exceed 300 bits, and is usually much smaller - 128 bits
is common; more bits are simply considered unnecessary. In contrast,
asymmetric, public-key cryptography is usually used with longer keys -
often much longer, depending on the specific public key cryptosystem;
see Table 6.1.

It is useful to compare the security of different cryptosystems, when each is


used with a specific key-length - e.g., with comparable efficiency. As explained
above, using the key-length alone would be misleading. One convenient, widely
used measure for the security of a given cryptosystem, used with a specific key
length, is called the effective key length; essentially, this uses exhaustive search
as a measure to compare against.
We say that a cipher using k-bit keys has effective key length l if the most
effective attack known against it takes about 2l operations, where k ≥ l. We
expect the effective key length of good symmetric ciphers to be close to their
real key length, i.e., l should not be ‘much smaller’ cf. to k. For important
symmetric ciphers, any attack which increases the gap between k and l would

28
be of great interest, and as the gap grows, there will be increasing concern with
using the cipher. The use of key lengths which are 128 bits or more leaves a
‘safety margin’ against potential better future attacks, and gives time to change
to a new cipher when a stronger, more effective attack is found.
Note that, as shown in Table 6.1, for asymmetric cryptosystems, there is
often a large gap between the real key length l and the effective key length k.
This is considered acceptable, since the design of asymmetric cryptosystems is
challenging, and it seems reasonable to expect attacks with performance much
better than exhaustive search. In particular, in most public key systems, the
secret key is not an arbitrary, random binary string.
Note that the evaluation of the effective key length, depends on the attack
model; there are often attacker with much smaller effective-key length, when
assuming a stronger attacker model, e.g., CPA compared to KPA or CTO. One
should therefore also take into account the expected attack model.
Also, notice that the effective key length measure compares based on the
time required for the attack; it does not allow for comparing different resources,
for example, time-memory trafeoff.
Normally, we select sufficient key length to ensure security against any
conceivable adversary, e.g., leaving a reasonable margin above effective key
length of say 100 bits; a larger margin is required when the sensitivity period
of the plaintext is longer. The cost of using longer keys is often justified,
considering the damages of loss of security and of having to change in a hurry
to a cipher with longer effective key length, or even of having to use longer
keys.
In some scenarios, however, the use of longer keys may have significant
costs; for example, doubling the key length in the RSA cryptosystem increases
the computational costs by about six. We therefore may also consider the risk
from exposure, as well as the resources that a (rational) attacker may deploy
to break the system. This is summarized by the following principle.

Principle 4 (Sufficient effective key length). Deployed cryptosystems should


have sufficient effective key length to foil feasible attacks, considering the max-
imal expected adversary resources and most effective yet feasible attack model,
as well as cryptanalysis and speed improvements expected over the sensitivity
period of the plaintext.

Experts, as well as standardization and security organizations, publish es-


timates of the required key length of different cryptosystems (and other cryp-
tographic schemes); we present a few estimates in Table 6.1.

2.4 Unconditional security and the One Time Pad


(OTP)
The exhaustive search and table look-up attacks are generic - they do not de-
pend on the specific design of the cipher: their complexity is merely a function
of key length. This raises the natural question: is every cipher breakable, given

29
enough resources? Or, can encryption be secure unconditionally - even against
an attacker with unbounded resources (time, computation speed, storage)?
We next present such an unconditionally secure cipher, the One Time Pad
(OTP). The One Time Pad is often attributed to a 1919 patent by Gilbert
Vernam [166], although some of the critical aspects may have been due to
Mauborgne [26], and in fact, the idea was already proposed by Frank Miller in
1882 [25]; we again refer readers to the many excellent references on history of
encryption, e.g., [102, 155].
The One Time Pad is not just unconditionally secure - it is also an exceed-
ingly simple and computationally efficient cipher. Specifically:
Encryption: To encrypt a message, compute its bitwise XOR with the key.
Namely, the encryption of each plaintext bit, say mi , is one ciphertext
bit, ci , computed as: ci = mi ⊕ ki , where ki is the ith bit of the key.
Decryption: Decryption simply reverses the encryption, i.e., the ith decrypted
bit would be ci ⊕ ki .
Key: The key k = {k1 , k2 , . . .} should consist of independently drawn fair
coins, and its length must be at least as long as that of the plaintext.
Notice, that the key should be - somehow - shared between the parties,
which may be a challenge in many scenarios.
See illustration in Figure 2.8.

c=m⊕k

Figure 2.8: The One Time Pad (OTP) cipher - an unconditionally-secure


stream cipher: c = m ⊕ k (bit-wise XOR).

To see that decryption recovers the plaintext, observe that given ci = mi ⊕


ki , the corresponding decrypted bit is ci ⊕ki = (mi ⊕ki )⊕ki = mi , as required.
The unconditional secrecy of OTP was recognized early on, and established
rigorously in a seminal paper published in 1949 by Claude Shannon [152]. In
that paper, Shannon also proved that every unconditionally-secure cipher must
have keys as long as the plaintext; namely, as long as unconditional secrecy is
required, this aspect cannot be improved. For proofs and details, please consult
a textbook on cryptology, e.g., [159].
The cryptographic literature has many beautiful results on unconditional
security. However, it is rarely practical to use such long keys, and in practice,

30
adversaries - like everyone else - have limited computational abilities. There-
fore, in this course, we focus on computationally-bounded adversaries.
While the key required by OTP makes its use rarely practical, we next show
a computationally-secure variant of OTP, where the key can be much smaller
than the plaintext, and which can be used in practical schemes. Of course, this
variant is - at best - secure only against computationally-bounded attackers.

2.4.1 OTP is a Stateful Cryptosystem / Stream Cipher


The attentive reader may have noticed that OTP does not conform to our
definition of shared-key cryptosystem in Definition 2.1. Specifically, that def-
inition did not allow the encryption process to maintain a synchronized state
between the parties, such as the counter i identifying the bits already used in
the OTP design - which is, therefore, a stateful design. We therefore extend the
definition so that the encryption and decryption operations may have another
input and another output. The additional input is the current state, and the
additional output is the next state; both are from a set S of possible states.

Definition 2.2 (Stateful shared-key cryptosystem). A stateful shared-key


cryptosystem is a pair of keyed algorithms, hE, Di. Algorithm E (and D)
receives as input plaintext (ciphertext for D), and input state s ∈ S, where S
is the set of possible states. Algorithm E (and D) outputs ciphertext (plaintext
for D), and output state s0 ∈ S. We say that hE, Di ensures correctness if
for every message m ∈ M , key k ∈ K and input state s ∈ S holds that if
(c, s0 ) = Ek (m, s) then (m, s0 ) = Dk (c, s).
Exercise 2.8. Define the stateful encryption and decryption functions hE, Di
for the OTP cipher.
Solution: We use the index i of the next bit to be encrypted as the state,
initialized with i = 1. Namely, encryption Ek (mi , i) returns (mi ⊕ ki , i + 1),
and decryption Dk (ci , i) returns (ci ⊕ ki , i + 1).

Stream ciphers. The one-time pad (OTP) is often referred to as a stream


cipher. We use the term stream ciphers to refer to stateful cryptosystems that
use bit-by-bit encryption process, i.e., a stream cipher is a process of mapping
each plaintext bit mi to a corresponding ciphertext bit ci 4 . Since stream ci-
phers map each plaintext bit to a ciphertext bit, and we require decryption
to be correct (recover the plaintext), then the mapping from plaintext to ci-
phertext cannot be randomized; but obviously, it also cannot be the same for
all bits, or decryption would be trivial. It follows that stream ciphers must
be stateful. This certainly holds for the One-Time Pad, where not only the
parties must share a key as long as all plaintext bits, they must also maintain
4 Other authors use the term stream cipher also for cryptosystems that use byte-by-byte
or block-by-block encryption, essentially as a synonym to stateful encryption.

31
an exact, synchronized count of the number of key bits used so far, to pre-
vent reuse of key bits (leading to exposure) and incorrect decryption. Hence,
sStream ciphers are inherently stateful.
Stream ciphers are often used in applied cryptography, and esp. in hardware
implementations, mainly due to their simple and efficient hardware implemen-
tation. Rarely, we use the OTP (or another unconditionally-secure cipher), but
much more commonly, a stream cipher with a bounded-length key, providing
‘only’ computational security. In the following section we introduce pseudo-
random generators (PRG) and pseudo-random functions (PRF), and show how
to use either of them to design a a bounded key length stream cipher.

2.5 Pseudo-Randomness, Indistinguishability and


Asymptotic Security
Randomness is widely used in cryptography - for example, the One Time Pad
cipher (§ 2.4) uses random keys to ensure unconditional secrecy. In this section,
we introduce pseudo-randomness,a central concept in cryptography, and three
types of pseudo-random schemes: pseudo-random generator (PRG), pseudo-
random function (PRF) and pseudo-random permutation (PRP). We also in-
troduce the central technique of indistinguishability test, which is central to the
definitions of these three pseudo-random schemes - as well as to the definition
of secure encryption, which we present later.

2.5.1 Pseudo-Random Generators and their use for


Bounded Key-length Stream Ciphers
In this subsection we introduce the Pseudo-Random Generator (PRG). A PRG
is one of the simpler cryptographic definition, and hence we consider it a good
choice for the first definition; however, it is still not that easy. Hence, in
this subsection we only introduce PRGs informally, focusing on their classical
application: a stream cipher with bounded key length.
We have already seen one stream cipher: the One Time Pad (OTP). The
OTP is unconditionally-secure - but requires the parties to share a secret key
as long as all the plaintext bits they may need to encrypt. Stream ciphers
with bounded key-length cannot be unconditionally-secure; instead, they will
be ‘only’ computationally secure - i.e., secure only assuming that the attacker
has limited computational capabilities.
In this section, we will see one method to implement a bounded-key-length
stream cipher, which is based on a cryptographic function which is called a
pseudo-random generator (PRG). PRGs have other important applications,
and are one of the cryptographic mechanism whose definition is least-complex;
furthermore, they are a good way to introduce the more complex - and even
more important - cryptographic mechanisms of pseudo-random function (PRF),
pseudo-random permutation (PRP) and block cipher, and the important sub-
jects of pseudorandomness and indistinguishablility test.

32
PRG: intuitive definition. In this subsection, we only introduce PRGs
informally, focusing on their use in the construction of bounded-key-length
stream ciphers. Given any random input string k, a PRG fP RG outputs a
longer string, i.e., (∀k ∈ {0, 1}∗ ) (|fP RG (k)| > |k|). Furthermore, if k is a
random string (of |k| bits), then fP RG (k) is pseudo-random. Intuitively, this
means that fP RG (k) cannot be efficiently distinguished from a true random
string of the same length |fP RG (k)|. We define these concepts of efficient
distinguishing and PRG precisely quite soon, in subsection 2.5.2.

Building stream cipher from a PRG. To obtain a stream cipher, we


require a PRG which produces a pseudo-random string as long as the plaintext
5
. We then XOR each plaintext bit with the corresponding pseudo-random bit,
as shown in Figure 2.9.

fP RG (·)

pad = fP RG (k); |pad| = |c| = |m| > |k|.


m

c = m ⊕ fP RG (k)

Figure 2.9: PRG-based Stream Cipher. The input to the PRG is usually called
either key or seed; if the input is random, or pseudo-random, then the (longer)
output string is pseudo-random. The state includes the current bit index i.

The pseudo-random generator stream cipher is very similar to the OTP;


the only difference is that instead of using a truly random sequence of bits
to XOR the plaintext bits, we use the output of a Pseudo-Random Generator
(PRG). If we denote the ith output bit of fP RG (k) by fP RG (k)i , we have that
the ith ciphertext bit ci , is defined as: ci = mi ⊕ fP RG (k)i . This is a shared-
key stream cipher in which k is the shared key, quite similar to the OTP.
Specifically, the state is the index of the bit i, encryption Ek (mi , i) returns
(mi ⊕ fP RG (k)i , i + 1), and decryption Ds (ci , i) returns (ci ⊕ fP RG (k)i , i + 1).
Note that this may require us to compute the value of fP RG (k) each time we
need a specific bit, or to store (all or parts of it).
5 The output of some PRGs may be only slightly longer than their input, e.g., one bit

longer. However, we can use such a PRG to construct another PRG, whose output length is
longer (as a function of its input length). The details and proof are beyond our scope; see,
e.g., [79]).

33
In subsection 2.5.2 below, we discuss PRGs, whose definition is based on
the PRG indistinguishability test. Let us first introduce the concept of indis-
tinguishability test, one of the ingenious concepts introduced by Alan Turing.

2.5.2 The Turing Indistinguishability Test


Intuitively, a PRG is an efficient algorithm, whose input is a binary string s,
called seed (or sometimes key); if the input is either random or pseudo-random,
then the (longer) output string is pseudo-random. In order to turn this intuitive
description into a definition, we first define clearly the notions of ‘efficient’
and ‘pseudorandom’. We discuss these notions in the following subsections;
in this subsection, we first present the ingenious but non-trivial concept of
indistinguishability test, which is key to the notion of pseudorandomness - and
to many definitions in cryptography.
The first indistinguishability test was the Turing Indistinguishability test,
proposed by Alan Turing in 1950, in a seminal paper [162], which lay the
foundations for artificial intelligence. Turing proposed the test, illustrated in
Figure 2.10, as a possible definition of an intelligent machine. Turing referred
to this test as the imitation test; another name often used for this test is simply
the Turing test.

Figure 2.10: The Turing Indistinguishability Test. A machine is considered


intelligent, if a distinguisher (judge) cannot determine in which box is the
machine and in which is a human. Turing stipulated that communication
between the distinguisher and the boxes would only be in printed form, to
avoid what he considered ‘technical’ challenges such as voice recognition.

Many cryptographic mechanisms are defined using indistinguishability tests.


These tests are similar, in their basic concept, to the Turing indistinguishabil-
ity test. The following subsection presents the first such test, which is testing
for the important property of pseudorandomness.

2.5.3 PRG indistinguishability test


We now return to the discussion of pseudorandomness, and define the PRG
indistinguishability test, illustrated in Figure 2.11. The pseudorandom test is
similar to the Turing indistinguishability test in Figure 2.10, in the sense that
a distinguisher is asked to identify which is the ‘true’ (intelligent person in

34
Turing test, and random sequences here) and who is the ‘imitation’ (machine
in Turing test, and sequences output by function f here).
Intuitively, a pseudo-random generator is a function f whose input is a
$
‘short’ random bit string x ← {0, 1}n , and whose output a longer string f (x) ∈
{0, 1}ln s.t. ln > n, which is pseudo-random - i.e., indistinguishable from a
random string (of the same length ln ).
But what does it mean for the output to be indistinguishable from random?
This is defined by the PRG indistinguishability test, which we next define -
and which, in concept, quite resembles the Turing indistinguishability test,
although the details are different. The similarity can be seen from comparing
the illustration of the PRG indistinguishability test in Figure 2.11, to that of
the Turing test in Figure 2.10.

Figure 2.11: Intuition for the Pseudo-Random Generator (PRG) Indistin-


guishability Test. Intuitively, f : {0, 1}∗ → {0, 1}∗ is a (secure) pseudo-random
generator (PRG), if an efficient distinguisher D can’t effectively distinguish
between f (x), for a random input x, and a random string of the same length
|f (x)|.

In order to turn this intuition into a definition of a (secure) Pseudo-Random


Generator (PRG), we must specify precisely the capabilities of the distinguisher
and criteria for the outcome of the experiment, i.e., when would we say that
f is indeed a (good/secure) PRG. We next discuss these two aspects in the
following subsection, where we finally present definitions for (secure) PRG.

2.5.4 Defining security: in general and for PRG


We now finally define (secure) PRG. We first define the distinguisher capabili-
ties; next, we define the advantage εP RG
D,f (n) of D for function f and inputs of
length n; and finally we define a (secure) PRG.

35
Distinguisher capabilities: Efficient algorithm (PPT). We model the
distinguisher as an algorithm, denoted D, which receives the a binary string
- either a random string or the ‘pseudorandom’ output of the PRG f - and
outputs its evaluation, which should be 0 if given truly random string, and 1
otherwise, i.e., if the input is not truly random. The distinguisher algorithm
D has to be efficient (or PPT). The terms efficient algorithm and PPT (Prob-
abilistic Polynomial Time) algorithm are crucial to definitions of asymptotic
security, which we use in this textbook6 .
Definition 2.3. We say that an algorithm A is efficient if its running time is
bounded by some polynomial in the length of its inputs. An efficient algorithm
may be randomized, in which case we say that it is in the set of Probabilistic
Polynomial Time (PPT) algorithms (A ∈ P P T ).
These definitions are widely used in computer science, in particular in the
theory of complexity. A useful property is that the set of efficient/PPT algo-
rithms remains the ssame for different ‘reasonable’ computational models, such
as different sets of basic operations and different sizes and types of storage.

PRG: the advantage of D for f . Before we define the criteria for an


function f to be considered a (secure) Pseudo-Random Generator (PRG), we
notice that by simply randomly guessing, the distinguisher may succeed with
probability 21 . Clearly, succeeding with probability 12 is, therefore, meaningless;
and if the distinguisher succeeds with less than 12 , then it is better to return
the inverse conclusions.
We therefore define the advantage εP RG
D,f (n) of D for function f and inputs
of length n, as the probability that D outputs 1 (correctly) when given f (x)
$
where x ← {0, 1}n is a random string of length n, minus the probability that
D outputs 1 (incorrectly) when given a truly random string of the same length.

Length-uniform assumption. We simplify the definitions by assuming that


f is a length-uniform function, i.e., for every input of length n, the output would
be of the same length ln .
Definition 2.4. Let f : {0, 1}∗ → {0, 1}∗ be a length-uniform function, i.e., if
|x| = n then |f (x)| = ln , and let D be an algorithm. The PRG-advantage of D
for f is denoted εP RG
D,f (n) and defined as:

εP RG
D,f (n) ≡ Pr [D (f (x))] − Pr [D (y)] (2.10)
$ $
x←{0,1}n y ←{0,1}ln

The probabilities in Equation 2.10 are computed over uniformly-random


n-bit binary string s (seed), uniformly-random ln = |f (1n )|-bit binary string
y, and uniformly-random coin tosses of the distinguisher D, if D uses random
bits. Note that we the length of the output of f depends only on the length of
the input, hence, our use of ln .
6 We also briefly also concrete security.

36
Negligible functions. Intuitively, f is a (secure) PRG, if all efficient (PPT)
distinguishers D have only negligible advantage εP RG
D,f (n). But what does it
mean for the advantage to be negligible? And how can we hope for the advan-
tage to be negligible or even small for very short input (seed)? If n is small,
then the distinguisher could compute the outputs for all possible inputs, i.e.,
the set F [n] = {f (x)}x∈{0,1}n ; and whenever given a string which is not in
F [n], the distinguisher can safely output 0, since it knows that this cannot be
an output of f (for any n-bit input).
There are two main approaches of dealing with this challenge in the crypto-
graphic literature, usually referred to as asymptotic security and concrete secu-
rity. We will focus on asymptotic security, since we believe that the concrete-
security definitions are a bit harder to understand and work with; as with
other advanced topics, readers are encouraged to learn them in more advanced
courses and textbooks, as well as directly in the relevant papers, e.g. [12].
Asymptotic-security requires the advantage function εP RG
D,f (n) to be negli-
gible. What does it mean when we say that a function ε : N → R is negligi-
ble? Clearly, we expect such function to converge to zero for large input, i.e.:
limn→∞ ε(n) = 0. But a negligible function is a function that converges to zero
in a very strong sense - faster than any polynomial7 .
Definition 2.5 (Negligible function). A function ε : N → R is negligible, if
for every non-zero polynomial p(n) holds:

ε(n)
lim =0 (2.11)
n→∞ p(n)
We use N EGL to denote the set of all negligible functions; we sometimes write
N EGL(n) to emphasize that the function is negligible in input parameter n.
Note that any non-zero polynomial function ε(n) - even if its values appear
to us tiny - cannot fit this definition of a negligible function, as the definition
ε(n) n→∞
requires p(n) → 0 for any polynomial p(n), and for the polynomial p(n) =
ε(n)
ε(n) we clearly have p(n) = 1.

Equivalent definition using nc instead of p(n). An equivalent definition


is to require that for any c, for sufficiently large n holds: |ε(n)| < nc ; this is
sometimes easier to use. The interested reader should be able to prove that
the two definitions are equivalent.
Exercise 2.9. which of the following functions are negligible? Prove your
responses: (a) fa (n) = 10−8 · n−10 , (b) fb (n) = 2−n/2 , (c) fc (n) = n!
1
, (d)
(−1)n
fd (n) = n .
7 Of course, we exclude the trivial, always-zero polynomial, p(n) = 0. Also, note that we

find it more convenient and natural to consider the absolute value of the function and poly-
nomial, i.e., we do not consider a function with large negative values to be negligible; other
authors require only the positive value of the function to be small, but the two definitions
are equivalent anyway for our purposes.

37
Working with negligible functions is a useful simplification; here is one
convenient property, which shows that if an algorithm has negligible probability
to ‘succeed’, then running it a polynomial number of times will not help - the
probability to succeed will remain negligible.

Lemma 2.1. Consider negligible function  : N → R, i.e.,  ∈ N EGL. Then


for any polynomial p(n), the functions g(n) = (p(n)), f (n) = p(n) · (n) are
also negligible, i.e., f (n), g(n) ∈ N EGL.

Asymptotic-secure definition of PRG. Finally, let’s define a (secure)


PRG. The definition assumes both the PRG and the distinguisher D are
polynomial-time, meaning that their running time is bounded by a polyno-
mial in the input length. Polynomial-time algorithms are often referred to
simply by the term efficient algorithms. Applied cryptography and most of the
theoretical cryptography only deal with polynomial-time (efficient) algorithms,
schemes and adversaries. Note that the PRG must be deterministic, but the
distinguisher D may be probabilistic (randomized).

Definition 2.6 (Pseudo-Random Generator (PRG)). A length uniform func-


tion f : {0, 1}∗ → {0, 1}∗ , s.t. (∀x ∈ {0, 1}n ) ln = |f (x)|, is a (Secure)
Pseudo-Random Generator (PRG), if it is efficiently-computable (f ∈ P P T ),
length-increasing (ln > n) and ensures indistinguishability, i.e., for every dis-
tinguisher D ∈ P P T , the advantage of D for f is negligible, i.e., εP RG
D,f (n) ∈
N EGL, where εP RG
D,f (n) is defined as in Equation 2.10.

The term ‘secure’ is often omitted; i.e., when we say that algorithm f is a
pseudo-random generator (PRG), this implies that it is a secure PRG.

Exercise 2.10. Let x ∈ {0, 1}n . Show that the following are not PRGs: (a)
+ parity(x), (b) fb (x) = 3x mod 2n+1 (using standard binary en-
fa (x) = x +
coding).

Solution for part (b): Notice that here we view x as a number encoded in
binary, whose value can be between 0 and 2n − 1.
A simple distinguisher Db for fb is: Db (y) outputs 1 (i.e., pseudo-random)
if y = 0 mod 3, otherwise, it outputs 0 (i.e., random). Why does this work?
If y is a random n + 1-bits string, then the probability of y = 0 mod 3 is
1
3 (actually, tiny bit less).
n+2
And what if y = fb (x), i.e., y = 3x mod 2n+1 ? If x < 2 3 , then 3x <
2n+2
2n+2 , i.e., y = 3x, and hence y = 0 mod 3. And Pr $ x< 3 ≥ 23 .
x←{0,1}n
Therefore, if y = fb (x), i.e., y = 3x mod 2n+1 , then y = 0 mod 3 with
probability at least 32 .
2 1 1
Therefore, εP RG
Db ,fb (n) ≥ 3 − 3 = 3 , which is clearly non-negligible.

38
2.5.5 Secure PRG Constructions
Note that we did not present a construction of a secure PRG. In fact, if we
could have presented a provably-secure construction of a secure PRG, satisfying
Def. 2.6, this would have immediately proven that P 6= N P , solving the most
well-known open problems in the theory of complexity. Put differently, if P =
N P , then there cannot be any secure PRG algorithm (satisfying Def. 2.6).
?
Since P = N P is believed to be a very hard problem, proving that a given
construction is a (secure) PRG must also be a very hard problem, and unlikely
to be done as a side-product of proving that some function is a PRG. Similar
arguments apply to most of the cryptographic mechanisms we will learn in this
course, including secure encryption, when messages may be longer than the
key. (The One Time Pad (OTP) is secure encryption, but its key is as long as
the plaintext.)
What is possible is to present a reduction-based construction of a PRG,
namely, construction of PRG from some other cryptographic mechanism, along
with a proof that the PRG is secure if that other mechanism is ‘secure’. For
example, see [79] for a construction of PRG f from a different cryptographic
mechanism called one-way function (which we discuss in § 4.4), and a reduction
proof, showing if the construction of f uses a OWF fOW F , then the resulting
function f would be a PRG. We will also present few reduction proofs, for
example, later in this section we prove reductions which construct a PRG from
other cryptographic mechanisms such as a Pseudo-Random Function (PRF),
see Exercise 2.13, and a block-cipher. Courses, books and papers dealing with
cryptography, are full of reduction proofs, e.g., see [79, 80, 159].
Unfortunately, there is no proof of the existence of any of these - one-way
function, PRF, block-cipher or most other cryptographic schemes. Indeed,
such proofs would imply P 6= N P . Still, reduction proofs are the main method
of ensuring the security of most cryptographic mechanisms - by showing that
they are ‘at least as secure’ as another cryptographic mechanism, typically
a mechanism whose security is well established (e.g., by failure of extensive
cryptanalysis efforts).
For example, one-way functions appear ‘weaker’ than PRGs, so may be
easier to design securely - and then use the reduction to obtain PRG. As a
more practical example, block-ciphers are standardized, with lots of cryptanal-
ysis efforts; therefore, block ciphers are a good basis to use for building other
cryptographic functions.
Let us give an important example of reduction-based proof which is specific
to PRGs. This is a construction of a PRG whose output is significantly larger
than its input, from a PRG whose output is only one-bit longer than its in-
put. Unfortunately, the construction and proof are beyond our scope; see [79].
However, the following exercise (Exercise 2.11) proves a related - albeit much
simpler - reduction, showing that a PRG G from n bits to n + 1-bits, gives also
a PRG G0 from n + 1 bits to n + 2-bits, simply by exposing one bit. In other
words, this shows that a PRG may expose one (or more) bits - but remain a
PRG.

39
Exercise 2.11. Let f : {0, 1}n → {0, 1}n+1 be a secure PRG. Is f 0 : {0, 1}n+1 →
{0, 1}n+2 , defined as f 0 (b+ +f (x), where b ∈ {0, 1}, also a secure PRG?
+x) = b+
Solution: Yes, if f is a PRG then f 0 is also a PRG. First, recall the PRG-
advantage (Equation 2.10) for distinguisher D, using ln = n + 1:

εP RG
D,f (n) ≡ Pr [D (f (x))] − Pr [D (r)] (2.12)
$ $
x←{0,1}n r ←{0,1}n+1

Next, rewrite Equation 2.10 for f 0 and distinguisher D0 , by substituting f


by f 0 , x by x0 , n by n + 1 and ln by n + 2:

εP RG
D 0 ,f 0 (n + 1) ≡ Pr [D0 (f 0 (x0 ))] − Pr [D0 (r0 )] 6∈ N EGL(n)
$ $
x0 ←{0,1}n+1 r 0 ←{0,1}n+2
(2.13)
We next present a simple construction of a distinguisher D (for f ), using,
as a subroutine (oracle), a given distinguisher D0 (for f 0 ):
n o
$
D(y) ≡ Return D0 (b + + y) where b ← {0, 1} (2.14)

Clearly D is efficient (PPT) if and only if D0 is efficient (PPT).


0
We prove that εP RG P RG
D 0 ,f 0 (n + 1) = εD,f (n), therefore, f is a PRF if and only
if f is a PRG. We begin by developing the first component of Equation 2.12:
h i
$
Pr [D(f (x))] = Pr D0 (b +
+ f (x))|b ← {0, 1} (2.15)
$ $
x←{0,1}n x←{0,1}n
h i
$
= Pr D0 (f 0 (b +
+ x))|b ← {0, 1} (2.16)
$
x←{0,1}n

= Pr D0 (f 0 (x0 ))) (2.17)


$
x0 ←{0,1}n+1

We now develop the other component of Equation 2.12:


h i
$
Pr [D(r)] = Pr D0 (b +
+ r)|b ← {0, 1} (2.18)
$ $
r ←{0,1}n+1 r ←{0,1}n+1

= Pr D0 (r0 ) (2.19)
$
r 0 ←{0,1}n+2

Now substitute the two components in Equation 2.12:

εP RG
D,f (n) ≡ Pr [D (f (x))] − Pr [D (r)] (2.20)
$ $
x←{0,1}n r ←{0,1}n+1

= Pr D0 (f 0 (x0 ))) − Pr D0 (r0 ) (2.21)


$ $
x0 ←{0,1}n+1 r 0 ←{0,1}n+2

≡ εP RG
D 0 ,f 0 (n + 1) (2.22)
0
Hence, εP RG P RG
D,f (n) = εD 0 ,f 0 (n + 1), namely, f is a PRG if and only if f is a
PRG.

40
Feedback Shift Registers (FSR). There are many proposed designs for
PRGs. Many of these are based on Feedback Shift Registers (FSRs), with
a known linear or non-linear feedback function f , as illustrated in Fig. 2.12.
For Linear Feedback Shift Registers (LFSR), the feedback function f is sim-
ply the XOR of some of the bits of the register. Given the value of the ini-
tial bits r1 , r2 , . . . , rl of an FSR, the value of the next bit rl+1 is defined as:
rl+1 = f (r1 , . . . , rl ); and following bits are defined similarly: (∀i > l)ri =
f (ri−l , . . . , ri−1 ).
FSRs are well-studied with many desirable properties. However, by defini-
tion, their state is part of their output. Hence, they cannot directly be used
as cryptographic PRGs. Instead, there are different designs of PRGs, often
combining multiple FSRs (often LFSRs) in different ways. For example, the
A5/1 and A5/2 stream ciphers defined in the GSM standard combine three
linear shift registers.

f(…)

r10 r9 r8 r7 r6 r10 r5 r4 r3 r2 r1

Figure 2.12: Feedback Shift Register, with (linear or non-linear) feedback func-
tion f ().

Feedback shift registers are convenient for hardware implementations. There


are other PRG designs which are designed for software implementations, such
as the RC4 stream cipher.
There are many cryptanalysis attacks on different stream ciphers, including
the three mentioned above (RC4 and the two GSM ciphers, A5/1 and A5/2);
presenting details of these ‘classical’ ciphers and attacks on them is beyond our
scope, see, e.g., [7] for GSM’s A5/1 and A5/2, and [107, 124] for RC4.
Instead of presenting a cryptanalytical attack on a PRG, we next give an
example of an attack against vulnerable deployment of a PRG in a system.
Specifically, we present an attack against MS-Word 2002, which exploits a
vulnerability in the usage of the RC4 PRG, rather than in its design. Namely,
this attack could have been carried out if any PRG was used incorrectly as in
MS-Word 2002.

41
Example 2.1. MS-Word 2002 used RC4 for document encryption, in the fol-
lowing way. The user provided password for the document; that password was
used as a key to the RC4 PRG, producing a long pseudo-random string which is
referred to as P ad, i.e., P ad = RC4(password). When the document is saved
or restored from storage, it is XORed with P ad. This design is vulnerable; can
you spot why?
The vulnerability is not specific in any way to the choice of RC4; the problem
is in how it is used. Namely, this design re-uses the same pad whenever the
document is modified - a ‘multi-times pad’ rather than OTP. The plaintext
Word documents contain sufficient redundancy, to allow decryption. See details
in [126].

The fact that the vulnerability is due to the use of RC4 and not to crypt-
analysis of RC4, is very typical of vulnerabilities in systems involving cryp-
tography. In fact, cryptanalysis is rarely the cause of vulnerabilities - system,
configuration and software vulnerabilities are more common.

2.5.6 Random functions


One practical drawback of stream ciphers is the fact that they require state,
to remember how many bits (or bytes) were already output. What happens if
state is lost? Can we eliminate or reduce the use of state? It would be great to
allow recovery from loss of state, or to avoid the need to preserve state when
encryption is not used, e.g., between one message and the next. In the next
section, we introduce another pseudo-random cryptographic mechanism, called
a Pseudo-Random Function (PRF), which has many applications in cryptog-
raphy - including stateless, randomized shared-key cryptosystems. However,
before we introduce pseudo-random functions, let us first discuss the ‘real’ ran-
dom functions.

Process for selecting a random function. . Consider an arbitrary domain


D and range R; how can we select a random function from D to R? One way
is as follows: for each input x ∈ D, select a random element in R to be f (x),
namely:
$ $
(∀x ← D)f (x) ← R (2.23)
In a typical case, both D and R are binary strings, i.e., for some integers
n, m, we have R = {0, 1}n , D = {0, 1}m . In this case, there are 2m elements
in the domain D = {0, 1}m , i.e., |D| = 2m , and each random selection of an
element in the range R = {0, 1}n required n random bits; in total, to select a
random function we need n · 2m coin flips. This process can be easily done for
small domain and range, by randomly choosing the mapping and writing it in
a table, e.g., as in Table 2.2 and Exercise 2.12.

Exercise 2.12. Using a coin, select randomly the functions below; count your
coin flips.

42
Function Domain Range 00 01 10 11 coin-flips
f1 {0, 1}2 {0, 1}
f2 {0, 1}2 {0, 1}3

Table 2.2: Do-it-yourself table for selecting f1 , f2 randomly, in Exercise 2.12.

1. f1 : {0, 1}2 → {0, 1} (use a copy of Table 2.2)


2. f2 : {0, 1}2 → {0, 1}3 (use a copy of Table 2.2)
3. f3 : {0, 1}3 → {0, 1}2 (create your own table)

How many coin flips required were required for each function?

Notations. We denote the set of all functions from domain D to range R by


$
{D → R}, and the choice of a random function f from D to R by: f ← {D →
R}.
In the typical case where the domain is large, the choice of a random func-
tion requires excessive number of random bits (coin-flip operations). Selecting
and storing such a function is difficult, as would be sending it - which is re-
quired if we want multiple parties to use the same random function, which is
actually very useful for cryptographic applications - such as a stateless stream
cipher. This motivates the use of Pseudo-Random Functions, which we discuss
in the next subsection. However, let us first discuss random functions a bit
more, to improve our understanding of this important but subtle concept.

Stream cipher using a random function. Figure 2.13a presents the de-
sign of a stream cipher using a randomly-chosen function f which is shared by
the two parties and kept secret. The design could be used either for bit-by-bit
encryption, with the random function mapping each input i to a single bit f (i),
which is then XORed with the corresponding message bit mi to form the ci-
phertext ci = mi ⊕ f (i). Alternatively, both input messages mi and the output
of the random function could be strings of some length, e.g., n, and then each
invocation of the random function will produce n pad bits, XORed with n pad
bits to produce n cipher bits.
One drawback of the use of stream ciphers is the need to maintain syn-
chronized state between sender and recipient. This refers to the (typical) case
where the input is broken into multiple messages, each provided in a separate
call to the encryption device. To encrypt all of these messages using a stream
cipher - OTP or the design in Figure 2.13a - the two parties must maintain the
number of calls i (bits or strings of fixed length). To avoid this requirement,
we can use randomized encryption, as we next explain.

Stateless, randomized encryption using a random function f . An


even more interesting application of a random function, is to avoid the need

43
i $
ri ← {0, 1}n

f (·)
f (·)

mi
mi

ci = mi ⊕ f (i) z }| {
ci = (mi ⊕ f (ri ), ri )
(a) Using random function f to con-
struct a stream-cipher for stateful (b) Using random function f to con-
encryption, with limited storage (a struct stateless, randomized encryp-
counter). Does not require random- tion. This construction has high com-
ization and communication-optimal munication overhead: n (random) bits
(|ciphertext| = |plaintext|. per plaintext bit.

Figure 2.13: Bit-wise encryption using random function f (·). We later improve:
replace random function by Pseudo-Random Function (PRF), and use block-
wise operations to reduce overhead; e.g., counter-mode (stateful) and OFB-
mode (randomized).

for the two parties to maintain state (of the message/bit counter i). To do this,
we use the random function to construct randomized encryption, as shown in
Figure 2.13b.
To encrypt each plaintext message mi , we choose a string ri of n random
$
bits, i.e., ri ← {0, 1}n . The ciphertext is the pair Ef (mi ) ≡ (mi ⊕ f (ri ), ri ).
Note our use of the function f as the key to the encryption; indeed, we can
think of the table containing our random mapping of each of the 2n strings in
the domain {0, 1}n to a bit, as the function - and as the key to the encryption
process.

Security. As with every cryptographic mechanism, we ask: are the designs


in Figure 2.13b and Figure 2.13a secure? Intuitively, the design of Figure 2.13a
is secure as long as we never re-use the same counter value, and the design
of Figure 2.13b is secure as long as we use a sufficient number of random
bits. This is actually correct; but it isn’t trivial to understand why. Let us
focus on the slightly more complex case of randomized encryption (the design
of Figure 2.13b); the argument for the counter-based, stateful stream cipher
design (fig:random:fun2stream:cipher) follows similarly.
An obvious concern is that an attacker may try to predict the value of f (ri )
used to encrypt a message (or bit) mi , from previously-observed ciphertexts
{cj }j<i . Let us assume, to be safe, that the attacker knows all the correspond-

44
ing plaintexts mj , allowing the attacker to find all the corresponding mappings
{f (rj )}j<i . Using this information, can the attacker guess f (ri )?
Now, clearly, if ri ∈ {(rj }j<i , then the attacker knows f (ri ) and can expose
mi . To address this, we clearly must use sufficiently-long random strings. But
what if ri 6∈ {(rj }j<i ?
To answer this, reconsider the process of selecting a random function, as
you did in Exercise 2.12. What we did was to select the entire table - mapping
from every element in the domain to a random element in the range - before
we applied the random function. However, notice that it does not matter if,
instead, we chose the mapping for each element ri ∈ D in the domain only on
the first time we need to compute f (ri ). Think it over!
This means, that if ri 6∈ {(rj }j<i , then the attacker does not learn anything
about f (ri ), even if it is given all of the {f (rj )}j<i values. Until we select
(randomly) the value of f (ri ), the attacker cannot know anything about it.
Therefore, the only concern we have is with the case that ri ∈ {(rj }j<i . Let
us return to this issue; what is the probability of that happening? Well since
each napping is selected randomly, simply i−1 |D| . Focusing on the typical case
where the input domain is {0, 1}n , this is i−1 2n
Therefore, if n is ‘sufficiently large’, then the maximal number of observa-
tions by the attacker would still be negligible compared to 2n - and 2in would be
negligible. For example, if the attacker can observe a million encryptions, we
‘just’ need 2n to be way larger than one million ; and considering that a million
is less than 220 , using n significantly larger than 20 - definitely for n = 120 or
even less - seems safe enough.

Efficiency. So the scheme is secure - provided n is ‘sufficiently large’, e.g.,


80 or more. However, is it also efficient? To implement the scheme, we need
to compute the random function f ; since we want a recipient to decipher our
messages, we need to compute and send all of f before we begin sending ci-
phertexts. However, this requires us to flip and (securely) share 2n bits - for
n = 80 (or more). Unfortunately, that’s clearly impossible. Fortunately, on
the other hand, we can use a Pseudo-Random Function (PRF) instead of the
random function, providing an efficient solution which is still secure against
computationally-limited adversaries.
Note that there is another efficiency concern with the scheme: is it really
necessary to send a new random string for each bit? Of course not. We can
address this concern in two ways:
Large range R = {0, 1}l : this allows us to use the same random string r or
counter i, to encrypt a block of l plaintext bits, by bitwise XOR of the l
message bits with the corresponding l-bit output of f (r) (or f (i)). In this
way, the n bits of r allow encryption of l bits of plaintext. See Figure 2.14.
Use f (r) as seed of a PRG: if we use a sufficiently large range, a PRG
could ‘expand’ f (r) into as many bits as required to bit-wise XOR with
the plaintext: Ef (m) = (r, P RG(f (r)) ⊕ m). In this way, the n bits of r

45
allow encryption of arbitrarily long plaintext m - requiring new n random
bits only to encrypt new plaintext, and only if the state (of the PRG)
was not retained. This is essentially what is done by the Output Feedback
(OFB) mode of operation, which we see later on, except that the OFB
mode also implements the PRG using the PRF, instead of using two sep-
arate functions (a PRF and a PRG). Figure 2.15 shows this design, using
a PRF Fk .

$
i ri ← {0, 1}n

f (·) f (·)

/n /n
mi / mi /
n n

/n /n

z }| { z }| {
ci = (mi ⊕ f (i)) ci = (mi ⊕ f (ri ), ri )
(a) Stateful block encryption with (b) Stateless, randomized block en-
Random Function f (·). cryption with Random Function f (·).

Figure 2.14: Block (n-bits) encryption using a Random Function f (·). Use
only one function application for n plaintext bits.

Finally, we present a useful property of random functions, which may help


a bit to further clarify this important, subtle notion.
Lemma 2.2. Let f 0 , f 00 : D → R be two functions from domain D to range
R. Define f 0 ⊕ f 00 to be a function from D to R s.t. for every x ∈ D holds
f 0 ⊕ f 00 (x) = f 0 (x) ⊕ f 00 (x). Then, if either f 0 or f 00 is random, then, for every
f 00 (respectively, f 0 ), the function f 0 ⊕ f 00 is also a random function from D to
R.

Proof: We prove that if f 0 is a random function, then f 0 ⊕ f 00 is also a


random function; the proof for f 00 is essentially the same.
Since f 0 is a random functions, we know that:
1
(∀x ∈ D)(∀y ∈ R) Pr0 [f 0 (x) = y] = (2.24)
f |R|

46
Consider an arbitrary (given) function f 00 : D → R. We have that:

(∀x ∈ D)(∀y ∈ R) Pr0 [f 0 ⊕ f 00 (x) = y] = (∀x ∈ D)(∀y ∈ R) Pr0 [f 0 (x) ⊕ f 00 (x) = y]


f f

= (∀x ∈ D)(∀y ∈ R) Pr0 [f 0 (x) = y ⊕ f 00 (x)]


f

= (∀x ∈ D)(∀y ∈ R) Pr0 [f 0 (x) = y 0 ]


0
f
1
=
|R|

Hence, f 0 ⊕ f 00 is a random function.

2.5.7 Pseudo-Random Functions (PRFs)


A Pseudo-Random Function (PRF) is an efficient substitute to the use of a
random function, which ensures similar properties, while requiring the gener-
ation and sharing of only a short key. The main limitation is that PRFs are
secure only against computationally bounded adversaries.
A PRF scheme has two inputs: a secret key k and a ‘message’ m; we
denote it as P RFk (m). Once k is fixed, the PRF becomes only a function of
the message. The basic property of PRF is that this function (P RFk (·)) is
indistinguishable from a truly random function. Intuitively, this means that
a PPT adversary cannot tell if it is interacting with P RFk (·) with domain D
and range R, or with a random function f from D to R. Hence, PRFs can be
used in many applications, providing an efficient, easily-deployable alternative
to the impractical truly random functions.
For example, PRFs can be used to construct shared-key cryptosystems, as
illustrated in Figure 2.15. The figure presents two designs of a cryptosystem
from a PRF: a stateful encryption, as a stream cipher, in Figure 2.15a, and a
stateless randomized encryption, in Figure 2.15b.
Both designs simply use a PRF instead a random function, used in the
corresponding designs in Figure 2.13. The security of the PRF designs follows
from the security of the corresponding random-function-based designs - and
from the indistinguishability of a PRF and a random function. Indeed, this is
one case of a very useful technique, which we refer to as the random function
design principle.
Principle 5 (Random function design). Design cryptographic protocols and
mechanisms using a random function, to make the security analysis easier.
Once secure, implement using a pseudorandom function; security would follow
since a PRF is indistinguishable from a random function.
We now need to finally properly define a PRF; however, we must first define
the oracle access notation, since we use it to define the PRF. Oracle access is a
central concept in complexity theory and cryptography; see [79]. Basically, it
means that A may provide input to the corresponding oracle and receive back
the response, without knowing which implementation of the oracle it interacts

47
$
i ri ← {0, 1}n

k fk (·) k fk (·)

/n /n
mi / mi /
n n

/n /n

z }| { z }| {
ci = (mi ⊕ fk (i)) ci = (mi ⊕ fk (ri ), ri )
(a) Stateful block encryption with a (b) Stateless, randomized block en-
PRF fk (·). cryption using PRF fk (·).

Figure 2.15: Block (n-bits) encryption using a Pseudo-Random Function (PRF)


fk (·). Use only one PRF application for n plaintext bits.

with (in this case, the truly-random function or the PRF). Other terms for
oracle access are ‘black box access’ or ‘subroutine access’.

Definition 2.7 (Oracle notation). Let F be a function (or an algorithm imple-


menting a function). We use the notation AF to denote an algorithm A which
can, as part of its operation, provide inputs to F and receive the corresponding
outputs. We refer to F as an oracle and to AF as algorithm A with oracle F .
We may specify some of the inputs to the oracle, allowing A to specify only the
others, e.g., AF
k (·, b), which implies that A may receive the results of Fk (a, b)
for values a chosen by A, but only for the specified values of k, b.
We next present a formal definition for a Pseudo-Random Function (PRF).
In this definition, the adversary A has oracle access to one of two functions: a
$
random function from domain D to range R, i.e., f ← {D → R}, or the PRF
keyed with a random n-bit key, Fk , with k being a random n-bit string, i.e.,
$
k ← {0, 1}n . We denote these two cases by Af and AFk , respectively. The
adversary should try to distinguish between these two cases, e.g., by outputting
0 (or ‘false’) if given oracle access to the random function f , and outputting 1
(or ‘true’) if given access to the PRF Fk . The idea of the definition is illustrated
in Fig. 2.16.
Note that the definition allows arbitrary length of the key (n), since indis-
tinguishability is only defined asymptotically - for sufficiently long keys.

Definition 2.8. A pseudo-random function (PRF) is a polynomial-time com-


putable function Fk (x) : {0, 1}∗ × D → R s.t. for all PPT algorithms A,
εP RF P RF
A,F (n) ∈ N EGL, i.e., is negligible, where the advantage εA,F (n) of the

48
Figure 2.16: The Pseudo-Random Function (PRF) Indistinguishability Test.
We say that function Fk (x) : {0, 1}∗ × D → R is a (secure) pseudo-random
generator (PRG), if no distinguisher D can efficiently distinguish between Fk (·)
and a random function f from the same domain D to the same range R, when
the key k is a randomly-chosen sufficiently-long binary string.

PRF F against adversary A is defined as:


εP RF
 F n   f n 
A,F (n) ≡ Pr A k (1 ) − Pr A (1 ) (2.25)
$ $
k←{0,1}n f ←{D→R}

The probabilities are taken over random coin tosses of A, and random choices
$ $
of the key k ← {0, 1}n and of the function f ← {D → R}.

Overview of the PRF indistinguishability test. The basic idea of this


definition is the use of indistinguishability test, much like in the definition of
a secure PRG (Definition 2.6), and even the Turing indistinguishability test
(Figure 2.10). Namely, a PRF (Fk ) is secure if every P P T algorithm A cannot
have significant advantage in identifying the pseudorandom function. We define
the advantage as the probability that A outputs 1 (‘true’, i.e., pseudorandom)
when given oracle access to the pseudorandom function Fk , minus the proba-
bility that A outputs 1 (‘true’, i.e., pseudorandom) when given oracle access
to the random function Fk , where both functions are over the same domain D
and range R. ‘Significant’ here means at least some positive polynomial in n,
the length of the key k, often referred to as the security parameter.

The oracle notation. Both AFk and Af use the oracle notation introduced
in Definition 2.7. Namely, they mean that A is given ‘oracle’ access to the
respective function (Fk () and f ()). Oracle access means that the adversary
can give any input x and get back that function applied to x, i.e., Fk (x) or
f (x), respectively

49
Why allow arbitrary key length (security parameter)? The definition
allows arbitrarily-long keys, although in practice, cryptographic standards of-
ten have a fixed key length, or only a few options. The reason that the definition
allows arbitrary length is that it requires the success probability to be negligi-
ble - smaller than any polynomial in the key length - which is meaningless if
the key length is bounded.

Why is 1n given as input to the adversary? A subtle, yet important,


aspect of the definition is the fact that in the two calls to the adversary A, we
provide the adversary with the value 1n as input, where n is the key length
(security parameter). The value 1n simply signifies a string of n consecutive
bits whose value is 1, i.e., it is the value of n encoded in unary. But why
provide 1n as input? It makes sense that the adversary should be informed of
the key length n, but why use unary encoding? Why not provide n using the
‘standard’ binary encoding?
To understand the reason, first recall that we focus on efficient (PPT)
algorithms; namely, the running time of both the pseudorandom function F
and the adversary A is bounded by a polynomial in the size of their inputs. The
inputs to the PRF include the key, and hence, consists of at least n bits; hence,
the running time of the PRF is (at least) polynomial in n. It is therefore ‘only
fair’ that the running time of the adversary A is also allowed to be polynomial
in n, which requires the input to be of length O(n). However, if we provide n
as input, then its length, using the common binary encoding, would only be
log(n), and while the adversary would remain PPT as a function of its input
length, its running time would be limited to O(log(n))!

Additional PRF applications. PRFs have many additional applications:

Message Authentication. In chapter 3, we show how a PRF may be used


for message authentication.
Derive independently-random keys/values. In many scenarios, we need
to share between parties multiple, independently-random keys or other
values, but the parties only share one key k. This is easily achieved using
PRF f ; if g1 , g2 , . . . are distinct identifiers, one of each required value,
then we can derive the keys as k1 = fk (g1 ), k2 = fk (g2 ), and so on. As a
concrete example, to derive separate keys for each day d from the same
k, we can use kd = fk (d); exposure of k2 and k4 will not expose any other
key, e.g., k3 !
Pseudo-random permutation or block cipher. We discuss the use of PRF
to construct a pseudo-random permutation in the following subsection;
later, in § 2.8, we show how to extend this to construct a block cipher.

50
2.5.8 PRF: Constructions and Robust Combiners
The concept of PRFs was proposed in a seminal paper by Goldreich, Goldwasser
and Micali [82]; the paper also presents a provably-secure construction of PRF,
given a PRG. That is, if there is a successful attack on the constructed PRF,
this attack can be used as a ‘subroutine’ to construct a successful attack on
the underlying PRG. However, the construction requires many applications of
the PRG for a single application of the PRF. Therefore, this construction is
not applied in practice. In contrast, Exercise 2.13 shows one of many simple
and efficient constructions of a PRG from a PRF.

Exercise 2.13. Let F be a PRF over {0, 1}n bits, and let k, r ∈ {0, 1}n . Prove
that f (k) = (Fk (1) +
+ Fk (2)) is a PRG.

Practical PRF designs, therefore, do not build a PRF from a PRG - they
use simpler and more efficient constructions. The most well knows is that of
a PRF from a block cipher, which we will discuss soon after introducing block
ciphers.
Another option is to construct candidate pseudo-random functions directly,
without assuming and using any other ‘secure’ cryptographic function, basing
their security on failure to ‘break’ them using known techniques and efforts by
expert cryptanalysts. In fact, pseudo-random functions are among the crypto-
graphic functions that seem good candidates for such ‘ad-hoc’ constructions; it
is relatively easy to come up with a reasonable candidate PRF, which will not
be trivial to attack. See Exercise 2.35.
Finally, it is not difficult to combine two candidate PRFs F 0 , F 00 , over the
same domain and range, into a combined PRF F which is secure as long as
either F 0 or F 00 is a secure PRF. We refer to such a construction as a robust com-
biner. Constructions of robust combiners are known for many cryptographic
primitives. The following lemma, from [92], presents a trivial yet efficient ro-
bust combiner for PRFs, and also allows us to give a simple example of a
typical cryptographic proof of security based on reduction to the security of
underlying modules.

Lemma 2.3 (Robust combiner for PRFs). Let F 0 , F 00 : {0, 1}∗ × D → R be


two polynomial-time computable functions, and let:

F(k0 ,k00 ) (x) ≡ Fk0 0 (x) ⊕ Fk0000 (x) (2.26)

If either F 0 or F 00 is a PRF, then F is a PRF. Namely, this construction is a


robust combiner for PRFs.

Proof. WLOG assume F 0 is a PRF, and we show that F is also a PRF. Suppose
to the contrary, i.e., exists some adversary A that can efficiently distinguish
$
between Fk0 ,k00 (·) and a random function f ← {D → R}.
We use A as a subroutine in the design of A0 . Adversary A0 is given
$
oracle access to a function yb0 (·), with b ← {0, 1}, where y00 (x) = Fk0 0 (x) and

51
y10 (x) = f 0 (x) where f 0 (·) is a random function from domain D to range R.
We will show that A0 efficiently distinguishes between Fk0 0 (·) and a random
function f 0 (·) from domain D to range R, in contradiction to the assumption
that F 0 is a PRF.
A0 first generates a random key k 00 for F 00 ; this allows A0 to compute the
value of Fk0000 (x), for any input x. A0 now runs A. When A asks for computation
of the oracle over input x, then A0 returns the value yb (x) = Fk0000 (x) ⊕ yb0 (x),
where yb0 (x) is the value it receives from its oracle (on input x). Namely,

y0 (x) = Fk0000 (x) ⊕ y00 (x) = Fk0000 (x) ⊕ Fk0 0 (x) = Fk0 ,k00 (x)

And
y1 (x) = Fk0000 (x) ⊕ y10 (x) = Fk0000 (x) ⊕ f 0 (x)
From Lemma 2.2, f 0 (x) ⊕ Fk0000 (x), like f 0 (x), is also a random function, which
we denote f (x). Hence, y1 (·) is a random function from domain D to range R.
Hence, if b = 0, i.e., A0 is given oracle access to a random function f 0 (x), then
A is also given oracle access to a random function f (x); otherwise, if b = 1,
i.e., A0 is given oracle access to Fk0 0 (·), then A is given oracle access to Fk0 ,k00 .
Namely, A0 returns the same value as returned to it by A, and distinguishes
between F 0 and the random function f 0 .

Considering that it is quite easy to design candidate PRFs (see Exer-


cise 2.35), and that it is very easy to robustly-combine candidate PRFs (Lemma 2.3),
it follows that PRFs are a good basis for more complex cryptographic schemes.
In particular, later in this section we show how to use a PRF to construct a
secure encryption scheme.

2.5.9 The key-separation principle and application of PRF


In the PRF robust combiner (Eq. 2.26), we used separate keys for the two
candidate-PRF functions F 0 , F 00 . In fact, this is necessary, as the following
exercise shows.
Exercise 2.14 (Independent keys are required for PRF robust combiners). Let
F 0 , F 00 : {0, 1}∗ × D → {0, 1}∗ be two polynomial-time computable functions,
and let Fk (x) = Fk0 (x) ⊕ Fk00 (x). Demonstrate that the fact that one of F 0 , F 00
is a PRF may not suffice to ensure that F would be a PRF.
Solution: Suppose F 0 = F 00 . Then for every k, x holds: Fk (x) = Fk0 (x) ⊕
0
Fk00 (x)
= Fk0 (x) ⊕ Fk0 (x) = 0|Fk (x)| . Namely, for any input x and any key k the
output of Fk (x) is an all-zeros string (Fk (x) ∈ 0∗ ). Hence F is clearly not a
PRF.
This is an example of the general key-separation principle below. In fact,
the study of robust combiners often helps to better understand the properties
of cryptographic schemes and to learn how to write cryptographic proofs.

52
Principle 6 (Key-separation). Use separate, independently-pseudorandom keys
for each different cryptographic scheme, as well as for different types/sources
of plaintext and different periods.

The principle combines three main motivations for the use of separate,
independently-pseudorandom keys:

Per-goal keys: use separate keys for different cryptographic schemes. A sys-
tem may use multiple different cryptographic functions or schemes, often
for different goals, e.g., encryption vs. authentication. In this case, se-
curity may fail if the same or related keys are used for multiple different
functions. Exercise 2.14 above is an example.
Limit information for cryptanalysis. By using separate, independently-
pseudorandom keys, we reduce the amount of information available to
the attacker (ciphertext, for example).
Limit the impact of key exposure. Namely, by using separate keys, we
ensure that exposure of some of the keys will not jeopardize the secrecy
of communication encrypted with the other keys.

Pseudo-random Functions (PRFs) have many applications in cryptography,


including encryption, authentication, key management and more. One impor-
tant application is derivation of multiple separate keys from a single shared
secret key k. Namely, a PRF, say f , is handy whenever two parties share one
secret key k and need to derive multiple separate, independently pseudorandom
keys k1 , k2 , . . . from k. A common way to achieve this, is for the two parties to
use some set of identifiers γ1 , γ2 , . . ., a distinct identifier for each derived key,
and compute each key ki as: ki = fk (γi ).
As another example, system designers often want to limit the impact of key
exposure due to cryptanalysis or to system attacks. One way to reduce the
damage from key exposures is to change the keys periodically, e.g., use key kd
for day number d:
Example 2.2 (Using a PRF for independent per-period keys.). Assume that
Alice and Bob share one master key kM . They may derive a shared secret key
for day d as kd = P RFkM (d). Even if all the daily keys are exposed, except
ˆ the key for day dˆ remains secure as long as kM is kept
the key for one day d,
secret.

2.5.10 Random and Pseudo-Random Permutations


After discussing random functions and PRFs, we now introduce two related
concepts: a random permutation and a pseudo-random permutation (PRP).

53
Random permutations. A permutation is a function π : D → D mapping
a domain D onto itself, where every element is mapped to a distinct element,
namely:

(π : D → D) is a permutation ⇐⇒ (∀x, x0 ∈ D) (π(x) 6= π(x0 )) (2.27)

Note that a permutation may map an element onto itself, i.e., π(x) = x is
perfectly legitimate.
We use P erm(D) to denote the set of all permutations over domain D.
$
Selection of a random permutation over D, i.e., selecting ρ ← P erm(D), is
similar to selection of a random function (Equation 2.23) - except for the need
to avoid collisions. A collision is a pair of elements (x, x0 ), both mapped to the
same element: y = ρ(x) = ρ(x0 ).
One natural way to think about this selection, is as being done incremen-
tally, mapping one input at a time. Let D0 ⊆ D be the set of already-mapped
$
elements; given any ‘new’ element x ∈ D/D0 , select ρ(x) ← D/D0 .
Using this process, for a small domain, e.g., D = {0, 1}n for small n, the
selection of a random permutation ρ is easy and can be done manually - sim-
ilarly to the process for selecting a random function (over small domain and
range). The process requires O(2n ) coin tosses, time and storage. For example,
use Table 2.3 to select two random permutations over domain D = {0, 1}2 , and
notice the number of coin-flips required.

Function Domain 00 01 10 11 coin-flips


ρ1 {0, 1}2
ρ2 {0, 1}2

Table 2.3: Do-it-yourself table for selecting random permutations ρ1 , ρ2 over


domain D = {0, 1}2 .

Pseudo-Random Permutation (PRP). Similarly to a PRF, a Pseudo-


Random Permutation (PRP) over domain D, denoted Ek (·), is an efficient
algorithm which cannot be distinguished efficiently from a random permutation
$
ρ ← P erm(D), provided that the key k provided is ‘sufficiently long’ and chosen
randomly.
In the definition, the adversary A has oracle access to one of two functions:
either Ek (·), keyed with a random n-bit key k, or a random permutation over
$
domain D, i.e., ρ ← P erm(D). We denote these two cases by AEk (·) and Aρ(·) ,
respectively. The adversary should try to distinguish between these two cases,
e.g., by outputting the string ‘Rand’ if given access to the random permutation
ρ(·), and outputting, say, ‘not random’, if given access to the PRP Ek (·). The
idea of the definition is illustrated in Fig. 2.17.
Note that the definition allows arbitrary length of the key (n), since indis-
tinguishability is only defined asymptotically - for sufficiently long keys.

54
Figure 2.17: The Pseudo-Random Permutation (PRP) Indistinguishability
Test. We say that function Ek (x) : {0, 1}∗ × D → D is a (secure) pseudo-
random permutation (PRP), if no distinguisher D can efficiently distinguish
$
between Ek (·) and a random permutation ρ ← P erm(D) over domain D,
when the key k is a randomly-chosen sufficiently-long binary string.

Definition 2.9. A pseudo-random Permutation (PRP) is a polynomial-time


computable function Ek (x) : {0, 1}∗ × D → D ∈ P P T s.t. for all PPT al-
gorithms A, εP RP
A,E (n) ∈ N EGL(n), i.e., is negligible, where the advantage
P RP
εA,E (n) of the PRP E against adversary A is defined as:

εP RP
 E n 
A,E (n) ≡ Pr A k (1 ) − Pr [Aρ (1n )] (2.28)
$ $
k←{0,1}n ρ←P erm(D)

The probabilities are taken over random coin tosses of A, and random choices
$ $
of the key k ← {0, 1}n and of the function ρ ← P erm(D).

Block cipher as an inversible PRP (or: PRP and its inverse). The
reason that we used E to denote the candidate-PRP function, is that one
of the most important applications for a PRP is for a block cipher, one of
the most important cryptographic mechanism, with multiple standards and
many implementations, and often used as a ‘building block’ to construct other
mechanisms. Block ciphers actually use a pair of functions, (E, D), where E is a
PRP and D is the inverse function, i.e.: (∀x ∈ D, k ∈ {0, 1}∗ ) x = Dk (Ek (x)).
Intuitively, such ‘inversible PRP’ is a pair of two keyed functions (E, D), which,
for a random key, cannot be distinguished from a random permutation and its
inverse (over same domain). We will not present the definition, which is a
simple extension of Definition 2.9.
The symbols (E, D) are often used for the functions of the block cipher,
since one basic application of a block cipher is for encryption; intuitively, E is

55
‘encryption’ and D is ‘decryption’. However, as we will soon see, a block cipher
does not meet the (strong) definition of secure encryption, which we present
next.
One natural - and important - question is the relation between a PRP over
domain D, and a PRF over the same domain and range (both D). Somewhat
surprisingly, it turns out that a PRP over D is indistinguishable from a PRF
over D. This important result is called the PRP/PRF Switching Lemma, and
has multiple proofs; we recommend the proof in [156]. Note that the lemma
provides a relation between the advantage functions; this is an example of
concrete security.
Lemma 2.4 (The PRP/PRF Switching Lemma). Let E be a polynomial-time
computable function Ek (x) : {0, 1}∗ × D → D ∈ P P T , and let A be a PPT
adversary, which is limited to at most q oracle queries. Then:
P RF
εA,E (n) − εP RP
q2
A,E (n) < (2.29)
2 · |D|
Where the advantage functions are as defined in Equation 2.28 and Equa-
tion 2.25.
In particular, if the size of the domain D is exponential in the security
parameter n (the length of key and of the input to A), e.g., D = {0, 1}n , then
εP RF P RP
A,E (n) − εA,E (n) ∈ N EGL(n). In this case, E is a PRP over D, if and
only if it is a PRF over D.
Proof idea: In a polynomial set of queries of a random function, there is
negligible probability of having two values which will map to the same value.
Hence, it is impossible to efficiently distinguish between a random function and
a random permutation. The proof follows since a PRF (PRP) is indistinguish-
able from a random function (resp., permutation).
The PRP/PRF switching lemma is somewhat counter-intuitive, since, for
large D, there are many more functions than permutations. Focusing on D =
2
{0, 1}n for convenience, there are (2n ) = 22n functions over D, and ‘only’ 2n !
permutations.
Note that the loss of (concrete) security bounded by the switching lemma,
is a disadvantage in using a block cipher directly as a PRF - it would be an
(asymptotically) secure PRF, but the advantage against the PRF definition
would be larger than the advantage against the PRP definition. Therefore,
we would prefer to use one of several constructions of a PRF from a block
cipher/PRP - that are efficient and simple, yet avoid this loss in security;
see [18, 87].
See Table 2.4 for a summary and comparison of random function, random
permutation, PRG, PRF and Pseudo-random Permutation (PRP).

2.6 Defining secure encryption


In the previous section, we defined PRG, PRF and PRP; in this section, we
finally make the next step and define secure encryption.

56
Function Key Input Output property
$
Random function f : D → R None x∈D f (x) ← R
$
Random permutation π : D → D None x∈D π(x) ← D not assigned as π(y) for y 6= x.
$ $
PRG f : {0, 1}∗ → {0, 1}∗ None x ← {0, 1}n f (x) indistinguishable from r ← {0, 1}|f (x)|
∗ $ n $
PRF f : {0, 1} × D → R k ← {0, 1} x∈D fk (·) indistinguishable from r(·) ← {D → R}
∗ $ n
PRP f : {0, 1} × D → D k ← {0, 1} x∈D fk (·) indistinguishable from random permuta-
tion over D

Table 2.4: Comparison between random function, random permutation, PRG,


PRF, and PRP

The definition of secure encryption is quite subtle. In fact, people have


been designing - and attacking - cryptosystems for millennia, without a precise
definition of the security goals! This only changed with the seminal paper of
Goldwasser and Micali [83], which presented the first precise definition of secure
encryption, along with a design which was proven secure (under reasonable
assumptions); this paper is one of cornerstones of modern cryptography.
It may be surprising that defining secure encryption is so challenging; we
therefore urge you to attempt the following exercise, where you are essentially
challenged to try to define secure encryption on your own, before reading the
rest of this section and comparing with the definition we present.

Exercise 2.15 (Defining secure encryption). Define secure symmetric encryp-


tion, as illustrated in Figure 2.1. Refer separately to the two aspects of security
definitions: (1) the attacker model, i.e., the capabilities of the attacker, and (2)
the success criteria, i.e., what constitutes a successful attack and what consti-
tutes a secure encryption scheme.

2.6.1 Attacker model


The first aspect of a security definition is a precise attacker model, defining
the maximal expected capabilities of the attacker. We discussed already some
of these capabilities. In particular, we already discussed the computational
limitations of the attacker: in § 2.4 we discussed the unconditional security
model, where attackers have unbounded computational resources, and from
subsection 2.5.2 we focus on Probabilistic Polynomial Time (PPT) adversaries,
whose computation time is bounded by some polynomial in their input size.
Attacker capabilities include also their possible interactions with the at-
tacked scheme and the environment; in the case of encryption schemes, we
refer to these are types of cryptanalysis attack. We mentioned above some of
these, mainly cipher-text only (CTO), known-plaintext attack (KPA), chosen
plaintext attack (CPA) and chosen ciphertext attack (CCA). Specifically, in a
chosen-plaintext attack, the adversary can chose plaintext and receive the cor-
responding ciphertext (encryption of that plaintext), and in chosen-ciphertext
attack, the adversary can chose ciphertext and receive the corresponding plain-
text (its decryption), or error message if the ciphertext does not correspond to

57
Attack type Cryptanalyst knowledge
Ciphertext Only (CTO) Plaintext distribution (possibly noisy/partial)
Known Plaintext Attack (KPA) Set of (ciphertext, plaintext) pairs
Chosen Plaintext Attack (CPA) Ciphertext for arbitrary plaintext chosen by attacker
Chosen Ciphertext Attack (CCA) Plaintext for arbitrary ciphertext chosen by attacker

Table 2.5: Types of Cryptanalysis Attacks. In all attack types, the cryptanalyst
knows the cipher design and a body of ciphertext.

well-encrypted plaintext. We summarize these four basic types of cryptanaly-


sis in Table 2.5. You will find more refined variants of these basic attacks, as
well as additional types of cryptanalysis attacks, in cryptography courses and
books, e.g., [80, 159].
It is desirable to allow for attackers with maximal capabilities. Therefore,
when we evaluate cryptosystems, we are interested in their resistance to all
types of attacks, and especially the stronger ones - CCA and CPA. On the other
hand, when we design systems using a cipher, we try to limit that attacker’s
capabilities.
For example, one approach to foil CCA attacks is to apply some simple
padding function pad to add redundancy to the plaintext before encryption;
the padding function may be as simple as appending a fixed string. For exam-
ple, given message m, key k, encryption scheme (E, D) and a simple padding
function, e.g., pad(m) = m + + 0l , i.e., concatenate l zero, we now encrypt by
computing c = Ek (pad(m)) = Ek (m + + 0l ). This allows the decryption process
to identify invalid ciphertexts. Namely, given c = Ek (pad(m)) = Ek (m + + 0l ),
l
then Dk (c) = m + + 0 , and we output m as usual; but if the output of Dk does
not contain l trailing zeros, then we identify faulty ciphertext. This approach
often helps to make it hard or infeasible for the attacker to apply a chosen-
ciphertext attack; in particular, a random ciphertext would almost always be
detected as faulty.
Note, however, that adding redundancy to the plaintext may make it easier
to perform ciphertext-only attacks; see Principle 9.
Also, some encryption and padding functions may still allow CCA attacks,
as we show in the following exercise.
Exercise 2.16. Show that the simple padding function pad(m) = m + + 0l , fails
to prevent CCA attacks against the PRG-stream-ciper (Fig. 2.9), and against
at least one or two other ciphers we discussed so far.

2.6.2 The Indistinguishability-Test for Shared-Key


Cryptosystems
Intuitively, the security goal of encryption is confidentiality: to transform plain-
text into ciphertext in such way that will allow specific parties (‘recipients’) -

58
and only them - to perform decryption, transforming the ciphertext back to
the original plaintext. However, the goal as stated may be interpreted to only
forbid recovery of the exact, complete plaintext; but what about recovery of
partial plaintext?
For example, suppose an eavesdropper can decipher half of the characters
from the plaintext - is this secure? We believe most readers would not agree.
What if she can decipher less, say one character? In some applications, this may
be acceptable; in others, even exposure of one character may have significant
consequences.
Intuitively, we require that an adversary cannot learn anything given the
ciphertext. This may be viewed as extreme; for example, in many applications
the plaintext includes known fields, and their exposure may not be a concern.
However, it is best to minimize assumptions and use definitions and schemes
which are secure for a wide range of applications.
Indeed, in general, when we design a security system, cryptographic or
otherwise, it is important to clearly define both aspects of security: the attacker
model (e.g., types of attacks ‘allowed’ and any computational limitations), as
well as the success criteria (e.g., ability to get merchandise without paying for
it). Furthermore, it is difficult to predict the actual environment in which a
system would be used; this motives the conservative design principle, as follows.

Principle 7 (Conservative design). Cyber-security mechanisms, and in par-


ticular cryptographic schemes, should be specified and designed with minimal
assumptions and for a maximal range of applications and for maximal, well-
defined attacker capabilities. On the other hand, systems should be designed to
minimize the attackers’ abilities, in particular, limiting the attacker’s ability to
attack the cryptographic schemes in use.
Okay, so hopefully you now agree that it would be best to require that
an adversary cannot learn anything from the ciphertext. But how do we even
define this? This is not so easy. The seminar paper by Goldwasser and Mi-
cali [83] presented two definitions and showed them to be equivalent: semantic
secure encryption and indistinguishability. We will only present the latter,
since we find it easier to understand and use, and resembeles the PRF, PRG
and Turing indistinguishability tests (Figure 2.16, Figure 2.11 and Figure 2.10,
respectively).
Intuitively, an encryption scheme ensures indistinguishability if an attacker
cannot distinguish between encryption of any two given messages. But, again,
turning this into a ‘correct’ and precise definition requires care.
The concept of indistinguishability is reminiscent of disguises; it may help
to consider the properties we can hope to find in an ‘ideal disguise service’:

Any two disguised persons are indistinguishable: we cannot distinguish


between any two well-disguised persons. Yes, even Rachel from Leah!8
8 See: Genesis 29:23, King James Bible.

59
Except, the two persons should have the ‘same size’: assuming that a
disguise is of ‘reasonable size’ (overhead), a giant can’t be disguised to
be indistinguishable from a dwarf!
Re-disguises should be different: if we see Rachel in disguise, and then
she disappears and we see a new disguise, we should not be able to tell
if it is Rachel again, in new disguise - or any other disguised person!
This means that disguises must be randomized or stateful, i.e., every two
disguises of the same person (Rachel) will be different.

We will present corresponding properties for indistinguishable encryption:

Encryptions of any two messages are indistinguishable. to allow arbi-


trary applications, we allow the attacker to choose the two messages.
However, there is one restriction: the two messages should be same length.
Re-encryptions should be different: the attacker should not be able to
distinguish encryptions based on previous encryptions of the same mes-
sages. This means that encryption must be randomized or stateful, so
that two encryptions of same message will be different. (A weaker notion
of ‘deterministic encryption’ allows detection of re-encryption of a mes-
sage, and is sometimes used for scenarios where state and randomization
are to be avoided.)

We are finally ready to formally present the indistinguishability-based defi-


nition of secure encryption. We only present the definition for chosen-plaintext
attack (CPA) indistinguishability (IND-CPA), and only for stateless encryption
(Definition 2.10).
The IND-CPA test receives two inputs: the ‘challenge bit’ b (that A tries
to find), and the security parameter, which in this case is also the key length,
n. The adversary is given oracle access to Ek (·), which we denote by AE k (·);
namely, it may give any message m and receive its encryption Ek (m) - and
possibly repeat this for more messages. At some point, A gives a pair of
messages m0 , m1 , and receives c∗ = Ek (mb ). As we discussed above, the two
messages must be of equal length, |m0 | = |m1 |. Finally, A outputs b∗ , which
is the output of the test. Intuitively, A ‘wins’ if b∗ = b.
We present the IND-CPA test informally in Figure 2.18, and using pseu-
docode in Figure 2.19.

Oracle notation AEk (·) . In the IND-CPA test, we use the oracle notation
AEk (·) , defined in Def. 2.7. Namely, AEk (·) denotes calling the A algorithm,
with ‘oracle access’ to the (keyed) PPT algorithm Ek (·), i.e., A can provide
arbitrary plaintext string m and receive Ek (m).

60
IN D−CP A
Figure 2.18: The IND-CPA test for encryption, TA,hE,Di (b, n). Throughout
the test, the adversary A may ask for encryption of one or many messages m.
At some point, A sends two same length messages (|m0 | = |m1 |), and receives
the encryption of mb , i.e.: Ek (mb ). Finally, A outputs its guess b∗ , and ‘wins’
if b = b∗ . The encryption is IND-CPA if Pr(b∗ = 1|b = 1) − Pr(b∗ = 1|b = 0) is
negligible.

IN D−CP A
TA,hE,Di (b, n) {
$
k ← {0, 1}n
(m0 , m1 ) ← AEk (·) (‘Choose’, 1n ) s.t. |m0 | = |m1 |
c∗ ← Ek (mb )
b∗ = AEk (·) (‘Guess’, c∗ )
Return b∗
}

Figure 2.19: The IND-CPA test for encryption (E, D). The two calls to the
adversary are often referred to as the ‘Choose’ phase and the ‘Guess’ phase.

Adversary A chooses challenge messages. The IND-CPA test allows A


to choose the two challenge messages m0 , m1 , and then receive c∗ = Ek (mb ),
where b ∈ {0, 1}. Allowing A to select the two messages completely may make
it easier for A; in many applications, the adversary only has very limited knowl-
edge about the possible plaintext messages. This is following the conservative
design principle - the encryption should be appropriate for any application,
including one in which there are only two possible plaintext messages, known
to the attacker - who ‘just’ needs to know which of them was encrypted. One
classical example is when the messages are ‘attack’ or ‘retreat’; another would
be ‘sell’ or ‘buy’.

61
Encryption must be randomized or stateful. IND-CPA encryption must
either be randomized or stateful. The reason is simple: the adversary is allowed
to make queries for arbitrary messages - including the ‘challenges’ m0 , m1 . If
the encryption scheme is deterministic - and stateless - then all encryptions of
a message, e.g. m0 , will return a fixed ciphertext; this will allow the attacker
to trivially ‘win’ in the IND-CPA experiment. Furthermore, Exercise 2.42
shows that limiting the number of random bits per encryption may lead to
vulnerability .
Using the IND-CPA test, we now define IND-CPA encryption, similarly to
how we defined PRG and PRF, in Definition 2.6 and Definition 2.8, respectively.
We present only the definition for stateless cryptosystems, since we mostly focus
on this case.
Definition 2.10 (IND-CPA for (stateless) shared-key cryptosystems). Let
hE, Di be a stateless shared-key cryptosystem. We say that hE, Di is IND-CPA,
if every efficient adversary A ∈ P P T has negligible advantage εIN D−CP A
hE,Di,A (n) ∈
N EGL(n), where:
h i h i
εIN D−CP A
hE,Di,A
IN D−CP A
(n) ≡ Pr TA,hE,Di IN D−CP A
(1, n) = 1 − Pr TA,hE,Di (0, n) = 1
(2.30)
Where the probability is over the random coin tosses in IND-CPA (including
of A and E).

Exercise 2.17. Consider the following alternative advantage function:


h i h i
IN D−CP A IN D−CP A IN D−CP A
ε̃hE,Di,A (n) ≡ Pr TA,hE,Di (1, n) = 1 − Pr TA,hE,Di (1, n) = 0
h i h i
IN D−CP A IN D−CP A IN D−CP A
ε̂hE,Di,A (n) ≡ Pr TA,hE,Di (1, n) = 1 − Pr TA,hE,Di (0, n) = 0

Show that both are not reasonable definitions for advantage function, by pre-
senting (simple) adversaries which achieve significant advantage for any cryp-
tosystem.

Indistinguishability for the CTO, KPA and CCA attack models Def-
inition 2.10 focuses on Chosen-Plaintext Attack (CPA) model.
Modifying this definition for the case of chosen-ciphertext (CCA) attacks
requires a further (quite minor) change and extension, to prevent the attacker
from ‘abusing’ the decryption oracle to decrypt the challenge ciphertext.
Modifying the definition for Cipher-Text-Only (CTO) attack and Known-
Plaintext Attack (KPA) is more challenging. For KPA, the obvious question
is which plaintext-ciphertext messages are known; this may be solved by using
random plaintext messages, however, in reality, the known-plaintext is often
quite specific.
It is similarly challenging to modify the definition so it covers CTO attacks,
where the attacker must know some information about the plaintext distribu-
tion. This information may be related to the specific application, e.g., when

62
the plaintext is English. In other cases, information about the plaintext dis-
tribution may be derived from system design, e.g., text is often encoded using
ASCII, where one of the bits in every character is the parity of the other bits.
An even more extreme example is in GSM, where the plaintext is the result
of the application of an Error-Correcting Code (ECC), providing significant
redundancy which even allows a CTO attack on GSM’s A5/1 and A5/2 ci-
phers [7]. In such a case, the amount of redundancy in the plaintext can be
compared to that provided by a KPA attack. We consider it a CTO attack, as
long as the attack does not require knowledge of all or much of the plaintext
corresponding to the given ciphertext messages.
Some systems, including GSM, allow the attacker to guess all or much
of the plaintext for some of the ciphertext messages, e.g., when sending a
predictable message at a specific time. Such systems violate the Conservative
Design Principle (principle 7), since a a KPA-vulnerability of the cipher renders
the system vulnerable. A better system design would limit the adversary’s
knowledge about the distribution of plaintexts, requiring a CTO vulnerability
to attack the system.

2.6.3 The Indistinguishability-Test for Public-Key


Cryptosystems (PKCs)
We next define the CPA-indistinguishability for public key cryptosystems (PKC;
see Figure 2.2). The definition is a minor variation of the indistinguishability-
test for shared-key cryptosystems (Definition 2.10), and even a bit simpler. In
fact, let us first present the definition, as well as the IND-CPA test for PKCS
(Figure 2.20), and only then point out and explain the differences; this would
allow the reader to play ‘find the differences’, comparing to Definition 2.10.
IN D−CP A
TA,hKG,E,Di (b, n) {
$
(e, d) ← KG(1n )
(m0 , m1 ) ← A(‘Choose’, e) s.t. |m0 | = |m1 |
c∗ ← Ee (mb )
b∗ = A(‘Guess’, (c∗ , e))
Return b∗
}

Figure 2.20: The IND-CPA test for public-key encryption (KG, E, D).

Definition 2.11 (PKC IND-CPA). Let hKG, E, Di be a public-key cryptosys-


tem. We say that hKG, E, Di is IND-CPA, if every efficient adversary A ∈
IN D−CP A
P P T has negligible advantage ε<KG,E,D>,A (n) ∈ N EGL(n), where:
h i h i
εIN D−CP A
hKG,E,Di,A (n) ≡ Pr T IN D−CP A
A,hKG,E,Di (1, n) = 1 − Pr T IN D−CP A
A,hKG,E,Di (0, n) = 1
(2.31)

63
Where the probability is over the random coin tosses in IND-CPA (including
of A and E).

In the PKCs definition of IND-CPA ( Definition 2.11), the adversary is


given the public key e. Hence, ADV can encrypt at will, without the need to
make encryption queries, as enabled by the oracle calls in Definition 2.10, and
we removed the oracle. Another change is purely syntactic: the cryptosystem
includes an explicit key generation algorithm KG, while for the shared-key
cryptosystem, we assumed the (typical) case where the keys are just random
n-bit strings.
We discuss three specific public key cryptosystems, all in chapter 6: DH
and El-Gamal in § 6.5, and RSA in § 6.6.

2.7 The Cryptographic Building Blocks Principle


We next discuss the design of secure symmetric encryption schemes. Ideally,
we would use efficient and proven secure, e.g., proven to be IND-CPA (Defini-
tion 2.10), without assumptions on computational-hardness of some underlying
functions. However, IND-CPA implies that there is no efficient (PPT) algo-
rithm that can distinguish between encryption of two given messages, i.e., the
IND-CPA test is not in the polynomial-complexity class P, containing problems
which have a polynomial-time algorithm. On the other hand, surely it is easy
to ‘win’ in the test, given the key; which implies that the IND-CPA test is
in the non-deterministic polynomial complexity class NP, containing problems
which have a polynomial-time algorithm - if given a hint (in our case, the key).
Taken together, this would have shown that the complexity class P is strictly
smaller than the complexity class N P , i.e., P 6= N P . Now, that would be
a solution to the most fundamental question in the theory of computational
complexity!
It is not practical to require the encryption algorithm to have a property
whose existence implies a solution to such a basic, well-studied open question.
Therefore, both theoretical and applied cryptography consider designs whose
security relies on failed attempts in cryptanalysis. The big question is: should
we rely on failed cryptanalysis of the scheme itself, or on failed cryptanalysis
of underlying components of the scheme?
It may seem that the importance of encryption schemes should motivate
the first approach, i.e., relying of failed attempts to cryptanalyze the scheme.
Surely this was the approach in ancient and ‘classical’ cryptology.
However, in modern applied cryptography, it is much more common to use
the second approach, i.e., to construct encryption using ‘simpler’ underlying
primitives, and to base the security of the cryptosystem on the security of these
component modules. We summarize this approach in the following principle,
and then give some justifications.

Principle 8 (Cryptographic Building Blocks). The security of cryptographic


systems should only depend on the security of a few basic building blocks. These

64
blocks should be simple and with well-defined and easy to test security proper-
ties. More complex schemes should be proven secure by reduction to the security
of the underlying blocks.
The advantages of following the cryptographic building blocks principle
include:
Efficient cryptanalysis: by focusing cryptanalysis effort on few schemes, we
obtain much better validation of their security. The fact that the building
blocks are simple and are selected to be easy to test makes cryptanalysis
even more effective.
Replacement and upgrade: by using simple, well-defined modules, we can
replace them for improved efficiency - or to improve security, in particular
after being broken or when doubts arise.
Flexibility and variations: complex systems and schemes naturally involve
many options, tradeoffs and variants; it is better to build all such variants
using the same basic building blocks.
Robust combiners: there are known, efficient robust-combiner designs for
the basic cryptographic building blocks [92]. If desired, we can use these
as the basic blocks for improved security.
The cryptographic building blocks principle is key to both applied and theo-
retical modern cryptography. From the theoretical perspective, it is important
to understand which schemes can be implemented given another scheme. There
are many results exploring such relationships between different cryptographic
schemes and functions, with many positive results (constructions), few nega-
tive results (proofs that efficient constructions are impossible or improbable),
and very few challenging open questions.
In modern applied cryptography, the principle implies the need to define a
small number of basic building blocks, which would be very efficient, simple
functions - and convenient for many applications. The security of these building
blocks would be established by extensive (yet unsuccessful) cryptanalysis efforts
- instead of relying on provably-secure reductions from other cryptographic
mechanisms.
In fact, most cryptographic libraries contain the four such widely-used
building blocks: the shared-key block cipher, which we discuss next; the keyless
cryptographic hash function (chapter 4); public-key encryption and signature
scheme (chapter 6). Cryptographic hash functions and block ciphers are much
more efficient than the public key schemes (see Table 6.1) and hence are pre-
ferred, and used in most practical systems - when public-key operations may
be avoided.

PRF as a building block. Pseudo-Random Functions (PRFs) are also


widely used in applied cryptography, however, they are not defined as a stan-
dard, and not all cryptographic libraries contain their explicit implementation.

65
Instead, they are usually implemented using another mechanism - most often,
using a block cipher. The PRP/PRF switching lemma (Lemma 2.4) shows that
a block cipher is indistinguishable from a PRF, and hence, every block cipher is
also a PRF; however, this may involve some loss in (concrete) security, which
is bounded by the lemma - but definitely significant. Instead, one should use
one of several constructions of PRF from a block cipher, which are simple, al-
most as efficient as the block cipher - and avoid the loss in (concrete) security;
see [18, 87].

2.8 Block Ciphers


Modern symmetric encryption schemes are built in modular fashion, using a
basic building block - the block cipher. A block cipher is defined by a pair of
keyed functions, Ek , Dk , such that the domain and the range of both Ek and
Dk are {0, 1}n , i.e., binary strings of fixed length n; for simplicity, we (mostly)
use n for the length of both keys and blocks, as well as the security parameter,
although in some ciphers, these are different numbers.

Figure 2.21: High-Level view of the NIST standard block ciphers: AES (cur-
rent) and DES (obsolete).

Block ciphers may be the most important basic cryptographic building


blocks. Block ciphers are in wide use in many practical systems and construc-
tions, and two of them were standardized by NIST - the DES (1977-2001) [134],
the first standardized cryptographic scheme, and the AES (2002-????) [49],
its successor (Fig. 2.21). DES was replaced, since it was no longer consid-
ered secure; the main reason was simply that improvements in hardware made
exhaustive-search feasible, due to the relatively short, 56-bit key. Another
reason for reduced confidence in DES - even in longer-key variants - was the
presentation of differential cryptanalysis and linear cryptanalysis, two strong
cryptanalytical attacks, which are quite generic, namely, effective against many
cryptographic designs - including DES. Indeed, AES was designed for resiliency
against these and other known attacks, and so far, no published attack against
AES appear to justify concerns.

66
We present a simplified explanation and example of differential cryptanal-
ysis below, and encourage interested readers to follow up in the extensive lit-
erature on cryptanalysis in general and these attacks, e.g., [63, 101, 108]; [108]
also gives excellent overview of block ciphers.
Unfortunately, there is no universally-accepted definition of the exact cryp-
tographic requirements for block ciphers. Hence, their required security proper-
ties are not very well defined. We will adopt here one popular approach, which
models a block cipher as a reversible Pseudo-Random Permutation (PRP). A
(secure) block cipher is a reversible PRP, i.e., a pair of PRPs Ek , Dk over
{0, 1}n , s.t. for every message m in the domain, holds m = Dk (Ek (m)).
Namely, a block cipher is indistinguishable from a random pair of a permu-
tation and its inverse permutation. The fact that m = Dk (Ek (m)) for every
k, m ∈ {0, 1}n is called the correctness requirement - note that this is essentially
the same as presented in Def. 2.1 for encryption schemes. The fact that both
E and D are PRPs, is the security requirement.

Example 2.3. Let Ek (m) = m ⊕ k and Ek0 (m) = m + k mod 2n . Show the
corresponding D, D0 functions such that both (E, D) and (E 0 , D0 ) satisfy the
correctness requirement; and show neither of them satisfy the security require-
ment.

Solution: Dk (c) = c ⊕ k, Dk0 (c) = c − k mod 2n . Correctness follows from


the arithmetic properties. Let us now show that (E, D) is insecure; specifically,
let us show that Ek is not a PRP. Recall that we need to provide a PPT
adversary AEk (·) , s.t. εP RP
E,A is not negligible. We present a simple adversary
A, that only makes two queries, and whose advantage is almost 1. The first
query of A will be for input 0n ; if we denote the oracle response by f (·), then
A receives f (0n ). If the oracle is for E, A receives Ek (0n ) = 0n ⊕ k = k, i.e.,
the key k. Intuitively, this clearly ‘breaks’ the system; let us show exactly how,
but from this point, our solution holds for the general case where the adversary
found k (if the oracle is for f (·) = Ek (·)).
Our second query can be for any other value (not 0n ), e.g., let’s make
the query 1n , so we now receive f (1n ), where f is either Ek or a random
permutation. Adversary A checks if f (1n ) (which it received from the oracle)
is the same as Ek (1n ) (which A computes, since it believes it knows k). If
the two are identical, then probably f (·) = Ek (·), i.e., A returns 1 (PRP);
otherwise, then for sure f is a random permutation (and A returns 0). So, the
advantage of A is almost 1, specifically:

εP RP
= Pr AEk = 1 − Af = 1
 
A,E Pr
k $
f ←P erm({0,1}n

Pr Ek (1k ) = Ek (1k ) − Ek (1k ) = f (1k )


 
= Pr
k $
f ←P erm({0,1}n
1
− 1− n ≈1
2

67
Now, notice that the same adversary A also distinguishes E 0 ; we leave to
the reader to substitute the values as necessary; these minimal changes are only
required until A ‘finds’ k, from that point, the solution is exactly identical.
Note that block ciphers, in particular DES and AES, are often referred to
as encryption schemes, although they do not satisfy the requirements of most
definitions of encryption, e.g., the IND-CPA test of Def. 2.10.
Exercise 2.18. Explain why a PRP and a block ciphers, fail the IND-CPA
test (Def. 2.10).
Solution: Consider Ek (m), which is either a PRP or the ‘encryption’ opera-
tion of a block cipher (i.e., a pair (E, D) of a PRP and its reverse). Then Ek (m)
is a function; whenever we apply it to the same message m, with the same key
k, we will receive the same output (Ek (m)). The attacker A would choose any
two different messages as (m0 , m1 ), confirm that c0 = Ek (m0 ) 6= c1 = Ek (m1 ),
and then use these as a challenge, to receive c∗ = Ek (mb ). It then outputs b0
s.t. cb0 = c∗.
On the other hand, we will soon see multiple constructions of secure encryp-
tion schemes based on block ciphers; these constructions are often referred to
as modes of operation. Indeed, block ciphers are widely used as cryptographic
building blocks, as they satisfy most of the requirements of the Cryptographic
Building Blocks principle. They are simple, deterministic functions with Fixed
Input Length (FIL), which is furthermore identical to their output length. This
should be contrasted with ‘full fledged encryption schemes’, which are random-
ized (or stateful) and have Variable Input Length (VIL). Indeed, block ciphers
are even easier to evaluate than PRFs, since PRFs may have different input
and output lengths.
Another desirable property of block ciphers is that they have a simple robust
combiner, i.e., a method to combine two or more candidate block ciphers into
one ‘combined’ function which is a secure block cipher provided one or more of
the candidates is a secure block cipher. This is shown in [92] and the following
exercise.
Exercise 2.19. Design a robust combiner for PRPs and for reversible PRPs
(block ciphers).
Hint: cf. Lemma 2.3, and/or read [92].

2.8.1 Constructing PRP from PRF: the Feistel Construction


In Ex. 2.35 we show that it is not too difficult to construct a reasonable can-
didate PRF. However, constructing ‘directly’ a candidate PRP seems harder,
since we need to ensure that every input is mapped to a distinct output. Con-
structing ‘directly a block cipher (i.e., a reversible PRP) seems even harder -
we also need to design the inverse permutation, without causing vulnerability.
It’s easy to get it wrong! This motivates constructing a PRP, given a PRF -
preferably, an ‘easier’ PRF, e.g., with smaller domain (shorter inputs). This is
the subject of this subsection.

68
However, let us first recall the important case where of designing a PRP
from a PRF whose range is identical to its domain D - and which will also be the
domain (and range) of the PRP. We already discussed this case, and presented
the PRP/PRF switching lemma (Lemma 2.4), which shows that every PRF
over a domain D, is also a PRP over D, and vice verse. This holds although,
for a given key k, a PRF Fk may very welll have some collisions Fk (x) = Fk (y),
while a PRP cannot have collisions. However, the switching lemma shows that
no computationally-bounded (PPT) adversary is likely to distinguish between
a PRP and a PRF (over domain D).
Constructing a PRP from a PRF is not trivial; see the following two exer-
cises.

Exercise 2.20. Let f be a PRF from n-bit strings to n-bit strings. Show that
gkL ,kR (mL +
+ mR ) = fkL (mL ) +
+ fkR (mR ) is not a PRF or a PRP (over 2n-bit
strings).

Hint: given a black box containing g or a random permutation over 2n-bit


strings, design a distinguishing adversary A as follows. A makes two queries,
one with input x = 02n and the other with input x0 = 0n + + 1n . Denote
0 0
the corresponding outputs by y = y0,...,2n−1 and y = y0,...,2n−1 . If the box
0
contained g, then y0,...,n−1 = y0,...,n−1 . In contrast, if the box contained a
0
random function, then the probability that y0,...,n−1 = y0,...,n−1 is very small -
−n
only 2 . The probability is about as small as if the box contained a PRP.
The next exercise presents a slightly more elaborate scheme, which is es-
sentially a reduced version of the Feistel construction (presented next).

Exercise 2.21. Let f be a PRF from n-bit strings to n-bit strings. Show that
+ mR ) = ml ⊕ fkR (mR ) +
gkL ,kR (mL + + mR ⊕ fkL (mL ) is not a PRP (over 2n-bit
strings).

We next present the Feistel construction, the most well known and simplest
construction of a PRP - in fact, a reversible PRP (block cipher) - from a PRF.
As shown in Fig. 2.22, the Feistel cipher transforms an n-bit PRF into a 2n-bit
reversible PRP.
Formally, given a function y = fk (x) with n-bit keys, inputs and outputs,
the three-rounds Feistel gk (m) is defined as:

Lk (m) = m0,...,n−1 ⊕ Fk (mn,...,2n−1 )


Rk (m) = Fk (Lk (m)) ⊕ mn,...,2n−1
gk (m) = Lk (m) ⊕ Fk (Rk (m)) +
+ Rk (m)

Note that we consider only a ‘three rounds’ Feistel cipher, and use the same
underlying function Fk in all three rounds, but neither aspect is mandatory.
In fact, the Feistel cipher is used in the design of DES and several other block
ciphers, typically using more rounds (e.g., 16 in DES), and often using different
functions at different rounds.

69
Figure 2.22: Three ‘rounds’ of the Feistel Cipher, constructing a block cipher
(reversible PRP) from a PRF Fk (·). The Feistel cipher is used in DES (but not
in AES). Note: most publications present the Feistel cipher a bit differently,
by ‘switching sides’ in each round.

Luby and Rackoff [123] proved that a Feistel cipher of three or more ‘rounds’,
using a PRF as Fk (·), is a reversible PRP, i.e., a block cipher.
One may ask, why use the Feistel design rather than directly design a
reversible PRP? Indeed, this is done in AES, which does not follow the Feistel
cipher design. An advantage of using the Feistel design is that it allows the
designer to focus on the pseudo-randomness requirements when designing the
PRF, without having simultaneously to make sure that the design is also an
invertible permutation. Try to design a PRP, let alone an reversible PRP, and
compare it to using the Feistel cipher!

2.9 Secure Encryption Modes of Operation


Finally we get to design symmetric encryption schemes; following the Crypto-
graphic Building Blocks principle, the designs are based on the much simpler
block ciphers.

70
Mode Encryption Properties
Electronic code ci = Ek (mi ) Insecure
book (ECB)
$
Per-Block Ran- ri ← {0, 1}n , Nonstandard,
dom (PBR) ci = (ri , mi ⊕ Ek (ri )) long ciphertext
$
Output Feedback r0 ← {0, 1}n , ri = Ek (ri−1 ), Parallel, fast online,
(OFB) c0 ← r0 , ci ← ri ⊕ mi PRF, 1-localization
$
Cipher Feedback c0 ← {0, 1}n , Parallel decrypt
(CFB) ci ← mi ⊕ Ek (ci−1 ) PRF, n + 1-
localization
$
Cipher-Block c0 ← {0, 1}n , parallel decrypt
Chaining (CBC) ci ← Ek (mi ⊕ ci−1 ) n + 1-localization
Counter (CTR) T1 ← nonce+ +0n/2 , Ti ← Ti−1 +1, Parallel, fast online,
ci = mi ⊕ Ek (Ti ) PRF, 1-localization,
stateful (nonce)

Table 2.6: Standard Encryption Modes of Operation using n-bit block cipher
(standardized by NIST [66, 134]). The plaintext is given as a concatenation of
n-bit blocks m1 + + m2 + + . . ., where each block has n bits, i.e., mi ∈ {0, 1}n .
Similarly the ciphertext is produced as a set of n-bits blocks c0 + + ... ∈
+ c1 +
{0, 1}n , where ci ∈ {0, 1}n . For PBR, ci ∈ {0, 1}2n . PRF: improved security
if E is a PRF. nonce: must differ for every message, e.g., message counter.

The term modes of operation is used for constructions of more complex


cryptographic mechanisms from block ciphers, for different purposes. See the
DES specifications [134], which described ‘standard modes of operation for
DES’, updated later in [66] for AES, and with the additon of the CTR mode.
The modes are summarized in Table 2.6; we have added one ‘mode’, the Per-
Block Random (PBR) mode, only to help understanding.
Exercise 2.22. Table 2.6 specifies only the encryption process. Write the
decryption process for each mode and show that correctness is satisfied, i.e.,
m = Dk (Ek (m)) for every k, m ∈ {0, 1}∗ .
The ‘modes of operations’ in Table 2.6 are designed to turn block ciphers
into more complete cryptosystems, handling goals such as:
Variable length: allow encryption of arbitrary, variable input length (VIL)
messages.
Randomization/state: Most modes use randomness to ensure independence
between two encryptions of the same (or of related) messages, as required
for indistinguishability-based security definitions. The exceptions are the
CTR mode, which uses state instead of randomization, and the ECB
mode, that uses neither - and, therefore, is not IND-CPA.

71
PRF: most modes (PBR, OFB, CFB and CTR), use only the encryption func-
tion E - even for decryption. This has an important implication: they
may be implemented using a PRF instead of a block cipher. This may
have imply better security, esp. when the same key is used for an exten-
sive number of messages, due to improved concrete-security (smaller ad-
vantage to attacker). However, notice that there will not be such advan-
tage if we simply use a block cipher as a PRP, relying on the PRP/PRF
switching lemma (Lemma 2.4); we should use one of the (simple and ef-
ficient) constructions of PRF from block cipher, which avoid an increase
in the adversary’s advantage; see [18, 87]. See [15, 147].
Efficiency is important - and multi-faceted. All of the modes we present, use
one block-cipher operation per message-block, and allow parallel decryp-
tion. The OFB, CTR and PBR modes also allow parallel encryption,
or ‘random access’ decryption - decryption of only specific blocks from
the plaintext. Another efficiency consideration is offline precomputation;
in the CTR modes, we may conduct all the block-ciphers computations
offline; after receiving the plaintext/ciphertext, we only need a single
XOR operation (per block). The OFB mode has similar property but
only for encryption; decryption requires the ciphertext as input to the
block-cipher.
Integrity/authentication: Some modes, which, unfortunately, we do not
discuss, ensure both confidentiality and integrity, preventing an attacker
from modifying intercepted messages to mislead the recipient, or from
forging messages as if they were sent by a trusted sender. These include
the Counter and CBC-MAC (CCM) mode and the (more efficient) Ga-
lois/Counter Mode (GCM) mode. Other modes ensure only authenticity;
we discuss one such mode, the CBC-MAC mode, in subsection 3.5.2.
Error localization and weak integrity: In the OFB and CTR, corruption
of any number m of ciphertext bits, results in corruption of only the
corresponding plaintext bits. This may help to recover from some cor-
ruptions of bits during communication, since no additional bits are lost,
but also implies that the attacker may ‘flip’ plaintext bits by ‘flipping’ the
corresponding ciphertext bits. In CFB and CBC, corruption of a single
ciphertext block, flips a bit in one block, and ‘corrupts’ another block
- with some exceptions; this is sometimes considered as a weak form of
prtection of integrity, but the defense is very fragile and relying on it has
resulted in several vulnerabilities. See details below.

2.9.1 The Electronic Code Book mode (ECB) mode


ECB is a naı̈ve mode, which isn’t really a proper ‘mode’: it simply applies the
block cipher separately to each block of the plaintext. Namely, to decrypt the
plaintext string m = m1 ++ m2 ++ . . ., where each mi is a block (i.e., |mi | = n),
we simply compute ci = Ek (mi ). Decryption is equally trivial: mi = Dk (ci ),

72
and correctness of encryption, i.e., m = Dk (Ek (m)) for every k, m ∈ {0, 1}∗ ,
follows immediately from the correctness of the block cipher Ek (·).

m0 m1 m2 mn

k Enc k Enc k Enc ······ k Enc

c0 c1 c2 cn

Figure 2.23: Electronic Code Book (ECB) mode encryption. Adapted from
[100].

Note: notations are not exactly consistent with text, should be fixed.

c0 c1 c2 cn

k Dec k Dec k Dec ······ k Dec

m0 m1 m2 mn

Figure 2.24: Electronic Code Book (ECB) mode decryption. Adapted from
[100]. Note: notations are not exactly consistent with text, should be fixed.

The reader may have already noticed that ECB is simply a mono-alphabetic
substitution cipher, as discussed in subsection 2.1.3. The ‘alphabet’ here is
indeed large: each ‘letter’ is a whole n-bit block. For typical block ciphers,
the block size is significant, e.g., nDES = 64 bits for DES and nAES = 128
bits; this definitely improves security, and may make it challenging to decrypt
ECB-mode messages in many scenarios.
However, obviously, this means that ECB may expose some information
about plaintext, in particular, all encryptions of the same plaintext block will
result in the same ciphertext block. Even with relatively long blocks of 64 or
128 bits, such repeating blocks are quite likely in practical applications and sce-
narios, since inputs are not random strings. Essentially, this is a generalization
of the letter-frequency attack of subsection 2.1.3 (see Fig. 2.7).
This weakness of ECB is often illustrated graphically by the example illus-
trated in Fig. 2.25, using the ‘Linux Penguin’ image [72, 170].

73
Figure 2.25: The classical visual demonstration of the weakness of the ECB
mode. The middle and the right ‘boxes’ are encryptions of the bitmap image 25
shown on the left; can you identify which of the two is ‘encrypted’ using ECB,
and which is encrypted with one of the secure encryption modes?

2.9.2 The Per-Block Random Mode (PBR)


We next present the Per-Block Random Mode (PBR). PBR is a non-standard
and inefficient mode; we find it worth discussing, since it is a simple way to
construct a secure encryption scheme from a PRF, PRP or block cipher.
Let m = m1 + + m2 + + mM be a plaintext message, where each mi ∈
+ ... +
{0, 1}n is one block. PBR-mode encryption of m is denoted c = E P BR (m),
computed as follows:
( )
$ n
c = P BRkE (m) = c1 ++ ... +
+ cM where ri ← {0, 1} (2.32)
ci ← (ri , mi ⊕ Ek (ri ))

Namely, encrypt each message block mi with the corresponding random


block ri . Note that we can encode each ci simply as ci = ri + + mi ⊕ Ek (ri ),
i.e., as a string of 2n bits; the pairwise notation is equivalent, and a bit easier
to work with.
Note that PBR mode does not use at all the ‘decryption’ function D of the
underlying block cipher (reversible PRP) - for either encryption or decryption.
This is the reason that PBR can be deployed using a PRF or PRP, and not
just using a block cipher. As can be seen from Table 2.6 and Exercise 2.22,
this also holds for OFB and CFB modes.
PBR is not a standard mode, and rightfully so, since it is wasteful: it
requires the use of one block of random bits per each block of the plaintext,
and all these random blocks also become part of the ciphertext and are used
for decryption, i.e., the length of the ciphertext is double the length of the
plaintext. However, PBR is secure - allowing us to discuss a simple provably-
secure construction of a symmetric cryptosystem, based on the security of the
underlying block cipher (reversible PRP).

74
Theorem 2.1. If E is a PRF or PRP, or (E, D) is a block cipher (reversible
PRP), then (P BRE , P BRD ) is a CPA-indistinguishable symmetric encryption.

Proof. We present the proof when E is a PRF; the other cases are similar. We
also focus, for simplicity, on encryption of a single-block message, m = m1 ∈
{0, 1}n .
Denote by (P BRÊ , P BRD̂ ) the same construction, except using, instead
$
of E, a ‘truly’ random function f ← {{0, 1}n → {0, 1}n }. In this case, for any
pair of plaintext messages m0 , m1 selected by the adversary and randomness r
used for encrypting, the probability of c∗ = (r, r ⊕ f (m0 )) is exactly the same
as the probability of c∗ = (r, r ⊕ f (m1 )), from symmetry of the random choice
of f . Hence, the attacker’s success probability, when ‘playing’ the IND-CPA
game (Def. 2.1) ‘against’ (P BRÊ , P BRD̂ ) is exactly half. Note that this holds
even for computationally-unbounded adversary.
Assume, to the contrary, that there is some PPT adversary A, that is
able to gain a non-negligible advantage against (P BRE , P BRD ). Recall that
this holds, even assuming E is a PRF - however, as argued above, A suc-
ceeds with probability exactly half, i.e., with exactly zero advantage, against
(P BRÊ , P BRD̂ ), i.e., if E was a truly random function.
We can use A to distinguish between Ek (·) and a random function, with
significant probability, contradicting the assumption that Ek is a PRF; see
Def. 2.8. Namely, we run A against the PBR construction instantiated with
either a true random function or Ek (·), resulting in either (Ê P BR , D̂P BR )
or (E P BR , DP BR ), correspondingly. Since A wins with significant advantage
against (E P BR , DP BR ), and with no advantage against (Ê P BR , D̂P BR ), this
allows distinguishing, proving the contradiction.

Error propagation, integrity and CCA security Since PBR mode en-
crypts the plaintext by bitwise XOR, i.e., ci = (ri + + mi ⊕ Ek (ri )), flipping
a bit in the second part result in flipping of the corresponding bit in the de-
crypted plaintext, with no other change in the plaintext. We say that such bit
errors are perfectly localized or have no error propagation. On the other hand,
bit errors in the random pad part ri corrupt the entire corresponding plaintext
block, i.e., are propagated to the entire block. In any case, errors are somewhat
localized - other plaintext blocks are decrypted intact.
This property may seem useful, to limit the damage of such errors, but
that value is very limited. On the other hand, this property has two negative
security implications. The first is obvious: an attacker can flip specific bits
in the plaintext, i.e., PBR provides no integrity protection. Of course, we did
not require cryptosystems to ensure integrity; in particular, the situation is
identical for other bitwise-XOR ciphers such as OTP.
The other security drawback is that PBR is not IND-CCA secure. This
directly results from the error localization property. Since all the modes we
show in this chapter localize errors (perfectly, or to a single or two blocks), it
follows that none of these modes is IND-CCA secure.

75
Exercise 2.23 (Error localization conflicts with IND-CCA security). Show
that every cryptosystem where errors in one ciphertext block are localized to
one or two corresponding blocks is not IND-CCA secure.
Hint: Consider the case of three-block plaintexts; one block must not be
corrupted. This suffices for attacker to succeed in the IND-CCA game.

2.9.3 The Output-Feedback (OFB) Mode


We now proceed to discuss standard modes, which provably-ensure secure en-
cryption, with randomization, for multiple-block messages - yet are more effi-
cient cf. to the PBR mode.
We begin with the simple Output-Feedback (OFB) Mode. In spite of its
simplicity, this mode ensures provably-secure encryption - and requires the
generation and exchange of only a single block of random bits, cf. to one block
of random bits per each plaintext block, as in PRB.

IV
pad0

Ek (·) Ek (·) Ek (·) ······ Ek (·)

pad1 pad2 pad3 padl


m1 m2 m3 ml

c0 c1 c2 c3 cl

Figure 2.26: Output Feedback (OFB) mode encryption. Adapted from [100].

c0 = pad0

Ek (·) Ek (·) Ek (·) ······ Ek (·)

pad1 pad2 pad3 padl


c1 c2 c3 cl

m1 m2 m3 ml

Figure 2.27: Output Feedback (OFB) mode decryption. Adapted from [100].

The OFB mode is illustrated in Figs. 2.26 (encryption) and 2.27 (decryp-
tion). OFB is a variant on the PRF-based stream cipher discussed in subsec-

76
tion 2.5.1 and illustrated in Fig. 2.15, and, like it, operates on input which
consists of l blocks of n bits each. The difference is that OFB uses a PRP
(block cipher) Ek instead of the PRF P RFk .
We use a random Initialization Vector (IV) as a ‘seed’ to generate a long
sequence of pseudo-random n−bit pad blocks, pad1 , . . . , padl , to encrypt plain-
text blocks m1 , . . . , ml . We next compute the bitwise XOR of the pad blocks
pad1 , . . . , padl , with the corresponding plaintext blocks m1 , . . . , ml , resulting
in the ciphertext which consists of the random IV c0 and the results of the
XOR operation, i.e. c1 = m1 ⊕ pad1 , c2 = m2 ⊕ pad2 , . . ..
Let us now define OF B − Ek (m), the OFB mode for a given block cipher
(E, D). For simplicity we define OF B −Ek (m) for messages m which consist of
some number l of n-bit blocks, i.e., m = m1 + +ml , where (∀i ≤ l)|mi | = n.
+. . .+
Then OF B − Ek (m) is defined as:
OF BkE (m1 +
+ ... +
+ ml ) = (c0 +
+ c1 +
+ ... +
+ cl ) (2.33)
where:
$
pad0 ← {0, 1}n (2.34)
padi ← Ek (padi−1 ) (2.35)
c0 ← pad0 (2.36)
ci ← padi ⊕ mi (2.37)

Offline pad precomputation The OFB mode allows both the encryption
process and the decryption process to precompute the pad, ‘offline’ - i.e., before
the plaintext and ciphertext, respectively, are available. Offline pad precompu-
tation is possible since the pad does not depend on the plaintext (or ciphertext).
This can be important, e.g., when a CPU with limited computation speed needs
to support a limited number of ‘short bursts’, without adding latency. Once the
plaintext/ciphertext is available, we only need one XOR operation per block.

Parallelism. The pad is computed sequentially; there does not appear to be


a way to speed up its computation using parallelism.

Error localization, correction and integrity Another important prop-


erty for encryption schemes is error localization, namely, the number of bits
changes in the deciphered plaintext, as result of corruption of a single cipher-
text bit. Since OFB operates as a bit-wise stream cipher, then it is 1-localized
(or perfectly localized): a change in any ciphertext bit simply causes a change
in the corresponding plaintext bit - and no other bit.
The ‘perfect bit error localization’ property implies that error correction
works equally well if applied to the plaintext (before encryption, with correction
applied to plaintext after decryption) or applied to the ciphertext. Without
localization, a single bit error in the ciphertext could translate to many bit
errors in the plaintext, essentially implying that error correction can only be
effectively applied to the ciphertext.

77
This motivated some designers to use OFB or a similar XOR-based stream
cipher so that they could apply error correction on the plaintext. However,
this is often a bad idea, as it results in structured redundancy in the plaintext,
which may make CTO attacks easier!.
Let us give two examples. Our first example is a scenario where the parties
share only a short key k, say of 20 bits. Normally, this would imply the
communication is vulnerable - attacker may simply perform exhaustive search
to find the key. However, what if the parties only encrypt a (single) message
$
containing a new, longer random key, say k 0 ← {0, 1}n ? Applying exhaustive
search is impossible, since every one of the 220 possible values for k will output
some value for k 0 - and all would be equally likely. However, the attack becomes
trivial if the plaintext includes not only the key, but also error correction or
detection code, denoted code(·), applied to the key. In this case, the ciphertext
would be Ek (k 0 ++ code(k 0 )), and the adversary can easily check which guess of
k is correct. Note that this particular attack may also be possible if we simply
pad the plaintext using some known string, e.g., send Ek (k 0 + + 0l ).
Our second example is a very important vulnerability: a Ciphertext-Only
CTO attack on the A5/1 and A5/2 stream ciphers [7], both used in the GSM
protocol. This attack exploits the known relationship between ciphertext bits,
due to the fact that an Error Correction Code was applied to the plaintext. This
redundancy suffices to attack the ciphers, using techniques that normally can
be applied only in Known Plaintext attacks. Unfortunately, complete details
of this beautiful and important result are beyond our scope; for details, see [7].
One may wonder, why would these designers prefer to apply error correction
to the plaintext rather than to the ciphertext? One motivation may be the
hope that this may make cryptanalysis harder, e.g., corrupt some plaintext
statistics such as letter frequencies. This may hold for some codes; but we
better design such defenses explicitly into the cryptosystem and not rely on
such fuzzy property of encoding.
Another motivation may be the hope that applying error correction/detec-
tion to the plaintext may provide integrity. Note that due to the perfect bit
error localization of OFB, an attacker can easily flip a specific plaintext bit -
by flipping the corresponding ciphertext bit. If we applied error detection to
the plaintext, then corruption of a single bit will corrupt the entire plaintext.
However, since the attacker can flip multiple ciphertext bits, thereby flipping
the corresponding plaintext bits, there are cases where the attacker can mod-
ify the ciphertext in such a way as to flip specific bits in the plaintext while
also ‘fixing’ the error detection/correction code, to make the message appear
correct. We conclude the following principle.

Principle 9 (Minimize plaintext redundancy). Plaintext should preferably


have minimal redundancy. In particular, plaintext should preferably not contain
Error Correction or Detection codes.

Namely, applying error correction to plaintext is a bad idea - certainly when


using stream-cipher design such as OFB. This raises the obvious question: can

78
an encryption mode of a block cipher also protect the integrity of the decrypted
plaintext? Both of the following modes, CFB and CBC, provide some defense
of integrity - by ensuring errors do propagate.

Provable security of OFB. The above discussed weaknesses are due to


incorrect deployments of OFB; correctly used, OFB is secure. Proving the
security of OFB follows along similar lines to Theorem 2.1, except that in order
to deal with multi-block messages, we will need to use a more elaborate proof
technique called ‘hybrid proof’; we leave that for courses and books focusing
on Cryptology, e.g., [80, 159].

2.9.4 The Cipher Feedback (CFB) Mode


We now present the Cipher Feedback (CFB) Mode. Like most standard modes,
it uses a random first block (‘initialization vector’, IV). In fact, CFB resembles
OFB; it uses the IV to generate a first pseudo-random pad pad1 = Ek (IV ),
and it computes ciphertext blocks by bitwise XOR between the pad blocks and
the corresponding plaintext blocks, i.e., ci = padi ⊕ mi .
The difference between CFB and OFB is in the ‘feedback’ mechanism,
namely, the computation of the pads padi (for i > 1). In CFB mode, this is
done using the ciphertext rather than the previous pad, i.e., padi = Ek (ci−1 ) =
Ek (padi−1 ⊕ mi−1 ). See Fig. 2.28.

IV

Ek (·) Ek (·) Ek (·) ······ Ek (·)

pad1 pad2 pad3 padl


m1 m2 m3 ml

c0 c1 c2 c3 cl

Figure 2.28: Cipher Feedback (CFB) mode encryption. Adapted from [100].

Optimizing implementations: parallel decryption, but no precompu-


tation Unlike OFB, the CFB mode does not support offline precomputation
of the pad, since the pad depends on the ciphertext (of the previous block).
One optimization that is possible is to parallelize the decryption operation.
Namely, decryption may be performed for all blocks in parallel, since the de-
cryption mi of block i is mi = ci ⊕ pi = ci ⊕ Ek (ci−1 ), i.e., can be computed
based on the ciphertexts of this block and of the previous block.

79
c0 (= IV ) c1 c2 c3 cl−1 cl

Ek (·) Ek (·) Ek (·) ······ Ek (·)

pad1 pad2 pad3 padl

m1 m2 m3 ml

Figure 2.29: Cipher Feedback (CFB) mode decryption. Adapted from [100].

Error localization and integrity Error localization in CFB is not perfect;


a single bit error in one ciphertext block completely corrupts the following
plaintext block.
As we discussed for OFB, this reduction in error localization may be viewed
as an advantage in ensuring integrity. Like OFB mode, the CFB mode allows
the attacker to flip specific bits in the decrypted plaintext, by flipping corre-
sponding bits in the ciphertext. However, as a result of such bit flipping, say in
block i, the decrypted plaintext of the following block is completely corrupted.
Intuitively, this implies that applying an error-detection code to the plaintext
would allow detection of such changes, in contrast to the situation with OFB
mode.
However, this dependency on the error detection code applied to the plain-
text may cause some concerns. First, it is an assumption about the way that
OFB is used; can we provide some defense for integrity that will not depend
on such additional mechanisms as an error detection code? Second, it seems
challenging to prove that the above intuition is really correct, and this is likely
to depend on the specifics of the error detection code used. Finally, adding
error detection code to the plaintext increases its redundancy, in contradiction
to Principle 9. We next present the CBC mode, which provides a different
defense for integrity, which addresses these concerns.

2.9.5 The Cipher-Block Chaining (CBC) mode


We now present one last encryption mode, the Cipher-Block Chaining (CBC)
mode. CBC is a very popular mode for encryption of multiple-block messages;
like CFB, it allows parallel decryption but not offline pad precomputation.
The CBC mode, like the OFB and CFB modes, uses a random Initialization
$
Vector (IV) as the first block of the ciphertext, c0 ← {0, 1}n . However, in
th
contrast to OFB and CFB, to encrypt the i plaintext block mi , CBC XORes
mi with the previous ciphertext block ci−1 , and then applies the block cipher.
Namely, ci = Ek (ci−1 ⊕ mi ). See Fig. 2.30.

80
m1 m2 m3 mn

IV

Ek Ek Ek ······ Ek

c0 c1 c2 c3 cn

Figure 2.30: Cipher Block Chaining (CBC) mode encryption. Adapted from
[100].

c1 c2 c3 cn

Dk Dk Dk ······ Dk

c0
(IV )
m1 m2 m3 mn

Figure 2.31: Cipher Block Chaining (CBC) mode decryption. Adapted from
[100].

More precisely, let (E, D) be a block cipher, and let m = m1 +


+ ... +
+ mn
be a message (broken into blocks). Then the CBC encryption of m using key
k and initialization vector IV ∈ {0, 1}l is defined as:

CBCkE (m1 +
+ ... +
+ ml ) = (c0 +
+ c1 +
+ ... +
+ cl ) (2.38)
where:
$
c0 ← {0, 1}n (2.39)
(i ∈ {1, . . . , l}) ci ← Ek (ci−1 ) ⊕ mi (2.40)
The CBC mode, like the other modes (exept ECB), ensures IND-CPA, i.e.,
security against CPA attacks, provided that the underlying block cipher is a
secure reversible PRP. However, it is not secure against CCA attacks.
Exercise 2.24. Demonstrate that CBC mode does not ensure security against
CCA attacks.
Hint: the solution is quite similar to that of Exercise 2.23.

81
Error propagation and integrity Any change in the CBC ciphertext, even
of one bit, results in unpredictable output from the block cipher’s ‘decryption’
operation, and hence unpredictable decryption. Namely, flipping a bit in the
ciphertext block i does not flip the corresponding bit in plaintext block i, as it
did in the OFB and CFB modes.
However, the flipping of a bit in the ciphertext block ci−1 , without change
to block ci , results in the flipping of the corresponding bit in the ith decrypted
plaintext block. Namely, bit-flipping is still possible in CBC, it is just a bit
different - and in order to flip a bit in the decrypted-plaintext block i, the
adversary has flip the corresponding bit in the previous block (i − 1), which
results in corruption of the decryption of block i − 1. Indeed, this kind of
tampering is used in several attacks on systems deploying CBC, such as the
Poodle attack [132]. Note also that bit flipping in the first decrypted-plaintext
block only requires flipping of the corresponding IV block - and hence does not
corrupt any plaintext block.

2.9.6 Ensuring CCA Security


We already observed, in Exercise 2.23, that any cryptosystem (and mode) that
ensures error localization to some extent cannot be IND-CCA secure. This
implies that none of the modes we discussed is IND-CCA secure. Such failure
can occur even for the much weaker - and more common - case of Feedback-only
CCA attacks, where the attacker does not receive the decrypted plaintext, but
only an indication of whether the plaintext was ‘valid’ or not.
How can we ensure security against CCA attacks? One intuitive defense
is to avoid giving any feedback on invalid-plaintext failures. However, this is
harder than it may seem. For example, often, after (successful) decryption, a
response is immediately sent, which may be hard to emulate when the plaintext
is invalid - we may be even unable to identify the sender, e.g., if the sender
identity is encrypted for anonymity. By observing if a response is sent, or the
timing of the response, an attacker may obtain feedback on the attack. Such
unintentional indications are referred to as side channels; for example, when
the feedback is based on the time the response is sent, this is a timing side
channel.
A better approach may be to detect the chosen-ciphertext queries, without
decrypting them. One simple way to do this is by authenticating the ciphertext,
i.e., appending to the ciphertext an authentication tag, which allows secure
detection of any modification in the ciphertext. Such authentication is the
subject of the next chapter.

2.10 Case study: the (in)security of WEP


We conclude this chapter, and further motivate the next, by discussing a
case study: vulnerabilities of the Wired Equivalency Privacy (WEP) stan-
dard [46]. Notice that these critical vulnerabilities were discovered long ago,

82
mostly in [37], relatively soon after the standard was published; yet, products
and networks supporting WEP still exist. This is an example of the fact that
once a standard is published and adopted, it is often very difficult to fix secu-
rity. Hence, it is important to carefully evaluate security in advance, in an open
process that encourages researchers to find vulnerabilities, and, where possible,
with proofs of security. To address these vulnerabilities, WEP was replaced -
possibly in too much haste - with a new standard, the Wi-Fi Protected Access
(WPA); vulnerabilities were also found in WPA, see [164].
WEP stands for Wired Equivalency Privacy; it was developed as part of
the IEEE 802.11b standard, to provide some protection of data over wireless
local area networks (also known as WiFi networks). As the name implies,
the original goals aimed at a limited level of privacy (meaning confidential-
ity), which was deemed ‘equivalent’ to the (physically limited) security offered
by a wired connection. WEP includes encryption for confidentiality, CRC-32
error-detection code (‘checksum’) for integrity, and authentication to prevent
injection attacks.
WEP assumes a symmetric key between the mobile device and an access
point. Many networks share the same key with all mobiles, which obviously
allows each device to eavesdrop on all communication. However, other vul-
nerabilities exist even if a network uses good key-management mechanisms to
share a separate key with each mobile.
Confidentiality in WEP is protected using the RC4 stream-cipher, used
as a PRG-based stream cipher as in subsection 2.5.1. The PRG is initiated
with a secret shared key, which is specified to be only 40 bits long. This
short key size was chosen intentionally, to allow export of the hardware, since
when the standard was drafted there were still export limitations on longer-key
cryptography. Some implementations also support longer, 104-bit keys.
The PRG is also initiated with a 24-bit per-packet random Initialization
Vector (IV). We use RC4IV,k to denote the string output by RC4 when initial-
ized using a given IV, k pair. More specifically, we use RC4IV,k [l] to denote
the first l bits in RC4IV,k , i.e., in the output by RC4 when initialized using
given IV, k pair.
WEP packets use the CRC-32 error detection/correction code, computed
over the plaintext message m. CRC-32 is a popular error detection/correction
code, which can detect most corruption of the message, and even allow correc-
tion of messages with up to 3 or 5 corrupted bits [109]. Specifically, to send
message m using secret key k, WEP implementations select a random 24-bit
IV, and transmit IV together with W EPk (m; IV ), defined9 as:

W EPk (m; IV ) ≡ RC4IV,k [|m +


+ CRC(m)|] ⊕ (m +
+ CRC(m)) (2.41)
9 In earlier version of this text, we used different notations: W EP 0 (m; IV ) ≡ RC4
k IV,k ⊕
+ CRC(m)) and W EPk (m; IV ) = [IV, W EPk0 (m; IV )]. This footnote is left here in case
(m +
some of the old notation still persists.

83
2.10.1 CRC-then-XOR does not ensure integrity
CRC-32 is a quite good error detection/correction code. By encrypting the
output of CRC, specifically by XORing it with the pseudo-random pad gener-
ated by RC4, the WEP designers hoped to protect message integrity. However,
like many other error correcting codes, CRC-32 is linear, namely, for any two
strings α, α0 ∈ {0, 1}∗ of equal length (|α| = |α0 |), holds:

CRC(α ⊕ α0 ) = CRC(α) ⊕ CRC(α0 ) (2.42)

We now show that this allows an attacker to change the message m sent in a
WEP packet, by flipping any desired bits and appropriately adjusting the CRC
field.
Specifically, let ∆ represent the string of length |m| containing 1 for bit
locations that the attacker wishes to flip. Given α = W EPk (m; IV ), the
attacker can compute a valid W EPk (m ⊕ ∆; IV ) as follows:

W EPk (m ⊕ ∆; IV ) = RC4IV,k [|m ⊕ ∆ +


+ CRC(m ⊕ ∆)|] ⊕ (m ⊕ ∆ +
+ CRC(m ⊕ ∆))
= RC4IV,k [|m ⊕ ∆ +
+ CRC(m)|] ⊕ (m +
+ CRC(m ⊕ ∆))
= + CRC(m)|] ⊕ (m ⊕ ∆ +
RC4IV,k [|m + + CRC(m) ⊕ CRC(∆))
= + CRC(m)|] ⊕ (m +
RC4IV,k [|m + + CRC(m)) ⊕ (∆ +
+ CRC(∆))
= W EPk (m; IV ) ⊕ (∆ +
+ CRC(∆))
= α ⊕ (∆ +
+ CRC(∆))

Namely, the CRC mechanism, XOR-encrypted, does not provide any mean-
ingful integrity protection.

WEP authentication-based vulnerabilities WEP defines two modes of


authentication, although one of them, called open-system authentication, sim-
ply means that there is no authentication.
The other mode is called shared-key authentication. It works very sim-
ply: the access point sends a random challenge R; and the mobile sends back
W EPk (R; IV ), i.e., a proper WEP packet containing the ‘message’ R.
This authentication mode is currently rarely used, since it allows attacks
on the encryption mechanism. First, notice that it provides a trivial way for
the attacker to obtain ‘cribs’ (known plaintext - ciphertext pairs). Of course,
encryption systems should be protected against known-plaintext attacks; how-
ever, following the conservative design principle (principle 7), system designers
should try to make it difficult for attackers to obtain cribs. In the common,
standard case of 40-bit WEP implementations, a crib is deadly - an attacker
can now do a trivial exhaustive search to find the key.
Even when using longer keys (104 bits), the shared-key authentication ex-
poses WEP to a simple cryptanalysis attack. Specifically, since R is known,
the attacker learns RC4IV,k for a given, random IV . Since the length of the
IV is just 24 bits, it is feasible to obtain a collection of most IV values and
the corresponding RC4IV,k pads, allowing decryption of most messages.

84
As a result of these concerns, most WEP systems use only open-system
authentication mode, i.e., do not provide any authentication.

Further WEP encryption vulnerabilities We briefly mention two further


vulnerabilities of the WEP encryption mechanism.
The first vulnerability exploits the integrity vulnerability discussed earlier.
As explained there, the attacker can flip arbitrary bits in the WEP payload
message. WEP is a link-layer protocol; the payload is usually an Internet
Protocol (IP) packet, whose header contains, in known position, the destination
address. An attacker can change the destination address, causing forwarding
of the packet directly to the attacker!
The second vulnerability is the fact that WEP uses ‘plain’ RC4, which has
been shown in [124] to be vulnerable.

2.11 Encryption: Final Words


Confidentiality, as provided by encryption, is the oldest goal of cryptology, and
is still critical to the entire area of cybersecurity. Encryption has been studied
for millennia, but for many years, the design of cryptosystems was kept secret,
in the hope of improving security. Kerckhoffs’ principle, however, has been
widely adopted and caused cryptography to be widely studied and deployed,
in industry and academia.
Cryptography was further revolutionized by the introduction of precise def-
initions and proofs of security by reduction, based on the theory of complex-
ity. In particular, modern study of applied cryptography makes extensive use
of provable security, especially computational security, i.e., ensuring security
properties with high probability, against probabilistic polynomial time (PPT)
adversaries. We have seen a small taste of such definitions and proofs in this
chapter; we will see a bit more later on, but for a real introduction to the
theory of cryptography, see appropriate textbooks, such as [79, 80].

85
2.12 Encryption and Pseudo-Randomness: Additional
exercises
Exercise 2.25. ConCrypt Inc. announces a new symmetric encryption scheme,
CES. ConCrypt announces that CES uses a 128-bit keys and is five times faster
than AES, and is the first practical cipher to be secure against computationally-
unbounded attackers. Is there any method, process or experiment to validate or
invalidate these claims? Describe or explain why not.
Exercise 2.26. ConCrypt Inc. announces a new symmetric encryption scheme,
CES512. ConCrypt announces that CES512 uses a 512-bit keys, and as a result,
is proven to be much more secure than AES. Can you point out any concerns
with using CES512 instead of AES?
Exercise 2.27. Compare the following pairs of attack models. For each pair
(A, B), state whether every cryptosystem secure under attack model A is also
secure under attack model B and vice versa. Prove (if you fail to prove, at least
give compelling argument) your answers. The pairs are:
1. (Cipher-text only, Known plain text)
2. (Known plain text, Chosen plain text)
3. (Known plain text, Chosen cipher text)
4. (Chosen plain text, Chosen cipher text)
Exercise 2.28. Alice is communicating using the GSM cellular standard, which
encrypts all calls between her phone and the access tower. Identify the attacker
model corresponding to each of the following cryptanalysis attack scenarios:
1. Assume that Alice and the tower use a different shared key for each call,
and that Eve knows that specific, known message is sent from Bob to Alice
at given times.
2. Assume (only) that Alice and the tower use a different shared key for
each call.
3. Assume all calls are encrypted using a (fixed) secret key kA shared between
Alice’s phone and the tower, and that Eve knows that specific, known
control messages are sent, encrypted, at given times.
4. Assume (only) that all calls are encrypted using a (fixed) secret key kA
shared between Alice’s phone and the tower
Exercise 2.29. We covered several encryption schemes in this chapter, in-
cluding At-Bash (AzBy), Caesar, Shift-cipher, general monoalphabetic substi-
tution, OTP, PRG-based stream cipher, RC4, block ciphers, and the ‘modes’
in Table 2.6. Which of these is: (1) stateful, (2) randomized, (3) FIL, (4)
polynomial-time?

86
Exercise 2.30. Consider use of AES with key length of 256 bits and block
length of 128 bit, for two different 128 bit messages, A and B (i.e., one block
each). Bound, or compute precisely if possible, the probability that the en-
cryption of A will be identical to the encryption of B, in each of the following
scenarios:
1. Both messages are encrypted with the same randomly-chosen key, using
ECB mode.
2. Both messages are encrypted with two keys, each of which is chosen ran-
domly and independently, and using ECB mode.
3. Both messages are encrypted with the same randomly-chosen key, using
CBC mode.
4. Compute now the probability the the same message is encrypted to the
same ciphertext, using a randomly-chosen key and CBC mode.
Exercise 2.31. Present a very efficient CPA attack on the mono-alphabetic
substitution cipher, which allows complete recovery of arbitrary messages, using
the encryption of one short plaintext string.
Exercise 2.32 (PRG constructions). Let G : {0, 1}n → {0, 1}n+1 be a secure
PRG. Is G0 , as defined in each of the following sections, a secure PRG? Prove.
1. G0 (s) = G(sR ), where sR means the reverse of s.
2. G0 (r + + G(s), where r, s ∈ {0, 1}n .
+ s) = r +
3. G0 (s) = G(s ⊕ G(s)1...n ), where G(s)1...n are the n most-significant bits
of G(s).
4. G0 (s) = G(π(s)) where π is a (fixed) permutation.
5. G0 (s) = G(s + 1).
6. (harder!) G0 (s) = G(s ⊕ sR ).
A. Solution to G0 (r ++ s) = r ++ G(s):
B. Solution to G0 (s) = G(s⊕G(s)1...n ): may not be a PRG. For example, let
g be a PRG from any number m bits to m+1 bits, i.e., output is pseudorandom
string just one bit longer than the input. Assume even n; for every x ∈ {0, 1}n/2
and y ∈ {0, 1}n/2 ∪ {0, 1}1+n/2 , let G(x ++ y) = x +
+ g(y). If g is a PRG, then
G is also a PRG (why?). However, when used in the above construction:
G0 (x +
+ y) + y) ⊕ G(x +
= G [(x + + y)]
+ y) ⊕ (x +
= G [(x + + g(y))]
= G [(x ⊕ x) +
+ (y ⊕ g(y))]
h i
n/2
= G 0 + y) ⊕ (x +
+ + g(y))
= 0n/2 +
+ y ⊕ g(y)

87
As this output begins with n/2 zero bits, it can be trivially distinguished from
random. Hence G0 is clearly not a PRG.
Exercise 2.33. Let G1 , G2 : {0, 1}n → {0, 1}2n be two different candidate
PRGs (over the same domain and range). Consider the function G defined in
each of the following sections. Is it a secure PRG - assuming both G1 and G2
are secure PRGs, or assuming only that one of them is secure PRG? Prove.

1. G(s) = G1 (s) ⊕ G2 (s).


2. G(s) = G1 (s) ⊕ G2 (s ⊕ 1|s| ).
3. G(s) = G1 (s) ⊕ G2 (0|s| ).
Exercise 2.34. Let G : {0, 1}n → {0, 1}m be a secure PRG, where m > n.

1. Let m = n + 1. Use G to construct a secure PRG G0 : {0, 1}n → {0, 1}2n .


2. Let m = 2n, and consider G0 (x) = G(x)+
+G(x+1). Is G0 a secure PRG?
3. Let m = 2n. Use G to construct a secure PRG G0 : {0, 1}n → {0, 1}4·n .
4. Let m = 4n. Use G to construct a secure PRG Ĝ : {0, 1}n → {0, 1}64·n .

Exercise 2.35 (Ad-Hoc PRF competition project). In this exercise, you will
experiment in trying to build directly a cryptographic scheme - in this case, a
PRF - as well as in trying to ‘break’ (cryptanalyze) it. Do this exercise with
others, in multiple groups (each containing one or multiple persons).

1. In the first phase, each group will design a PRF, whose input, key and
output are all 64 bits long. The PRF should be written in Python (or some
other agreed programming language), and only use the basic mathematical
operations: module addition/subtraction/multiplication/division/remain-
der, XOR, max and min. You may also use comparisons and conditional
code. The length of your program should not exceed 400 characters, and
it must be readable. You will also provide (separate) documentation.
2. All groups will be given the documentation and code of the PRFs of all
other groups, and try to design programs to distinguish these PRFs from a
random function (over same input and output domains). A distinguisher
is considered successful if it is able to distinguish in more than 1% of the
runs.
3. Each group, say G, gets one point for every PRF that G succeeded to
distinguish, and one point for every group that failed to distinguish G’s
PRF from random function. The group with the maximal number of
points wins.

88
Exercise 2.36. Let f be a secure Pseudo-Random Function (PRF) with n bit
keys, domain and range, and let k be a secret, random n bit key. Derive from
k, using f , two pseudorandom keys k1 , k2 , e.g., one for encryption and one
for authentication. Each of the derived keys k1 , k2 should be 2n-bits long, i.e.,
twice the length of k. Note: the two keys should be independent, i.e., each
of them (e.g., k1 ) should be pseudorandom, even if the adversary is given the
other (e.g., k2 ).
1. k1 =
2. k2 =
Exercise 2.37 (PRF constructions). Let F n,b,l : {0, 1}n × {0, 1}b → {0, 1}l
be a secure PRF; for brevity, we write simply Fk (x) for Fkn,b,l (x). Is F 0 , as
defined in each of the following sections, a secure PRF? Prove.

1. F̂k (m) = Fk (m ⊕ 1).


2. F̂k (m) = Fm (k).
3. F̂k (m) = Fk (mR ), where mR means the reverse of m.
4. F̂k (mL +
+ mR ) = Fk (mL ) +
+ Fk (mR ).
+ mR ) = (mL ⊕ Fk (mR )) +
5. F̂k (mL + + (Fk (mL ) ⊕ mR ).
6. (harder!) Fk0 (mL +
+ mR ) = FFk (1b ) (mL ) +
+ FFk (0b ) (mR ). Assume l = n.

7. F̂k (m) = LSb(Fk (m)), where LSb returns the least-significant bit of the
input.
8. (harder!) F̂k (mL + +FFk (0b ) (mR ⊕Fk (mL )). Assume
+mR ) = FFk (1b ) (mL )+
l = n = b.
Solution of F̂k (m) = Fk (m⊕1): yes, if Fk (m) is a secure PRF, then F̂k (m) =
Fk (m ⊕ 1) is also a secure PRF. Assume, to the contrary, that there is a PPT
algorithm  that ‘breaks’ F̂ , i.e., there is some (strictly-positive) polynomial
p(n) s.t. for sufficiently large n holds:
   
ˆ
Pr ÂF̂k (·) (1n ) = 1 − Pr Âf (·) (1n ) = 1 > p(n) (2.43)

where the probabilities are computed over uniformly-random coin tosses by


A and uniformly-random choice of k ∈ {0, 1}n and of function fˆ : {0, 1}b →
{0, 1}l .
We use the adversary  as a ‘subroutine’ to implement a PPT algorithm
A, as illustrated in 2.32. Namely, the value of Ag (1n ), i.e., the output of A,
given oracle to an unknown function g, and applied to security parameter 1n ,
is defined as:
Ag(m) (1n ) ≡ Âg(m⊕1) (1n ) (2.44)

89
ĝ(m) = g(m ⊕ 1);
m
1)

g (m
) =
Âĝ (1n )
n
1 ĝ (m

0
Ag (1n ) : 1/
{
Return Âĝ (1n ),
where ĝ(m) ≡ g(m ⊕ 1);
}

Figure 2.32: Design of adversary A for the solution of item 1 of Exercise 2.37.

Which implies trivially that:

Âg(m) (1n ) ≡ Ag(m⊕1) (1n ) (2.45)

By applying this equation together with the fact that, by design, F̂k (m) =
Fk (m ⊕ 1), we have:

ÂF̂k (m) (1n ) = AFk (m⊕1⊕1) (1n ) = AFk (m) (1n ) (2.46)

Namely, we can rewrite Eq. (2.43) as:


   
ˆ
Pr AFk (m) (1n ) = 1 − Pr Af (m⊕1) (1n ) = 1 > p(n) (2.47)

where the probabilities are computed over uniformly-random coin tosses by


A and uniformly-random choice of k ∈ {0, 1}n and of function fˆ : {0, 1}b →
{0, 1}l . Let f (m) ≡ fˆ(m ⊕ 1); a uniform choice of fˆ implies a uniform choice
of f (since the two sets of functions are permutations). Hence we have:
   
Pr AFk (m) (1n ) = 1 − Pr Af (m) (1n ) = 1 > p(n) (2.48)

However, Eq. (2.48) implies that F is not a secure PRF, which contradicts the
assumption was F is a secure PRF, proving that assumption was wrong, i.e.,
F̂ is a secure PRF.

Exercise 2.38 (Key dependent message security). Several works design cryp-
tographic schemes such as encryption schemes, which are secure against a ‘key
dependent message attack’, where the attacker specifies a function f and re-
ceives encryption Ek (f (k)), i.e., encryption of the message f (k) where k is the
secret key. See [30].

90
1. Extend the definition of secure pseudo-random function for security against
key-dependent message attacks.
2. Suppose that F is secure pseudo-random function. Show a (’weird’) func-
tion F 0 which is also a secure pseudo-random function, but not secure
against key-dependent message attacks.
Exercise 2.39 (ANSI X9.31 PRG and the DUHK attack). The ANSI X9.31
is a well-known PRG design, illustrated in Fig. 2.33. In this question we inves-
tigate a weakness in it, presented in [103]; it was recently shown to be still rel-
evant for some devices using this standard, in the so-called DUHK attack [45].
Our presentation is a slight simplification of the X9.31 design but retains the
important aspects of the attack. The design uses a PRF (or block cipher)
F : {0, 1}n × {0, 1}n → {0, 1}n , with a randomly-chosen and then fixed key k;
the attacks we discuss assume that the key is then known to the attacker. Let
f (x) = Fk (x).
1. Let g(x) = f (x) +
+ f (f (x)) +
+ . . .. Is g a secure PRG?
2. Let g(x) = g1 (x) + + g2 (x) + + . . ., where g1 (x) = f (x ⊕ f (t)), g2 (x) =
f (f (t) ⊕ f (g1 (x) ⊕ f (t))), and t is a known value (representing the time).
Is g a secure PRG?
Hint: Solution to first part is almost trivial; indeed this part is mostly there to
aid you in solving the second part (which may not be much harder, after solving
the first part).

Ti AESK ⊕ AESK Vi

Vi−1 ⊕ AESK Ri

Figure 2.33: A single round of the ANSI X9.31 generator, instantiated Fk (x)
by AESk (x) (i.e., using AES as the block cipher or PRF).

Exercise 2.40 (Cascade is not a robust combiner for PRFs). Let F 0 , F 00 :


{0, 1}∗ ×D → D be two polynomial-time computable functions, and let F(k0 ,k00 ) (x) =
Fk0 0 (Fk0000 (x)) be their cascade. Give example of F 0 , F 00 s.t. one of them is a
PRF, yet F is not a PRF. Namely, cascade is not a robust combiner for PRFs.
Exercise 2.41. A message m of length 256 bytes is encrypted using a 128-bit
block cipher, resulting in ciphertext c. During transmission, the 200th bit was
flipped due to noise. Let c0 denote c with the 200th bit flipped, and m0 denote
the result of decryption of c0 .
1. Which bits in m0 would be identical to the bits in m, assuming the use
of each of the following modes: (1) ECB, (2) CBC, (3) OFB, (4) CFB?
Explain (preferably, with diagram).

91
2. For each of the modes, specify which bits is predictable as a function of
the bits of m and the known fact that the 200th bit flipped.

Exercise 2.42. Consider a scenario where randomness is scarce, motivating


attempts to design encryption schemes that use less randomization. Specifically,
consider the following two variants of CBC mode, both using, per message,
only twenty random bits, rather than n random bits (block size) in standard
CBC. Both variants are identical to CBC mode, except for the choice of c0 (the
initialization vector), which is as specified below; both use a twenty-bits random
$
string r ← {0, 1}20 . Show that neither variant suffices to ensure IND-CPA.
Append zeros: c0 = r||0n−20 .
Pseudorandomly: c0 = Ek (r).
Note: your solution may require up to few million queries; just make sure the
number of queries is polynomial in n.
Exercise 2.43. Hackme Bank protects money-transfer orders digitally sent
between branches, by encrypting them using a block cipher. Money transfer
orders have the following structure: m = f + +r+ +t+ +x+ +y+ + p, where f, r
are each 20-bits representing the payer (from) and the payee (recipient), t is a
32-bit field encoding the time, x is a 24 bit field representing the amount, y is a
128-bit comment field defined by the payer and p is 32-bit parity fields, computed
as the bitwise-XOR of the preceding 32-bit words. Orders with incorrect parity,
outdated or repeating time field, or unknown payer/payee are ignored.
Mal captures ciphertext message x containing money-transfer order of 1$
from Alice to his account. You may assume that Mal can ‘trick’ Alice into
including a comment field y selected by Mal. Assume 64-bit block cipher. Can
Mal cause transfer of larger amount to his account, and how, assuming use of
the following modes:

1. ECB
2. CBC
3. OFB
4. CFB

Solution: The first block contains f, r (10 bits each), and top 24 bits of the
time t, the second block contains 8 more bits of the time, x (24 bits) and 32
bits of the comment; block three contains 64 bits of comments, and block four
contains 32 bits of comment and 32 bits of parity. Denote these four plaintext
blocks by m1 + + m2 ++ m3 + + m4 .
Denote the ciphertext blocks captured by Mal as c0 + + c1 +
+ c2 ++ c3 ++ c4 ,
where c0 is the IV.

92
1. ECB: attacker select the third block (completely comment) to be identical
to the second block, except for containing the maximal value in the 24
bits from bit 8 to bit 31. The attacker then switches between the third
and fourth block before giving to the bank. Parity bits do not change.
2. CBC: Attacker chooses y s.t. m3 = m2 . Now attacker sends to the bank
the manipulated message z0 + +c3 ++c3 + +c4 , where z0 = m1 ⊕m3 ⊕c2 .
+c3 +
As a result, decryption of the first block retrieves m1 correctly (as m1 =
z0 ⊕ m3 ⊕ c2 ), and decryption of the last block similarly retrieves m4
correctly (no change in c3 , c4 ). However, both the second and the third
block, decrypt to the value (c3 ⊕ c2 ⊕ m3 ). Hence, the 32 bit XOR of
the message does not change. The decryption of the second block (to
c3 ⊕ c2 ⊕ m3 ) is likely to leave the time value valid - and to increase the
amount considerably.
3. OFB: the solution is trivial since Mal can flip arbitrary bits in the de-
crypted plaintext (by flipping corresponding bits in the ciphertext).
4. CFB: as in CBC, attacker chooses y s.t. m3 = m2 . Attacker sends
to the bank the manipulated message c0 ++ c1 +
+ c1 +
+ c1 +
+ z4 where
z4 = p4 ⊕ c2 ⊕ p2 .
Exercise 2.44 (Affine block cipher). Hackme Inc. proposes the following
highly-efficient block cipher, using two 64-bit keys k1 , k2 , for 64-bit blocks:
Ek1 ,k2 (m) = (m ⊕ k1 ) + k2 (mod 264 ).
1. Show that Ek1 ,k2 is an invertible permutation (for any k1 , k2 ), and the
inverse permutation Dk1 ,k2 .
2. Show that (E, D) is not a secure block cipher (invertible PRP).
3. Show that encryption using (E, D) is not CPA-IND, when used in the
following modes: (1) ECB, (2) CBC, (3) OFB, (4) CFB.
Exercise 2.45 (How not to build PRP from PRF). Suppose F is a secure PRF
with input, output and keyspace all of length n bits. For xL , xR ∈ {0, 1}n , let
Fk0 (xL + + Fk (xR ) and Fk00 (xL +
+ xR ) = Fk (xL ) + + xR ) = Fk (xL ⊕ xR ) +
+ Fk (xL ⊕
Fk (xL ⊕ xR )). Prove that neither Fk0 nor Fk00 are a PRP.
Exercise 2.46 (Building PRP from a PRF). Suppose you are given a secure
PRF F , with input, output and keyspace all of length n bits. Show how to use
F to construct:
1. A PRP, with input and output length 2n bit and key length n bits,
2. A PRP, with input, output and key all of length n bits.
Exercise 2.47. Show that the simple padding function pad(m) = m + + 0l , fails
to prevent CCA attacks against most of the modes-of-operation (Fig. 2.6), when
l ≤ n. The attacker may perform CPA and CCA queries, and the plaintext
contains multiple blocks.

93
Exercise 2.48 (Indistinguishability definition). Let (E, D) be a stateless shared-
key encryption scheme, and let p1 , p2 be two plaintexts. Let x be 1 if the most
significant bits of p1 , p2 are identical and 0 otherwise, i.e., x = {1 if M Sb(p1 ) =
M Sb(p2 ), else 0}. Assume that there exists an efficient algorithm X that com-
putes x given the ciphertexts, i.e., x = X(Ek (p1 ), Ek (p2 )). Show that this
implies that (E, D) is not IND-CPA secure, i.e., there is an efficient algo-
rithm ADV which achieves significant advantage in the IND-CPA experiment.
Present the implementation of ADV by filling in the missing code below:
ADV Ek (‘Choose’, 1n ) : {
}
ADV Ek (‘Guess’, s, c∗ ) : {
}

Exercise 2.49 (BEAST vulnerability). Versions of SSL/TLS before TLS1.1,


use CBC encryption in the following way. They select the IV randomly only
for the first message m0 in a connection; for subsequent messages, say mi , the
IV is simply the last ciphertext block of the previous message. This creates a
vulnerability exploited, e.g., by the BEAST attack and few earlier works [6,64].
In this question we explore a simplified version of these attacks. For simplicity,
assume that the attacker always knows the next IV to be used in encryption,
and can specify plaintext message and receive its CBC encryption using that
IV. Assume known block length, e.g., 16 bytes.

1. Assume the attacker sees ciphertext (c0 , c1 ) resulting from CBC encryp-
tion with c0 being the IV, of a single-block message m, which can have
only two known values: m ∈ {m0 , m1 }. To find if m was m0 or m1 , the
adversary uses fact that it knows the next IV to be used, which we denote
c00 , and asks for CBC encryption of a specially-crafted single-block mes-
sage m0 ; denote the returned ciphertext by the pair (c00 , c01 ), where c00 is
the (previously known) IV, as indicated earlier. The adversary can now
compute m0 from c01 :
a) What is the value of m0 that the adversary will ask to encrypt?
b) Fill the
 missing parts in the solution of the adversary:
m0 if
m=
m1 if
2. Show pseudo-code for the attacker algorithm used in the previous item.
3. Show pseudo-code for an attack that finds the last byte of message m.
Hint: use the previous solution as a routine in your code.
4. Assume now that the attacker tries to find a long secret plaintext string
x of length l bytes. Assume attacker can ask for encryption of messages
m = p+ + x, where p is a plaintext string chosen by the attacker. Show
pseudo-code for an attack that finds x. Hint: use previous solution as
routine; it may help to begin considering fixed-lenght x, e.g., four bytes.

94
Sketch of solution to second part (to be updated): Attacker makes
query for encryption of some one-block message y, receives α0 , α1 where α1 =
Ek (α0 ⊕ y). Suppose now attacker knows next message will be encrypted with
IV I. Attacker picks m0 = I ⊕ y ⊕ α0 , and m1 some random message. If
game pick bit b = 0 then attacker receives encryption of m0 which is I and
Ek (I ⊕ m0 ) = Ek (I ⊕ I ⊕ y ⊕ α0 ) = Ek (y ⊕ α0 ) = α1 ; otherwise, it receives
some other string.

Sketch of solution to third part: solution to previous part allowed at-


tacker to check if the plaintext was a given string; we now simply repeat this
for the 256 different strings corresponding to all possible values of last byte of
m2 .

Exercise 2.50 (Robust combiner for PRG). 1. Given two candidate PRGs,
say G1 and G2 , design a robust combiner, i.e., a ‘combined’ function G
which is a secure PRG is either G1 or G2 is a secure PRG.
2. In the design of the SSL protocol, there were two candidate PRGs, one
(say G1 ) based on the MD5 hash function and the other (say G2 ) based
on the SHA-1 hash function. The group decided to combine the two; a
simplified version of the combined PRG is G(s) = G2 (s ++ G1 (s)). Is this
a robust-combiner, i.e., a secure PRG provided that either G1 or G2 is a
secure PRG?

Hint: Compare to Lemma 2.3. You may read on hash functions in chapter 4,
but the exercise does not require any knowledge of that; you should simply
consider the construction G(s) = G2 (s+
+G1 (s)) for arbitrary functions G1 , G2 .

Exercise 2.51 (Using PRG for independent keys). In Example 2.2, we saw
how to use a PRF to derive multiple pseudo-random keys from a single pseudo-
random key, using a PRF.
1. Show how to derive two pseudo-random keys, using a PRG, say from n
bits to 2n bits.
2. Show how to extend your design to derive four keys from the same PRG,
or any fixed number of pseudo-random keys.

Exercise 2.52. Let (E, D) be a block cipher which operates on 20 byte blocks;
suppose that each computation of E or D takes 10−6 seconds (one microsecond),
on given chips. Using (E, D) you are asked to implement a secure high-speed
encrypting/decrypting gateway. The gateway receives packets at line speed of
108 bytes/second, but with maximum of 104 bytes received at any given second.
The goal is to have minimal latency, using minimal number of chips. Present
an appropriate design, argue why it achieves the minimal latency and why it is
secure.

95
Exercise 2.53. Consider the AES block cipher, with 256 bit key and 128 bit
blocks, and two random one-block (128 bit ) messages, m1 and m2 , and two
random (256-bit) keys, k1 and k2 . Calculate (or approximate/bound) the prob-
ability that Ek1 (m1 ) = Ek2 (m2 ).
Exercise 2.54 (PRF→PRG). Present a simple and secure construction of a
PRG, given a secure PRF.
Exercise 2.55 (Independent PRGs). Often, a designer has one random or
pseudo-random ‘seed/key’ binary string k ∈ {0, 1}∗ , from which it needs to
generate two or more independently pseudorandom strings k0 , k1 ∈ {0, 1}∗ ;
i.e., each of these is pseudorandom, even if the other is given to the (PPT)
adversary. Let P RG be a pseudo-random generator, which on input of arbitrary
length l bits, produces 4l output pseudorandom bits. For each of the following
designs, prove its security (if secure) or its insecurity (is insecure).
1. For b ∈ {0, 1}, let kb = P RG(b +
+ k).
2. For b ∈ {0, 1}, let kb = P RG(k) [(b · 2 · |k|) . . . ((2 + b) · |k| − 1)].
Solution:
1. Insecure, since it is possible for a secure PRG to ignore the first bit,
i.e., P RG(b ++ s) = P RG(b + + s), resulting in k0 = P RG(0 ++ k) =
P RG(1 + + k) = k1 . We skip the (simple) proof that such a PRG may be
secure.
2. Secure, since each of these is a (non-overlapping) subset of the output of
the PRG.

Exercise 2.56 (Indistinguishability hides partial information). In this exercise


we provide an example to the fact that a cryptosystem that ensures indistin-
guishability (IND-CPA), is guaranteed not to leak partial information about
plaintext, including relationships between the plaintext corresponding to differ-
ent ciphertexts. Let (E, D) be an encryption scheme, which leaks some infor-
mation about the plaintexts; specifically we assume that there exists an efficient
adversary A s.t. for two ciphertexts c1 , c2 of E, holds A(c1 , c2 ) = 1 if and
only if the plaintexts share a common prefix, e.g., c1 = Ek (ID + + m1 ) and
c2 = Ek (ID + + m2 ) (same perfix, ID). Show that this implies that (E, D) is
not IND-CPA secure. See the question illustrated in Fig. 2.34.
Solution: see sketch in Fig. 2.35.
Exercise 2.57 (Encrypted cloud storage). Consider a set P of n sensitive
(plaintext) records P = {p1 , . . . , pn } belonging to Alice, where n < 106 . Each
record pi is l > 64 bits long ((∀i)(pi ∈ {0, 1}l )). Alice has very limited mem-
ory, therefore, she wants to store an encrypted version of her records in an
insecure/untrusted cloud storage server S; denote these ciphertext records by
C = {c1 , . . . , cn }. Alice can later retrieve the ith record, by sending i to S, who
sends back ci , and then decrypting it back to pi .

96
Figure 2.34: Figure for Exercise 2.56 (to be done).

1. Alice uses some secure shared key encryption scheme (E, D), with l bit
keys, to encrypt the plaintext records into the ciphertext records. The goal
of this part is to allow Alice to encrypt and decrypt each record i using
a unique key ki , but maintain only a single ‘master’ key k, from which
it can easily compute ki for any desired record i. One motivation for
this is to allow Alice to give keys to specific record(s) ki to some other
users (Bob, Charlie,...), allowing decryption of only the corresponding
ciphertext ci , i.e., pi = Dki (ci ). Design how Alice can compute the key ki
for each record (i), using only the key k and a secure block cipher (PRP)
(F, F −1 ), with key and block sizes both l bits. Your design should be as
efficient and simple as possible. Note: do not design how Alice gives ki
to relevant users - e.g., she may do this manually; and do not design
(E, D).

97
Figure 2.35: Figure for solution of Exercise 2.56 (to be done).

Solution: ki =
2. Design now the encryption scheme to be used by Alice (and possibly by
other users to whom Alice gave keys ki ). You may use the block cipher
(F, F −1 ), but not other cryptographic functions. You may use different
encryption scheme (E i , Di ) for each record i. Ensure confidentiality of
the plaintext records from the cloud, from users (not given the key for
that record), and from eavesdroppers on the communication. Your design
should be as efficient as possible, in terms of the length of the ciphertext
(in bits), and in terms of number of applications of the secure block cipher
(PRP) (F, F −1 ) for each encryption and decryption operation. In this
part, assume that Alice stores P only once, i.e., never modifies records
pi . Your solution may include a new choice of ki , or simply use the same
as in the previous part.
Solution: ki = ,
Eki i (pi ) = ,
Dki i (ci ) = .
3. Repeat, when Alice may modify each record pi few times (say, up to 15
times); let ni denote number of modifications of pi . The solution should
allow Alice to give (only) her key k, and then Bob can decrypt all records,
using only the key k and the corresponding ciphertexts from the server.
Note: if your solution is the same as before, this may imply that your
solution to the previous part is not optimal.
Solution: ki = ,
Eki i (pi ) = ,
Dki i (ci ) = .
4. Design an efficient way for Alice to validate the integrity of records re-
trieved from the cloud server S. This may include storing additional

98
information Ai to help validate record i, and/or changes to the encryp-
tion/decryption scheme or keys as designed in previous parts. As in pre-
vious parts, your design should only use the block cipher (F, F −1 ).
Solution: ki = ,
Eki i (pi ) = ,
Dki i (ci ) = ,
Ai = .
5. Extend the keying scheme from the first part, to allow Alice to also
compute keys ki,j , for integers i, j ≥ 0 s.t. 1 ≤ i · 2j + 1, (i + 1) ·
2j ≤ n, where ki,j would allow (efficient) decryption of ciphertext records
ci·2j +1 , . . . , c(i+1)·2j . For example, k0,3 allows decryption of records c1 , . . . , c8 ,
and k3,2 allows decryption of records c13 , . . . , c16 . If necessary, you may
also change the encryption scheme (E i , Di ) for each record i.
Solution: ki,j = ,
Eki i (pi ) = ,
Dki i (pi ) = .

Exercise 2.58 (Modes vs. attack models.). For every mode of encryption we
learned (see Table 2.6):
1. Is this mode always secure against any of the attack models we discussed
(CTO, KPA, CPA, CCA)?
2. Assume this mode is secure against KPA. Is it then also secure against
CTO? CPA? CCA?
3. Assume this mode is secure against CPA. Is it then also secure against
CTO? KPA? CCA?
Justify your answers.
Exercise 2.59. Recall that WEP encryption is defined as: W EPk (m; IV ) =
[IV, RC4IV,k ⊕(m+ +CRC(m))], where IV is a random 24-bit initialization win-
dow, and that CRC is a error-detection code which is linear, i.e., CRC(m ⊕
m0 ) = CRC(m)⊕CRC(m0 ). Also recall that WEP supports shared-key authen-
tication mode, where the access point sends random challenge r, and the mobile
response with W EPk (r; IV ). Finally, recall that many WEP implementations
use 40-bit key.
1. Explain how an attacker may efficiently find the 40-bit WEP key, by
eavesdropping on the shared-key authentication messages between the mo-
bile and the access point.
2. Present a hypothetical scenario where WEP would have used a fixed value
of IV to respond to all shared-key authentication requests, say IV=0.
Show another attack, that also finds the key using the shared-key authenti-
cation mechanism, but requires less time per attack. Hint: the attack may
use (reasonable) precomputation process, as well as storage resources; and

99
the attacker may send a ‘spoofed’ challenge which the client believes was
sent by the access point.
3. Identify the attack models exploited in the two previous items: CTO,
KPA, CPA or CCA?
4. Suppose now that WEP is deployed with a long key (typically 104 bits).
Show another attack which will allow the attacker to decipher (at least
part) of the encrypted traffic.

100
Chapter 3

Authentication: Message
Authentication Code (MAC) and
Signature Schemes

Modern cryptography addresses different goals related to threats to information


and communication accessible to an attacker. In this chapter, we focus on
the goal of authentication of information and communication. Specifically,
we discuss cryptographic schemes and protocols to detect when an attacker
impersonates as somebody else, or modifies information from another agent.
In most of this chapter, we discuss message authentication code (MAC)
schemes, which ensure that information was created by a known entity - with-
out any modification. People often expect that encryption will ensure this
property; we discuss the use of encryption for authentication, and show that
this is quite tricky, although possible if done correctly. We also discuss how to
combine MAC and encryption, to ensure both confidentiality and authenticity,
and to improve system security.
MAC schemes use the same key to generate the authentication tag, and
to validate the authentication tag. We also discuss, albeit briefly, signature
schemes that use a private key to authenticate (sign) messages, and a distinct
public key to validate signatures over messages.

3.1 Encryption for Authentication?


As we discussed in the previous chapter, encryption schemes ensure confiden-
tiality, i.e., an attacker observing an encrypted message (ciphertext) cannot
learn anything about the plaintext (except its length). Sometimes, people
expect encryption to be non-malleable; intuitively, a non-malleable encryption
scheme prevents the attacker from modifying the message in a ‘meaningful way’.
See definition and secure constructions of non-malleable encryption schemes
in [59]. However, be warned: achieving, and even defining, non-malleability is
not as easy as it may seem!

101
In fact, many ciphers are malleable; often, an attacker can easily modify a
known ciphertext c, to c0 6= c s.t. m0 = Dk (c0 ) 6= m (and also m0 6= ERROR).
Furthermore, often the attacker can ensure useful relations between m0 and m.
An obvious example is when using the (unconditionally-secure) one-time-pad
(OTP), as well as using Output-Feedback (OFB) mode.

Example 3.1. Suppose an attacker, Mal, eavesdrops on ciphertext c sent from


Alice to her bank, where c is OTP-encryption of the known plaintext message
m =‘‘Transfer 10$ to Bob. From: Alice. Password: ‘IluvBob’.’’,
encoded in ASCII. Show how Mal can modify the message so the bank will trans-
fer money to his account rather than to Bob. How much money can Mal steal
by sending the message?
Explain why your solution also works when using OFB mode encryption -
or any PRG-based stream cipher.

We conclude that encryption schemes may not suffice to ensure authentica-


tion. This motivates us to introduce, in the next section, another symmetric-
key cryptographic scheme, which is designed explicitly to ensure authentication
and integrity: the Message Authentication Code (MAC).

3.2 Message Authentication Code (MAC) schemes


Message Authentication Code (MAC) schemes are a simple, symmetric key
cryptographic functions, designed to verify the authenticity and integrity of
information (messages). A MAC function M ACk (m)has two inputs, a (secret)
n-bit secret (symmetric) key k, and a message m. Note that MAC schemes are
deterministic, i.e., do not have access to random bits.
The output M ACk (m) is often referred to as the tag. MAC functions are
used to detect (unauthorized) messages or changes to messages. Intuitively,
given m, M ACk (m) for a secret, random key k, it is hard for a (computationally-
bounded) attacker to find another message m0 6= m together with the value of
M ACk (m0 ).
Typically, as shown in Fig. 3.1, a secret, symmetric MAC key k is shared
between two (or more) parties. Each party can use the key to authenticate a
message m, by computing an authentication tag M ACk (m). Given a message
m together with a previously-computed tag T , a party verifies the authenticity
of the message m by re-computing M ACk (m) and comparing it to the tag T ;
if equal, the message is valid, i.e., the tag must have been previously computed
by the same party or another party, using the same secret key k.
In a typical use, one party, say Alice, sends a message m to a peer, say Bob,
authenticating m by computing and attaching the tag T = M ACk (m). Bob
confirms that T = M ACk (m), thereby validating that Alice sent the message,
since he shares k only with Alice. See Fig. 3.1.

Sender identification Consider two - or more - parties that use the same
MAC key to send authenticated messages among them. By validating the tag

102
Figure 3.1: Message Authentication Code (MAC).

received, recipients know that the tag was computed by one of the key holders
- but not which key holder computed the MAC. Adding the identity of the
sender to the input to the MAC, in addition to the message itself, ensures
correct identification of the sender, if all the parties are trusted to add their
identity.

Repudiation/deniability To validate that a given tag T correctly validates


a message m, i.e., T = M ACk (m), requires the ability to compute M ACk (·),
i.e., knowledge of the secret key k. However, this implies the ability to compute
(valid) tags from any other message. This allows the entity that computed the
tag to later deny having done it, since it could have been computed also by
other entities. We later discuss digital signature schemes, which use a secret
key to compute the signature (tag), and a public key to validate it, which can
be used to prevent senders from denying/repudiating messages.

3.3 MAC and Signature Schemes: definitions


3.3.1 Definition of Message Authentication Code (MAC)
Scheme
A MAC scheme is a function F , with the following unforgeability property:
an attacker, which does not know the key k and is not given Fk (m) for any
given message m, is unable to find the value of Fk (m), with better chance than
a random guess. Similar to the definitions of chosen-plaintext attack and of
pseudo-random functions and permutations, we allow the adversary to obtain
the MAC values for any other message. The formal definition follows. For
concreteness, we will focus on MAC whose output is an l-bit binary string.
Definition 3.1 (MAC). An l-bit Message Authentication Code (MAC) over
domain D, is a function F : {0, 1}∗ × D → {0, 1}l , such that for all PPT
algorithms A, the advantage εM AC
F,A (n) is negligible in n, i.e., smaller than any

103
positive polynomial for sufficiently large n (as n → ∞), where:
h i 1
εM AC
F,A (n) ≡ Pr (m, Fk (m)) ← AFk (·|except m)
(1n ) − l (3.1)
$
k←{0,1}n
2

$
Where the probability is taken over the random choice of an n bit key, k ←
{0, 1}n , as well as over the coin tosses of A.

Oracle. The expression AFk (·|except m) refers to the output of the adversary
A, where during its run, the adversary can give arbitrary inputs x 6= m and
receive the corresponding values of the function, Fk (x). We say that the ad-
versary A has an oracle to the MAC function FK (·) (excluding the message
m). See Definition 2.7.

The advantage function εM AC


F,A (n) and key length n. The definition is
for l-bit MAC, i.e., the output is always a binary string of length l. Hence, a
random guess at the MAC of any input message m would be correct with prob-
ability 2−l . Therefore, we defined the advantage εM AC
F,A (n) as the probability
that the adversary finds a correct MAC value for a message m (not input to
the oracle), minus the ‘base success probability’ of 2−l . The function F is a
(secure ) MAC, if this advantage εM AC
F,A (n) is negligible.
The key length is denoted n, and is not bounded. The ‘advantage’ of the
adversary over random guess, should be negligible in n, i.e., converge to zero as
n grows. In practice, MAC functions are used with specific key length, which
is believe to be ‘long enough’ to foil attacks (by attackers with reasonable
resources and time).

Output length - fixed (l) or as key length (n). In some other definitions
of MAC schemes, the output length is also n, i.e., same as the key. In this case,
the 21l element becomes 21n , which is negligible in n, and hence can be ignored.
(Readers are encouraged to prove this last statement, as an exercise.)

Input domain. Notice that the definition allows an arbitrary input do-
main D to the MAC function. The two most commonly used domains are
D = {0, 1}∗ , i.e., the set of all binary string (of unbounded length), and
D = {0, 1}lin , i.e., the set of all binary strings of some fixed length lin . Of
course, lin may also be the same as l. A MAC function whose input is the set
of binary strings of fixed length, is called FIL-MAC, i.e., Fixed Input Length
MAC. In contrast, a MAC function whose input is the set of all binary strings
is called VIL-MAC, i.e., Variable Input Length MAC.

3.3.2 Signature Schemes


A digital signature scheme consists of three algorithms, (KG, S, V ), for Key
Generation, Signing and Verifying, respectively, and of a limited set M of

104
messages (which we can sign by applying S). The three algorithms and their
basic operations are illustrated in Figure 3.2.

Figure 3.2: Digital signature scheme (KG, S, V ). The Key-Generation algo-


rithm KG receives as input the security parameter 1n , and outputs a (pri-
vate,public) keypair (s, v) (e.g., of length n). The signing algorithm S uses the
private signature key s to compute the signature σ = Ss (m), given plaintext
message m. The validation algorithm V uses the public validation key v to
validate the plaintext m; validation returns T (true), if and only if σ = Ss (m).
In the figure, we the keys as (A.s, A.v) (instead of just (s, v), to emphasize that
they are associated with Alice.

Key Generation (KG) is a randomized algorithm whose input is the security


$
parameter 1n , and whose output is a pair of correlated keys (s, v) ← KG(1n );
the security parameter is specified in unary (as a string of n bits with the value
1). It is often convencient to use dot notation to associate the (public, private)
key pair with their ‘owner’ , e.g., (A.s, A.v) for Alice, (B.s, B.v) for Bob.
The key s is called the private signing key, and is used to sign messages
m ∈ M , using the signing algorithm S; we use σ = Ss (m) to denote that σ
is the signature of message m using signing key s. The key v is called the
public validation key, and is used for validating the correlation between a given
message m and a give signature σ; namely, Vv (m, σ) is true, if and only if σ
is the result of signing m with the corresponding private signing key v. For
simplicity, we will require the signing and verifying algorithms (S, V , resp.)
to be deterministic; extending to allow randomized algorithms is not difficult -
but a bit hairy.
We say that the signature scheme (KG, S, V ) is correct, if for every security
parameter 1n holds:
 
$
∀(s, v) ← KG(1n ), m ∈ M Vv (m, Ss (m)) (3.2)

105
Digital signatures provide message authentication, like a Message Authen-
tication Code (MAC) schemes. There is, however, a critical difference:

MAC use the same secret key k for authenticating a message m, by com-
puting the authenticator M ACk (m), and for verifying the match between
a given pair of message m and authenticator a.
Signatures use (private, public) key pair (s, v), where the private sign-
ing key s is required for signing, and the public verification key v suffices
for verification. Knowledge of the public verification key v should not
help in signing.

This separation between the signing functionality and the verification func-
tionality is very useful, and is the reason that we use the name signature
schemes; they correspond to the property associated with handwritten sig-
natures. Bob may use Alice’s public verification key A.v to recognize that
σ = SA.s (m) is Alice’s signature on m, but that would not allow Bob to forget
Alice’s signature on other messages. Following the conservative design prin-
ciple (Principle 7), Bob should not be able to forge Alice’s signature on any
message not signed by Alice; this is referred to as existential unforgeability.
The definition is quite similar to that of MAC functions, except for providing
the adversary with the public verification key v.

Definition 3.2 (Signature scheme). Let S = (KG, S, V ) be a signature scheme.


We say that S is existentially unforgeable signature scheme over domain D, if
for all PPT algorithms A, the advantage εeu−Sign
S,A (n) is negligible in n, i.e.,
smaller than any positive polynomial for sufficiently large n (as n → ∞), where:
 
eu−Sign (m, σ) ← ASs (·) (1n );
εS,A (n) ≡ Pr (3.3)
$
(s,v)←KG(1n )
Vv (m, σ) ∧ A didn’t request Ss (m)

Where the probability is taken over the random coin tosses of A and KG, and
the message m ∈ D is not one of the messages for which A used the oracle
Ss (·|except m) to sign.

Domain, efficiency and the Hash-then-Sign construction. Notice that


signature schemes, like MAC, are defined with respect to some input domain
D. In practice, messages are usually much longer; and ‘breaking’ them into
many short messages to apply the signature, would be extremely inefficient,
as signature computation and validation are computationally-intensive oper-
ations. As we explain in subsection 4.2.6, the standard solution is to apply
a hash function h·) to the ‘long’ message, and sign the (short) output h(m);
this is called the Hash-then-Sign (HtS) construction1 Namely, given a signature
scheme S defined for domain {0, 1}n and a hash function h with domain {0, 1}∗
1 We focus here on keyless hash h(m), see subsection 4.2.6 for discussion of HtS using

keyed hash hk (m).

106
and range {0, 1}n (i.e., h : {0, 1}∗ → {0, 1}n ), we define the HtS scheme SHtS
h

as follows:
h
SHtS .KG(1n ) ≡ [Link](1n ) (3.4)
h
SHtS .Ss (m) ≡ S.S(h(m)) (3.5)
h
SHtS .Vv (m, σ) ≡ [Link] (h(m)) (3.6)
h
The HtS scheme SHtS may be applied to any binary string, i.e., its domain

is {0, 1} . The reader may confirm that it is a correct signature scheme over
{0, 1}∗ (Exercise 3.17). Theorem 4.1 shows that if h : {0, 1}∗ → {0, 1}n is a
collision-resistant hash function (CRHF) h(·), as in Definition 4.1, and S is
existentially unforgeable signature scheme over {0, 1}n , then the HtS scheme
h
SHtS is existentially unforgeable signature scheme over {0, 1}∗ , i.e., applicable
to arbitrary-length binary strings. Of course, the HtS method may fail if using
an insecure hash function h; see Exercise 3.18.

3.4 Applying MAC and Signatures Schemes


3.4.1 Applying MAC functions
A MAC function is a simple cryptographic mechanism, which is quite easy
to use; however, it should be applied correctly - with an understanding of
its properties and without expecting it to provide other properties. We now
discuss a few aspects of the usage of MAC schemes, and give a few examples
of common mistakes.

Confidentiality A MAC function is a great tool to ensure integrity and au-


thenticity; however, MAC does not ensure confidentiality. Namely, M ACk (m)
may expose information about the message m. This is sometimes overlooked
by system designers; for example, early versions of the SSH protocol used the
so-called ‘Encrypt and Authenticate’ method, where to protect message m,
the system sent Ek (m) + + M ACk (m); one problem with this design is that
M ACk (m) may expose information about m.
Notice that while obviously confidentiality is not a goal of MAC schemes,
one may hope that it is derived from the authentication property. To refute such
false hopes, it is best to construct a counterexample - a very useful technique
to prove that claims about cryptographic schemes are incorrect. The counter-
examples are often very simple - and often involve ‘stupid’ or ‘strange’ designs,
which are especially designed to meet the requirements of the cryptographic
definitions - while demonstrating the falseness of the false assumptions. Here
is an example showing that MAC schemes may expose the message.

Example 3.2 (MAC does not ensure confidentiality.). To show that MAC may
not ensure confidentiality, we construct such a Non-confidential MAC function
F N M (where N M stands for ‘Non-confidential MAC’). Our construction uses

107
an arbitrary secure MAC scheme F (which may or may not ensure confiden-
tiality). Specifically:

FkN M (m) = Fk (m) +


+ LSb(m)

where LSb(m) is the least-significant bit of m. Surely, F N M does not ensure


confidentiality, since it exposes a bit of the message (we could have obviously
exposed more bits - even all bits!).
On the other hand, we now show that F N M is a secure MAC. Assume,
to the contrary, that there is some adversary AN M that succeeds (with signifi-
cant probability) against F N M . We use AN M to construct an attacker A that
succeeds with the same probability against F . Attacker A works as follows:

1. When AN M makes a query q to F N M , then A makes the same query to


F , receiving Fk (q); it then returns FkN M (q) = Fk (q)+
+LSb(q), as expected
by AN M .
2. When AN M outputs its guess m, T , where T is its guess x for FkN M (m) =
+ LSb(m), and m was not used in any of AN M ’s queries, then A
Fk (m) +
outputs x except for its least-significant bit; namely, if x = FkN M (m) =
+ LSb(m), then A outputs FkN M (m) = Fk (m).
Fk (m) +
It follows that F N M is a secure MAC if and only if F is a secure MAC.

We show, later on, that every PRF is a MAC. The following exercise shows
that the reverse is not true: a MAC is not necessarily a PRF. This exercise is
similar to the example above.

Exercise 3.1 (Non-PRF MAC). Show that a MAC function is not necessarily
a Pseudo-Random Function (PRF).

Solution outline: Let F be an arbitrary secure MAC scheme that outputs


n-bit tags. Construct a MAC scheme F 0 , which outputs 2n-bit tags, as follows.

Fk0 (m) = Fk (m) +


+ 0n

Clearly, F 0 is not a PRF, because A has a significant chance of distinguishing


between an output of F 0 and a random 2n-bit string (since the second half of
the output of F 0 is all zeros). Yet, you can show that F 0 is a secure MAC if and
only if F is a secure MAC, using a similar method to the one in Example 3.2.
Therefore, a MAC function is not necessarily a PRF.

Key separation Another problem with the SSH ‘Encrypt and Authenticate’
design, Ek (m) ++ M ACk (m), is the fact that the same key is used for both
encryption and MAC. This can cause further vulnerability; an example is shown
in the following simple exercise.

108
Exercise 3.2. Show that the use of the same key for encryption and MAC in
Ek (m) ++ M ACk (m) can allow an attacker to succeed in forgery of messages -
in addition to the potential loss of confidentiality shown above - even when E
and M AC are secure (encryption and MAC, respectively).

Solution outline: Let E 0 , M AC 0 be secure encryption and MAC functions,


respectively. Define EkE ,kM (m) = Ek0 E (m) ++ kM and M ACkE ,kM (m) = kE +
+
M ACk0 M (m). Obviously, , the use of EkE ,kM (m)+
+M ACkE ,kM (m) exposes both
keys and is therefore insecure. However, using the method of Example 3.2,
you can show that E, M AC are also secure encryption and MAC functions,
respectively. See also Exercise 3.14.
This is a good example to the principle of key separation.

Principle 10 (Key Separation). Keys used for different purposes and crypto-
graphic schemes should be independent from each other - ideally, each chosen
randomly; if necessary, pseudo-random derivation is also Ok.

Freshness and sender authentication A valid MAC received with a mes-


sage shows that the message was properly authenticated by an entity holding
the secret key. In many applications involving authentication, it is necessary to
ensure further properties. We already commented above, that MAC does not
ensure sender authentication, unless the design ensures that only the specific
sender will compute the MAC using the specific key over the given message.
This is usually ensured by including the sender identity as part of the payload
being signed, although another way to ensure this is for each sender to use its
own authentication key. Of course, this does not prevent one entity holding a
shared key, from impersonating as another entity using the same key.
Another important property is freshness, namely, ensuring that the message
was not already handled previously. Again, to ensure freshness, the sender
should include appropriate indication in the message, or use a different key.
This is usually achieved by including, in the message, a timestamp, a counter
or a random number (‘nonce’) selected by the party validating freshness. Each
of these options has its drawbacks: the need for synchronized clocks, the need
to keep a state, or the need for the sender to receive the nonce from the recipient
(additional interaction).

3.4.2 Applications of Signature Schemes


Signature schemes have two critical properties beyond MAC schemes; these
two properties make signatures a critical tool of applied cryptography. First,
signatures facilitate non-repudiation, i.e., can provide evidence. Second, sig-
natures are necessary for certificates, an essential element of the Public Key
Infrastructure (chapter 8). Let us discuss these two properties and related
applications.

109
Signatures facilitate non-repudiation and evidences (‘digital signa-
tures’). The use of the private signing key is required to digitally sign a
message, but validation of a signature only requires the corresponding public
verification key. This allows a recipient of a signed message to know that once
she validated a signature, using the verification key v, she would be able to
‘convince’ other parties that the message was, in fact, signed by the use of the
private key corresponding to v; and these parties also only need to know v. We
refer to this property as non-repudiation, since the owner of the private key
cannot claim that the ability to verify messages allowed another party to forge
a signature, i.e., compute a seemingly-valid signature of a message, without
access to the private signing key.
Note that non-repudiation does not hold for (shared-key) MAC schemes,
where one computes the MAC, using the key k, both to verify a given MAC
authenticator, and to compute the MAC authenticator in the first place.
The non-repudiation property allows a digitally-signed document to provide
an evidence for the agreement of the signer, much like the classical use of hand-
written signatures. Indeed, the use of digital signatures to prove agreement,
has significant advantages compared to the use of hand-written signatures:
Security. Handwritten signatures are prone to forging of the signature it-
self, as well as to modification of the signed document. If the signature
scheme is secure (i.e., existentially unforgeable, see Definition 3.2), then
production of a valid signature over a document m practically requires
the application of the private signing key to sign exactly m.
Convenience. Digital signatures can be sent over a network easily, and their
verification only requires running of an algorithm. Admittedly, signature
verification does involve some non-negligible overhead, but is much easier
than the manual process and expertise required to confirm handwritten
signatures. Later on, digital signatures may be easily archived, backed-up
and so on.
Non-repudiation is essential for many important applications, such as sign-
ing an agreement or a payment order, or for validation of recommendations
and reviews; they are also applied extensively in different cryptographic sys-
tems and protocols. One especially important application is their use to im-
plement a public key certificate, linking between an entity and its public key,
which is central to the public key infrastructure (PKI). We discuss PKI mainly
in chapter 8.

Legal interpretation of signatures and digitized handwritten signa-


ture. Digital signatures are covered by legislation in some jurisdictions, how-
ever, their legal definition and implications vary significantly between juris-
dictions, and often differs considerably from what you may expect based on
the cryptographic definitions and properties. For example, many web services
use the term ‘digital signature’ to refer to agreement by a user in a web form,
sometimes accompanied by a visual representation of a handwritten signature.

110
Other systems and organizations, consider as a ‘digital signature’ the scanned
or scribbled version of a person’s signature, which we refer to as digitized hand-
written signatures.
These interpretations are related to handwritten signatures rather than to
cryptographic signatures. In particular, since digitized handwritten signatures
are merely digitally-represented images, they definitely cannot prevent an at-
tacker from modifying the ‘signed’ document in arbitrary way, or even reusing
the signature to ‘sign’ a completely unrelated document. From the security
point of view, these digitized handwritten signatures are quite insecure - not
only compared to cryptographic signatures, but even compared to ‘real’ hand-
written signatures, since ‘real’ handwritten signatures may be verified with
some precision by careful inspection (often by experts).

Signatures facilitate public key infrastructure (PKI) and certificates.


Most applied cryptographic systems involve public key cryptosystems (PKCs),
e.g. RSA, and key-exchange protocols, e.g. the Diffie-Hellman (DH) protocol,
both presented in chapter 6. In particular, PKCs and key-exchange are cen-
tral to the TLS/SSL protocol (chapter 7), which is probably the most widely-
used and important cryptographic protocol, and the main cryptographic web-
security mechanism. However, all of these depend on the use of authentic public
keys for remote entities, using only public information (keys). This still leaves
the question of establishing the authenticity of the public information (keys).
If the adversary is limited in its abilities to interfere with the communication
between the parties, then it may be trivial to ensure the authenticity of the
information received from the peer. In particular, many works assume that the
adversary is passive, i.e., can only eavesdrop to messages; this is also the basic
model for the DH key exchange protocol. In this case, it suffices to simply send
the public key (or other public value).
Some designs assume that the adversary is inactive or passive during the
initial exchange, and use this exchange information such as keys between the
two parties. This is called the trust on first use (TOFU) adversary model.
In few scenarios, the attacker may inject fake messages, but cannot eaves-
drop on messages sent between the parties; in this case, parties may easily
authenticate a message from a peer, by previously sending a challenge to the
peer, which the peer includes in the message. We refer to this as a off-path
adversary, and study attacks and defenses for this attack model in Volume 2,
focusing on network security [93].
However, all these methods fail against the stronger Monster-in-the-Middle
(MitM) adversary, who can modify and inject messages as well as eavesdrop
on messages. Furthermore, there are many scenarios where attackers may
obtain MitM capbilities, and even when this seems harder to believe, it is
always better to ensure security against such powerful attackers, following the
conservative design principle (Principle 7). To ensure security against a MitM
attacker, we must use strong, cryptographic authentication mechanisms. A
message authentication code (MAC) requires the parties to share a secret key

111
in advance ; if that’s the case, the parties could use this shared key to establish
secure communication directly.
Signature schemes provide a solution to this dilemma. Namely, a party
receiving signed information from a remote peer, can validate that informa-
tion, using only the public signature-validation key of the signer. Furthermore,
signatures also allow the party performing the signature-validation, to first
validate the public signature-validation key, even when it is delivered by an in-
secure channel which is subject to a MitM attack, such as email. This solution
is called public key certificates.

Figure 3.3: Public key certificate issuing and usage processes.

As illustrated in Fig. 3.3, a public key certificate is a signature by an entity


called the issuer or certificate authority (CA), over the public key of the subject,
e.g., Alice. In addition to the public key of the subject, subject.v, the signed
information in the certificate contains attributes such as the validity period,
and, usually, an identifier and/or name for the subject (Alice).
Once Alice receives her signed certificate Cert, she can deliver it to the
relying party (e.g., Bob), possibly via insecure channels such as email or the
Internet Protocol (IP). This allows the relying party (Bob) to use Alice’s public
key, i.e., rely on it, e.g., to validate Alice’s signature over a message m, as
shown in Fig. 3.3. Note that this requires Bob to trust this CA and to have its
validation key, CA.v.
This discussion of certificates is very basic; more details are provided in
chapter 8, which discusses public-key infrastructure (PKI), and in chapter 7,
which discusses the important TLS/SSL protocol.

3.5 Constructing MAC, part I: constructions from PRF


In this and the next section, we discuss constructions of a MAC function. In
this section, we focus on construction of a MAC from a PRF, and mainy on
the CBC-MAC construction.

112
We note that, based on the PRP/PRF switching lemma (Lemma 2.4), we
could use a block-cipher instead of the PRF in these constructions, since a
block-cipher is indistinguishable from a PRF. Since a PRF is often not in-
cluded in cryptographic libraries, it may be tempting to use instead, in these
construction, a block cipher, which is part of most cryptographic libraries. How-
ever, recall this is not advisable, since the use of block cipher instead of PRF
involves loss in security. Instead, use one of the efficient, simple constructions
of PRF from a block cipher, which avoid the loss of security, e.g., [18, 87].
The section contains three subsections, In the first, we observe that given
a PRF, we can actually use it directly as a MAC, i.e., every PRF is also a
MAC. There is a caveat: the input domain of the MAC is the same as that of
the PRF, which, in turn, is the same as of the underlying block cipher (if the
PRF is implemented from a block cipher as explained above). Namely, if we
use n-bit blocks, i.e. the domain of the block-cipher (and PRF) is {0, 1}n , then
the MAC function also applies to n-bit messages. However, typical messages
are longer.
The second subsection presents the CBC-MAC construction, which con-
structs a l · n-bit PRF from an n-bit PRF, for a given constant number of
blocks l. This allows efficient and secure use of n-bit-input PRF (or block
cipher), to encrypt longer, l · n-bits messages.
Finally, in the third subsection we discuss extensions that allow a MAC for
messages of arbitrary length.

3.5.1 Every PRF is a MAC


In this subsection, we take the first step toward the CBC-MAC construction.
This step is the observation that every PRF whose range is {0, 1}l , is also an
l-bit MAC, with the same input and output domains. This is formalized in the
following lemma, which we call the PRF-is-MAC lemma.

Lemma 3.1 (PRF-is-MAC lemma). Let F be a PRF from input domain D


to the range {0, 1}l . Then F is also an l-bit MAC, with input domain D and
output domain {0, 1}l .

Proof: Assume that F is not a MAC (for same domain D and range {0, 1}l ).
Namely, assume that there exists some adversary AM AC s.t. εM AC
AM AC ,F (n) is
non-negligible in n (as defined in Equation 3.1). We use AM AC to construct
another adversary, AP RF , s.t. εP RF
AP RF ,F (n) is non-negligible in n (as defined in
Equation 2.25); this shows that F is (also) not a PRF, which proves the claim.
Let us now define AP RF . First, recall that in Equation 2.25, adversary
$
AP RF is given an oracle either to a random function f ← {D → {0, 1}l , or
to the pseudo-random function Fk : D → {0, 1}l for some random n-bit key
$
k ← {0, 1}n . Adversary AP RF runs AM AC , letting it use the same oracle.
Namely, whenever AM AC asks its oracle with input x ∈ D, adversary AP RF
calls its oracle with the same input x; and when it receives a result ξ, it returns
that result to AM AC .

113
When AM AC terminates, it should return some pair, which we denote by
(m, σ). Upon receiving (m, σ), adversary AP RF provides m as input to its
oracle; denote the output by σ 0 . If σ 6= σ 0 , then AP RF returns ‘Rand’; other-
wise, i.e., if σ = σ 0 , then AP RF returns ‘Pseudo’. Essentially, AP RF outputs
‘Rand’ (i.e., guess it was given a random function), when AM AC fails was able
to predict correctly the output of the oracle for the input m.
Let us consider what happens if AP RF is given an oracle to a random
$
function f ← {D → R}. In this case, when running AM AC , the values returned
from the oracle were for that random function f ; clearly, AM AC cannot be
expected to perform as well as when given an oracle to the function Fk (·). In
fact, AM AC has to return a pair (m, σ), without giving input m to the oracle.
But if the oracle is to a random function f , then f (m), is chosen independently
of f (x) for any other input x 6= m; learning other outputs cannot help you guess
the output when the input is m! Hence, the probability of a match is (only)
2−l - the probability of a random match  between two random l bit strings.
Namely, Pr $ Af (1n ) = ‘Rand’ = 2−l .
f ←{D→R}
Now consider what happens if AP RF is given an oracle to a pseudo-random
function Fk (·). The claim is that F is also a MAC, but we assumed, to the
contrary, that it is not; so AM AC is able to return a pair (m, Fk (m)) - with
probability significantly larger
 F than 2−l . In these cases, AP RF will return
‘PR’. Namely, Pr $ n
A k (1 ) = ‘Rand’ = 2−l + p(n), where p(n) is a
n
k←{0,1}
significant (not negligible) function.
It follows that εP RF
A,F (n) is not negligible, and hence, F is not a PRF.

3.5.2 CBC-MAC: ln-bit MAC (and PRF) from n-bit PRF


Lemma 3.1 shows that every n-bit PRF is also an n-bit MAC. But how can
we deal with longer messages? In this subsection, we present the CBC-MAC
construction, which produces an l · n-bit PRF, using a given n-bit PRF. Since
every PRF is a MAC, this gives also an l · n-bit MAC. The CBC-MAC con-
struction is a standard from 1989 [1], i.e., prior to the PRF-is-MAC lemma
(from [17]), which is why it refers to construction of MAC (from block cipher)
and not to construction of l · n-bit PRF from n-bit PRF.
Before we present the CBC-MAC construction, let us discuss some insecure
constructions. First, consider performing MAC to each block independently,
similar to the ECB-mode (§ 2.9). One drawback is that this would result in
a long MAC. An even worse drawback is that this is insecure; an attacker
may obtain a MAC for a different message, which contains re-ordered and/or
duplicated blocks.
Next, consider adding a counter to the input, to which we refer as CTR-
MAC. This prevents the trivial attack - but not simple variants, as shown in
the following exercise. For simplicity, the exercise is given for l = 2. Of course,
this design also has the disadvantage of a longer output tag.

114
Exercise 3.3 (CTR-MAC is insecure). Let E be a secure (n + 1)−bit block
cipher, and define the following 2n−bit domain function: Fk (m0 +
+ m1 ) =
Ek (0 +
+ m0 ) +
+ Ek (1 ++ m1 ) (CTR-MAC). Present a counterexample showing
that F is not a secure 2n−bit MAC.

Solution: See in chapter 10.


Finally, we present the CBC-MAC construction, also known as the CBC-
MAC mode. This is a widely used, standard construction of an (l ·n)−bit MAC
from an n−bit block cipher. The CBC-MAC mode, illustrated in Fig. 3.4, is a
variant of the CBC mode used for encryption, see § 2.9. Given a block-cipher
E, we define CBC − M AC E as in Eq. 3.7, for an l-block input message (i.e.,
of length l · n bits), m = m1 +
+ m2 +
+ ... +
+ ml :

CBC − M ACkE (m) = {c0 ← 0n ; (i = 1 . . . l)ci = Ek (mi ⊕ ci−1 ); output cl }


(3.7)
See Fig. 3.4.
When E is obvious we may simply write CBC − M ACk (·).

m1 m2 m3

0n

Ek Ek Ek

CBC − M ACkE (m)

Figure 3.4: CBC-MAC: construction of l · n−bit PRF (and MAC), from n−bit
PRF.

There are other constructions of secure MAC from PRFs (and block ci-
phers), including more efficient constructions, e.g., supporting parallel compu-
tation. However, CBC-MAC is the most widely used MAC based on block
ciphers, as also possibly the simplest, hence we focus on it.
We next present Lemma 3.2 which shows that CBC-MAC constructs a
secure PRF (and hence also MAC), provided that the underlying function E
is a PRF.

Lemma 3.2. If E is an n-bit PRF, then CBC − M ACkE (·) is a secure n · l-bit
PRF and MAC, for any constant integer l > 0.

Proof: see in [17].

CBC-MAC is not a VIL-MAC . The CBC-MAC construction is defined


for input which is an integral number of blocks, i.e., n·l bits. Would it work for
inputs of arbitrary length, or how can we extend it so it does support input of

115
arbitrary length, i.e., a variable input length (VIL) PFR (and MAC) - defined
for input domain domain {0, 1}∗ ?
One obvious problem is that an arbitrary binary string, may not even consist
of an integral number of blocks, while CBC-MAC is defined only for inputs
which are of length n · l, i.e., integral number of blocks. However, let us ignore
that problem for now, and focus on inputs whose length is an integral number
of blocks, i.e., the inputs in the domain VIBC domain, defined as:

V IBC ≡ m ∈ {0, 1}n·l |l ∈ mathbbZ +



(3.8)

Where VIBC stands for variable input block-count.


However, CBC-MAC is not a PRF - or even a MAC - even for input domain
VIBC. We show this in the following exercise.
Exercise 3.4 (CBC-MAC is not VIL MAC). Show that CBC-MAC is not a
MAC for the domain VIBC (Equation 3.8), and hence is definitely not a MAC
for {0, 1}∗ , or a PRF for either VIBC or {0, 1}∗ -.
Solution: Let fk (·) = CBC − M ACkE (·) be the CBC-MAC using an under-
lying n-bit block cipher Ek . Namely, for a single-block message a ∈ {0, 1}n , we
have fk (a) = Ek (a); and for a two block message a + + b, where a, b ∈ {0, 1}n ,
we have fk (a + + b) = Ek (b ⊕ Ek (a)).
We present a simple adversary Afk , with oracle access to fk , i.e., A is able
to make arbitrary query x ∈ {0, 1}∗ to fk and receive the result fk (x). Let X
denote all the queries made by A during its run. We show that Afk generates
a pair x, fk (x), where x 6∈ X, which shows that fk (i.e., CBC-MAC) is not a
MAC for domain VIBC (and hence also not a {0, 1}∗ -MAC, i.e., VIL MAC).
Specifically, the adversary A first makes an arbitrary single-block query, for
arbitrary a ∈ {0, 1}n . Let c denote the result, i.e., c = fk (a) = Ek (a). Then,
A computes b = a ⊕ c and outputs the pair of message a + + b and tag c.
Note that c = fk (a+ +b), since fk (a+
+b) = Ek (b⊕Ek (a)) = Ek ((a⊕c)⊕c) =
Ek (a) = c. Namely, c is indeed the correct tag for a + + b. Obviously, A did not
make a query to receive fk (a + + b). Hence, A succeeds in MAC game against
CBC-MAC.

3.5.3 Constructing Secure VIL MAC from PRF


Lemma 3.2 shows that CBC-MAC is a secure ln-bit FIL PRF (and MAC);
however, Exercise 3.4 shows that it is not a VIL MAC (and hence surely not
VIL PRF). The crux of the example was that we used the CBC-MAC of a one-
block string, and presented it as the MAC of a 2-block string. This motivates
a minor change to the construction, where we prepend the block-encoded length
L(m) of the input m to the input before applying CBC-MAC. We define L(m)
as an n-bit binary string (i.e., a block), whose binary value is the the length
|m| of the input m. Lemma 3.3 shows that this construction is indeed a secure
VIL MAC. We refer to this variant as length-prepending CBC-MAC.

116
Lemma 3.3 (Length-prepending CBC-MAC is a VIL PRF.). Let fk (m) =
CBC − M ACkE (L(m) + + m), where L(m) is the block-encoded length of m (as
defined above). Then fk (·) is a PRF over the set of all binary strings(and
MAC).

Proof: See [17].


Note that the block-encoded length L(m), can only support message up to
the maximal length encoded by n bits - i.e., |m| < 2n . In practice, this isn’t an
issue - and it is not difficult to extend the construction to avoid this limitation,
if you really want to. It is hard to imagine a practical scenario in which you
will have to do this, however.

3.6 Other MAC constructions


In the previous section, we presented the well-known - and widely used - CBC-
MAC construction, which constructs a MAC from a block cipher. There are
some other constructions of MAC from block ciphers, whose goal is to im-
prove efficiency, e.g., by avoiding the need to know the input length in advance
(CMAC [68]) or allowing parallel computation and verification (e.g., XOR-
MAC [16]).
Other constructions try to construct MAC using other cryptographic mech-
anisms, which may be more efficient than block-cipher. In this section we dis-
cuss briefly some approaches trying to achieve that. We discuss the following
approaches: (1) design a MAC ‘from scratch’, i.e., without provable reduction
to the security of some other cryptographic scheme, (2) combine multiple can-
didate MAC functions (robust combiner), and (3) construct a MAC from a
cryptographic hash function.

3.6.1 MAC design ‘from scratch’


This approach attempts to design a candidate MAC function without requiring
a reduction to the security of some cryptographic scheme; tyically, the design
simply does not involve any other, known cryptographic function. Instead,
we may use some problems which are considered computationally-hard. The
security of such design is based on the failure of significant cryptanalysis efforts
against the MAC function. This used to be the main method of design of new
cryptographic mechanisms.
However, following the cryptographic building block principle (principle 8),
MAC functions are rarely designed ‘from scratch’. We give an example: vul-
nerabilities in EDC-based MAC designs. ,

A (failed) attempt to construct MAC from EDC Let us consider a spe-


cific design, which, intuitively, may look promising: constructing a MAC from
(good) Error Detection Code (EDC), such as one of the good CRC schemes.

117
EDC schemes are designed to ensure integrity - albeit, their design model as-
sumes random errors, rather than intentional modifications. However, can we
extend them, using a secret key, to provide also authentication?
Notice that in § 2.10 we showed that encryption of CRC may not suffice to
ensure authentication. Still, this does not rule out their use for authentication
by using a secret key in a different way, in particular, unrelated to encryption.
In the following exercise, we show that two specific, natural constructions,
M ACk (m) = EDC(k + +m) and M ACk0 (m) = EDC(m+ +k), are both insecure.

Exercise 3.5 (Insecure EDC-based MACs). Show that both M ACk (m) =
EDC(k + + m) and M ACk0 (m) = EDC(m +
+ k) are insecure, even when us-
ing a ‘strong’ EDC such as CRC.

Solution: We first note that the insecurity is obvious for simple EDCs such
as bit-wise XOR of all bits of the message, and appending the result as an
EDC. This weak EDC detects single bit errors, but fails to detect any error
involving an even number of bits. This holds equally well with and without a
secret key, concatenated before or after the message.
Let us now show that these designs are insecure, also when using a ‘strong
EDC’ such as CRC. Specifically, consider CRC-MACk (m) = CRC(k + + m); we
next show this is not a secure MAC.
Recall that the CRC function is linear, namely CRC(m⊕m0 ) = CRC(m)⊕
CRC(m0 ). Hence, for any message m,

CRC-MACk (m) = CRC(k +


+ m) (3.9)
 
= CRC (0|k| + + 0|m| )
+ m) ⊕ (k + (3.10)
= CRC(0|k| + + 0|m| )
+ m) ⊕ CRC(k + (3.11)
= CRC-MAC0|k| (m) ⊕ CRC-MACk (0|m| ) (3.12)

Namely, to forge the MAC for any message m, the attacker makes a query
for q = 0|m| , and receives CRC-MACk (0|m| ). Adversary now computes:
CRC-MAC0|k| (m) = CRC(0|k| + + m), and finally computes CRC-MACk (m) =
CRC-MAC0|k| (m) ⊕ CRC-MACk (0|m| ).
We conclude that these ECD-based MACs are - as expected - insecure.
Note that in Exercise 3.5 above, the attack assumes that the attacker can
obtain the MAC for the specific message (query) q = 0|m| . Obtaining MAC for
this specific (‘chosen’) message may be infeasible in many scenarios, i.e., the
attack may appear impractical. However, as the following exercise shows, it is
quite easy to modify the attack so that it works for any (‘known’) message for
which the attacker can receive the MAC value.

Exercise 3.6 (Realistic attack on CRC-MAC). Show how an attacker can


calculate CRC-MACk (m) for any message m, given the value CRC-MACk (m0 )
for any message m0 s.t. |m0 | = |m|.

118
Guidance: The attack is a slight modification of the one in Exercise 3.5,
exploiting the linearity, very much like in Eq. 3.9, except for choosing a dif-
ferent message (query) q, not q = O|m| as before. The main challenge is to
select the message (query) q so that CRC-MACk (m) = CRC-MAC0|k| (q) ⊕
CRC-MACk (m0 ); you can find the required value of q by essentially solving
this equation, which, using the linearity of CRC, is actually a simple linear
equation.

3.6.2 Robust combiners for MAC


A robust combiner for MAC combines two (or more) candidate MAC functions
to create a new composite function, which is proven secure provided that one
(or a sufficient number) of the underlying functions is secure. There is actually
a very simple robust combiner for MAC schemes: concatenation (denoted + +).
In the following exercise we show that concatenation is a robust combiner.
Exercise 3.7. Show that concatenation is a robust combiner for MAC func-
tions.

Solution (from [92]): Let F 0 , F 00 be two candidate MAC schemes, and define
Fk0 ,k00 (m) = Fk0 0 (m) +
+ Fk0000 (m). We should show that it suffices that either F 0
00
or F is a secure MAC, for F to be a secure MAC scheme as well. Without loss
of generality, assume F 0 is secure; and assume, to the contrary, that F is not
a secure MAC. Namely, assume an attacker AFk0 ,k00 (µ)|µ6=m that can output a
pair m, Fk0 ,k00 (m), given access to an oracle that computes Fk0 ,k00 on any value
except m. We use A to construct an adversary A0 which succeeds against F 0 .
Adversary A0 operates by running A, as well as selecting a key k 00 and
running Fk0000 (·); this is needed to allow A0 to provide the oracle service to
AFk0 ,k00 (µ)|µ6=m , computing Fk0 ,k00 (µ) for any given input µ. Whenever A makes
a query q, then A0 makes the same query to the Fk0 0 (·) oracle, to receive Fk0 0 (q).
Then, A0 computes by itself Fk0000 (q), and combines it with Fk0 0 (q) to produce
the required response (Fk0 0 (q), Fk0000 (q)).
When A finally returns the pair (m, Fk0 ,k00 (m)) = (m, Fk0 0 (m) + + Fk0000 (m)),
0 0
then A simply returns the pair (m, Fk0 (m)), i.e., omitting the second part of
the MAC that A returned.
However, concatenation is a rather inefficient construction for robust com-
biner of MAC schemes, since it results in duplication of the length of the out-
put. The following exercise shows that exclusive-or is also a robust combiner
for MAC - and since the output length is the same as of the component MAC
schemes, it is efficient.
Exercise 3.8. Show that exclusive-or is a robust combiner for MAC functions.
Namely, that M AC(k0 ,k00 ) (x) = M ACk0 0 (x) ⊕ M ACk0000 (x) is a secure MAC, if
one or both of {M AC 0 , M AC 00 } is a secure MAC.
Guidance: Similar to the solution of Ex. 3.7; see in chapter 10.

119
3.6.3 MAC constructions from other cryptographic
mechanisms
Finally, we consider constructions of MAC functions from other cryptographic
mechanisms, following the cryptographic building blocks principle (§ 2.7); these
type of constructions of one cryptographic scheme from another, are also the
most widely studied in cryptographic literature.
The most well-known and widely used constructions of MAC functions from
other cryptographic schemes, are from two ‘standard’ cryptographic building-
blocks: block ciphers and cryptographic hash functions.
Constructions of MAC functions from block ciphers, need to address the ob-
vious challenge of input length: we defined MAC functions for arbitrary input
length, usually abbreviated as VIL (for Variable Input Length, while block ci-
phers have Fixed Input Length (FIL). This makes these constructions somewhat
more complex, and we discuss them in the following section (subsection 3.5.2).
Cryptographic hash functions, in contrast, usually allow for variable input
length (VIL), and therefore, constructions based on them are mostly simpler.
We discuss cryptographic hash functions in chapter 4, and their use to construct
MAC functions in subsection 4.6.1.

3.7 Combining Authentication and Encryption


Message authentication combines authentication (sender identification) and in-
tegrity (detection of modification). However, when transmitting messages, we
often have additional goals. These include security goals such as confidential-
ity, as well as fault-tolerance goals such as error-detection/correction, and even
efficiency goals such as compression. In this section, we focus on the combina-
tion of the two basic security goals: encryption and authentication. Later on, in
subsection 5.1.4, we discuss the complete secure session transmission protocol,
which addresses additional goals involving security, reliability and efficiency,
for a session (connection) between two parties.
There are two main options for ensuring the confidentiality and authentica-
tion/integrity requirements together: (1) by correctly combining an encryption
scheme with a MAC scheme, or (2) by using a combined authenticated encryp-
tion scheme. In the first subsection below, we discuss authenticated encryp-
tion schemes, including the security requirements and adversary model for the
combination of confidentiality (encryption) and authentication. In the follow-
ing subsections, we discuss specific generic constructions, combining arbitrary
MAC and encryption schemes.

3.7.1 Authenticated Encryption (AE) and AEAD schemes


Since the combination of confidentiality and authenticity is often required, there
are also constructions of combined Authenticated Encryption (AE) schemes.
AE schemes, like encryption schemes, consist of two functions: encrypt-and-
authenticate and decrypt-and-verify. The decrypt-and-verify returns ERROR

120
if the ciphertext is found not-authentic; similar verification property can be
implemented by a MAC scheme, by comparing the ‘tag’ received with a message
to the result of computing the MAC on the message. AE schemes may also
have a key-generation function; in particular, this is necessary when the keys
are not uniformly random.
The use of such combined scheme allows simpler, less error-prone implemen-
tations, with calls to only one function (encrypt-and-authenticate or decrypt-
and-verify) instead of requiring the correct use of both encryption/decryption
and MAC functions. Many constructions are generic, i.e., built by combin-
ing arbitrary implementation of cryptographic schemes, following the ‘crypto-
graphic building blocks’ principle. The combinations of encryption scheme and
MAC scheme that we study later in this subsection are good examples for such
generic constructions.
Other constructions are ‘ad-hoc’, i.e., they are designed using specific func-
tions. Such ad-hoc constructions may have better performance than generic
constructions, however, that may come at the cost of requiring more complex or
less well-tested security assumptions, contrary to the Cryptographic Building
Blocks principle.
In many applications, some of the data to be authenticated should not
be encrypted, since it is used also by agents which do not have the secret
(decryption) key; for example, the identity of the destination. Such data is often
referred to as associated data, and authenticated encryption schemes supporting
it are referred to as AEAD (Authenticated Encryption with Associated Data)
schemes [146]. AEAD schemes have the same three functions (key-generation,
encrypt-and-authenticate, decrypt-and-verify); the change is merely in adding
an optional ‘associated-data’ field as input to the encrypt-and-authenticate
function and as output of the decrypt-and-verify function.

Authenticated encryption: attack model and success/fail criteria


We now briefly discuss the attack model (attacker capabilities) and the goals
(success/fail criteria) for the combination of authentication and confidentiality
(encryption), as is essential for any security evaluation (principle 1). Essen-
tially, this combines the corresponding attack model and goals of encryption
schemes (indistinguishability test) and of message authentication code (MAC)
schemes (forgery test).
As in our definitions for encryption and MAC, we consider a computationally-
limited (PPT) adversary. We also allow the attacker to have similar capabilities
as in the definitions of secure encryption / MAC. In particular, we allow cho-
sen plaintext queries, where the attacker provides input messages (plaintext)
and receives their authenticated-encryption, as in the chosen-plaintext attack
(CPA) we defined for encryption.

Exercise 3.9. Present precise definitions for IND-CPA and security against
forgery for AE and AEAD schemes.

121
3.7.2 EDC-then-Encrypt Schemes
Several practical secure communication systems first apply an Error-Detecting-
Code (EDC) to the message, and then encrypt it, i.e.: c = Ek (m + + EDC(m)).
We believe that the motivation for this design is the hope to ensure authenti-
cation as well as confidentiality, i.e., the designers were (intuitively) trying to
develop an authenticated-encryption scheme. Unfortunately, such designs are
often insecure; in fact, often, the application of EDC/ECC before encryption
allows attacks on the confidentiality of the design. We saw one example, for
WEP, in § 2.10. Another example of such vulnerability is in the design of GSM,
which employs not just an Error Detecting Code but even an Error Correcting
Code, with very high redundancy. In both WEP and GSM, the encryption
was performed by XORing the plaintext (after EDC/ECC) with the keystream
(output of PRG).
However, EDC-then-Encrypt schemes are often vulnerable, also when using
other encryption schemes. For example, the following exercise shows such vul-
nerability, albeit against the authentication property, when using CBC-mode
encryption.
Exercise 3.10 (EDC-then-CBC does not ensure authentication). Let E be
a secure block cipher and let CBCkE (m; IV ) be the CBC-mode encryption of
plaintext message m, using underlying block cipher E, key k and initializa-
tion vector IV , as in Eq. (2.38). Furthermore, let EDCtCBCkE (m; IV ) =
CBCkE (m + + h(m); IV ) where h is a function outputting one block (error de-
tecting code). Show that EDCtCBC E is not a secure authenticated encryption;
specifically, that authentication fails.
Hint: attacker asks for EDCtCBC E encryption of the message m0 = m +
+
h(m); the output gives also the encryption of m.

3.7.3 Generic Authenticated Encryption Constructions


We now discuss ‘generic’ constructions, combining arbitrary MAC and encryp-
tion schemes to ensure both confidentiality and authentication/integrity. As
discussed above, these constructions can be used to construct a single, com-
bined ‘authenticated encryption’ scheme, or to ensure both goals (confidential-
ity and authenticity) in a system.
Different generic constructions were proposed - but not all are secure. Let
us consider three constructions, all applied in important, standard applica-
tions. For each of the designs, we present the process of authenticating and
encrypting a message m, using two keys - k 0 used for encryption, and k 00 used
for authentication.

Authenticate and Encrypt (A&E) , e.g., used in early versions of the SSH
protocol: C = Enck0 (m), A = M ACk00 (m); send (C, A).
Authenticate then Encrypt (AtE) , e.g., used in the SSL and TLS stan-
dards: A = M ACk00 (m), C = Enck0 (m, A); send C.

122
Encrypt then Authenticate (EtA) , e.g., used by the IPsec standard: C =
Enck0 (m), A = M ACk00 (C); send (C, A).

Exercise 3.11 (Generic AE and AEAD schemes). Above we described only the
‘encrypt-and-authenticate’ function of the authenticated-encryption schemes for
the three generic constructions, and even that, we described informally, with-
out the explicit implementation. Complete the description by writing explicitly,
for each of the three generic constructions above, the implementation for the
encrypt-and-authenticate (EnA) and the decrypt-and-verify (DnV) functions.
Present also the AEAD (Authenticated Encryption with Associated Data) ver-
sion.
Partial solution: we present only the solution for the A&E construction.
The AE implementations are:

A&[Link](k0 ,k00 ) (m) ← (Enck0 (m), M ACk00 (m))


 
 m ← Deck0 (c) 
A&[Link](k0 ,k00 ) (c, a) ← ERROR if m = ERROR or a 6= M ACk00 (m)
Otherwise, m
 

The AEAD implementations are very similar, except also with Associated
Data (wAD); we present only the EnA function:

A&[Link](k0 ,k00 ) (m, d; r) = (Enck0 (m; r), d, M ACk00 (m +


+ d))

Some of these three generic constructions are insecure, as we demonstrate


below for particular pairs of encryption and MAC functions. Can you identify
- or guess - which? The answers were given, almost concurrently, by two
beautiful papers [19, 110]; the main points are in the following exercises.
Exercise 3.12 shows that A&E is insecure; this is quite straightforward, and
hence readers should try to solve it alone before reading the solution.
Exercise 3.12 (Authenticate and Encrypt (A&E) is insecure). Show that a
pair of secure encryption scheme Enc and secure MAC scheme M AC may
be both secure, yet their combination using the A&E construction would be
insecure.
Solution: given any secure MAC scheme M AC, let

M ACk0 00 (m) = M ACk00 (m) +


+ m[1]

where m[1] is the first bit of m.


If M AC is a secure MAC then M AC 0 is also a secure MAC. However, M AC 0
exposes a bit of its input; hence, its use in A&E would allow the adversary to
distinguish between encryptions of two messages, i.e., the resulting, combined
scheme is not IND-CPA secure - even when the underlying encryption scheme
E is secure.

123
Exercise 3.13 shows that AtE is also insecure. The argument is more elabo-
rate than the A&E argument from Exercise 3.12, and it may not be completely
necessary to understand it for a first reading; however, it is a nice example
of a cryptographic counterexample, so it may be worth investing the effort.
Readers may also consult [110] for more details.
Exercise 3.13 (Authenticate then Encrypt (AtE) is insecure). Show that a
pair of secure encryption scheme Enc and secure MAC scheme M AC may be
both secure, yet their combination using the AtE construction would be insecure.
Solution: Consider the following simplified version of the Per-Block Random
(PBR) mode presented in subsection 2.9.2, defined for single block messages:
Enck (m; r) = m ⊕ Ek (r) + + r, where E is a block cipher; notice that this is also
essentially OFB and CFB mode encryption, applied to single block messages.
When the random bits are not relevant, i.e., simply selected uniformly, then
we do not explicitly write them and use the simplified notation Enck (m).
As shown in Theorem 2.1, if E is a secure block cipher (or even merely a
PRF or PRP), then Enc is an IND-CPA secure encryption scheme. Denote
the block length by 4n, i.e., assume it is a multiple of 4. Hence, the output of
Enc is 8n-bits long.
We next define a randomized transform Split : {0, 1} → {0, 1}2 , i.e.,
from one bit to a pair of bits. The transform always maps 0 to 00, and
randomly transforms 1 to {01, 10, 11} with the corresponding probabilities
{49.9%, 50%, 0.1%}. We extend the definition of Split to 2b-bit long strings, by
applying Split to each input block, i.e., given 2n-bit input message m = m1 + +
. . .+
+m2n , where each mi is a bit, let Split(m) = Split(m1 )+ +. . .+
+Split(m2n ).
We use Split to define a ‘weird’ variant of Enc, which we denote Enc0 ,
defined as: Enc0k (m) = Enck (Split(m)). The reader should confirm that,
assuming E is a secure block cipher, then Enc0 is IND-CPA secure encryption
scheme (for 2n-bit-long plaintexts).
Consider now AtEk,k0 (m) = Enc0k (m + + M ACk0 (m)) = Enck (Split(m + +
M ACk0 (m))), where m is an n-bits long string, and where M AC has input
and outputs of n-bits long strings. Hence, the input to Enc0 is 2n-bits long,
and hence, the input to Enc is 4n-bits long - as we defined above.
However, AtE is not a secure authenticated-encryption scheme. In fact,
given c = AtEk,k0 (m), we can decipher m, using merely feedback-only CCA
queries.
Let us demonstrate how we find the first bit m1 of m. Denote the 8n bits of c
as c = c1 + +c8n . Perform the query c0 = c¯1 +
+. . .+ +c¯2 +
+c3 +
+c4 +
+c5 ++. . .+
+c8n , i.e.,
inverting the first two bits of c. Recall that c = AtEk,k0 (m) = Enck (Split(m+ +
M ACk0 (m))) and that Enck (m; r) = m⊕Ek (r)+ +r. Hence, by inverting c1 , c2 ,
we invert the two bits of Split(m1 ) upon decryption.
The impact depends on the value of m1 . If m1 = 0, then Split(m1 ) = 00;
by inverting them, we get 11, whose ‘unsplit’ transform returns 1 instead of 0,
causing the MAC validation to fail, providing the attacker with an ‘ERROR’
feedback. However, if m1 = 1, then Split(m1 ) is either 01 or 10 (with proba-
bility 99.9%), and inverting both bits does not impact the ‘unsplit’ result, so

124
that the MAC validation does not fail. This allows the attacker to determine
the first bit m1 , with very small (0.1%) probability of error (in the rare case
where Split(m1 ) returned 11).
Note that the AtE construction is secure - for specific encryption and MAC
schemes. However, it is not secure for arbitrary secure encryption and MAC
schemes, i.e., as a generic construction. Namely, Encrypt-then-Authenticate
(EtA) is the only remaining candidate generic construction. Fortunately, EtA
is secure, for any secure encryption and MAC scheme, as the following lemma
states.

Lemma 3.4 (EtA is secure [110]). Given a CPA-IND encryption scheme Enc
and a secure MAC scheme M AC, their EtA construction ensures both CPA-
IND and secure MAC.

Proof sketch: We first show that the IND-CPA property holds. Suppose, to
the contrary, that there is an efficient (PPT) adversary A that ‘wins’ against
EtA in the IND-CPA game, with significant probability. We construct adver-
sary A0 that ‘wins’ in the IND-CPA game against the encryption scheme Enc,
employed as part of the EtA scheme. Specifically, A0 generates a key k 00 for the
MAC function, and runs A. Whenever A chooses the two challenge messages
m0 , m1 , and should be provided with the authenticated-encryption of mb , then
A0 chooses the same two messages and receives c∗ = Enck0 (mb ). Then A0
uses the key k 00 it generated to compute a∗ = M ACk00 (c∗) and return the pair
(c∗, a∗) which would be the authenticated-encryption of mb , as required.
Similarly, whenever A asks for encryption of a message m, then A0 uses its
oracle to compute c = Enck0 (m), and k 00 to compute a = M ACk00 (c). A0 then
returns the pair (c, a) to A, which is exactly the required EtA.EnAk0 ,k00 (m).
Finally, when A guesses a bit b, then A0 guesses the same bit. If A ‘wins’,
i.e., correctly guesses, then A0 also ‘wins’. It follows that there is no efficient
(PPT) adversary A that ‘wins’ against EtA in the IND-CPA game.
We next show that EtA also ensures security against forgery, as in Def. 3.1,
adjusted for AE / AEAD schemes, as in Ex. 3.9. Suppose there is an effi-
cient (PPT) adversary A that succeeds in forgery of the EtA scheme, with
significant probability. Namely, A produces a message c and tag a s.t. m =
EtA.DnVk0 ,k00 (c, a), for some message m, without making a query to EtA.EnAk0 ,k00 (m).
By construction, this implies that a = M ACk00 (c).
However, from the definition of encryption (Def. 2.1), specifically the cor-
rectness property, there is no other message m0 6= m whose encryption would
result in same ciphertext c. Hence, A did not make a query to EtA.EnAk0 ,k00
that returned M ACk00 (c) as the tag - yet A obtained M ACk00 (c) somehow - in
contradiction to the assumed security of MAC.

Additional properties of EtA: efficiency and ability to foil DoS, CCA


Not only is EtA secure given any secure Encryption and MAC scheme - it also
has three additional desirable properties:

125
Efficiency: Any corruption of the ciphertext, intentional or benign, is detected
immediately by the verification process (comparing the received tag to
the MAC of the ciphertext). This is much more efficient then encryption.
Foil DoS: This improved efficiency implies that it is much harder, and rarely
feasible, to exhaust the resources of the recipient by sending corrupted
messages (ciphertext).
Foil CCA: By validating the ciphertext before decrypting it, EtA schemes
prevent CCA attacks against the underlying encryption scheme, where
the attacker provides specially-crafted ciphertext messages, receives the
corresponding plaintext (or failure indication if the ciphertext was not
valid encryption), and uses the resulting plaintext and/or failure indi-
cation to attack the underlying encryption scheme. If the attacker is
creating such crafted ciphertext and sending the the EtA scheme, then
it should fail the MAC validation, and would not even be input to the
decryption process. Therefore, as long as the attacker cannot forge legit-
imate MAC, they can only attack the MAC component of the EtA, and
the encryption scheme is protected from this threat.

3.7.4 Single-Key Generic Authenticated-Encryption


All three constructions above used two separate keys: k 0 for encryption and
k 00 for authentication. Sharing two separate keys may be harder than sharing
a single key. Can we use a single key k for both the encryption and the
MAC functions used in the generic authenticated encryption constructions (or,
specifically, in the EtA construction, since it is always secure)? Note that this
excludes the obvious naive ‘solution’ of using a ‘double-length’ key, split into
an encryption key and a MAC key. The following exercise shows that such ‘key
re-use’ is insecure.

Exercise 3.14 (Key re-use is insecure). Let E 0 , M AC 0 be secure encryption


and MAC schemes. Show (contrived) examples of secure encryption and MAC
schemes, built using E 0 , M AC 0 , demonstrating vulnerabilities for each of the
three generic constructions, when using the same key for authentication and
for encryption.

Partial solution:
A&E: Let Ek0 ,k00 (m) = Ek0 0 (m) +
+ k 00 and M ACk0 ,k00 (m) = k 0 +
+ M ACk00 (m).
Obviously, when combined using the A&E construction, the result is com-
pletely insecure - both authentication and confidentiality are completely
lost.
AtE: To demonstrate loss of authenticity, let Ek0 ,k00 (m) = Ek0 0 (m) +
+ k 00 as
above.

126
EtA: To demonstrate loss of confidentiality, let M ACk0 ,k00 (m) = k 0 +
+M ACk00 (m)
as above. To demonstrate loss of authentication, with a hint to one el-
egant solution: combine Ek0 ,k00 (m) = Ek0 0 (m) + + k 00 as above, with a
(simple) extension of Example 3.2.

The reader is encouraged to complete missing details, and in particular, to show


that the all the encryption and MAC schemes used in the solution are secure
(albeit contrived) - only their combined use, in the three generic constructions,
is insecure.
Since we know that we cannot re-use the same key for both encryption and
MAC, the next question is - can we use two separate keys, k 0 , k 00 from a single
key k, and if so, how? We leave this as a (not too difficult) exercise.

Exercise 3.15 (Generating two keys from one key). Given a secure n-bit-
key shared-key encryption scheme (E, D) and a secure n-bit-key MAC scheme
M AC, and a single random, secret n-bit key k, show how we can derive two
keys (k 0 , k 00 ) from k, s.t. the EtA construction is secure, when using k 0 for
encryption and k 00 for MAC, given:

1. A secure n-bit-key PRF f .


2. A secure n-bit-key block cipher (Ê, D̂).
3. A secure PRG from n bits to 2n bits.

Solution: see in chapter 10.

127
3.8 Message Authentication: Additional exercises
Exercise 3.16. Mal intercepts a message sent from Alice to her bank, and
instructing the bank to transfer 10$ to Bob. Assume that the communication
is protected by OTP encryption, using a random key shared between Alice and
her bank, and by including Alice’s password as part of the plaintext, validated
by the bank. Assume Mal knows that the message is an ASCII encoding of the
exact string Transfer 10$ to Bob. From: Alice, PW: xxxxx, except that xxx is
replaced by Alice’s password (unknown to Mal). Show how Mal can change the
message so that upon receiving it, the bank will, instead, transfer 99$ to Mal.
Exercise 3.17. Let S be a correct signature scheme over domain {0, 1}n , and
let h : {0, 1}∗ → {0, 1}n be a hash function whose output is n bits long. Prove
h
that the HtS construction SHtS , defined as in Equation 3.5, is correct.

Exercise 3.18. Let S be a existentially unforgeable signature scheme over


domain {0, 1}n , and let h(x + + y) = x ⊕ y be a hash function whose input
is 2n bits long, and whose output is the n-bit string resulting from the bit-
wise exclusive-OR of the most-significant n input bits, with the least significant
h
n input bits. Show an attacker A that shows that the HtS construction SHtS ,
defined as in Equation 3.5, is not an existentially unforgeable signature scheme.

Exercise 3.19. Hackme Inc. proposes the following highly-efficient MAC,


using two 64-bit keys k1 , k2 , for 64-bit blocks: M ACk1 ,k2 (m) = (m ⊕ k1 ) +
k2 (mod 264 ). Show that this is not a secure MAC.

Hint: Compare to Exercise 2.44.

Exercise 3.20. Let F : {0, 1}n → {0, 1}l be a secure PRF, from n bit strings
to l < n bit strings. Define F 0 : {0, 1}n → {0, 1}l as: Fk0 (m) = Fk (m)+
+Fk (m),
i.e., concatenate the results of Fk applied to m and to the inverse of m. Present
0
an efficient algorithm ADV Fk which demonstrates that F 0 is not a secure MAC,
0
i.e., outputs tuple (x, t) s.t. x ∈ {0, 1}n and t = Fk0 (x). Algorithm ADV Fk
may provide input m ∈ {0, 1}n and receive Fk0 (m), as long as x 6= m. You
0
can present ADV Fk by ‘filling in the blanks’ in the ‘template’ below, modifying
and/or extending the template if desired, or simply write your own code if you
like.
0
ADV Fk : {t0 = Fk0 ( );

Return ( ); }

Exercise 3.21. Consider CF B − M AC, defined below, similarly to the defi-


nition of CBC − M AC (Eq. (3.7)):

CF B−M ACkE (m1 + +mη ) = c0 ← 0l ; (i = 1 . . . η)ci = mi ⊕ Ek (ci−1 ); outputcη



+m2 +
+. . .+
(3.13)

128
1. Show an attack demonstrating that CF B − M ACkE is not a secure l · η-
bit MAC, even when E is a secure l-bit block cipher (PRP). Your attack
should consist of:
a) Up to three ‘queries’, i.e., messages m = , m0 =
00
and m = , each of one or more blocks, to which
the attacker receives CF B − M ACkE (m), CF B − M ACkE (m0 ) and
CF B − M ACkE (m00 ). Note: actually, one query suffices.
b) A forgery, i.e., a pair of a message mF and its
authenticator a = such that mF 6∈ {m, m0 , m00 }
and a = CF B − M ACkE (mF ).
2. Would your attack also work against the ‘improved’ variant ICF B −
M ACkE (m) = EK (CF B −M ACkE (m))? If not, present an attack against
ICF B −M ACkE (m): m = , m0 = ,
00 F
m = ,m and a = .

Exercise 3.22. 1. Alice sends to Bob the 16-byte message ‘I love you Bobby’,
where each character is encoded using one-byte (8 bits) ASCII encoding.
Assume that the message is encrypted using the (64-bit) DES block ci-
pher, using OFB mode. Show how an attacker can modify the ciphertext
message to result with the encryption of ‘I hate you Bobby’.
2. Can you repeat for CFB mode? Show or explain why not.
3. Can you repeat for CBC mode? Show or explain why not.
4. Repeat previous items, if we append to the message its CRC, and verify
it upon decryption.

Exercise 3.23. 1. Our definition of FIL CBC-MAC assumed that the input
is a complete number of blocks. Extend the construction to allow input of
arbitrary length, and prove its security.
2. Repeat, for VIL CBC-MAC.
Exercise 3.24. Consider a variant of CBC-MAC, where the value of the IV
is not a constant, but instead the value of the last plaintext block, i.e.:

CBC−M ACkE (m1 + +m2 + +mη ) = {c0 ← mη ; (i = 1 . . . η)ci = Ek (mi ⊕ ci−1 ); outputcη }
+. . .+
(3.14)
Is this a secure MAC? Prove or present convincing argument.
Exercise 3.25. Let E be a secure PRF. Show that the following are not secure
MAC schemes.

1. ECB-encryption of the message.


2. The XOR of the output blocks of ECB-encryption of the message.

129
Exercise 3.26 (MAC from a PRF). In Exercise 2.35 you were supposed to
construct a PRF, with input, output and keyspace all of 64 bits. Show how to
use such (candidate) PRF to construct a VIL MAC scheme.

Exercise 3.27. This question discuss a (slightly simplified) vulnerability in


a recently proposed standard. The goal of the standard is to allow a server
S to verify that a given input message was ‘approved’ by a series of filters,
F1 , F2 , . . . , Ff (each filter validates certain aspects of the message). The server
S shares a secret ki with each filter Fi . To facilitate this verification, each
message m is attached with a tag; the initial value of the tag is denoted T0 and
and each filter Fi receives the pair (m, Ti−1 ) and, if it approves of the message,
outputs the next tag Ti . The server s will receive the final pair (m, Tf ) and use
Tf to validate that the message was approved by all filters (in the given order).
A proposed implementation is as follows. The length of the tag would be the
same as of the message and of all secrets ki , and that the initial tag T0 would be
set to the message m. Each filter Fi signals approval by setting Ti = Ti−1 ⊕ ki .
To validate, the server receives (m, Tf ) and computes m0 = Tf ⊕k1 ⊕k2 ⊕. . .⊕kf .
The message is considered valid if m0 = m.

1. Show that in the proposed implementation if the tag Tf is computed as


planned (i.e. as described above), then the message is considered valid if
and only if all filters approved of it.
2. Show that the proposed implementation is insecure.
3. Present a simple, efficient and secure alternative design for the validation
process.
4. Present an improvement to your method, with much improved, good per-
formance even when messages are very long (and having tag as long as
the message is impractical).

Note: you may combine the solutions to the two last items; but separating
the two is recommended, to avoid errors and minimize the impact of errors.
Exercise 3.28 (Single-block authenticated encryption?). Let E be a block ci-
pher (or PRP or PRF), for input domain {0, 1}l , and let l0 < l. For input
0 0
domain m ∈ {0, 1}l−l , let fk (m) = Ek (m +
+ 0l ).

1. Prove or present counterexample: f is a secure MAC scheme.


2. Prove or present counterexample: f is an IND-CPA symmetric encryp-
tion scheme.

Exercise 3.29. Let F : {0, 1}κ × {0, 1}l+1 → {0, 1}l+1 be a secure PRF,
where κ is the key length, and both inputs and outputs are l + 1 bits long. Let
F 0 : {0, 1}κ × {0, 1}2l → {0, 1}2l+2 be defined as: Fk0 (m0 +
+ m1 ) = Fk (0 ++
m0 ) + + m1 ), where |m0 | = |m1 | = l.
+ Fk (1 +

130
1. Explain why it is possible that F 0 would not be a secure 2l-bit MAC.
2. Present an adversary and/or counter-example, showing F 0 is not a secure
2l-bit MAC.
3. Assume that, indeed, it is possible for F 0 not to be a secure MAC. Could
F 0 then be a secure PRF? Present a clear argument.

Exercise 3.30. Given a keyed function fk (x), show that if there is an efficient
operation ADD such that fk (x + y) = ADD(fk (x), fk (y)), then f is not a
secure MAC scheme. Note: a special case is when ADD(a, b) = a + b.

Exercise 3.31 (MAC from other block cipher modes). In subsection 3.5.2 we
have seen given an n-bit block cipher (E, D), the CBC-MAC, as defined in Eq.
(3.7), is a secure n · η-bit PRF and MAC, for any integer η > 0; and in Ex.
3.3 we have seen this does not hold for CTR-mode MAC. Does this property
hold for...

ECB-MAC, defined as: ECB − M ACkE (m1 +


+...+
+ mη ) = Ek (m1 ) +
+...+
+
Ek (mη )
PBC-MAC, defined as: P BC − M ACkE (m1 + + mη ) = m1 ⊕ Ek (1) +
+ ... + +
+ mη ⊕ Ek (η)
... +
OFB-MAC, defined as: OF B − M ACkE (m1 + + ... ++ mη ) = pad0 , m1 ⊕
Ek (pad0 ) + + mη ⊕ Ek (padη−1 ) where pad0 is random.
+ ... +
CFB-MAC, defined as: CF B − M ACkE (m1 + +. . .++ mη ) = c0 , c1 , . . . cη where
c0 is random and ci = mi ⊕ Ek (ci−1 ) for i¿1.
XOR-MAC, defined as: XOR−M ACkE (m1 +
L
+. . .+
+mη ) = Ek (i⊕Ek (mi ))

Justify your answers, by presenting counterexample (for incorrect claims) or


by showing how an adversary against the MAC function, you construct an
adversary against the block cipher.

131
Chapter 4

Cryptographic Hash Functions

In this chapter, we discuss cryptographic hash functions. Cryptographic hash


functions are some of the most widely-used cryptographic schemes, with many
diverse properties, uses and applications. In fact, in the previous chapter we
mentioned one application: construction of a MAC scheme from a crypto-
graphic hash function; in this chapter, we will discuss such constructions of
MAC from hash - as well as other applications, such as blockchains.

4.1 Introducing Crypto-hash functions, their goals and


applications
Hash functions map from variable input length (VIL) strings, to fixed-length
output of n bits; the output is often referred to as the digest. Since the input
may be arbitrarily long, and the output is always n bits, the basic property
of hash functions is compression; of course, compression may be done trivially,
e.g., but truncating; hash functions are expected to satisfy additional proper-
ties, as we will discuss.
As illustrated in Figure 4.1, hash functions may be keyed (hk (·) : {0, 1}∗ ×
{0, 1}∗ → {0, 1}n ) or keyless (h(·) : {0, 1}∗ → {0, 1}n ). The key k is (usually)
non-secret; you will find few works referring to the use of hashes with secret
keys, but usually they actually refer constructions of MAC functions from hash
functions, a topic we will discuss; we do not discuss other use of hash with secret
keys. For keyed hash, assume, for simplicity, that the key is of the same length
n as the digest.

Digest length as a parameter n and the h(n) (m) notation. We defined


that the digest (output of the hash) is a string of n bits. Is n a fixed parameter of
the hash function? In practice, using published standards for hash functions,
the length is indeed fixed, e.g., 160 bits; see subsection 4.1.4. However, our
security measures are defined as a function of n, i.e., view n as a parameter.
Formally, we are supposed, therefore, to include the digest length n as part
(n)
of the input to h, e.g., h(n) (m) (keyless) or hk (m) (keyed). However, such

132
m m

h(·) k hk (·)

h(m) hk (m)

(a) Keyless hash function (b) Keyed hash function


h(·) : {0, 1}∗ → {0, 1}n . hk (·) : {0, 1}n × {0, 1}∗ → {0, 1}n .

Figure 4.1: Hash functions: mapping from a variable-length input to n-bits


output.

notation is cumbersome; therefore, we omit the (n) superscripts, except in the


security definitions which are based on the use of n as a parameter.
For simplicity, when discussing keyed hash functions, we use n to denote
the length of both digest and key.

4.1.1 Warmup: hashing for efficiency


Before we focus on crypto-hash functions, we first discuss briefly the use of
hash functions for randomly mapping data, as used (also) for load-balancing
and other ‘classical’, non-adversarial scenarios. Our goal is to provide intuition
for the required security properties - and awareness of some of the challenges.
A common application of hash functions, including non-cryptographic hash
functions, is to map the inputs into the possible digests values (‘bins’) in a
‘random’ manner, i.e., with a roughly equal probability of assignment to each
bin (digest value). For the typical case of n-bit digest values, there would be
N = 2n bins. This property is used in many algorithms and data structures,
to improve efficiency and fairness. For non-security applications, the exact
properties required from this ‘random’ mapping are often vague, or defined in
terms of statistical tests that the function should pass.
A typical application of non-cryptographic hash functions is illustrated
in Fig. 4.2. Here, a hash function h maps from the set of names (given as
unbounded-length strings), to a smaller set, say the set of n-bit binary strings.
The goal of applying a hash function here is load balancing of the number of
entries assigned to each bin, i.e., that each ‘bin’ will be assigned roughly the
same number of names; in particular, if the number of names mapped is much
less than the number of bins (2n ), we can expect very few or no collisions,
i.e., two names mapped to the same bin. Such load-balancing is important for
efficiency and fairness.
Of course, in cryptography, and cybersecurity in general, we mainly consider
adversarial settings. In the context of load-balancing applications as shown in

133
Figure 4.2: Typical load-balancing application of non-cryptographic hash func-
tion h

Figure 4.3: Algorithmic Complexity Denial-of-Service Attack exploiting inse-


cure hash function h to cause many collisions

Fig. 4.2, this refers to an adversary who can manipulate some of the input
names, and whose goal is to cause imbalanced allocation of names to bins, i.e.,
many collisions - which can cause bad performance.
Consider an attacker whose goal is to degrade the performance for a particu-
lar name, say Bob. This attacker may provide to the system a list of deviously-
crafted names x1 , x2 , . . ., especially selected such that all of them ‘collide’ - i.e.,
are mapped to the same bin as Bob, i.e., h(‘Bob’) = h(x1 ) = h(x2 ) = . . .. The
attack is illustrated in Fig. 4.3. This method is sufficient to significantly impair
the performance of many algorithms and systems, in a so-called Algorithmic
Complexity Denial-of-Service Attack. One way in which attackers may exploit
such an attack is to cause excessive overhead for network security devices, such
as malware/virus scanners, Intrusion-Detection Systems (IDS) and Intrusion-

134
Figure 4.4: With a secure cryptographic hash function and N bins, the at-
tacker’s needs to compute hash for about N1 inputs till finding one that has the
same hash as Bob. Typically the digest is n bits long, and then N = 2n .

Prevention Systems (IPS), causing these systems to ‘give up’ and allowing at-
tacks to penetrate undetected. We will discuss this and other Denial-of-Service
(DoS) attacks in volume II, which focuses on network security.
Such vulnerability of ‘classical’ hash functions, which were not designed
for security, is not surprising. Instead, it is indicative of the need for well-
defined security requirements for (cryptographic) hash functions, following the
attack model and security requirements principle (Principle 1). In particular,
using a secure cryptographic hash function, with the correct security property,
should foil the Algorithmic Complexity Denial-of-Service Attack of Fig. 4.3.
Specifically, each name that the adversary chooses is mapped to a ‘random’
bin; only one in roughly 2n names will match the ‘target’, i.e. h(xi ) = h(‘Bob’).
Note that some hash functions may seem to provide sufficiently-randomized
mapping when the inputs are ‘natural’ but may still allow an attacker to easily
select inputs that will hash to a specific bin (e.g., the one Bob is mapped to).
See the following exercise.

Exercise 4.1. Given an alphabetic string x, let num(x, i) be the alphabetical-


position of the ith letter in x, e.g., if x =‘abcdef ’, then num(x, i) = i. Consider
P|x|
hash function h(x) = i=1 num(x, i) mod 26, i.e., sum of all the letters (mod
26). Show how an attacker may easily generate a set C of strings colliding
with Bob, i.e. (∀c ∈ C)h(c = h(‘bob’). The attacker should not need to to
compute the hash value for many different strings. Give 3 examples of such
strings. Note: the strings do not have to be ‘real names’ - any alphabetic string
is allowed.

Solution: See in chapter 10.


When the number of bins 2n is large enough, finding such a collision
(match), with the given value ‘Bob’, becomes infeasible - assuming that the
adversary can only randomly select values and test each of them by computing
the hash. More precisely, given a random preimage of the hash function, in this
case ‘Bob’, it should be infeasible for the attacker to find a second preimage

135
(collision) m s.t. h(m) = h(‘Bob’). This property is called second preimage
resistance (SPR). See Fig. 4.4.
Note that if the number of bins is not sufficiently high, the adversary may
still be able to find collisions - on average, N1 of the inputs will match the
desired by. One may try to address this by using a secret mapping, e.g., a
pseudo-random function (PRF), instead of relying on hash functions. Note
that we consider only keyed hash where the key is not secret, hence, such hash
cannot be used for this role (as a replacement for the PRF).

4.1.2 Goals and requirements for crypto-hashing


The usefulness of hash functions stems from their diverse security properties.
Roughly, these properties fall into three broad goals: integrity, i.e., ensuring
uniqueness of the message; confidentiality, i.e., ‘hiding’ the contents of the mes-
sage; and randomness, i.e., ensuring that unknown values (of input or output)
are pseudorandom.
These goals are broad, and motivate different security requirements. Defin-
ing the security requirements for cryptographic hash functions is quite tricky;
in particular, there are several distinct requirements, required and motivated
by the many different applications for cryptographic hash functions. Some
of the definitions may appear similar on first sight - but the differences are
meaningful and critical.

Goal Requirement Abridged description


Integrity Collision resistance Can’t find collision (m, m0 ), i.e.,
(CRHF; Definition 4.1) m 6= m0 yet h(m) = h(m0 ).
Integrity Second-preimage resistance Can’t find collision to random m:
(SPR; Definition 4.6) m0 6= m yet h(m) = h(m0 ).
Confidentiality One-way function Given h(m) for random m,
(OWF; Definition 4.8 can’t find m0 s.t. h(m) = h(m0 ).
Randomness Randomness extracting Choose m except n random bits;
(Definition 4.9) then output is pseudorandom.
All Random oracle model Consider h as random function
(ROM; §4.6)

Table 4.1: Goals and Requirements for cryptographic hash functions. The
abridged descriptions are presented for keyless crypto-hash h : {0, 1}∗ →
{0, 1}n . For keyed hash h : {0, 1}n × {0, 1}∗ → {0, 1}n , use hk instead of
h, with a random key k. In SPR and OWF, the message m is random.

We discuss four security requirements, which we consider to be most impor-


tant for applied cryptography: collision resistance, one-way function (also re-
ferred to as preimage resistance), second-preimage resistance, and randomness
extraction. Table 4.1 maps the three goals to the corresponding requirements,
with a brief description of each requirement (and reference to the relevant sec-
tion). We also include the random oracle model (§ 4.6), where we analyze

136
the security of a system using crypto-hash functions as if we were using a
randomly-chosen function instead; the ROM is often adopted to allow analysis
of designs using cryptographic hash functions, without identifying a feasible
security requirement which ensures security of the design.
Here is an exercise which will strengthen your (still intuitive) understanding
of the different security requirements in Table 4.1. A solution is provided for
part (a); if you have difficulties solving part (b), try again after reading the
following sections, which discuss each of these requirements in details.

Exercise 4.2 (Insecure hash example). Show that the following hash functions
fail to provide the security properties defined above: (a) h(x) = x mod 2n - all
properties, (b) h0 (x) = x2 mod 2n - all except OWF.

Solution for item (a): We first show that h is not SPR (and hence surely
not collision resistant). Namely, assume we are given a random input x; let
x0 = x + 2n . Clearly x0 6= x, and yet h(x0 ) = (x + 2n ) mod 2n = x mod 2n =
h(x), namely, x0 is a collision (second preimage) with x.
We next show that h is not a one-way function (OWF). Specifically, given
h(x) for any preimage x, let x0 = h(x); clearly:

h(x0 ) = x0 mod 2n = h(x) mod 2n


= (x mod 2n ) mod 2n = x mod 2n = h(x)

Namely, x0 is a preimage of h(x), and hence h is not a OWF.


Finally we show that h is not a randomness extracting hash function. Specif-
+ 0n , where r is a random n bit string. Then h(x) = (r +
ically, let x = r + + 0n )
mod 2n = 0n , which is obviously not a random string.

4.1.3 Applications of crypto-hash functions


The broad security requirements of cryptographic hash functions facilitate their
use in many systems and for an extensive variety of applications. These differ-
ent applications and systems rely on different security requirements. As in any
security system, it is important to identify the exact security requirements and
assumptions; however, published designs, and even standards, do not always
identify the requirements, or state them imprecisely. Important applications
of cryptographic hash functions, which we map to the corresponding necessary
requirements in Table 4.2, include:

Integrity, digests and blockchain : the (short) digest h(m) allows valida-
tion of the integrity of the (long) message (or file) m. Digest schemes
and blockchains allow efficient validation of sequences of multiple mes-
sages (or files), optionally collected into blocks, with efficient validation
of only specific message(s)/file(s).
Hash-then-Sign : possibly the most well-known use of hash functions: facil-
itate signatures over long (VIL) documents. This uses the hash-then-sign

137
Application Sufficient requirements
Integrity, Merkle-tree, blockchain (§4.2) Collision resistance
Hash-then-sign (§4.2.6) Collision resistance
One-time-password (§4.4.1) OWF
Password hashing, OTP-chain (9.1) ROM
One-time signatures (§4.4.2) OWF (for VIL also CRHF)
Proof-of-Work (§4.6) ROM
Key/random generation Randomness extracting;
(§4.5) ≥ n random input bits
Random map (§4.1.1) SPR (for large n)

Table 4.2: Applications of crypto-hashing, and the corresponding requirements.

paradigm (subsection 4.2.6), i.e., applying the RSA or other FIL signing
function, to a digest h(m) of the message m being signed. See subsec-
tion 4.2.6.
Login when server file may be compromised : Hash functions are used
to improve the security of password-based login authentication, in several
ways. The most widely deployed method is using a hashed password file,
which makes exposure of the server’s password file less risky - since it
contains only the hashed passwords. Another approach is to use a hash-
based one-time password, which is a random number allowing the server
to authenticate the user, with drawbacks of single use and having to re-
member or have this random number. One-time passwords are improved
into OTP-chain (aka hash-chain), to allow multiple login sessions with
the same credential. See §9.1 and §4.4.1.
Proof-of-Work : cryptographic hash functions are often used to provide
Proof-of-Work (PoW), i.e., to prove that an entity performed a con-
siderable amount of computation. This is used by Bitcoin and other
cryptocurrencies, and for other applications. See subsection 4.9.3.
Key derivation and randomness generation : hash functions are used to
extract random, or pseudorandom, bits, given input with ‘sufficient ran-
domness’. In particular, this is used to derive secret shared keys. See
§4.5.
Random map : as discussed above (§4.1.1), hash functions are often used,
usually for efficiency, for ‘random’ mapping of inputs into ‘bins’, i.e., the
2n possible digests.

4.1.4 Standard cryptographic hash functions


Due to their efficiency, simplicity and wide applicability, cryptographic hash
functions are probably the most commonly used ‘cryptographic building blocks’,

138
as discussed in the cryptographic building blocks principle (Principle 8). This
implies the importance of defining and adopting standard functions, which can
be widely evaluated for security - mainly by cryptanalysis - and the need for
definitions of security.
There have been many proposed cryptographic hash functions; however,
since security is based on failed efforts for cryptanalysis, designers usually avoid
less-well-known (and hence less tested) designs. The most well-known crypto-
graphic hash functions include the MD4 and MD5 functions proposed by RSA
Inc., the SHA-1, SHA-21 and SHA-3 functions standardized by NIST ( [67,
141]), the RIPEMD and RIPEMD-160 standards, and others, e.g., BLAKE2.
Several of these, however, were already ‘broken’, i.e., shown to fail some of
the requirements (discussed next). In particular, collisions - and specifically,
chosen-prefix collisions - were found for RIPEMD, MD4 and MD5 in [158], and
later also for SHA-1 [120]. As a result, these functions should be avoided and
replaced, at least in the (many) applications which depend on the collision-
resistance property.
Note that existing standards define only keyless cryptographic hash func-
tions. However, as we later explain, there are strong motivations to use keyed
cryptographic hash functions, which use a random, public key (without a pri-
vate key). In particular, one of the security requirements, collision-resistance,
cannot be achieved by any keyless function. We later discuss constructions of
keyed hash functions from (standard) keyless hash functions.

4.2 Collision Resistant Hash Function (CRHF)


4.2.1 Keyless Collision Resistant Hash Function
(Keyless-CRHF)
A keyless hash function h(m) maps unbounded length binary strings m ∈
{0, 1}∗ , to n-bit binary strings h(m) ∈ {0, 1}n ; we often refer to h(m) as the
digest of m. Hence, there are infinitely many collisions, i.e., messages m 6= m0
s.t. h(m) = h(m0 ). However, finding such collisions may not be easy when
the domain of digests is large enough, i.e., for large n. Intuitively, we say that
a hash function is collision resistant, if it is computationally-hard to find any
collision, as illustrated in Figure 4.5.
The definition follows. Notice that use the explicit notation h(n) (m), to
emphasize that the hash function is defined for arbitrary digest (output) length
n, where n is the security parameter, i.e., the adversary is limited to run in time
polynomial in n, and the adversary’s advantage can be a function of n. The
common notation h(m), where n is not explicitly expressed, is only ‘shorthand’
to this ‘real’ notation.

1 The SHA-2 specifications defines six variants for SHA-2, with digests lengths of 224,

256, 384 or 512 bits; these variants are names SHA-224, SHA-256, SHA-384, SHA-512, SHA-
512/224, and SHA-512/256.

139
Figure 4.5: Keyless collision resistance: for sufficient digest length n = |h(·)|,
it is infeasible to find efficiently a collision, i.e., a pair of inputs x, x0 ∈Domain,
which are mapped by hash function h to the same output, h(x) = h(x0 ), except
with negligible probability.

Definition 4.1 (Keyless Collision Resistant Hash Function (CRHF)). A key-


less hash function h(n) (·) : {0, 1}∗ → {0, 1}n is collision-resistant if for every
efficient (PPT) algorithm A, the advantage εCRHF h,A (n) is negligible in n, i.e.,
smaller than any positive polynomial for sufficiently large n (as n → ∞), where:
h i
εCRHF
h,A (n) ≡ Pr (x, x0 ) ← A(1n ) s.t. (x 6= x0 ) ∧ (h(n) (x) = h(n) (x0 ) (4.1)

Where the probability is taken over the random coin tosses of A.

4.2.2 Are there Keyless CRHFs?


Standard cryptographic hash functions, discussed in subsection 4.1.4, are all
keyless; and practical deployment almost always use these designs. By now, the
readers should not be surprised to learn that none of these were proven secure;
we discussed in § 2.7 the fact that ‘real’, unconditional proofs of security for
(most) cryptographic schemes would imply that P = N P and therefore are
not to be expected, and that cryptographic hash functions are among the basic
cryptographic building blocks which are typically validated by accumulated
evidence of failed attempts to cryptanalyze them.
However, the following lemma may be surprising: all keyless hash functions
fail to satisfy the keyless-CRHF definition (Definition 4.1) - namely, a keyless-
CRHF - using this definition - simply does not exist; the answer to the question
which we used as the name of this subsection is simply no, there are no keyless
CRHFs. We present and prove this, and then discuss the implications.
The reader is quite right to be surprised and suspect some ‘trick’ here -
after all, we just explained that all standard cryptographic hash functions are
keyless! Well, that is correct: the proof is very simple - but uses what some
may consider an ‘unfair trick’ against the poor victim keyless hash function.
The proof will show that for any given keyless hash function h, there exists a
trivial, efficient adversarial algorithm Ah , that outputs a collisions for h, i.e.,
a pair (m, m0 ) s.t. m 6= m0 but h(m) = h(m0 ). Furthermore, Ah is not just
efficient in n: its time complexity is basically the time to print out the collision,

140
it does almost nothing else. Ah, and more: Ah does not just succeeds with a
‘significant’ probability; it always succeeds. Namely, h is very very far from the
requirements of Definition 4.1!
Ok, so now we hope you are already intrigued and want to see the trick - or
did you already detect it? Since yet, we have essentially already did the trick
- ‘hidden in plain sight’ exactly in the paragraph above. Can you find it? Try
to, before you read the proof, then compare. This exercise should be fun and
instructive. And we explain the ‘trick’ right following the proof.
As in Definition 4.1, the lemma refers to a keyless hash function as h(n) (·),
where n is the length of the range (output) of h(n) (·), i.e, n = h(n) (x) for
every string x ∈ {0, 1}∗ . The output length n must be an explicit parameter,
since the keyless CRHF definition requires the probability of a collision to be
negligible in n (for an attacker whose runtime is bounded by a polynomial in
n); see discussion of Definition 4.1.

Lemma 4.1 (Keyless CRHF do not exist.). Let h(n) : {0, 1}∗ → {0, 1}n be
a keyless hash function. Then h(n) (·) is not a keyless CRHF (as defined in
Definition 4.1).

Proof: Given h(n) (·), we prove that there exists an efficient adversary algo-
rithm Ah(n) that always finds a collision, i.e. εCRHF
h,A (n) = 1 - clearly showing
that h(n) (·) does not satisfy the definition of a keyless CRHF.
Before we prove that such Ah(n) exists, first note that, since the domain
of h(n) is unbounded while the range is the bounded set {0, 1}n , then h(n)
must have collisions, i.e., pairs of binary messages m 6= m̂ s.t. h(m) = h(m̂).
Let m(n) , m̂(n) denote one such collision; we have such a pair for each digest
length n. It doesn’t matter how we pick the particular collision, and we do
not assume an efficient algorithm for picking it; for example, given n, we can
pick the collision m(n) , m̂(n) which has the smallest value for m (among all
collisions of h(n) , and if m collides with multiple other strings, pick m)ˆ(n) to
be the smallest string that collides with m(n) . So these values m(n) , m̂(n) are
well defined (even if we didn’t show any efficient algorithm to find them).
Define the adversarial algorithm simply as: Ah(n) (1n ) = (m(n) , m̂(n) ). Namely,
this algorithm has ‘burned in’ a table of collisions (m(n) , m̂(n) ) for every value
of the security parameter n. Clearly, there is such an algorithm - we just de-
fined it, in fact, as we define the contents of this ‘table of collisions’ and the
trivial algorithm that looks up in this table and outputs a collision. Further-
more, this algorithm is polynomial-time (in PPT), since it only looks-up the
collision and outputs it; as readers familiar with the basics of the theory of
computation should be able to confirm (see [81]). Therefore, h(n) (·) is not a
keyless CRHF (as defined in Definition 4.1).
Ok, so the ‘trick’ was that we proved that such an attacker Ah exists - but
in a non-constructive way, i.e., we did not present such an adversary or showed
an efficient way to find it. We only showed that such an adversary exists.
This possibly-surprising lemma shows that there is no keyless CRHF, as de-
fined in Definition 4.1. Of course, there may be some other reasonable definition

141
for keyless CRHFs, for whom the lemma does not apply; indeed, we later define
Second-Preimage Resistant (SPR) hash, which is essentially a weaker collision-
resistance property, and there are other related definitions in the cryptographic
literature. However, all definitions we are aware of are significantly different
from CRHF.
Keyless CRHF remain an important and useful concept - although there
are no keyless CRHFs. In particular, all existing standard hash functions are
keyless. Cryptanalysts spend significant efforts to find collisions; where found,
e.g., for MD5 and SHA-1, we migrate to supposedly-stronger standards, e.g.,
SHA-2 and SHA-3. So, finding collisions to these functions appears to be ‘hard’
- but only in the intuitive, everyday sense of the word, and not as a well-defined
concept.
However, cryptographic hash functions are one of the few cryptographic
building blocks, and per Principle 8, we want to use such building blocks to
construct other cryptographic mechanisms, and prove the security of such con-
structions by reduction to the security of the building block. That cannot be
achieved when we do not have a precise definition of the building block, or, in
this case, when we know that there are no keyless CRHFs.
In this textbook, we deal with relatively simple applications of CRHF, and
in most of the follow-up chapters, our discussion is less formal. Therefore, for
simplicity, we will usually use keyless CRHF; in all of the constructions and
designs, it is easy to add the ‘missing’ keys, when desired (e.g., for provably
secure reductions). Namely, we consider the use of keyless CRHFs as a conve-
nient simplification. A justification may be that if the system using the hash
is insecure, then we may be able to use the attack to find the collision - which
is assumed hard; and anyway in ‘real’ implementations, the output length is
fixed. In fact, many cryptographic designs use an even stronger simplification
- the random oracle model (ROM), which we discuss in §4.6.
Another approach is to design the application without assuming CRHF at
all, and instead, rely on other properties, which may exist for keyless hash,
such as one-way function (OWF), second-preimage resistance (SPR), or other
properties (which we do not cover). SPR is especially relevant, as it is, essen-
tially, a weak form of collision resistance; in fact, it is sometimes even referred
to as weak collision resistance. Of course, care must be taken to ensure that
SPR is really sufficient for the application; there could be subtle vulnerabilities
due to use use of SPR in an application requiring ‘real’ collision-resistance. We
discuss SPR in § 4.3.
A final alternative is to use a keyed CRHF instead of keyless CRHF. Con-
sidering that existing standard define only keyless hash, a common approach
is to use construction of a keyed hash from a keyless hash. Often, this would
be the HMAC construction, originally designed as a construction of a MAC
function from hash. We discuss HMAC in subsection 4.6.1. We discuss this
next.

142
Figure 4.6: Keyed collision resistance hash function (CRHF): given random
key k, it is hard to find a collision for hk , i.e., a pair of inputs x, x0 ∈ {0, 1}∗
s.t. hk (x) = hk (x0 ).

4.2.3 Keyed Collision Resistance


We next discuss keyed collision resistant hash functions (keyed CRHF). The
definition for keyed CRHF seems very similar; the only difference is that the
probability is also taken over the key, and the key is provided as input to the
adversary. Recall that, for simplicity, we use n as the length of both the digest
and the key; hence, we do not need to provide n as an additional input (since
it is equal to the key length). We next define keyed collision resistance, which
we illustrate in Figure 4.6.
Definition 4.2 (Keyed Collision Resistant Hash Function (CRHF)). A keyed
hash function hk (·) : {0, 1}∗ × {0, 1}∗ → {0, 1}∗ is collision-resistant if for
every efficient (PPT) algorithm A, the advantage εCRHF
h,A (n) is negligible in n,
i.e., smaller than any positive polynomial for sufficiently large n (as n → ∞),
where:

εCRHF
h,A (n) ≡ Pr [(x, x0 ) ← A(k) s.t. (x 6= x0 ) ∧ (hk (x) = hk (x0 )] (4.2)
k←{0,1}n

Where the probability is taken over the random coin tosses of A and the random
choice of k.

Target Collision Resistant (TCR) vs. ACR / Keyed CRHF. Defini-


tion 4.2 uses the term keyed CRHF, following Damgård [51]. Another term for
this definition is any collision resistance (ACR hash), proposed by Bellare and
Rogaway in [24]. They preferred this term, to emphasize that this definition
allows the attacker to choose the specific collision as function of the key, since
the key is given to the attacker before the attacker outputs the entire collision
(both x and x0 s.t. hk (x) = hk (x0 )).
Bellare and Rogaway preferred to use the term ACR to the term ‘keyed
CRHF’, to emphasize the difference from a weaker notion of collision-resistance

143
Figure 4.7: Target collision resistant (TCR) hash: adversary cannot find target
x, to which it would be able to find a collision x0 , once it would be given the
random key k.

that they (and we) call2 Target Collision Resistant (TCR) hash. The term TCR
emphasizes that, to ‘win against’ the TCR definition, the attacker has to first
select the target x, i.e., one of the two colliding strings, before it receives the
(random) key k. Only then the attacker is given the random key k, and has to
output the colliding string x0 s.t. h(x) = h(x0 ). Intuitively, this makes sense:
it seems that on most applications, a collision between two ‘random’ strings
x, x0 may not help the attacker; the attacker often needs to match some specific
‘target’ string x. The TCR definition still allows the attacker to choose the
target - but at least not as a function of the key!
We next define target collision resistance, which we illustrate in Figure 4.7.

Definition 4.3 (Target collision resistant (TCR) hash). A keyed hash function
hk (·) : {0, 1}∗ × {0, 1}∗ → {0, 1}∗ is called a target collision-resistant (TCR)
hash, if for every efficient (PPT) algorithm A, the advantage εTh,A CR
(n) is neg-
ligible in n, i.e., smaller than any positive polynomial for sufficiently large n
(as n → ∞), where:

x ← A(1n );
  
0 0
εTh,A
CR
(n) ≡ Pr n s.t. (x 6
= x ) ∧ (h k (x) = hk (x )
k←{0,1} x0 ← A(x, k)
(4.3)
Where the probability is taken over the random coin tosses of A and the random
choice of k.

TCR is a useful variant, since it is ‘strong enough’ for many applications,


yet, it may be easier to construct TCR hash, as the challenge to the attacker
is harder. Furthermore, TCR hash is not subject to the ‘birthday attack’,
discussed in the next subsection. This may allow TCR hash to be safely used
with a shorter digest - a potentially important benefit.
2 TCR is a different name for the notion, which was earlier defined by Naor and Yung

in [133], but with a different name: universal one-way hash functions.

144
4.2.4 Birthday and exhaustive attacks on CRHFs
Find collisions in exponential time. Both definition 4.2 and definition 4.1,
may appear to be overly-permissive, in two ways: we restrict the adversary to
run in polynomial time (PPT) in the length n of the digest, and we allow
the adversary to have a negligible probability of finding a collision. However,
without these two ‘relaxations’, the definition would not be feasible. Namely,
we show that for every hash h, we can find collisions in exponential time; or, if
we prefer, we can find collisions in polynomial time - but with exponentially-
small probability. We will focus here on keyless hash, but it holds almost
unchanged for keyed hash.
Let us first argue that an adversary which can run in time exponential in n
would be able to find a collision. Consider a hash function h : {0, 1}∗ → {0, 1}n ,
and a set X containing 2n + 1 distinct input binary strings. The output of h
is the set of n-bits strings, which contains 2n elements; hence, there must be
at least two elements x 6= x0 in the set X, which collide, i.e., h(x) = h(x0 ).
An adversary that runs in time exponential in n can surely compute h(x) for
every element in X and find this collision. Hence, the definitions restrict the
adversary to run in time polynomial in n. This argument clearly hold for
keyed-hash (and TCR hash) as well.

Find collisions with exponentially-small probability. We next extend


the argument and show, for any hash function h : {0, 1}∗ → {0, 1}n , a PPT
algorithm (i.e., probabilistic algorithm that runs in time polynomial in n) with
non-zero probability to find a collision. Consider the same set X as before, and
an algorithm that selects two random elements in X; with small probability,
this algorithm would output the collision x 6= x0 s.t. h(x) = h(x0 ). Therefore,
the definitions allow the adversary to have negligible probability of finding a
collision. Again, we expressed the argument for a keyless hash h : {0, 1}∗ →
{0, 1}n , but it can easily be adapted for keyed hash. √
We next show that finding a collision actually requires only about 2n
attempts; this is due to the birthday paradox.

The birthday paradox and attack on collision resistance. The above


argument presented an algorithm that finds a collision by computing at most
2n + 1 hash values. However, the expected number √ of hash-computations re-
quired to find a collision is only O(2n/2 ) = O 2n , not O(2n ). This is due
to the birthday paradox: in a room containing 23 persons, the probability of
a collision, i.e., two people having birthday on the same date, is about half -
much more than intuitively expected. To understand why this is true, notice
that when a person is added to a room currently containing i persons (with no
i
collisions), the probability of a collision with some person in the room is 356 ,
1
not 356 !
More precisely, the expected number q of messages {m1 , m2 , . . . , mq } which

145
should be hashed before finding a collision h(mi ) = h(mj ) is approximately:
r
n/2 π
q/2 · / 1.254 · 2n/2 (4.4)
2
Hence, to ensure collision-resistance against adversary who can do 2q com-
putations, e.g., q = 80 hash calculations, we need the digest length n to be
roughly twice that size, e.g., 160 bits. Namely, the effective key length of a
CRHF is only q = n/2. This motivates the fact that hash functions often
have digest length twice the key length of shared-key cryptosystems used in
the same system. Using longer digest length and/or longer key length does not
harm security, but may have performance implications.
Note that the birthday attack applies to both keyed CRHF and keyless
CRHF; however, it does not apply to Target Collision Resistant (TCR) hash.
Can you see why? Carefully compare Definition 4.2 vs. Definition 4.3, and
you’ll find out!

4.2.5 CRHF Applications (1): File Integrity

Figure 4.8: Example of use of hash function h to validate integrity of file m


downloaded by a user in NY, from an untrusted repository in DC. To validate
integrity, the user downloads the (short) digest directly from the website of the
producer, in LA. This reduces network overhead - and load on the producer’s
website - compared to downloading the entire file from the producer’s website.

Collision resistance is a great tool for ensuring integrity. One common ap-
plication is to distribute a (large) object m, e.g., a file containing the executable
code of a program. Suppose the file m is distributed from its producer in LA,
to a user or repository in Washington DC (step 1 in Fig. 4.8). Next, a user in
NY is downloading the file from the repository (or peer user) in DC (step 3),
and, to validate the integrity, also the digest h(m) of the file, directly from the

146
producer in LA (step 2). By downloading the large file m from (nearby) DC,
the transmission costs are reduced; by checking integrity using the digest h(m),
we avoid the concern that the file was modified in DC or in transit between
LA, DC and NY.
A potential remaining concern is modification of the digest h(m) received
directly from producer in LA, by a Monster-in-the-Middle (MitM)3 attacker.
This may be addressed in different ways, including the use of a secure web
connection for retrieving h(m), as discussed in chapter 7, and/or receiving the
digest from multiple independent sources.
This method of download validation is deployed manually, by savvy users,
or in an automated way, by operating system, application or even a script
running within a browser, e.g., see [77].

4.2.6 CRHF Applications (2): Hash-then-Sign (HtS)


Collision-resistance is a powerful property; in particular, it facilitates one of the
most important applications of cryptographic hash functions - the hash-then-
sign (HtS) paradigm. The hash-then-sign paradigm is essential for efficient
deployment of public-key digital signatures, which we introduced in subsec-
tion 3.3.2. We present constructions for signature schemes based on public key
cryptosystems in § 6.7; and in subsection 4.4.2 we discuss one-time signatures
and present their constructions, based on one-way functions (OWFs). How-
ever, both approaches result in signatures for limited-length inputs, and for any
realistic application, must be applied using the Hash-then-Sign construction.

Keyless HtS is secure. Let us first recall the keyless Hash-then-Sign con-
struction, which we discussed in subsection 3.3.2. This construction uses a
keyless CRHF h : {0, 1}∗ → {0, 1}n . Given signature scheme (KG, S, V ) whose
input domain is (or includes) n-bit strings, i.e., {0, 1}n , the keyless hash-then-
sign signature of any message m ∈ {0, 1}∗ is defined as Ssh (m) ≡ Ss (h(m)),
where we use s for the private signing key of S (and of Ssh ).
We next show that provided that h is a Collision-Resistant Hash Function
(CRHF), and that the signature scheme (S, V ) is secure (existentially unforge-
able, see Definition 3.2), then (S h , V h ) is also secure.

Theorem 4.1 (Keyless Hash-then-Sign is secure (existentially unforgeable).).


Let (KG, S, V ) be an existentially unforgeable signature scheme over the do-
main {0, 1}n , and let h : {0, 1}∗ → {0, 1}n be a keyless CRHF function, Let
h h
SHtS be the hash-then-sign signature as defined in Equation 3.5. Then SHtS is

an existentially unforgeable signature scheme over domain {0, 1} .

Proof: Assume that the claim does not hold, i.e., that there is an effi-
cient adversary A ∈ P P T s.t. εeu−Sign
h
SHtS ,A
(n) 6∈ N EGL(n), as defined in Equa-
tion 3.3. Namely, with significant probability, A outputs a pair (m, σ) s.t. A
3 Also called Man-in-the-Middle

147
h h
didn’t provide m as input to the SHtS .Ss (·) oracle, yet SHtS .Vv (m, σ). From
h
Equation 3.6, SHtS .Vv (m, σ) = [Link] (h(m)). Let φ ≡ h(m); now, either there
was another message m0 s.t. φ = h(m0 ) which Adid provide as input to the
h
SHtS .Ss (·) oracle, or not. Let us consider both cases; at least one of the two
must occur with significant probability.
If there is another message m0 s.t. φ = h(m0 ) which A provided as input to
SHtS .Ss (·), then the pair (m, m0 ), both of which produced by A, is a collision
h

for h. This should be impossible to find efficiently, since h is assumed to be a


CRHF.
If there was no such m0 , then we can use A to construct an adversary A0 that
will find forgery for the original signature scheme (KG, S, V ). The adversary A0
runs A, and whenever A makes an oracle query m, then A0 computes φ = h(m),
makes query for Ss (φ) and returns the result to A- obviously, this would be
h
the expected value. Finally, when A returns the forgery (m, σ) for SHtS , then
0
A computes h(m) and returns (h(m), σ), which is the corresponding forgery
for S.

Two HtS Constructions using keyed-hash. We now present two Hash-


then-Sign (HtS) constructions from keyed-hash function. We begin with a
very simple construction by Damgård [51], which we refer to as Keyed HtS as
it is based on the use of a keyed-CRHF (also called any collision resistance
(ACR) hash). This construction is identical to the keyless Hash-then-Sign
construction, except for the use of a keyed hash hk (·) : {0, 1}∗ × {0, 1}∗ →
{0, 1}∗ . The hash key is selected once, during the key-generation process, and
becomes part of the public verification key and of the private signing key. We
h
define the Keyed-HtS construction SHtS as follows.
Definition 4.4 (The Keyed-HtS construction). Given a signature scheme S
with domain {0, 1}n and a keyed hash (CRHF) hk (·) : {0, 1}∗ × {0, 1}∗ →
{0, 1}∗ , the Keyed-HtS signature using signature S and keyed-hash h is defined
as follows:
 
$
(s, v) ← [Link](1n )
h
SHtS .KG(1n ) ≡  $ (4.5)
 
k ← {0, 1}n 
return ((s, k), (e, k))
h
SHtS .S(s,k) (m) ≡ S.S(hk (m)) (4.6)
h
SHtS .V(v,k) (m, σ) ≡ [Link] (hk (m)) (4.7)
The Keyed HtS construction has two drawbacks: it requires distribution of
a longer public key, which may necessitate changes is key-distribution mecha-
nisms; and it requires the underlying keyed hash function to be a keyed-CRHF,
also known as any-collision resistant (ACR). Bellare and Rogaway show, in [24],
an almost-as-simple construction, that requires only a weaker, target-collision
resistant (TCR) keyed-hash function. A significant advantage is that TCR-
hash are not vulnerable to the birthday attack, and hence may be able to use
significantly shorter digests (about half of the bits).

148
Figure 4.9: Second-preimage resistance (SPR): given keyless hash function
h : {0, 1}∗ → {0, 1}n , for any input length m ≥ n, given a random first
preimage x ∈ {0, 1}l , for l ← A(1n ), it is hard to find a collision with x, i.e., a
second preimage x0 ∈ {0, 1}∗ s.t. x0 6= x yet h(x) = h(x0 ).

We refer to this construction as the HtS-TCR construction. It is also similar


to the keyless Hash-then-Sign construction, except for the use of a keyed hash
hk (·) : {0, 1}∗ × {0, 1}∗ → {0, 1}∗ - and for using and transmitting the hash-
key with each message. Here, the hash key is selected each time as part of
the signing operation, and is sent together with the output of the underlying
h
signature function. We define the HtS-TCR construction SHtS−T CR as follows.

Definition 4.5 (The HtS-TCR construction). Given a signature scheme S


with domain {0, 1}n and a keyed hash hk (·) : {0, 1}∗ × {0, 1}∗ → {0, 1}∗ , the
HtS-TCR signature using signature S and keyed-hash h is defined as follows:
h n
SHtS−T CR .KG(1 ) ≡ [Link](1n ) (4.8)
h
SHtS−T CR .Ss (m) ≡ (k, S.S(hk (m))) (4.9)
h
SHtS−T CR .V(v,k) (m, (k, σ)) ≡ [Link] (hk (m)) (4.10)

The keyed-HtS construction, as well as the HtS-TCR construction, ensure


existential unforgeability, if the underlying keyed-hash has the respective prop-
erty. For the theorems and proofs, see [24, 51].

4.3 Second-preimage resistance (SPR)


The second property we introduce is second-preimage resistance (SPR), illus-
trated in Figure 4.9. We define SPR only for keyless hash functions, although
it can also be defined for keyed hash [148].
We define Second-Preimage Resistance (SPR) hash function as follows. An
adversary is given a specific, randomly-chosen first preimage x ∈ {0, 1}l , for
some l (see later), and ‘wins’ if it outputs a colliding second preimage, i.e.,
x0 ∈ {0, 1}∗ s.t. x0 6= x yet h(x0 ) = h(x).

Why and how we fix the length of the first preimage (l)? We can only
select a value with uniform probability from a finite set; for example, given the
set {0, 1}l , i.e., binary strings of length l, we can select a random string by

149
flipping l fair coins - giving probability 21l to each of the 2l strings in {0, 1}l .
But we cannot select a value with uniform probability from an infinite set -
e.g., from {0, 1}∗ .
Therefore, in order to be able to select the a random first preimage x, with
uniform distribution, we take it from the finite set {0, 1}∗ . But what is this l?
Intuitively, we want the property to hold for ‘any’ l; but since the string is given
as input to the adversary A, we can’t allow it to be of exponential length. We
solve this by allowing the adversary to pick l. The definition follows; note that
we found it more convenient to let the adversary directly specify the length of
x, without explicitly setting this length into a variable l.
Definition 4.6 (Second-preimage resistance (SPR) Hash Function). A (key-
less) hash function h : {0, 1}∗ → {0, 1}n is second-preimage resistant (SPR)
if for every efficient algorithm A ∈ P P T , the advantage εSP R
h,A (n) is negligi-
ble in n, i.e., smaller than any positive polynomial for sufficiently large n (as
n → ∞), where:

εSP R
h,A (n) ≡ Pr [x0 ← A(x) s.t. x 6= x0 ∧ h(x) = h(x0 )] (4.11)
$
x←{0,1}A(1n )

Where the probability is taken over the choice of x and the random coin tosses
of A.
SPR is sometimes referred to as weak collision resistance, and indeed, al-
most trivially, every CRHF is also an SPR hash function. However, the reverse
is not true, and in particular, it is widely believed that (keyless) SPR hash
functions exist, while, as argued above, keyless CRHF cannot exist. In prac-
tice, collision attacks are known against some standard hash functions such as
SHA1 and MD5, but not second-preimage attacks. See the next exercise.
Exercise 4.3. Let h be an SPR hash function. Use h to construct another
hash function, h0 , which you will show to be (1) an SPR (like h), but (2) not
a CRHF.
Solution: See chapter 10.
We can use an SPR hash function for the ‘random mapping’ application,
discussed in subsection 4.1.1. In this application, the goal is to prevent the
attacker from selecting many inputs to collide with an incoming message or
name (e.g., with ‘Bob’). Of course, the attacker can try different values, and
for each guess, has probabilty 2−n of being a collision; but this attack fails if
we select sufficiently large digest size n. Hence, for this case, SPR is a sufficient
requirement.
There are other applications which require only the SPR property and not
the stronger collision-resistance property, e.g., see the following exercise. How-
ever, we usually prefer to use only keyless cryptographic hash functions for
whom no collisions have been found, i.e., where CRHF ‘seems’ to hold (recall
that we know that actually, no function can satisfy the CRHF goal). In par-
ticular, this protects against the common case, where a designer incorrectly

150
believes that an SPR hash suffices, while the system is actually vulnerable to
non-SPR collision attacks. Importantly, SPR is not sufficient for Hash-and-Sign
applications, as we discuss next.
Exercise 4.4. Identify an additional application, discussed earlier in this chap-
ter, for which an SPR hash function seem, intuitively, to suffice for security.
Explain this intuition (why the SPR property seems sufficient for security of
that application). Is there any simplification of reality required for this intuition
to hold?

4.3.1 The Chosen-Prefix Vulnerability and its HtS exploit


Theorem 4.1 shows that Hash-then-Sign is secure, when used with a CRHF.
But would the weaker SPR property suffice? Even if the attacker can find
some collision h(m) = h(m0 ), for some ‘random’ strings m, m0 , how would the
attacker convince the signer to sign m, and why should the alternative message
m0 be of (significant) value to the attacker? In short: is there a realistic attack,
which may be possible against an SPR hash h (although it must fail against
a CRHF)? We next show that this is indeed the case by presenting such an
attack, exploiting the chosen-prefix vulnerability.
Definition 4.7 (Chosen-prefix vulnerability). Hash function h is said to suf-
fer from the chosen-prefix vulnerability if there is an efficient collision-finding
algorithm CF , s.t. given any (prefix) string p ∈ {0, 1}∗ , the algorithm CF
efficiently outputs a (collision) pair of strings x, x0 ∈ {0, 1}∗ , s.t. for any
(suffix) string s ∈ {0, 1}∗ holds h(p + +x+ + s) = h(p ++ x0 ++ s). Namely, let
0 ∗
(xp , xp ) ≡ CF (p); then for every prefix p ∈ {0, 1} holds:

(∀s) h(p +
+x+ + x0 +
+ s) = h(p + + s) (4.12)

Chosen-prefix attacks are practical. The chosen-prefix vulnerability is a


realistic concern; in fact, such vulnerabilities were found for widely-used stan-
dard hash functions including RIPEMD, MD4, and MD5 in [158], and later
also for SHA1 [120], which motivates avoiding them and replacing them with
other cryptographic hash functions. We next show how the vulnerability facil-
itates a realistic attack on the Hash-then-Sign paradigm, allowing an attacker
to trick users into signing what appears to a third party to be a statement
(e.g., money transfer) that the user never intended to sign. See [120, 158] for a
more elaborate attack, which allows also forgery of public key certificates; we
discuss public key certificates in chapter 8.

Chosen-prefix attack on Hash-then-Sign: simplified version We be-


gin by presenting a simplified version of the chosen-prefix attack on the hash-
then-sign paradigm. In this version, an attacker, say Mal, uses the chosen-prefix
attack to find two to the string ‘Pay $’. Let the colliding pair be (x, x0 ).
Assume that x << x0 . Mal now sets-up an online shop and offers for sale
an item whose market value is y : x < y << x0 . Mal offers the item for only

151
x$ - a real bargain! Oh well, you already know what is really going to happen,
right?
Alice comes along and happily buys the item, by sending to Mal a signed
payment order to her bank, e.g.: (P O, SignA.s (h(P O))), where P O = Pay $x to Mal,
ready to be deposited at the bank. Mal indeed deposits a payment order signed
by Alice - but not P O, the payment order received... Instead, Mal deposits a
fake payment order P O0 for $ x0 , attaching the same signature that Alice sent
for P O. Specifically, Mal sends to the bank the pair (P O0 , SignA.s (h(P O))),
where P O0 = Pay $x0 to Mal. Now, due to chosen-prefix vulnerability, we
know that:

h(P O) = h(Pay $x to Mal) = h(Pay $x0 to Mal) = h(P O0 )

Hence, the same signature generated by Alice, appears to the bank to be a


valid signature over the ‘fake’ payment order PO’ - and the bank transfers to
Mal the larger amount x0 >> x!
Of course, the attack was described in a simplified way and is not really re-
alistic. In particular, people do not sign plain-text (ASCII) messages, but more
readable formats such as PDF documents. We next discuss a more realistic
variant of the attack, which works for PDF files.

Realistic Chosen-prefix Attack: Signing PDF documents We now


improve the chosen-prefix attack to allow forgery of signatures over documents
formatted in ‘rich’ markup languages like PDF, postscript, and HTML. The
attacker, Mal, exploits the fact that these (and similar) languages allow docu-
ments to contain conditional rendering statements, allowing the document to
display different content depending on different conditions.
In the attack, Mal uses the conditional rendering capability, to create
two documents D1 , DM that have the same hash value, h(D1 ) = h(DM ),
but when rendered by the correct viewer, e.g., PDF viewer, the two docu-
ments are rendered very differently. Namely, viewing D1 , the reader displays
text t1 = ‘Pay $1 to Amazon’, while viewing DM , the reader displays text
tM = ‘Pay $1,000,000 to Mal’. The rest of the contents, and even the de-
tails of the markup language used, do not materially change the attack, so we
ignore them.
Mal creates these two documents as follows. First, the documents share
common perfix and suffix: D1 = p + + x1 ++ s, DM = p + + xM + + s.
The prefix p consists of headers and preliminaries as required by the markup
language, e.g., %PDF for PDF, or <!DOCTYPE html> for HTML, followed by the
‘if’ statement in the appropriate syntax. Simplifying, let’s say that p = if .
Mal next applies the collision-finding algorithm CF (Definition 4.7), to
find collision for prefix p, namely: (x1 , xM ) ← CF (p). For every suffix s holds:
h(p + + x1 ++ s) = h(p + + xM + + s).
To complete D1 , DM , Mal sets the suffix s to:

s ← ‘=’ +
+ x1 +
+ ‘then display’ +
+ t1 +
+ ‘, else display’ +
+ tM

152
Figure 4.10: One-Way Function, aka Preimage-Resistant hash function: given
(sufficiently long) random preimage x, it is hard to find x, or any other preimage
x0 of h(x), i.e., s.t.. h(x0 ) = h(x).

Mal is now ready to launch the attack on Alice, similarly to the simplified
attack above. Namely, Mal first sends D1 to Alice, who views it and sees the
rendering t1 . Let us assume that Alice agrees to pay 1$ to Amazon, and hence
signs D1 , i.e., computes σ = SA.s (h(D1 )) and sends (D1 , σ) it back to Mal.
Mal forwards to the bank the modified message (DM , σ). The bank vali-
dates the signature, which would be Ok since h(D1 ) = h(p + + x1 ++ s) = h(p +
+
xM ++s) = h(DM ). The bank then views DM , sees tM = ‘Pay one million dollars to Mal’,
and transfers one million dollars from Alice to Mal.
Pl
Exercise 4.5. Consider the hash function h(x1 + + x2 + + ... ++ xl ) = i=1 xi
mod p, where each xi is 64 bits and p is a 64-bit prime. (a) Is h an SPR
hash function? CRHF? (b) Present a collision-finding algorithm CF for h. (c)
Create two HTML files D1 , DM as above, i.e., s.t. h(D1 ) = h(DM ), yet when
they are viewed in a browser, they display texts t1 , tM as above.
Solution: See chapter 10.

4.4 One-Way Functions, aka Preimage Resistance


The third security property we define is called Preimage resistance or One-Way
Function (OWF), and illustrated in Fig. 4.10. We prefer the term one-way
function, which emphasizes the ‘one-way’ property: computing h(x) is easy,
but ‘inverting’ it to find x, or a colliding preimage x0 s.t. h(x0 ) = h(x), is hard.
Definition 4.8 (Preimage resistance / One-Way Function). An efficient func-
tion h with domain {0, 1}∗ is called preimage resistant, or a one-way function,
if for every efficient algorithm A ∈ P P T , the advantage εOW F
h,A (n) is negligi-
ble in n, i.e., smaller than any positive polynomial for sufficiently large n (as
n → ∞), where:

εOW F
h,A (n) ≡ Pr [x0 ← A(h(x)) s.t. h(x) = h(x0 )] (4.13)
$
x←{0,1}n

Where the probability is taken over the the choice of x and over the random
coin tosses of A.

153
Note that the definition does not require one-way functions to be hash func-
tions, i.e., we do not require bounded output length, or even that the output
length be shorter than the input length. In fact, one particularly interesting
case is one-way permutation, which is a OWF which is also a permutation.
The correct way to use a OWF is by ensuring that the input domain is
random. We next present two related applications: one-time password and
one-time signature.

4.4.1 Using OWF for One-Time Passwords (OTP) and


OTP-chain
A one-way function can be used to facilitate one-time password (OTP)4 au-
thentication, allowing a user (say Alice) to prove to a server, say Bob, her
identity, by Bob using a non-secret validation token for Alice.
To implement one-time passwords, Alice first selects a random string s to
serve as the (secret) one-time password. Next, Alice computes the non-secret
validation token as v = h(s). Alice shares the validation token v with Bob (and
possibly others - v is not secret!). Later, to prove her identity, or to make some
other agreed-upon signal, Alice sends s. Bob can easily confirm that v = h(s).
Unfortunately, the One-Way Function property - maybe due to the catchy
name - is sometimes misunderstood. In particular, designs sometimes specify
the use of a one-way function (OWF) - when, in fact, if the hash function is
(only) a OWF, the design may be vulnerable.
In particular, consider OTP-chain, a widely-deployed generalization of one-
time password authentication. In a OTP-chain5 , Alice pre-computes a whole
‘chain’ of values, xi = h(xi−1 ), beginning with a random value x0 , and till xl
for some l (the ‘chain length’). This is used to allow Alice to authenticate l
times: in the i−th authentication, Alice sends xl−i . Everyone who knows xl ,
can easily validate, by applying h.
Is OTP-chain authentication secure? Actually, if our only assumption is
that h is a OWF, it may not be secure. The problem is that for i > 0, the values
xi are not guaranteed to distribute uniformly; see example in next exercise.
However, OTP-chain is secure - if the function is not only one-way, but also a
permutation; see the following exercise.
Exercise 4.6. [OTP-chain using OWF may be vulnerable] Let h : {0, 1}∗ →
{0, 1}n , and define g : {0, 1}∗ → {0, 1}2n as:
g(x) = 02n if x = 0 mod 2n , otherwise h(x) +
+ 0n


Show that if h is a OWF, then g is also a OWF; yet, show that f (x) = g(g(x)) is
not a OWF, and in particular, that a OTP-chain using g is completely insecure.
4 It is unfortunate that the acronym OTP is used for both One Time Pad and One

Time Password, but the use of this acronym for these two different purposes seems firmly
entrenched.
5 OTP-Chain is often referred to as hash-chain, but we prefer the term OTP-chain, to

avoid confusion with digest-chain and blockchain schemes, which have different goals, prop-
erties and designs; see § 4.7 and § 4.9.

154
Figure 4.11: A one-time signature scheme, limited to a single bit (as well as to
a single signature).

Figure 4.12: A one-time signature scheme, for l-bit string (denoted d).

Solution: See chapter 10.


Exercise 4.7. Show that if h is a OWF and a permutation over any given
input length m, then OTP-chain using h is secure. For simplicity, it suffices
to prove this for a chain of length two (like function f in Exercise 4.6).

4.4.2 Using OWF for One-Time Signatures


We next show how to use one-way functions to implement public key signa-
tures - limited to signing only a single (one) time, i.e., a one-time signature
scheme. In spite of their obvious limitation, one-time signatures are sometimes
useful, due to their negligible computational overhead. Note, however, that
they require rather long public keys and signatures.
We present the construction in three steps, each with a separate figure.
Figure 4.11 presents a one-time signature scheme which is defined only for the
case of a single-bit message. This is a simple extension of one-time passwords.
The private key s simply consists of two random strings s0 , s1 , while the public
key consists of the hashes of these strings: v0 = h(s0 ), v1 = h(s1 ). To sign a
bit b, we simply send σ = sb ; to validate incoming bit b and signature σ, we
validate that vb = h(σ). Note that v = (v0 , v1 ) is not secret; only s = (s0 , s1 )
is secret - and we can disclose sb upon signing bit b.
We next extend the scheme, to allow one-time signature of an l-bit string
d, as illustrated in Figure 4.12. (We use the symbol d for ‘digest’, as this value

155
Figure 4.13: A one-time signature scheme, for variable-length message m, using
‘hash-then-sign’.

will become digest of a longer message in the next extension.) This is simply
application of the one-bit signature scheme of Figure 4.11, over each bit di of
d, for i = 1, . . . , l. We use now multiple key pairs (sib , vbi ), where b ∈ {0, 1} is
the bit value (as before) and i is the index of the bit within d, and vbi = h(sib ).
Finally, we extend the scheme further, to efficiently sign arbitrary-length
inputs (VIL) string m. This extension, illustrated in Figure 4.13, simply applies
the hash-then-sign paradigm. Namely, we first use a CRHF, which we denote
by h0 (since it does not have to be the same as h), to compute the l-bit digest
d of the message: d = h0 (m). Then, we simply apply the scheme of Figure 4.12
to sign the digest d.

4.5 Randomness extraction


For the last security property we define, we use the term randomness extrac-
tion. Intuitively, a function is randomness extracting if its output is uniformly
random, provided that its input has ‘sufficient randomness’. This is impor-
tant in security and cryptography, since randomness is necessary for many
mechanisms; in particular, randomness is essential for both encryption and
key-generation, as well as for challenge-response based authentication. How-
ever, cryptographic systems rarely have available sources of ‘true, perfect ran-
domness’; existing sources of randomness, such as measurements of delays of
different physical actions, are definitely not perfectly random. Another appli-
cation of randomness extraction is for key derivation; indeed, the keyed-variant
of randomness extraction hash functions is usually referred to as Key Deriva-
tion Function (KDF). The HMAC construction, which we present in subsec-
tion 4.6.1, is often used for randomness extraction (as a KDF); we discuss this
application further in §6.3. Randomness extraction also received considerable
attention in the research of the theory of cryptography, e.g., see [58].
In spite of its importance for practical applications, randomness extraction
is less well known to practitioners than the previous properties, and not listed
among the requirements of standard hash functions. This may be because
it is more subtle and harder to define; in particular, what does it mean for

156
the input to have ‘sufficient randomness’ ? There are different possible formal
definitions for this intuitive requirement. Unfortunately, the definitions used
in the theoretical cryptography publications appear to be too complex for our
modest goals. Instead, we discuss two simple models for randomness extraction.

Von Neumann’s Biased-Coin Model We first discuss the classical biased-


coin model proposed already at 1951 by Von Neuman [167], one of the pioneers
of computer science. In the Von Neuman model, each of the input bits is the
result of an independent toss of a coin with fixed bias. Namely, for every bit
generated, the value 1 is generated with probability 0 < p < 1 and the value 0
is generated with probability 1 − p - with no relation to the value of other bits.
Von Neumann proposed the following method to extract perfect randomness
from these biased bits. First, arrange these sampled bits in pairs {(xi , yi )}.
Then, remove pairs where both bits are identical, i.e., leave only pairs of the
form {(xi , 1−xi )}. Finally, output the sequence {xi }. This simple - if somewhat
‘wasteful’ in input bits - algorithm outputs uniformly random bits.

Exercise 4.8. [Von Neumann extractor] Show that, if the input satisfies the
assumption of the Von Neuman model, then the output is uniformly random,
i.e., each bit xi is 1 with probability exactly half - independently of all other
bits.

Solution: See chapter 10.


The Von Neuman extractor is simple, and the output is proven uniform
without any computational assumption. However, the assumption is hard to
justify, for most typical security applications of randomness-extraction. In
particular, consider the goal of key derivation, in particular as applied to CDH
problem in the DH protocol (§6.3); surely there is no justification to assume
that every bit is a result of a biased coin flip. We next discuss a different model,
which seems to be applicable to more scenarios.

Bitwise Randomness Extracting Function We now present a different


simple model for randomness extraction, which we call bitwise randomness
extracting. We believe that bitwise randomness extraction is helpful for under-
standing this important property of cryptographic hash functions.
Intuitively, a hash function is bitwise-randomness extracting if the output is
pseudorandom, even if the adversary can select the input message - except for
a ‘sufficient number’ of the bits of the message. This intuition is illustrated in
Fig. 4.14, which defines a ‘game’ where an adversarial algorithm Adv tries to
defeat the randomness extraction - by selecting some input message, except for
some number µ(n) of random bits, and then distinguishing between the output
and a random string (of the same length).
In Fig. 4.15, we further clarify how the adversary selects the input (except
for µ(n) random bits). The adversary Adv first outputs two binary strings,
denoted m and M , of the same length |m| = |M |. The string m is the input

157
Figure 4.14: Bitwise-Randomness Extraction (simplified).

Figure 4.15: Bitwise-Randomness Extraction (more precise).

to the hash function, selected by the adversary - except for (at least) µ(n) of
the bits in x, which are ‘randomized’, i.e., replaced with random bits.
The string M is a bitmask; each of its bits corresponds to one bit of m.
When a bit in M is turned on (1), then the corresponding bit of P m would
be randomized. At least µ(n) bits in M contain 1, i.e., µ(n) ≤ i M [i]. If
M [i] = 1, i.e., the ith bit of M is turned on, then this bit of m is randomized.
Technically, we do this by selecting randomly a string m0 of the same length
as m, and then using m ⊕ (m0 ∧ M ) as the input to the hash function h.
Finally, we perform an ‘indistinguishability test’, much like the one used to
define (IND-CPA) encryption (Definition 2.10), or, even more, the (simpler)
definition of pseudorandom generator (PRG, Definition 2.6). Namely, we select
$
a random bit b, and let yb = h(m⊕(m0 ∧M )) and y1−b ← {0, 1}n ; the adversary
Adv ‘wins’ if it correctly guesses the value of b.

Definition 4.9 (Bitwise-Randomness Extracting (BRE) function). An ef-


ficient hash function h : {0, 1}∗ → {0, 1}n is called µ-bitwise-randomness
extracting (BRE) if for every efficient algorithm A ∈ P P T , the advantage
εµ−BRE
h,A (n) is negligible in n, i.e., smaller than any positive polynomial for
sufficiently large n (as n → ∞), where:
h i h i
εµ−IN
h,A
D−BRE
(n) ≡ Pr 1 = IND-BREµA,h (1, n) − Pr 1 = IND-BREµA,h (0, n)
(4.14)
Where IND-BREA,h (·, ·) is defined in Algorithm 1 and the probability is taken
over the random coin tosses of A and of IND-BREµA,h (·, ·).

158
Algorithm 1 Bitwise-Randomness Extraction Indistinguishability experiment
IND-BREµA,h (b, n).
(m, M ) ← A(1n )
P|M |
If |m| =
6 |M | or i=1 M (i) < µ(n) return ⊥
$
m0 ← {0, 1}|m|
yb ← h(m ⊕ (m0 ∧ M ))
$
y1−b ← {0, 1}n
return A(y0 , y1 , m, M )

It is interesting to note that it is easy to design a bitwise-randomness ex-


tracting function h : {0, 1}∗ → {0, 1}, i.e., for the very special case of n = 1,
i.e., extracting a single bit; but extracting more bits is not as simple.
Exercise 4.9. Present a bitwise-randomness extracting function h whose out-
put is a single bit, and prove that it satisfies the definition.
Randomness extraction is closely related to PRGs; in particular, in the
common case where we have a partially-random input, but need more than n
(pseudo)random bits, we can use a randomness extractor to get n pseudoran-
dom bits and then use a PRG f to expand it to a longer pseudorandom string.
This called the extract-then-expand model. Note that f may expand its input
by much more than one bit.
Exercise 4.10 (Extract-then-expand). Let h be a bitwise-randomness-extracting
hash function and f be a secure PRG. Show that the function g(x) = f (h(x))
is a randomness extracting hash function (with longer output length). Namely,
for large enough m, if m (or more) bits in x are random, then g(x) = f (h(x))
is pseudorandom.

4.6 The Random Oracle Model


Often, designers use cryptographic hash functions, but without identifying a
specific, well-defined requirement of the hash function that suffices to ensure
security. However, such constructions are often still ‘secure’ in the practical
sense, that no practical attack against them is found - for many years, and
often, in spite of considerable motivation for cryptanalysis and efforts.
Of course, there are also plenty of constructions which similarly use hash
functions, without a well-defined assumption - but which have been found to
be insecure. However, very often, the attacks are generic, i.e., do not exploit a
weakness of a specific hash function, and apply when the design is implemented
with an arbitrary hash function.
Therefore, even when we cannot identify the specific ‘generic’ property of
hash function on which security relies, it is useful to identify designs which
are vulnerable even for an ‘ideal-security’ hash function. We definitely want
to avoid such a design, which would be clearly insecure for any hash function!

159
Can we define a property which implies such vulnerable designs, and then
prove that a candidate design is secure in the (limited) sense of being secure
for such ‘ideal’ hash function? Unfortunately, we don’t know how to define an
‘ideal-security’ hash function...
The Random Oracle Model (ROM)6 , proposed by Bellare and Rogaway [21],
is a common approach to this dilemma. Intuitively, Random Oracle Model
(ROM) constructions and protocols are secure in the (impractical) case that
the parties select h() as a truly random function (for the same domain and
range), instead of using a concrete, specific hash function as h(). The function
h() is still assumed to be public, i.e. known to the attacker. Namely, we model
an ‘ideal’ hash function as a random function (over the same domain and range,
i.e., {0, 1}∗ → {0, 1}n ).
Definition 4.10 (ROM-security). Let H be the set of all n-bit hash functions,
i.e., functions h : {0, 1}∗ → {0, 1}n (for some n). Consider a parameterized
scheme S h , where h is a given hash function h. Also, for any security definition
def , let εdef
S h ,Ah
→ [0, 1] be the def -advantage function, defined for a given
scheme S h and parameterized adversary Ah , where Ah is a PPT algorithm,
using standard computational model (e.g., Turing machine), with the additional
ability to provide an arbitrary input x to h and receive back h(x) (as a single
operation).
We say that the (parameterized) system Sh is def -ROM-secure, or def -
secure under the Random Oracle Model, if the advantage of any PPT adversary
$
Ah for scheme S h , for a random hash function h ← H, is negligible, i.e.:

Pr εdef
S h ,Ah
∈ N EGL(n) (4.15)
$
h←H

Note about random choice of h (may be skipped). The careful reader


may have noticed that it is not well defined how to select an element from an
$
infinite set with uniform probability, hence, h ← H is not well defined. This
choice should be interpreted as follows. For any l > n, let H l be the set of
all functions from {0, 1}l to {0, 1}n , i.e., functions from l-bit binary strings to
n-bit binary strings: H l = {h : {0, 1}l → {0, 1}n }. The set H l is finite, hence
$ $
we can define random sampling from it: hl ← H l . When we write h ← H, it
$
should be interpreted as choosing hl ← H l for every integer l > n; and, for any
input string m ∈ {0, 1} , let h(m) = h|m| (m).

From the definition it follows that to ‘prove security under the ROM’, we
would analyze the construction/protocol as if the function h(·), used by the
protocol, was chosen randomly at the beginning of the execution and then
made available to all parties - legitimate parties running the protocols or using
the scheme/construction, as well as the adversary.
6 Often people use the term with ‘methodology’ or ‘method’ instead of ‘model’, but we

use ‘model’ since it is more common.

160
Security under the ROM vs. Security under the Standard Model.
Analysis of security under the ROM model is so widely used, that papers in
cryptography often use the term ‘secure in the standard model’ to emphasize
that their results are ‘really’ proven secure, rather than only proven under
the ROM (or under another idealized model). Of course, these proofs are still
based on reductions, i.e., on unproven assumptions; however, these assumptions
are ‘standard’ cryptographic assumptions. For many of these standard cryp-
tographic assumptions there are even theoretical results showing that there
exist schemes satisfying these assumptions under a ‘standard’ mathematical
assumption, most often, assuming P 6= N P .
We emphasize again, that ROM-security does not necessarily imply that
the design is ‘really’ secure, when implemented with any specific hash function;
once the hash function is fixed, there may very well be an adversary that breaks
the system. This is true even when the hash function meets the ‘standard’
security specifications, e.g., CRHF. However, ROM-security is definitely a good
indication of security, since a vulnerability has to use some property of the
specific hash function. Indeed, there are many designs which are only proven
to be ROM-secure. This motivates many works in cryptography, which provide
designs secure in the standard model, as alternatives to known designs which
are secure only under the ROM. The following subsection studies one of the
most widely used applications of hash functions - the construction of a Message
Authentication Code (MAC) scheme from a hash function; we show several
constructions which are secure under ROM but insecure under the standard
model.

4.6.1 HMAC and other constructions of a MAC from a


Hash function
One common use of cryptographic hash functions, is for message authentica-
tion, by implementing a MAC function. The common motivation is that some
cryptographic hash functions are extremely efficient, and this efficiency can be
mostly inherited by HMAC. For example, the Blake2b [4] cryptographic hash
function achieves speeds of over 109 bytes/second, using rather standard CPU
(intel I5-6600 with 3310MHz clock).

About the terms: keyed hash vs. MAC. Constructions of MAC func-
tions from hash functions are often referred to as keyed hash, where they assume
that the hash function is ‘keyed’, in some way, using a secret key k. This dif-
fers from the ‘standard’ use of the term ‘keyed hash function’, which we adopt,
where the key k is not secret (i.e., we assume that k is known to the adver-
sary). Indeed, why use ‘keyed hash’ to mean exactly the same thing as MAC?
And surely we can’t only use ‘keyed hash’ to mean MAC, considering MAC
functions may be constructed in different ways, e.g., from a block cipher, e.g.,
the CBC-MAC (subsection 3.5.2)!

161
Security of constructions of MAC from hash. Let us return to the
‘real’ question: how to construct a MAC from a cryptographic hash? Many
heuristic proposals were made, mostly constructing the MAC from a keyless
hash function. Three of the most well known heuristics were presented and
compared by Tsudik [160]. Given keyless hash function h, key k and message
m, these are:

Prepend Key: M ACkP K (m) = h(k +


+ m)
Append Key: M ACkAK (m) = h(m +
+ k)
Message-in-the-Middle: M ACkM itM (m) = h(k +
+m+
+ k)

An obvious question is whether these schemes are secure - assuming that the
cryptographic hash function h satisfies some assumption. Let us first observe
that all three constructions are secure under the ROM.

Exercise 4.11. Prove that (a) M AC P K , (b) M AC AK and (c) M AC M itM


are secure under the ROM.

Proof sketch: assume an adversary outputs m, σ for a message m which it


did not give as input to the ‘oracle’ for h. Then the output of the corresponding
h function, was never computed yet, i.e., it is still random. For example,
for M ACkP K (m) = h(k + + m), the value of h(k + + m), for this m, was not
computed yet. In fact, we need to pick it only to check the adversary’s guess
σ; at that point, we choose it randomly from the set {0, 1}n . The probability
that our choice will be the same as σ is only 2−n , i.e., negligible. Hence,
M ACkP K (m) = h(k + + m) is secure under the ROM; a very similar argument
holds for (b) M AC AK and (c) M AC M itM .
Just for clarity, let us also give an example of construction which is not
secure - even under the ROM. Specifically, consider M ACk2AK (m) = h(k + +
m) ++ h(k ++ m ⊕ 1|m| ). Namely, M AC 2AK was made ‘more complex’ - maybe
with (wrong) hope of making it more secure - by concatenating two hash values,
one of k ++ m and the other of k + + m ⊕ 1|m| , where m ⊕ 1|m| is just a weird
way for writing the negation of m.

Example 4.1. Show that M AC 2AK is insecure, (even) under the ROM.
solution: Adversary asks to receive M ACk2AK for the message m = 0l (for
+σR , where |σL | = |σR | = n.
any length l); let the value returned by denoted σL +
Then the adversary returns the ‘guess’ (1l , σR + + σL ). Verify that this is the
correct pair.
Unfortunately, it is easy to see that even if a hash function h satisfies any
of the properties we have defined in the previous sections, its use in all of these
constructions may still result in insecure MAC. This even holds for a hash
function h that satisfies all of these assumptions. See following exercise.
Exercise 4.12. Present a keyless hash function h such that:

162
1. h is a CRHF, yet (a) M AC P K , (b) M AC AK , (c) M AC M itM is not a
secure MAC.
2. h is a SPR, yet (a) M AC P K , (b) M AC AK , (c) M AC M itM is not a
secure MAC.
3. h is a OWF-hash, yet (a) M AC P K , (b) M AC AK , (c) M AC M itM is not
a secure MAC.
4. h is a BRE-hash, yet (a) M AC P K , (b) M AC AK , (c) M AC M itM is not
a secure MAC.
5. h is CRHF, OWF and BRE, yet (a) M AC P K , (b) M AC AK , (c) M AC M itM
is not a secure MAC.

The examples may assume a hash function h0 which has the corresponding
property (CRHF, SPR, OWF, BRE or their combination).

While the examples in the solutions to Exercise 4.12 would probably be


very ‘artificial’ and probably irrelevant to any ‘real’ candidate hash function,
some weaknesses of these constructions can apply to realistic hash functions.
In particular, many hash functions have the following extend property: given
h(x), one can compute h(x + + y), even without knowing anything about x. This
property hold for any hash function using the (widely-used) Merkle-Damgård
construction, which is used by many hash functions, including the MD5 and
SHA-1 standards; see discussion in § 4.7. However, Tsudik showed in [160]
that the M AC P K is insecure, for any hash function with the extend property.
Showing this is a nice, not very difficult exercise; see in [160]).

HMAC. HMAC [11,13] is the most widely-used construction of a MAC from


a keyless hash function. HMAC is defined as:

HM ACk (m) = h(k ⊕ OP AD +


+ h(k ⊕ IP AD +
+ m)) (4.16)

Where OPAD, IPAD are fixed constant strings.


It is not difficult to see that HMAC is secure under the ROM (Exercise 4.25).
However, as we already discussed, that this is not a sufficient criteria. Exer-
cise 4.26 is the ‘complementary’ question: it asks you to show that assuming
(‘just’) that h() is a CRHF, is NOT enough for the HMAC construction to be
a secure MAC. For example, we can design a function h() which is collision
resistant, yet using it in the HMAC construction would expose the key; using
the HMAC construction with this (weird) h would not result in a secure PRF
or MAC. Of course, the construction will assume a given CRHF h0 .
However, in [11], the authors - Bellare, Canetti and Krawczyk - show that
HMAC is a secure MAC, also in the standard model, i.e., under reasonable
cryptographic assumptions. Due to the importance and wide use of HMAC,
confidence in its security grew over the years, with several additional results
establishing its security (under weaker assumptions), as well as due to the mere

163
fact that such important standard has not be ‘broken’ by some cryptanalysis.
In fact, HMAC is also often used for additional goals, such as a candidate PRF,
and as a Key Derivation Function (KDF), which is essentially a keyed variant
of a randomness extraction hash function; see discussion in § 4.5, § 6.3 and [58].

4.7 The Digest-Chain and Merkle-Damgård


Construction
In this and the following two sections, we discuss Digest schemes, which are
a generalization of the Collision-Resistant Hash Function (CRHF). In this
section, we begin with the simplest digest scheme, which we refer to as the
Digest-Chain scheme. The terms Digest schemes and Digest-Chain are not
widely used; in particular, the only well-known construction of this type is the
Merkle-Damgård construction, which we present, but whose common defini-
tion only implements the digest function of the digest-chain scheme, requiring
a minor extension to provide the entire functionality of digest-chain schemes.
We see this as a demonstration of the importance of distinguishing between an
abstract scheme (e.g., block cipher or digest-chain) and constructions of it (e.g.,
the AES block cipher, or the Merkle-Damgård digest function, or its extension
into digest-chain construction).
All digest schemes extend the integrity-goal of CRHF, but instead of hash-
ing a single input, the input to a digest scheme is a sequence of input strings
(messages). Similarly to hash functions, one can define either a keyless or
a keyed Digest-Chain scheme; and, since digest schemes extend the goals of
CRHFs, it is easy to see that, for the same arguments presented for CRHF in
§ 4.2, there can actually be no secure digest schemes (at least, for the definitions
we study). In spite of that, following the discussion in subsection 4.2.2, we only
define the keyless variants, typically constructed from keyless hash functions.
Also, similarly to keyless hash functions, the functions of the Digest-chain
scheme are defined for different security parameter n, which is also the length
of the digest, but we usually do not explicitly write it as a parameter. If we
do need to explicitly write it, we will use the same notation as for keyless hash
functions, e.g., ∆(n) .

4.7.1 The Merkle-Damgård Construction of collision


resistant digest function
In this subsection, we focus on the basic collision-resistance property of digest-
chain schemes. This property is a trivial extension of CRHFs, and requires just
a single function, to which we refer as the digest function and denote by ∆. In
fact, the digest function could be viewed by itself as a basic digest scheme.
The input to the digest function ∆ is a finite sequence of messages M =
{m1 , . . . , ml }; and its output, ∆(M ), the digest of M , is an n-bit sting. We
require the digest function to ensure collision-resistance, i.e., that it would
be infeasible for a PPT adversary A to output a collision, i.e., two different

164
Figure 4.16: The Merkle-Damgård Construction of a collision-resistance digest
function from a CRHF. The construction, with minor preprocessing of the
input, also be used to construct a CRHF from a compression function (which
is a CRHF restricted to fixed-length inputs), see details in the text.

sequences M 6= M 0 which have the same digest, i.e., ∆(M ) = ∆(M 0 ). Let us
define this a bit more precisely, although we still present the definition for fixes
digest length of n bits; recall that the more precise interpretation is that we
define a whole sequence of such functions ∆(n) for every integer n > 0, we just
omit the (n) for brevity and simplicity - and since in practice, we will use a
specific length anyway.

Definition 4.11. An n-bit digest function ∆ is an efficiently computable func-


tion (in PPT) that maps finite sequences of binary strings to n-bit binary

strings, i.e., ∆ : ({0, 1}∗ ) → {0, 1}∗ .
Digest function ∆ is collision resistant if the digest collision-resistance ad-
vantage εDCR
A,∆ (n) is negligible (in n), for every efficient adversary A ∈ P P T ,
where:
0 0 ∗ ∗ 0 0
εDCR n

A,∆ (n) ≡ Pr (M, M ) ← A(1 ) s.t. M, M ∈ ({0, 1} ) ∧ (M 6= M ) ∧ ∆(M ) = ∆(M )
(4.17)

Obviously, collision-resistant digest functions are very similar to collision-


resistant hash functions (CRHFs). It may seem that we can compute the digest
of a sequence merely by concatenating all its inputs into a single string. How-
ever, this is incorrect; intuitively, concatenation of all the inputs does not pre-
serve the ‘inter-string boundaries’. For example, surely (11, 00) 6= (1, 10, 0) 6=
(1, 1, 00).
However, it is well known, and quite easy, to construct a collision-resistant
digest function from a CRHF. The most well-know construction transforming a
CRHF into a digest function is the well known Merkle-Damgård construction,
which we present in Figure 4.16 and in eqs. 4.18,4.19. Given a CRHF h and
input M = {m1 , . . . , ml }, the value of Merkle-Damgård digest function based
on h, which we denote MDh .∆(M ) is defined as follows:

165
For l = 1: MDh .∆ ({m1 }) ≡ h(0n+1 + + m1 ) (4.18)
h
 
MD .∆(m1 , . . . , ml−1 )
MDh .∆ ({m1 , . . . , ml }) ≡ h (4.19)
++1 +
+ ml

Lemma 4.2. If h is a CRHF, then MDh .∆ is a collision-resistant digest


function.
Proof: assume, to the contrary, that adversary AM D ∈ P P T has non-
negligible advantage against MDh .∆, i.e., εDCR AM D ,∆ (n) 6∈ N EGL(n). We use
AM D to construct adversary Ah which has non-negligible advantage against h.
Specifically, Ah runs AM D , and whenever AM D outputs a collision for MDh .∆,
then Ah outputs a collision for h. Let us explain how.

Let the collision be (M, M 0 ), i.e., : M, M 0 ∈ ({0, 1}∗ ) ∧ (M 6= M 0 ) ∧
0
∆(M ) = ∆(M ). Denote the number of messages in M by l and the number of
messages in M 0 by l0 , and, without loss of generality, assume l ≥ l0 . The proof
is by induction on l.
If l = 1 then l0 must also be one, hence: MDh .∆(M ) = h(0n+1 + + m1 ) and
MDh .∆(M ) = h(0n+1 + + m01 ). Since M 6= M 0 , and in this case M = {m1 } and
M 0 = {m01 }, it follows that m1 6= m01 . Let m̄ = 0n+1 + + m01 ;
+ m1 , m̄0 = 0n+1 +
h
it follows that m̄ 6= m̄0 . Hence, (m̄, m̄0 ) is a collision that A will output, as
claimed.
Assume therefore that the claim holds for l = i and we prove it holds also
for l = i + 1. First assume l0 > 1. We have

MDh .∆ ({m1 , . . . , mi+1 }) ≡ h(MDh .∆(m1 , . . . , mi ) +


+1+
+ mi+1 ) (4.20)

and

MDh .∆ ({m01 , . . . , m0l0 }) ≡ h(MDh .∆(m01 , . . . , m0l0 −1 ) + + m0l0 )


+1+

and of course

MDh .∆ ({m1 , . . . , mi+1 }) = MDh .∆ ({m01 , . . . , m0l0 })

Now, if the inputs to the hash are identical, than this means that MDh .∆(m01 , . . . , m0l0 −1 ) =
MDh .∆(m1 , . . . , mi ), contradicting the induction hypothesis. If the inputs to
the hash are different, then this is a collision. Hence, in both cases, Ah can
output a collision, as claimed.
It remains to consider the case where l0 = 1 (and we prove for l = i + 1
after we proved for l = i). Equation 4.20 still holds, but in this case we have

MDh .∆ ({m01 }) ≡ h(0n+1 +


+ m0l0 )

and

MDh .∆ ({m1 , . . . , mi+1 }) = MDh .∆ {m01 } = h(0n+1 +


+ m0l0 )


166
Figure 4.17: A compression function: fixed input length to fixed (and shorter)
output length. For simplicity, assume input length is 2n.

. Since the (n+1)th bit differs between these two inputs to h, but their outputs
are the same, it follows that also in this case, Ah outputs a collision and the
claim is complete.
Exercise 4.13. Show that collisions may be possible for a variant of the con-
struction where Equation 4.19 is replaced by:
MDh .∆ ({m1 , . . . , ml }) ≡ h(MDh .∆(m1 , . . . , ml−1 ) +
+0+
+ ml )
Note: this is not an easy exercise. Hint: Assume some CRHF h0 and use it
to design a very unlikely CRHF h for which such collision is possible. Reading
the proof carefully may help.
The Merkle-Damgård construction was proposed independently by Merkle
and Damgård, in [52, 129]; I personally find Damgård’s text easier to follow.
However, that in both works, and in most of the extensive work building on this
important construction, it is not presented at all as a construction of a digest
function from a hash function. Instead, it is presented as a construction of a
CRHF from a compression function, where a compression function7 is defined
just like a CRHF, except that its input domain is the set of binary strings of
some specific length m > n (where n is still used for the size of the output
string).
One motivation to build a cryptographic hash function from a compression
function, is the ‘cryptographic building blocks’ principle 8: The security of
cryptographic systems should only depend on the security of a few basic building
blocks. These blocks should be simple and with well-defined and easy-to-test
security properties.
Cryptographic hash functions are one of these few building blocks of applied
cryptography, due to their simplicity and wide range of applications. However,
compression functions are even simpler, since their input is restricted to fixed-
input length strings. Let us, therefore, explain how the construction can be
also be used for this application, i.e., constructing a CRHF from a compression
function
7 The term compression function may not be the best choice; it may have been clearer

to refer to such functions, with Fixed-Input-Length (FIL), as FIL-hash functions. However,


the use of ‘compression functions’ is entrenched in the literature; hence, we think it is better
to use it anyway.

167
We first consider the case of hashing binary strings of length n0 which splits
exactly into m−n bit ‘messages’, i.e., assume the input length n0 satisfies n0 = 0
mod (m − n). In this case, we don’t need the MD-strengthening preprocessing
step, and given a collision-resistant compression function h, can simply define
the following CRHF h0 :
 0
m [1 : (m − n)], m0 [(m − n + 1) : 2 · (m − n)], . . .

0 0 h
h (m ) = MD .∆
. . . , m0 [|m0 | − (m − n) + 1 : |m0 |]
(4.21)
Where we use the notation m0 [i : j] for the sub-string of m0 , from the ith bit
to the j th bit.
It is easy to see, from Lemma 4.2, that if h is a collision-resistant compres-
sion function, then h0 would be collision-resistant.

MD-strengthening. The Merkle-Damgård construction is usually presented


with an additional preprocessing step called MD-strengthening. This step is
necessary since h0 , as presented in Equation 4.21, is not yet a hash function,
since it is defined only for strings of length n0 such that n0 = 0 mod (m − n).
So we need to do a further step that will allow us to handle any binary string
- without this restriction. This extension is referred to as MD-strengthening8 .
The basic idea of MD-strengthening is to pad the message m0 with (m −
n) − [n0 mod (m − n)] additional bits, e.g. all zeroes, so that its length would
become an exact multiple of m − n bits ‘messages’. However, this introduces
a small issue: the hash of m0 would be the same as that of m0 + + 0 - an ex-
tra zero bit would have the same impact as a pad bit! This can be solved in
different ways; MD-strengthening simply appends one additional m − n bits
‘message’, which contains the binary encoding of the number of pad bits, i.e.,
of: (m − n) − [n0 mod (m − n)]. Let binl (x) denote the l bit string con-
taining the encoding of an integer x : 0 ≤ x ≤ l; then we need to append
binm−n ((m − n) − [n0 mod (m − n)]) Therefore, the complete MD construc-
tion of a CRHF hM D from a compression function h, including padding and
MD-strengthening, is defined as:

0
+0(m−n)−[n
hM D (m0 ) = h0 (m0 + mod (m−n)]
+binm−n ((m − n) − [n0
+ mod (m − n)])
(4.22)
Where binl (x) is defined above and h0 is defined in Equation 4.21.

Lemma 4.3. If h is a collision-resistant compression function, then hM D ,


defined as in Equation 4.22, is a collision-resistant hash function (CRHF).

Proof sketch: follows from Lemma 4.2.


8 The reader may wonder about this term; ‘MD’ clearly stands for Merkle and Damgård,

so that’s fine, but why ‘strengthening’ ? It seems that this term is used in relation to naive
and vulnerable methods of padding the input to the Merkle-Damgård construction. But I
am not sure.

168
Initialization Vector (IV). The string of 0n bits that we append to the first
message m1 in Equation 4.18, is often referred to as an Initialization Vector
(IV), similarly to the use of these term in some descriptions of the ‘modes of
operation’ of block cipher (§ 2.9). The choice of 0n is completely arbitrary; and
n-bit string could have been used, as long as it is a fixed string - if we allow the
IV to be chosen freely each time, then collisions are possible (see next exercise).
Note that it is possible to use a random IV as a way to key the hash function
- but, again, only if this IV is used for all messages.
Exercise 4.14. Assume that |m1 | = n, and consider a variant on the MD
construction where we change Equation 4.18 so that for l = 1, we have:
MDh .∆ ({m1 }) ≡ m1 . This variant ‘saves’ a hash operation; however, show
that it may allow collisions.

4.7.2 The Extend Function and Validation of Entries and


Extensions
Digest schemes provide additional integrity mechanisms beyond collision resis-
tance. These mechanisms are useful for many applications and situations, in
which the sequence of messages is dynamic, such a in a log scenario. Clearly,
in a log, new messages may be added over time. Furthermore, we may want to
add messages to the log, to validate that a particular message appears in the
log, or to validate that a new digest of a log is consistent with a previous di-
gest, all without re-using the entire set of messages. There are two motivations
for not requiring the entire set of messages: improved efficiency - and allowing
validation and log-extension by different parties, who may not even possess all
the messages in the log. The reader may already see how this will soon bring
us to more elaborate digest schemes, such as Merkle digests and Blockchains -
the topics of the following two sections.
However, for now, we still continue to discuss the simpler digest-chain
scheme. Our discussion so far was limited to the digest function, which can
be viewed as a very basic digest-chain scheme; we now extend it, to define a
‘proper’ digest-chain scheme.
The extension involves only one more function, which we actually refer
to as the extend function, and denote Extend. This function receives the
‘current’ digest and a sequence of (one or more) additional (‘new’) messages,
and produces the ‘new’ digest. The only additional requirement we need to
make is that the extend function is consistent with the digest function, i.e.,
that for any given ∆l = ∆(Ml ) and sequence of additional messages Ml+1,l0 ,
holds:
Extend(∆l , Ml+1,l0 ) = ∆(Ml + + Ml+1,l0 ) (4.23)
The definition of a digest-chain scheme and its security requirements follows.
Definition 4.12. A Digest-Chain scheme is a pair (∆, Extend) of PPT-
computable functions:
∆ is a digest function as defined in Definition 4.11.

169
Extend is the extend function, whose inputs are a digest ∆l and a sequence
of ‘additional’ messages Ml+1,l0 , and whose output is a ‘new’ digest ∆l0 .
A digest-chain scheme is correct if for any given ∆l = ∆(Ml ) and sequence
of additional messages Ml+1,l0 , Equation 4.23 holds.
A digest-chain scheme is secure if it is correct and its digest function is
collision-resistant (see Definition 4.11).
In spite of this simple definition and requirements, there are three different
ways to use the Extend function, for different applications and scenarios:
Extend current digest: this is direct use of ∆ to extend the sequence of mes-
sages, Ml = {m1 , . . . , ml }, with additional messages Ml+1,l0 {ml+1 , . . . , ml0 .
The digest function will receive as input the current digest ∆l = ∆(Ml ),
and the sequence of additional messages Ml+1,l0 , and produce the new
digest ∆l0 . The basic correctness property is that this would be the digest
of the entire sequence, i.e., that ∆l0 = ∆(Ml + + Ml+1,l0 ).
Validate digest consistency: in this use-case, the current digest ∆l and the
new digest ∆l0 are computed by one entity, e.g., a bank, and received by
a different entity, Val, e.g., a customer. Val may want to validate that
∆l0 is consistent with ∆l , and with a given set of new messages, e.g.,
transactions, Ml+1,l0 {ml+1 , . . . , ml0 . Namely, Val needs to know that
∆l0 = ∆(Ml + + Ml+1,l0 , for some set of messages Ml , committed-to by
the old digest ∆l , i.e., ∆l = ∆(Ml ). Notice that in some applications,
Val may not even be interested in the specific additional transactions
in the set Ml+1,l0 , but they must be used for validation when using a
digest-chain scheme; this will be avoided in the Merkle-digest scheme
and Blockchain scheme, presented in the following sections.
Validate Inclusion: in this use-case, Val is an entity who has a digest ∆l0 ,
and receives a particular message ml+1 . Val wants to validate that ml+1
appeared in the sequence whose digest is the known ∆l0 , possibly also
with its sequence number. To this end, Val must be provided with the
∆l = ∆(Ml ) and with the entire sequence of additional messages Ml+1,l0 ,
and use these to reproduce ∆l0 .

The Merkle-Damgård extend function. As mentioned in subsection 4.6.1,


it is well known that the Merkle-Damgård construction allows extension; in
fact, for some applications such as MAC, this is not always a welcome feature.
However, this is a required feature for a digest-chain scheme. We therefore
define the Merkle-Damgård extend function MDh .Extend, based on a hash
function h:

 For l = 1: h(∆ +
+1++ m1 )
MDh .Extend (∆, {m1 , . . . , ml }) ≡ For l > 1:
MDh .Extend (h(∆ +
+1++ m1 ), {m2 , . . . , ml })

(4.24)

170
Lemma 4.4. If h is a CRHF, then (MDh .∆, MDh .Extend) is a secure digest-
chain scheme.

Proof: Correctness follows by substituting using the definitions of the func-


tions, and collision resistance of MDh .∆ was proven in 4.2.

4.8 The Merkle Digest Scheme


Like the digest-chain scheme discussed in § 4.7, the Merkle digest scheme ex-
tends the integrity-goal of CRHF, mainly by using a digest function ∆ : {mi ∈
{0, 1}∗ }i → {0, 1}∗ . Specifically, instead of hashing a single input, the input
to the Merkle digest function is a set of objects, often referred to a files or
messages. For example, a typical software distribution may consist of multi-
ple messages, e.g., (m1 , m2 , m3 , m4 ), such as when using large libraries. The
Merkle-digest ∆ function also has the same collision-resistance requirement as
of the digest-chain ∆ function.
There are two main differences between the Merkle digest scheme and the
digest chain scheme. The first difference is that in the Merkle digest scheme,
validation of inclusion is a mandatory, important mechanism, whose efficiency
is often a main design goal of constructions of Merkle digest scheme. In typ-
ical applications of Merkle digest schemes, it is desirable to allow recipients
to efficiently validate integrity (inclusion) of one or few messages that they
need, using only limited, concise information, for efficiency (and sometimes
also for privacy). To support validation of inclusion, Merkle schemes (always)
include the P oI function, that computes a Proof-of-Inclusion, and the V erP oI
function, that validates a given P oI.
The other main difference between the digest-chain scheme and the Merkle-
digest scheme, is in the mechanisms and requirements related to sequence ex-
tension and validation. Recall that the digest-chain scheme handles extensions
of the message-sequence, as well as validation of consistency and of inclusion,
all using the same - optional - Extend function, as described in subsection 4.7.2.
In the Merkle-digest scheme, validation of inclusion is done by the (mandatory)
P oI and V erP oI functions, as explained; while sequence extension and valida-
tion are handled with the (optional) Proof-of-Consistency (P oC) and validation
of P oC (V erP oC) functions. These functions are often more complex; in fact,
while we present two Merkle scheme constructions, we describe the extension
and validation of consistency mechanisms only for one of the two (with a simple
- but inefficient - design).
We next describe the Merkle digest scheme and two constructions; however,
let us first say few words on our naming of the scheme and of the constructions.

On naming: Merkle digest scheme vs. the Merkle tree construction.


The term Merkle tree is widely used in the literature, often in reference to
the ‘classical’ construction proposed by Ralph C. Merkle, one of the pioneers
of modern cryptography, in [128]. Merkle’s construction, and variants of it,

171
usually based on a CRHF and on the graph-theory concept of a tree9 , and are
widely used in cryptography. We present below two of the simpler, classical
constructions, and mention few of their many applications.
However, while different constructions were proposed, they are usually all
referred to as Merkle tree or some variant of that name; the term is also used to
refer to this type of cryptographic scheme (rather than a specific construction).
However, the author believes that it is important to distinguish between an
abstract scheme (e.g., block cipher) and constructions of it (e.g., AES and
DES).
We think it is better to use different terms for the abstract scheme and for
each specific construction. Specifically, we use the term Merkle digest to refer
to the abstract scheme. We use the terms Merkle tree (MT ) and two-layered
hash tree construction (2lMT ) to refer to two specific constructions we present.
The MT construction is a minor variant of the one by Merkle (in [128]), and
the 2lMT construction is a simpler, ‘folklore’ construction. We also discuss
the extensible Merk digest scheme, which adds additional functionality to the
Merkle digest scheme, and constructions for it.
Note, also, that our definition of a Merkle digest scheme is not a widely-
adopted definition; the author is not aware of any existing widely-deployed
definition for such schemes, and believes that this definition is a bit simpler
than some of the few definitions that the author did find. Comments on these
points (and other aspects of this manuscript) would be highly appreciated.

4.8.1 The Merkle digest scheme: Definitions


Let us begin with a slightly informal description of the Merkle digest scheme.
Similarly to hash functions, one can define either a keyless or a keyed Merkle
digest scheme. Following the discussion in subsection 4.2.2, we only define the
keyless variants. Most practical protocols using Merkle digest schemes (or the
specific Merkle tree construction), use the keyless versions, constructed from
keyless hash functions. Let us finally get to the definition of the Merkle digest
scheme, as well as its correctness and security requirements.
A Merkle digest scheme consists of three functions: a digest function ∆, a
Proof-of-Inclusion function P oI and a Proof-of-Inclusion Verification predicate
V erP oI. Similarly to keyless hash functions, the three function of the Merkle
digest scheme are defined for different security parameter n, which is also the
length of the digest, but we usually do not explicitly write it as a parameter.
If we do need to explicitly write it, we will use the same notation as for keyless
hash functions, e.g., ∆(n) .
Intuitively, ∆ produces an n-bit digest ∆(M ) ∈ {0, 1}n of a sequence
M = {mi ∈ {0, 1}∗ }i of messages10 , generalizing the digest functionality of
a hash function (which receives a single item rather than a set). Function P oI
9 Readers not familiar with graph theory and specifically trees, are encouraged to learn

this fascinating use useful topic, e.g. see [71, 172]. However, this is not be really critical for
understanding the constructions.
10 Sometimes each message is restricted in length, e.g., , m ∈ {0, 1}n .
i

172
produces a proof P oI(M, i) for the contents of the ith message in the set M
(i.e., mi ), and V erP oI(d, m, i, p) uses proof p to verify that m is the ith mes-
sage in some set M , whose digest is d (i.e., (∃M s.t. m = M [i])d = ∆(M )).
Merkle digest constructions are optimized not only for computation time, but
also to have succinct proofs, e.g., only n log |M | bits.

Definition 4.13 (Merkle digest scheme). A Merkle digest scheme M is a tuple


of three PPT functions (M.∆, M.P oI, M.V erP oI), where:

M.∆ is the Merkle tree digest function, whose input is a sequence of mes-
sages M = {mi ∈ {0, 1}∗ }i and whose output is an n-bit digest: M.∆ :

({0, 1}∗ ) → {0, 1}n .
M.P oI is the Proof-of-Inclusion function, whose input is a sequence of mes-
sages M = {mi ∈ {0, 1}∗ }i , an integer i ∈ [1, |M |] (the index of one mes-
sage in M ), and whose output is a Proof-of-Inclusion (P oI): M.P oI :

({0, 1}∗ ) × N → {0, 1}∗ .
M.V erP oI is the Verify-Proof-of-Inclusion predicate, whose inputs are digest
d ∈ {0, 1}n , message m ∈ {0, 1}∗ , index i ∈ N, proof p ∈ {0, 1}∗ , and
whose output is a bit (1 for ‘true’ or 0 for ‘false’): M.V erP oI : {0, 1}n ×
{0, 1}∗ × N × {0, 1}∗ → {0, 1}.

A Merkle digest scheme M is correct if for every sequence of messages


M = {mi ∈ {0, 1}∗ }i and every index i ∈ [1, |M |], the Proof-of-Inclusion
verifies correctly, i.e.:

M.V erP oI(M.∆(M ), mi , i, M.P oI(M, i)) (4.25)

A Merkle digest scheme M is secure if for every efficient (PPT) algorithm


A, both the collision advantage εColl P oI
M,A (n) and the PoI advantage εM,A (n) are
negligible in n, i.e., smaller than any positive polynomial for sufficiently large
n (as n → ∞), where:

(x, x0 ) ← A(1n ) s.t. (x 6= x0 )


 
Coll
εM,A (n) ≡ Pr
∧(M.∆(x) = M.∆(x0 )
(d, m, i, p) ← A(1n ) s.t. M.V erP oI(d, m, i, p) ∧
 
P oI
εM,A (n) ≡ Pr
(6 ∃x ∈ D)(d = M.∆(x))

Where the probability is taken over the random coin tosses of A.

4.8.2 Extending the sequence: Proofs of Consistency


In many applications of the Merkle digest scheme, the digest is always com-
puted and validated by applying the ∆ function to the entire sequence of mes-
sages. However, there are also applications of Merkle digest schemes where
entries may be added to the sequence over time, motivating the use of spe-
cial operations to extend and validate the consistency of the digest of the

173
extended sequence - like the Extend function of the digest-chain scheme (sub-
section 4.7.2). For example, this may be desirable in order to allow a recipient
to validate new entries and the corresponding new digest, even if the recipient
did not maintain all the entries which were included in the sequence so far,
e.g., for maintaining a log or ledger of transactions.
The Merkle digest scheme, as defined in Definition 4.13, allows verification
of the integrity of the sequence (e.g., log) using just the digest, and verification
of a particular entry, using the P oI mechanism. However, to allow the sequence
of events to be extended over time, we need an additional mechanism: Proof
of Consistency (P oC). A P oC allows to verify that a given new digest dN ,
is consistent with the known, ‘current’ digest dC ; namely, if dC is the digest
of some sequence MC = {m1 , . . . , mlC } with lC entries, then dN is the digest
of a sequence MN with lN > lC entries which is an extension of MC , i.e.,
MN = {m1 , . . . , mlC , . . . , mlN .

Definition 4.14 (Proof-of-Consistency). We say that Merkle digest scheme


M supports Proof-of-Consistency given two additional functions, M.P oC for
extending a digest, and M.V erP oC, where:

M.P oC is the Extend and Proof-of-Consistency function P oC, whose input


are two sequences and whose output is a Proof-of-Consistency: M.P oC :
∗ ∗
({0, 1}∗ ) × ({0, 1}∗ ) → {0, 1}∗ . We usually abbreviate the name of this
function and simply refer to it as Proof-of-Consistency (P oC).
M.V erP oC is the Verify-Proof-of-Consistency predicate, whose inputs two di-
gests d1 , d2 and a P oC x, and whose output is a bit: M.V erP oC :
3
({0, 1}∗ ) → {0, 1}.

We will use M to refer to the entire scheme, including also these two additional
functions.
We say that the M has correct P oC if for every two sequences of mes-
sages M = {mi ∈ {0, 1}∗ }i , M 0 = {m0i ∈ {0, 1}∗ }i , their Proof-of-Consistency
verifies correctly, i.e.:

+ M0 ), M.P oC(M, M0 ))
M.V erP oC (M.∆(M ), M.∆(M + (4.26)

We say that M has secure P oC, if for every efficient (PPT) algorithm A,
the P oC-advantage εP oC
M,A (n) is negligible in n, where:

(d, d0 , x) ← A(1n ) s.t.


 

εP oC
M,A (n) ≡ Pr
 M.V erP oC(d, d0 , x) ∧ 
(6 ∃M, M 0 ∈ {0, 1}∗ )(x = M.P oC(M, M 0 ))

Where the probability is taken over the random coin tosses of A.

Merkle Digest PoC vs. Blockchain PoC Merkle Digest and digest-chains
are both used in many Blockchain systems. Blockchains systems allow parties

174
to agree on a sequence of blocks of events (entries). Blockchains, like Merkle di-
gests, provide P oC and V erP oC mechanisms to ensure consistency. However,
this does not necessarily mean that they use the Proof-of-Consistency of the
Merkle scheme; often, blockchains use only the PoC of the digest chain (using
the Extend function). We discuss the blockchain scheme and constructions in
§ 4.9.

4.8.3 Merkle Digest scheme and Privacy


Using Merkle Digest schemes for privacy. A Merkle Digest scheme may
also be useful for privacy, when some recipients should have access only to
some files, e.g., if each file mi contains data which is private to user i. Note,
however, that CRHFs - and Merkle digests - may not ensure confidentiality.
Collision-resistance does not ensure that the value of h(m) will not expose some
information about m. Namely, for such privacy properties, the hash function
h should also have additional properties, such as other properties we discuss
later on for general-purpose cryptographic hash functions.

Exercise 4.15. Let h be a (keyed or keyless) CRHF. Use h to design another


hash function g, s.t. (1) g is also a CRHF, yet (2) g exposes one or more bits
of its input. Explain why this implies that the two Merkle constructions we
discussed, do not guarantee privacy. In particular, explain why the P oI of one
message may expose information about other messages.

4.8.4 2lMT : the two-layered Merkle Tree construction


In this and the next subsection, we present two ‘classical’ constructions of a
Merkle Tree, from an underlying CRHF. In this subsection, we begin with the
simple Two-layered Merkle Tree (2lMT ) construction; this construction is not
very efficient - in particular, for input which is a sequence of l messages, the
resulting Proof-of-Inclusion requires l · n bits. Nevertheless, for some applica-
tions, this construction suffices; and it is surely a good stepping stone to the
more complex and efficient Merkle tree (MT ) construction.
As the name Two-layered Merkle Tree (2lMT implies, the construction op-
erates in ‘two layers’: we first apply the hash function to each of the input
messages, and then apply it again - to the concatenated digest of all messages.
We illustrate the keyless and keyed variants of the Two-layered Merkle Tree
(2lMT ) construction, in Figure 4.18 and Figure 4.19, respectively. The defini-
tion for the keyless 2lMT follows; we leave the definitions of the keyed versions,
with ACR-hash and with TCR-hash, to the reader.

Definition 4.15 (The Two-layered Merkle Tree (2lMT ) construction). Given


a keyless function h : {0, 1}∗ → {0, 1}n , we define the Two-layered Merkle Tree
(2lMT ) construction as follows, for input containing l messages:

175
Figure 4.18: The keyless Two-layered Merkle Tree (2lMT ) construction, ap-
plied to input set of four-messages M ≡ {m1 , m2 , m3 , m4 }. Let xi = h(mi )
denote the hash of a single message mi (for i ∈ {1, 2, 3, 4}). The digest
of the set M is the hash of the concatenation of the hashes of each mes-
sage, i.e., y = h [h(m1 ) +
+ h(m2 ) +
+ h(m3 ) +
+ h(m4 )]. We denote the digest
by 2lMT .∆(M ) = y.

Figure 4.19: The keyed Two-layered Mekle Tree (2lMT ) construction, ap-
plied to input set of four-messages M ≡ {m1 , m2 , m3 , m4 }. Let xi = hk (mi )
denote the hash of a single message mi (for i ∈ {1, 2, 3, 4}). The digest of
the set M is the hash of the concatenation of the hashes of each message,
i.e., y = hk [hk (m1 ) +
+ hk (m2 ) +
+ hk (m3 ) +
+ hk (m4 )]. We denote the digest by
2lMT .∆k (M ) = y.

2lMT .∆(m1 , . . . , ml ) ≡ h [h(m1 ) +


+ ... +
+ h(ml )]
{h(mi )}li=1
2lMT .P oI((m1 , . . . , ml ), j) ≡
 
l True if xi = h(m), and
2lMT .V erP oI(d, m, i, {xi }i=1 ) ≡
d = h(x1 ++ ... +
+ xl )

Theorem 4.2. The Two-layered Merkle Tree (2lMT ) construction is a correct


and secure Merkle digest scheme.
Proof sketch: correctness follows immediately from the definitions. To prove
security, first show that any collision of 2lMT .digest gives a collision of h. As-

176
sume now that some Proof-of-Inclusion verifies correctly, i.e., 2lMT .V erP oI(d, m, i, {xi }li=1 ) =
1, but d is not the digest of some sequence M of messages whose ith message
is m (i.e., M [i] = m). Show, from the construction, that this also implies a
collision for h.
The obvious disadvantage of the 2lMT construction is that the Proof-of-
Inclusions are of length l ·n bits, which is inefficient; this is the main motivation
for the use of the MT construction, which we show in the next subsection.
We next show the P oC for the 2lMT construction. Again, it is very simple
- which is our main motivation for showing it - but definitely inefficient, with
long Proof-of-Consistency.
Definition 4.16 (Proof-of-Consistency for the Two-layered Merkle Tree). Given
a keyless function h : {0, 1}∗ → {0, 1}n , We define the 2lMT .P oC and
2lMT .V erP oC functions as:
 0

2lMT .P oC((m1 , . . . , ml ), (m01 , . . . , m0l0 )) ≡ {h(mi )}li=1 , {h(m0i )}li=1
  0

2lMT .V erP oC d, d0 , {xi }li=1 , {x0i }li=1 ≡ d = h(x1 + + xl ) ∧ d0 = h(x01 +
+ ... + + x0l0 )
+ ... +

Theorem 4.3. The Two-layered Merkle Tree P oC is correct and secure.


(2lMT ) construction, with 2lMT .P oC and 2lMT .V erP oC, has correct and
secure P oC.
Proof sketch: Similar to that of Theorem 4.2.
When l, the number of messages/files, is large, then both proof-of-inclusion
and proof-of-consistency may become long. We next present the ‘original’
Merkle tree (MT ) construction, which improves efficiency for this case (large
l).

4.8.5 The Merkle tree MT construction


We now present MT , the ‘classical’ Merkle tree construction, first presented
in Merkle’s seminal paper [128]. The MT can be seen as the next logical step
from the 2lMT construction: it is a bit more complex: instead of using two
layers, it uses dlog(l)e layers, where l is the number of input strings. This
allows better efficiency: the P oI contains only log(l) digests (l · n bits), and
verifying it requires only log(l) hash operations.
We illustrate the computation of the digest in the MT construction in
Figure 4.20, and the computation of the Proof-of-Inclusion (PoI) in Figure 4.21,
both for the simple case of a digest of a set of four messages (m = 4). As usual,
for simplicity, we show and discuss only the keyless variant; it is not difficult
to modify the construction for the case of keyed hash (assuming ACR or TCR,
see subsection 4.2.3).
Except for the very bottom ‘layer’ hashing the input messages, all other
hash applications have 2n-bit inputs to each hash function. In fact, Merkle’s
original construction in [128] is the same, except that it did not have this
‘bottom’ layer, and simply assumed that the input messages are each only n

177
m1 m2 m3 m4

h1 h2 h3 h4

h1−2 h3−4

h1−4 = h(h1−2 +
+ h3−4 )

Figure 4.20: The Merkle Tree (MT ) construction, applied to input set of four
messages M ≡ {m1 , m2 , m3 , m4 }. Let hi = h(mi ) denote the hash of a single
message mi , and hi−j denote the MT digest for messages mi , . . . , mj ; the
complete digest is MT .∆(M ) = h1−4 = h(h1−2 + + h3−4 ). The figure does
not indicate the use of the key, as for keyless MT ; however the only change
required for keyed version would be that all hashing is done using a common,
random key, known to the attacker.

bits - or, that the input is one long string broken down to chunks of n bits.
In any case, let us focus on our variant; in the special case where inputs are
strings of n bits each, one can simply eliminate one layer of hashing (i.e., use
Merkle’s original construction). For simplicity, assume that the number l of
inputs is a power of two, say l = 2L . It is easy to extend the construction to
support any l, e.g., by padding.
This construction has L + 1 ‘layers’, which we number from i = 0 to i = L.
The first layer, i = 0, is unique: it applies the hash function to each of the 2L
inputs. The output are 2L strings of n bits each. The next layer hashes each
consecutive pair of output of the previous layer, i.e., 2n-bits, resulting in 2L−1
sting of n bits each. The process then repeats, with each layer i (from i = 1
to i = L) receiving 2L−(i−1) strings from previous layer (i − 1), and outputting
2L−i digests (n bits each). This continues, each time halving the total number
of bits, until we remain with only the final n-bits digest.
The definition for the keyless MT construction follows; we leave the defi-
nition of the keyed versions, using ACR and using TCR hash, to the reader.
The definition is a bit ‘hairy’; most readers may prefer to only understand the
construction from Figure 4.20 and Figure 4.21.

Definition 4.17 (The (keyless) Merkle Tree (MT ) construction). Given a


keyless hash function h : {0, 1}∗ → {0, 1}n , we define the Merkle Tree (MT )
construction (MT .∆, MT .P oI, MT .V erP oI) for l = 2L input messages, as
in Figure 4.22.

178
m1 m2 m3 m4

h1 h2 h3 h4

h1−2 h3−4

h1−4 = h(h1−2 +
+ h3−4 )

Figure 4.21: Example of Proof-of-Inclusion (PoI) in the Merkle Tree (MT ).


The PoI for m2 consists of h1 = h(m1 ) and h3−4 = h(h(m3 ) + + h(m4 )). For
simplicity, the figure (and the PoI equations) use keyless hash; the only change
required for keyed MT would be to use keyed hash instead.


 If L = 0 : h(m1 )
MT .∆(M ) ≡ Else h (MT .∆ (m1 , . . . , m2L−1 ) + +
+
+MT .∆ (m2L−1 +1 , . . . , m2L ))



 If L = 0 : (empty string)
L−1
If j > 2 : MT .∆ (m1 , . . . , m2L−1 ) +
+


 
MT .P oI(M, j)1≤j≤l ≡ +MT .P oI (m2L−1 +1 , . . . , m2L ) , j − 2L−1
+
 Else : MT .∆ (m2L−1 +1 , . . . , m2L ) +
+



+MT .P oI [(m1 , . . . , m2L−1 ) , j]
+

MT .V erP oI(d, m, j, {xi }L L


i=1 ) ≡ RV erP oI(d, h(m), j, {xi }i=1 ) (defined below)
+ x0 )

 If L = 1 ∧ j = 0 : d = h(x1 +
 If L = 1 ∧ j = 1 : d = h(x0 +

+ x1 )
RV erP oI(d, x0 , j, {xi }L
i=1 ) ≡

 If j > 2 L−1
: RV erP oI + x0 ), bj/2c, {xi }L−1
k (d, h(xL + i=1 )
Else : RV erP oIk (d, h(x + 0
+ xL ), bj/2c, {xi }L−1
i=1 )

Figure 4.22: The Merkle Tree (MT ) construction for l = 2L input messages:
M = {m1 , . . . , m2L }.

As expected, the MT construction results in a correct and secure MT


scheme.

Theorem 4.4. The (keyless) Merkle Tree (MT ) construction (Fig. 4.22) is
a correct and secure MT scheme.

Proof sketch: Similar to Theorem 4.2.

179
MT extensions: arbitrary l and P oC. We simplified the MT construc-
tion by assuming that the number of messages in the input, l, is a power of two
(l = 2L ). Under this assumption, it is also easy to define P oC and V erP oC
functions; we leave this as an exercise.
It is a bit more challenging to extend the MT construction for arbitrary
number of messages l (not a power of two), and to define the appropriate P oC,
V erP oC functions, and we will not do it here. It is not very difficult, so readers
are encouraged to work out the details, and/or see published constructions, e.g.,
in [48, 61, 115].

Storage concerns. Computation of the Merkle-tree hash requires at least


storage of one hash value for each layer of the tree, which is particularly in-
convenient in hardware. This is one of the reason that this construction is not
used to construct CRHFs from compresssion functions.

Security concern: fragile security for collisions. There is another con-


cern with the use of Merkle tree, in particular for hashing: it is very vulnerable
to the discover of a collision in the underlying hash (or compression) function.
Given any collision of the underlying hash/compression function, we can find
collisions for the entire Merkle-tree - for any given prefix and suffix.

4.9 Blockchains, Proof-of-Work (PoW) and Bitcoin


In this section, we discuss one of the most exciting applications of crypto-
graphic hash functions: the blockchain. A blockchain is yet-another integrity
mechanism for sequences of entries - highly related to the digest-chain and
Merkle digest schemes; so far it appears that most, or maybe even all, pro-
posed blockchain schemes were based on the combination of a digest-chain
scheme and a Merkle digest scheme - or at least on one of the two.
The entries in the blockchain may contain different data, e.g., transactions.
In many applications, entries are signed, although this is not always the case.
One typical use of signed entries is when the blockchain provides an auditable
append-only log of signed transactions.
Often, blockchains are used to track transfer of objects (or payments) be-
tween entities; such log of transfer of objects or funds between parties, is often
referred to as a ledger; blockchains allow maintenance of public ledgers, allowing
everyone to audit and validate the ledger.

Anonymity? Many blockchain applications, e.g. Bitcoin, maintain some


level of anonymity, by identifying the recipient of an object (or payment) only
using their public validation key, rather than by name. For example, Alice,
whose (validation, signing) key pair is (A.v, A.s), may transfer object or pay-
ment x to an entity identified only by their public validation key v, by adding
entry SignA.s (v +
+ x).

180
Figure 4.23: A basic blockchain: sequence of digests ∆1 , ∆2 , . . . of a sequence
of blocks B1 = {m11 , m21 , . . .}, B2 = {m12 , . . .}, . . .

4.9.1 The blockchain digest scheme


Like the digest-chain and the Merkle digest schemes, the blockchain maintains
a digest of the sequence of entries, in a way which ensures immutability: it
is infeasible to add, remove, reorder or modify entries, without changing the
digest. We refer to this as the collision-resistance property of the blockchain.
We focus on the most common construction of a blockchain digest scheme,
which combines a digest-chain scheme with a Merkle digest scheme. Most
properties - and definitely the collision-resistance property - follow from the
corresponding properties of the digest-chain and Merkle-digest schemes, which,
in turn, are due to the collision-resistance of the underlying hash function.

Blocks. In a blockchain, as the name implies, entries are added in groups


called blocks, which are added to the blockchain sequentially (hence, a chain).
The mapping of entries to blocks is also immutable: it is infeasible to move
an entry from one block to another, while retaining the same digest. Even the
last entry of one block cannot be ‘moved’ to become the first entry of the next
block, or vice verse. This property can be useful, e.g., to map all operations in
the block to a particular period.
Often, blockchains use a Merkle scheme - often, the Merle tree - to compute
the digest within each block, and then use a digest-chain scheme, typically
the Merkle-Damgård scheme, to compute the digest of multiple blocks. See
Figure 4.23.
Blockchain digest schemes focus on ensuring the integrity/immutability of
the blockchain, and of particular entries in it. Like for CRHF and Merkle
digest schemes, integrity is validated using the digest, i.e., integrity is assured
with respect to a blockchain digest.
The blockchain digest scheme, like the other digest schemes (and like a
CRHF), does not include mechanisms to validate the digest itself. If attackers
may provide their own (fabricated) digest, then they can compute a digest
that will fit the entries they want. Namely, any entity may ‘add a block’ to
an existing blockchain, in the sense of computing the new digest, without any
restrictions or ‘controls’. See Figure 4.23.

181
Blockchain digest schemes, like Merkle digest schemes, support two addi-
tional validation mechanisms: Proof-of-Inclusion (PoI) and Proof-of-Consistency
(PoC).

Proof-of-Inclusion (PoI). A PoI proves that a particular entry is included


in the blockchain, and identifies its block and sequence number. However, there
is a significant difference in the inputs and outputs of the PoI, as defined in
the Merkle digest scheme and in the blockchain digest scheme:

Merkle scheme PoI is produced based on the entire sequence of messages,


although Validation does not require the entire set of messages. In MT
and most other efficient constructions, the PoI is of size O(n · log l) bits
and validation requires O(log l) hash operations.
Blockchain PoI require only the messages in the same block, and the digest
of the blockchain before this block is added; and validation requires only
a single hash operation.

Proof-of-Consistency(PoC). A PoC is used to validate that a newly re-


ceived digest, is of a sequence whose prefix is a previous digest. Again, there are
important differences in the inputs and outputs of the PoC, between blockchains
and Merkle schemes:
Merkle scheme Poc is produced based on both the original sequence of en-
tries, and the entries extending it. In MT and most other efficient con-
structions, the PoC is of size O(n · log l) bits and validation requires
O(log l) hash operations.
Blockchain Poc requires only the digest of the of the blockchain before this
block, and the entries in the new block. The PoC is typically only n bits
(one digest, essentially), and validation requires only one hash operation.

4.9.2 Controlled blockchains: permissioned and


permissionless
In the rest of this section, we present controlled blockchains - permissioned and
permissionless, the Bitcoin blockchain-based cryptocurrency, and finally, Proof-
of-Work (PoW), a crucial component of Bitcoin and many other blockchains,
usually implemented using hash functions.
Our discussion of blockchains so far did not involve any restrictions or
controls over the blocks added to a blockchain. However, many applications,
e.g. Bitcoin and other crypto-currencies, require a mechanism that controls
the addition of new blocks to the blockchain. In fact, when people refer to
‘blockchains’, they usually mean a controlled blockchain. Different control
mechanisms were proposed; they mostly fall into one of the following two cat-
egories:

182
Figure 4.24: Permissioned Blockchain: blocks added only with permission -
signature which is validated with known authority key vAuthority . If there are
multiple authorities, use consensus to avoid conflicts.

Figure 4.25: The permissionless Bitcoin Blockchain uses Proof-of-Work (PoW)


to control the addition of new blocks, in a process called mining.

Permissioned blockchains, where only specific, authorized parties can add


a block to the blockchain. In a typical permissioned blockchain, every
blockchain digest must be signed using the private signing key of (one or
more) authorities. Namely, the
Permissionless blockchains, where any party may, in principle, add a block
to the blockchain, as long as it ‘wins’ in a fair game. which limits the
issuing of new blocks. The two main control mechanisms are Proof-of-
Work (PoW) and Proof-of-Stake. The Bitcoin blockchain uses PoW, as
shown in Figure 4.25, and therefore we focus on PoW.

4.9.3 Proof-of-Work (PoW) schemes


In Bitcoin, issuing a new block requires a solution to a difficult computational
problem, namely, a Proof-of-Work (PoW). While this may be the only ap-

183
plication of PoW that we will explore in this volume, there are many more
applications of PoW in Cybersecurity, and we will discuss some of them in the
next volume [93], mainly with respect to Denial-of-Service attacks.
Intuitively, a PoW allows one party, the worker, to solve a challenge, with an
approximately known amount of computational resources, resulting in a proof
of this success, which can be efficiently verified by anyone.
Proofs-of-Work schemes belong in this chapter of hash functions, both due
to their use in permissionless blockchains and in particular Bitcoin, but also
since their most well known implementation - and the one used in Bitcoin - is
based on a hash function.
Notice that we used the general term ‘computational resources’ and not a
more specific term such as computation-time; the reason is that some PoW pro-
posals focus on other resources, e.g., storage, or on a combination of resources,
e.g., time and storage. However, from this point, let us focus on the case where
provision of the PoW requires an approximately known amount of computation
time. This is the most widely used form of PoW - and in particular, the one
used in Bitcoin, which, as we mentioned (and will show), is based on the use
of a cryptographic hash function.
Like our definitions of hash functions and of the schemes in this chapter,
PoW schemes may be keyed or keyless ; and like for the other schemes, we
focus on the keyless variant.
We define a PoW scheme with respect to a specific computational model
M and ‘challenge domain’ CD . As mentioned above, our definition is limited
to keyless PoW based on computational time.
Definition 4.18 (Proof of Work (PoW)). A PoW scheme PoW consists of
two efficient algorithms: [Link], and [Link].

[Link]: The [Link] algorithm receives three inputs: a challenge c ∈


CD , a randomizer r ∈ {0, 1}n , and a work-amount w ∈ N s.t. 1 ≤ w ≤ n.
The [Link] function outputs an n-bit binary string, to which we refer
as the solution.
PoW.V alidate: The PoW.V alidate algorithm has four inputs: the challenge c,
the randomizer r, the work-amount w (all as described for [Link]),
and a purported Proof-of-Work π ∈ {0, 1}∗ . The PoW.V alidate algo-
rithm returns true or false.
A PoW scheme PoW is correct, if the runtime of [Link](c, r, w), using
computational model M, is always at most w, and if for every challenge c ∈ CD
and randomizer r ∈ {0, 1}n holds:

PoW.V alidate(c, r, w, [Link](c, r, w)) = true (4.27)

A PoW scheme PoW is α-secure, for α ∈ (0, 1] if for any algorithm A


whose average runtime (using computational model M) on input (r, w), where

184
$
r ← {0, 1}n , is less than α · w, holds: εP oW
A,PoW (n) ∈ N EGL(n), where:

εP oW
A,PoW (n) ≡ Pr {(∃c ∈ CD )PoW.V alidate(c, r, w, A(c, r, w)) = true}
$
r ←{0,1}n
(4.28)

Proof-of-work mechanisms are often implemented using a cryptographic


hash function. A typical implementation, which is a simplification of the one
in Bitcoin, is the PoW scheme PoW B , defined as:
2n−1
PoW B .V alidate: On input (c, r, w, π), return true if h(c + + π) ≤
+r+ w .
Otherwise, return false.
PoW B .solve: On input (c, r, w), repeatedly compute x ≡ h(c +
+r+
+ π) for
$ 2n−1
different π ← {0, 1}n , aborting and returning π when x ≤ w .

We leave the properties of PoW B as an exercise to the reader.

Exercise 4.16. 1. Show if PoW B correct or not.


2. (harder) Present an example of a hash function h, which is collision re-
sistant, and yet for which PoW B is not α-secure, for any constant α > 0.
3. (harder) Show if PoW B is secure under the Random Oracle Model.

4.9.4 The Bitcoin Blockchain and Crytocurrency


Bitcoin is the most well-known permissionless blockchain - as well as the most
well-known cryptocurrency. Bitcoin, and other cryptocurrencies, are complex,
interesting and much studied. In particular, Bitcoin is probably the most
important application of PoW, which are key to the operation - and economics
- of Bitcoin, and in particular, of the Bitcoin mining process, which is the main
Bitcoin mechanism we discuss in this subsection.
Bitcoin is using the ‘classical’ combination of a Merkle-tree digest construc-
tion for computing a digest of each block, with the Merkle-Damgård digest-
chain construction for appending new blocks in a controlled manner, as illus-
trated in Figure 4.25. Bitcoin is a cryptocurrency; the entries in the blockchain
represent movement of funds. Funds, in Bitcoin, are associated with a public
(signature validation) key, rather than with an account number or other identi-
fier; in that sense, Bitcoin provides some level of anonymity, or, more precisely,
pseudonymity, since this mechanism does not prevent linkage between different
transactions using the same key.

The Bitcoin control mechanism: Mining. One of the innovations of


Bitcoin, is its permissionless mechanism for adding blocks to the blockchain -
which is also the mechanism used to gradually increase the total number of
available Bitcoins.

185
BitCoin mining uses the PoW B Proof-of-Work design, presented in subsec-
tion 4.9.3. As shown in Figure 4.25, if the previous (current) digest is ∆i−1 ,
then to add a block, the new digest ∆i must be signed by private key si ,
where the corresponding public key is included in a PoW solved by the entity
adding the new block. The PoW is computed over both the public key vi , and
the previous digest ∆i−1 . The inclusion of the previous digest, ensures that
the ‘competition’ for solving the PoW (and mining) may begin only once the
previous digest is known.
Of course, solving PoW requires considerable computational resources - in
fact, the wasting of these resources - including energy - is one of the criticisms
against Bitcoin (and in favor of permissioned blockchains). To motivate entities
to invest this effort and solve the PoW, Bitcoin offers an incentive: whenever a
key vi is included in the PoW, this key is ‘awarded’ a Bitcoin. This is also the
(only) process of adding new Bitcoins, which is the reason that this process is
referred to as mining, obviously with gold-mines more than, say, coal-mines in
mind. Another motivation is a transaction fee that may be charged from each
transaction in the block, and also given to the ‘block issuing public key’ vi .
The work-amount parameter w of the PoW (not shown in Figure 4.25), is
adjusted automatically by a clever feedback mechanism in Bitcoin, that ensures
that new blocks will be added at a reasonable, but not excessive, rate. The
idea is that as issuing of blocks becomes more valuable, the work-amount is
automatically increased, which reduces the economical incentive for mining,
and hence is expected to slow down mining.

Tracing funds in Bitcoin. One element of Bitcoin which is less advanced,


is the method of tracing funds. Namely, in order to identify which public key
owns which amount of Bitcoins, it is necessary to trace-back all the mining
operations to this public key and all the transactions to and from this public
key. Bitcoin is not taking advantage of more optimized ‘proof’ mechanisms,
such as the Proof-of-Inclusion (PoI) presented for the Merkle tree construction.
There are additional concerns and mechanisms of Bitcoin that we will re-
frain from discussing, such as the fact that multiple ‘first blocks’ may exist for
the chain at the same time - or how Bitcoin attempts to manage this concern.
Indeed, obviously, our discussion of Bitcoin, cryptocurrencies and blockchains,
is superficial. There are many publications, including books, that should be
used to study these areas further.

4.10 Cryptographic hash functions: additional exercises


Exercise 4.17 (XOR-hash). Consider messages of n blocks of l bits each,
n . Define hash function h for such l · n bit messages, as:
denoted m1 . . . mL
n
h(m1 . . . mn ) = i=1 mi . Show that h does not have each of the following
properties, or present a convincing argument why it does:
1. Collision-resistance.

186
2. Second-preimage resistance.
3. One-wayness (preimage resistance)
4. Randomness extraction.
5. Secure MAC, when h is used in the HMAC construction.

Solution to part 4 (randomness extraction): consider even n, i.e., n = 2µ


for some integer µ, and random messages of the form m1 + + ... +
+ mn where
for every i = 1 . . . holds m2i−1 = m2i . Clearly, this set of messages has lots of
randomness (in fact m·l random bits); however the output of h would be always
zero, i.e., completely deterministic and predictable. Hence h is not randomness
extracting.

Exercise 4.18. Let h be a ‘compression function’, i.e., a cryptographic hash


function whose input is of length 2l and output is of length l. Let h0 : {0, 1}2l·n →
{0, 1}l extend h to inputs of length 2l · n, as follows: h0 (m1 + + ... +
+ mn ) =
L n
i=1 h(m i ), where (∀i = 1, . . . , n)|mi | = 2l. For each of the following prop-
erties, assume h has the property, and show that h0 may not have the same
property. Or, if you believe h0 does retain the property, argue why it does. The
properties are:

1. Collision-resistance.
2. Second-preimage resistance.
3. One-wayness (preimage resistance)
4. Randomness extraction.

Would any of your answers change, if h and h0 have a random public key as
an additional input?

Exercise 4.19. Consider messages of 2n blocks of l bits each, denoted m1 . . . mn ,


and let hc be a secure compression function, i.e., a cryptographic hash function
from 2n bits to l bits. DefineLhash function h for such 2n blocks of l bits mes-
n
sages, as: h(m1 . . . m2n ) = i=1 hc (m2i , m2i−1 ). Show that h does not have
each of the following properties, although hc has the corresponding property, or
present a convincing argument why it does:
1. Collision-resistance.
2. Second-preimage resistance.
3. One-wayness (preimage resistance)
4. Randomness extraction.
5. Secure MAC, when h is used in the HMAC construction.

187
Exercise 4.20. It is proposed to combine two hash functions by cascade, i.e.,
given hash functions h1 , h2 we define h12 (m) = h1 (h2 (m)) and h21 (m) =
h2 (h1 (m). Suppose collision are known for h1 ; what does this imply for colli-
sions in h12 and h21 ?

Exercise 4.21. Recently, weaknesses were found in few cryptographic hash


functions such as hM D5 and hSHA1 , and as a result, there were many proposals
for new functions. Dr. Simpleton suggests to combine the two into a new
function, hc (m) = hSHA1 (hM D5 (m)), whose output length is 160 bits. Prof.
Deville objects; she argued that hash functions should have longer outputs, and
suggest a complex function, h666 , whose output size is 666 bits. A committee
setup to decide between these two, proposes, instead, to XOR them into a new
function: fX (m) = [0506 ++ hc (m)] ⊕ h666 (m).

1. Present counterexamples showing that each of these may not be collision-


resistant.
2. Present a design where we can be sure that finding a collision is definitely
not easier than finding one in hSHA1 and in hc .
3. Repeat both parts for randomness-extraction.

Exercise 4.22. Let h be the result of a Merkle hash tree, using a compression
function comp : {0, 1}2n → {0, 1}n , and let (KG, S, V ) be a secure (FIL)
signature scheme. Let Ssh (m) = Ss (h(m)) follow the ‘hash then sign’ paradigm,
to turn (KG, S, V ) into a VIL signature scheme, i.e., allow signatures over
messages of arbitrary number of blocks. Show that Ssh is not a secure signature
scheme, by presenting an efficient adversary (program) that outputs a forged
signature.
Exercise 4.23. Consider the following slight simplification of the popular
HMAC construction: h0k (m) = h(k + + h(k ++ m)), where h : {0, 1}∗ → {0, 1}n is
a hash function, k is a random, public n-bit key, and m ∈ {0, 1}∗ is a message.
1. Assume h is a CRHF. Is h0k also a CRHF?
 Yes. Suppose h0k is not a CRHF, i.e., there is some adversary A0 that
finds a collision (m01 , m02 ) for h0 , i.e., h0k (m01 ) = h0k (m02 ). Then at least
one of the following pairs of messages (m1,1 , m2,1 ), (m1,2 , m2,2 ) is a colli-
sion for h, i.e., either h(m1,1 ) = h(m2,1 ) or h(m1,2 ) = h(m2,2 ) (or both).
The strings are: m1,1 = , m1,2 = ,
m2,1 = , m2,2 = .
 No. Let ĥ be some CRHF, and define h(m) = .
Note that h is also a CRHF (you do not have to prove this, just to design h
so this would be true and easy to see). Yet, h0k is not a CRHF. Specifically,
the following two messages m01 = , m02 =
0 0 0
are a collision for hk , i.e., hk (m1 ) = hk (m2 ).

188
2. Assume h is an SPR hash function. Is h0k also SPR?
 Yes. Suppose h0k is not SPR, i.e., for some l, there is some algorithm
A0 which, given a (random, sufficiently-long) message m0 , outputs a col-
lision, i.e., m01 6= m0 s.t. h0k (m0 ) = h0k (m01 ). Then we define algorithm A
which, given a (random, sufficiently long) message m, outputs a collision,
i.e., m1 6= m s.t. hk (m) = hk (m1 ). The algorithm A is:
Algorithm A(m):
{
Let m0 =
Let m01 = A0 (m0 )
Output
}
 No. Let ĥ be some SPR, and define h(m) = . Note
that h is also an SPR (you do not have to prove this, just to design h so
this would be true and easy to see). Yet, h0k is not an SPR. Specifically,
given a random message m0 , then m01 = is a collision,
i.e., m0 6= m01 yet h0k (m01 ) = h0k (m02 ).
3. Assume h is a OWF. Is h0k also a OWF?
 Yes. Suppose h0k is not OWF, i.e., for some l, there is some algorithm
A0 which, given h0k (m0 ) for a (random, sufficiently-long) message m0 ,
outputs a preimage, i.e., m01 6= m0 s.t. h0k (m0 ) = h0k (m01 ). Then we define
algorithm A which, given h(m) for a (random, sufficiently long) message
m, outputs a preimage, i.e., m1 s.t. hk (m) = hk (m1 ). The algorithm A
is:
Algorithm A(m):
{
Let m0 =
Let m01 = A0 (m0 )
Output
}
 No. Let ĥ be some SPR, and define h(m) = . Note
that h is also an SPR (you do not have to prove this, just to design h so
this would be true and easy to see). Yet, h0k is not an SPR. Specifically,
given a random message m0 , then m01 = is a collision,
i.e., m0 6= m01 yet h0k (m01 ) = h0k (m02 ).
4. Repeat similarly for bitwise randomness extraction.
Exercise 4.24. Consider the following construction: h0k (m) = h(k + + m),
where h : {0, 1}∗ → {0, 1}n is a hash function, k is a secret n-bit key, and
m ∈ {0, 1}∗ is a message. Assume you are given some SPR hash function
ĥ : {0, 1}∗ → {0, 1}n̂ ; you can use n̂ which is smaller than n. Using ĥ, construct
hash function h, so that (1) it is ‘obvious’ that h is also SPR (no need to prove),
yet (2) h0k (m) = h(k + + m) is (trivially) not a secure MAC. Hint: design h s.t.
it becomes trivial to find k from h0k (m) (for any m).

189
1. h(x) = .
2. (Justification) h is an SPR since .
3. (Justification) h0k (m) = h(k+
+m) is not a secure MAC since .

Exercise 4.25 (HMAC is secure under ROM). Show that the HMAC con-
struction is secure under the Random Oracle Model (ROM), when used as a
PRF, MAC and KDF.

Exercise 4.26 (HMAC is insecure using CRHF). Show counterexamples show-


ing that even if the underlying hash function h is collision-resistant, its (sim-
plified) HMAC construction hmack (x) = h(k + + h(k + + m)) is insecure when
used as any of PRF, MAC and KDF.
Exercise 4.27 (Hash-tree with efficient proof of non-inclusion). The Merkle
hash-tree allows efficient proof of inclusion of a leaf (data item) in the tree.
Present a variant of this tree which allows efficient proof of either inclusion
or of non-inclusion of an item with given ‘key’ value (where each item consist
of a key and data; there may be multiple items with the same key). Assume
all data items (keys and values) are given together - no need to build the tree
dynamically. Your solution may ‘expose’ the another (key, value) pair beyond
the one queried, or of two other values, for a ‘proof of non-inclusion’. Note: try
to provide solution which is efficient in number of hash operations required for
verification (the number should be about one more than in the regular Merkle
tree).

190
Chapter 5

Shared-Key Protocols

5.1 Introduction
In the previous two chapters, we discussed cryptographic schemes, which con-
sist of one or more functions, with security criteria. For example, MAC, PRF
and PRG schemes consist of a single function, with criteria such as security
against forgery (Def. 3.1). Similarly, encryption schemes consist of multiple
functions (encryption, decryption and possible key generation), with criteria
such as CPA-indistinguishability (CPA-IND, Def. ??).
Cryptographic schemes are very useful, but rarely as a single, stand-alone
function. Usually, cryptographic schemes are used as a part of different proto-
cols, which involve multiple exchanges of messages among a set P of parties,
often only two, e.g., P = {Alice, Bob}. There are many types of protocols,
including many related to cryptography, and, more generally, security.

What is a protocol? A protocol is a set of algorithms, one algorithm πp for


each party p ∈ P . Each party p ∈ P is also associated with a state sp , which
may change over time, in discrete events; we denote the state of p ∈ P at time
t by sp (t).
An event is triggered by some input signal x, and results in some output
signal y. We use sp (t− ) for the state of p just before an event at time t, and
sp (t+ ) for the state of p after the event. The output signal and the new state
are the result of applying πp , the algorithm of the protocol at party p, to the
input signal and previous state, i.e.:

(y, sp (t+ )) ← πp (x, sp (t− )) (5.1)

An execution of the protocol is a sequence of events, each associated with a


specific participant p ∈ P and specific time, where the time of events is strictly
increasing - we assume events never occur exactly at the same time.

191
Note 5.1: Alternatives to global, synchronous time

We focus on applied, simpler models; in particular, we use a simple, synchronous


model for time and protocols. Specifically, to use the notation sp (t) for the state at p
and time t, we assume that events, in all parties, are ordered along some time-line.
Furthermore, in some protocols, we also assume synchronous clocks, where each
party has access to the current value of the time t. There are many academic works
which study more complex and less intuitive models. In particular, in reality, clocks
are not precisely synchronized; indeed, ensuring good clock synchronization is not
trivial, and doing it securely, in spite of different attacks, is even more challenging.
Even the assumption that all events can be mapped along the same ‘universal time-
line’ is arguably incompatible with relativistic physics; there is extensive research
on ‘local time’ models, where events in different locations are only partially ordered.

5.1.1 Inputs and outputs signals


It is convenient to define a protocol operation for each type of input, e.g.,
an initialization IN IT signal. Most input signals include an associated input
value; for example, IN IT (κ) may indicate initialization with shared secret κ.
Another typical input signal is IN (µ), which signals receipt of message µ from
the network.
Similarly, it is often convenient to consider the output of the protocol as one
or more (type, value) pairs of output signals. For example, OU T (µ, d) signals
that the protocol outputs message µ, to be sent by the underlying network to
destination d ∈ P .
Many protocols assume synchronized clocks. This is facilitated by a W AKEU P
input signal, invoked periodically, or upon termination of a ‘sleep’ period, re-
quested by the protocol in a previous SLEEP (t) output signal.
Specific protocols have different additional input and output signals. These
are mostly used for the input and output for the services provided by π. For
example, handshake protocols secure the setup and termination of sessions,
using OP EN and CLOSE input signals to open/close a session, and providing
U P and DOW N output signals to indicate the state of a session; see Table ??.
In contrast, session protocols secure the transmission of messages, and hence
have a SEN D(m, σ) input signal, invoked to request the protocol to send a
message m over session σ, and a DELIV ER(m, σ) output signal, to deliver a
message m received over session σ.
Conventions: We use italics, e.g., m, for protocol-specific inputs and out-
puts, such as messages sent and received using the protocol, and Greek font,
e.g., µ, for ‘lower-layer’ messages, that are output by the protocol (in OU T (µ, d)
events) or that the protocol receives as input from the network (in IN (µ)
events). We write signal names in capital letters.

192
5.1.2 Focus: two-party shared-key handshake and session
protocols
In this chapter we discuss shared-key protocols, i.e., protocols between two
parties, P = {Alice, Bob}. Both parties are initialized with a shared symmetric
key κ. In the following chapters, we discuss protocols using public keys, in
addition to protocols using shared keys.
A session begins by executing a handshake protocol, which involves authen-
tication of one or both parties. The handshake protocol often also allows the
parties to agree on a shared session key. Handshake protocols are the main
topic of this chapter. A successful handshake is usually followed by exchange of
one or more messages between the parties, using a session protocol. We discuss
session protocols in subsection 5.1.4.
Handshake and session protocols are usually quite simple and easy to un-
derstand - but this simplicity can be deceptive, and many attacks have been
found on proposed, and even widely deployed, protocols. Most attacks are sur-
prisingly simple in hindsight. Furthermore, most attacks - esp. on handshake
protocols - do not involve exploiting vulnerabilities of the underlying crypto-
graphic schemes such as encryption and MAC. Instead, these attacks exploit
the fact that designers did not use the schemes correctly, e.g., used schemes
for goals they are not designed for - e.g., assuming that encryption provides
authentication. There have been many attacks on such insecure, vulnerable
handshake protocols, in spite of their use of secure cryptographic primitives.
Therefore, it is critical to clearly define the adversary model and conservative
security goals, and to have proof of security under reasonable assumptions - or,
at least, careful analysis.

5.1.3 Adversary Model


Most of the works on design and analysis of cryptographic protocols, and in
particular of handshake and session protocols, adopt a Monster in the Middle
(MitM) attack model. A MitM1 attacker has eavesdropper capabilities, i.e., it
is exposed to the contents of every message µ sent over the network by any
party in an OU T (µ, d) signal to any intended recipient d ∈ P .
In addition to the eavesdropper capabilities, a MitM attacker also actively
controls the communication. Namely, the adversary has complete control over
the occurrence and contents of IN (µ) signals. This allows the attacker to drop,
duplicate, reorder or modify messages, including causing receipt of messages µ
which were never sent.
In addition, we usually assume that the attacker has complete or significant
control over other input events, as well as receiving all or much of the output
events. In fact, to define integrity-properties, such as message or handshake
authentication, we usually allow the adversary complete control over all the
1 Many works refer to the MitM model as Man in the Middle, but we prefer the term

Monster in the Middle. This allows us to use the pronoun ‘It’ to refer to the attacker, in
contrast to Alice and Bob.

193
input to the protocol (e.g., OP EN signals), and provide the adversary with
all the outputs of the protocol. For example, see Definition 5.2, which defines
mutual-authentication for handshake protocols.
In contrast, to define confidentiality properties, we slightly restrict the ad-
versary’s access to output events, and control over input events. For example,
we define IND-CPA security for session protocols by allowing the attacker to
choose arbitrary ‘plaintext’ messages m, as well as to select a pair of special
‘challenge’ messages m0 , m1 . We then select a random bit b ∈ {0, 1}. The
attacker then observes the encoding of the chosen messages m as well as of
the ‘challenge’ message mb . Finally, the attacker guesses the value of b. This
essentially extends the IND-CPA security definition for encryption schemes
(Definition ??).
Finally, in protocols involving multiple parties, we may also let the adver-
sary control some of the parties running the protocols; this is less relevant to
two-party protocols which are our focus in this chapter. We will, however,
discuss scenarios where the attacker may have access to some of the storage,
including keys, used by some of the parties.

5.1.4 Secure Session Protocols


A secure session transmission protocol defines the processing of messages as
they are sent and received between two parties, over an insecure connection;
such protocols are often referred to simply as session protocol or as record proto-
col (since they specify handling of a single message, also referred to as a record).
Session transmission protocols use a secret key shared between the parties; in
principle, one could also use public-private key pairs, however, public-key cryp-
tography is much more computationally expensive, hence, the use of public-keys
is normally limited to the session-setup (handshake) protocols, used to initiate
the session - and sometimes also periodically, to refresh the keys and begin a
new instance of the session transmission protocol.
Session transmission protocols are a basic component in any practical sys-
tem for secure communication over insecure channels, such as the Transport-
Layer Security (TLS) and Secure Socket Layer (SSL) protocols, used to protect
many Internet applications including web communication, and the IPsec (IP
security) protocol, used to protect arbitrary communication over the Internet
Protocol (IP).
A session consists of three phases: it is opened, then active and finally
closed, with indication of normal or abnormal (error) termination. Messages
are sent and received only when a session is active. We use these concepts to
(informally) define the main security goals of session protocols.

Definition 5.1 (Goals of session protocols). Confidentiality. The confiden-


tiality goal is normally similar to the definition for encryption schemes.
Namely, the adversary chooses two challenge messages m0 , m1 , and should
not be able to identify which of the two was randomly chosen and sent,
with probability significantly better than half - similarly to the CPA-indistinguishability

194
(CPA-IND, Def. ??) test. The main difference is that we allow a stateful
protocol instead of stateless scheme.
Message authentication: universal forgery capabilities. To test a pro-
tocol for ensuring message authentication, we would allow the adversary
to ask parties to send arbitrary messages. The MitM attacker should not
be able to cause receipt of a message that it never asked a party to send
- similarly to the security against forgery (Def. 3.1) test.
Session mapping. There is a one-to-one mapping from every session where
one party sent or received a message, to a session at the corresponding
other party, s.t. the messages received in each of the two sessions session,
are identical, in content and order, to the corresponding messages sent
in the corresponding session.
Detecting truncation. If a session at Alice is mapped to a session at Bob,
but Bob sent some messages that Alice did not receive, then the session
at Alice did not yet terminate - or, terminated with failure indication.
The obvious and common way to ensure secure order is by authenticat-
ing a sequence number Seq, or some other mechanism to prevent re-ordering,
typically, as part of the authentication of the message. Often, the underlying
communication channel preserves order of messages, and hence the sender and
recipient can maintain Seq and it is not necessary to include Seq explicitly
in the messages; for example, this is done in SSL/TLS. In other cases, the
sequence number is sent in the clear, i.e., authenticated but not encrypted;
this allows ordering and identification of losses - even by intermediate agents
who cannot decrypt. Other fields that are often sent authenticated but ‘in the
clear’, i.e., not encrypted, include the sender and recipient addresses; usually,
these fields are grouped together, and referred to as the header.
Note that ensuring secure order, by authentication of messages and sequence
numbers, does not suffice to prevent truncation, i.e., loss of the very last mes-
sages from from one party to the other. To prevent truncation, the session
protocol usually terminates with a special ‘termination exchange’, allowing a
party to detect when an attacker dropped some message from its peer.
In addition to the security goals, session-transmission protocols have the
following reliability and efficiency goals:
Error detection and/or correction, typically using Error Detection Code
(EDC) such as Checksum, or Error Correction Code (ECC), such as
Reed–Solomon codes. These mechanisms are designed against random
errors, and are not secure against intentional, ‘malicious’ modifications.
Note that secure message authentication, such as using MAC, also ensures
error detection; however, as we explain below, it is often desirable to also
use the (insecure) error detection codes.
Compression, to improve efficiency by reducing message length, usually pro-
vided by applying a compression code. As we explain below, this require-
ment may conflict with the confidentiality requirement.

195
Message

Compress

Plaintext

Encrypt

Header Ciphertext Seq

MAC
Header Ciphertext tag

Code
Header Ciphertext tag ECC

Figure 5.1: Secure Session Sending Process: Security, Reliability and Compres-
sion.

Fig. 5.1 presents the recommended process for combining these functions,
justified as follows:
• Compression is only effective when applied to data with significant redun-
dancy; plaintext is often redundant, in which case, applying compression
to it could be effective. In contrast, ciphertext would normally not have
redundancy. Hence, if compression is used, it must be applied before en-
cryption. Note, however, that this may conflict with the confidentiality
requirement, as we explain below; for better confidentiality, avoid com-
pression.
• Encryption is applied next, before authentication (MAC), following the
‘Encrypt-then-Authenticate’ construction. Alternatively, we may use an

196
authenticated-encryption with associated data (AEAD) scheme, to com-
bine the encryption and authentication functions. Notice that by ap-
plying authentication after encryption or using an AEAD scheme, we
facilitate also authentication of a sequence-number or similar field used
to prevent re-play/re-order/omission, which is often known to recipient
(and not sent explicitly); we can also authenticate ‘header’ fields such as
destination address, which are not encrypted since they are used to pro-
cess (e.g., route) the encrypted message; see Fig. 5.1. The Encrypt-then-
Authenticate mode also allows prevention of chosen-ciphertext attacks
and more efficient handling of corrupted messages.
• Finally, we apply error correction / detection code. This allows efficient
handling of messages corrupted due to noise or other benign reasons.
An important side-benefit is that authentication failures of messages to
which errors were not detected imply an intentional forgery attack - an
attacker made sure that the error-detecting code will be correct.

Compress-then-Encrypt Vulnerability Note that there is a subtle vul-


nerability in applying compression before encryption, since encryption does not
hide the length of the plaintext, while the length of compressed messages de-
pends on the contents. In particular, a message containing randomly-generated
strings typically does not compress well (length after compression is roughly as
long as before compression), while messages containing lots of redundancy, e.g.,
strings composed of only one character, compress well (length after compression
is much shorter). This allows an attacker to distinguish between the encryp-
tions of two compressed messages, based on the redundancy of the plaintexts;
see next exercise, as well as Exercise 5.13. This vulnerability was exploited in
several attacks on the TLS protocol including CRIME, TIME and BREACH.
Exercise 5.1. Let (Enc, Dec) be an IND-CPA secure encryption scheme, and
let Enc0k (m) = Enck (Compress(m)), where Compress is a compression func-
tion. Show that Enc0 is not IND-CPA secure.
Sketch of Solution: See chapter 10.

Organization of the rest of this chapter In § 5.2 we discuss shared-key


authentication-handshake protocols, which provide mutual entity authentication
of the two parties communicating - or only of one of them. In § 5.3 we discuss
session-authentication handshake protocols, which provide message/session au-
thentication in addition to mutual authentication, namely, they authenticate
messages exchanged as part of the handshake, in addition to authenticating
the parties.
In § 5.4, we discuss key-setup handshake protocols, which provide, in addi-
tion to authentication, also a shared session key k. The session key k is later
used by a session protocol to protect the communication between the parties.
In § 5.5 we discuss Key Distribution protocols, where keys are set up with
the help of a third party. We mostly focus on the GSM security protocols,

197
Table 5.1: Signals of shared-key entity-authenticating handshake protocols, for
session σ. The first three signals are common to most protocols; the rest are
specific to handshake protocols.

Signal Type Party Meaning


IN IT (κ) Input Both Initialize with shared key κ
IN (µ) Input Both Message µ received via the network
OU T (µ, d) Output Both Send message µ to d ∈ P via network
OP EN (pR ) Input Initiator Open session to pR ∈ P
BIN D(p, σ) Output Both Begin handshake to p ∈ P ; identifier σ
U Pρ (σ) Output ρ Session σ is up at ρ ∈ {I, R} (Initiator, Responder)
CLOSE(σ) Input Both Close session σ
DOW Nρ (σ, ξ) Output ρ Session σ terminated; outcome ξ ∈ {OK, F ail}

which provide client-authentication and keying for confidentiality for the GSM
mobile network. We discuss two serious vulnerabilities of the GSM handshake:
replay attacks, facilitated by client-only authentication, and downgrade attacks,
facilitated by insecure ciphersuite negotiation.
In § 5.6, we discuss variants designed to provide (limited) resiliency to
exposures of secret information (keys).

5.2 Shared-key Entity-Authenticating Handshake


Protocols
Shared-key Entity-Authenticating Handshake Protocols authenticate one party
to another, or mutually-authenticate both parties; in both cases, authentication
is done using a key κ shared between the parties (in the IN IT (κ) signal).
The handshake is usually initiated at one party by a request from the user
or application; we refer to this party as initiator (I). In the other party, the
responder (R), the handshake usually begins upon receiving the first message
µ, via the network, from the initiator.

5.2.1 Shared-key entity-authenticating handshake protocols:


signals and requirements
The application at the initiator initiates a new handshake session by invoking
the OP EN (pR ) input signal of the handshake protocol. The protocol at the
initiator responds to the OP EN (pR ) request by a BIN D(pR , σ) output signal,
where σ is a session identifier used to refer to this session.
The initiator also begins the exchange of protocol-messages with the respon-
der, with the initiator pI sending a message µI using the OU T (µI , pR ) output
event, and the responder pR sending back a message µR using OU T (µR , pI ).
If the adversary allows the exchange, this results in IN (µI ) and IN (µR ) input
events, at the responder pR and initiator pI , respectively.

198
If the messages are exchanged correctly, the responder pR will also output a
BIN D(pI , σ) signal, indicating the successful initiation of a handshake session
by initiator pI , where σ is the same session identifier as in the corresponding
BIN D(pR , σ) output signal previously output by the initiator pI to refer to
this session.
The session identifier σ is used in following input and output signals to refer
to this particular session, similarly to a socket/file handles in used in the socket
API. For simplicity, assume that session identifiers are unique, i.e., the protocol
never opens two sessions with the same identifier σ, not even consecutively.
Following the BIN D signal, the protocol at both parties continues to ex-
change messages, using OU T signals, to perform the authentication. When the
entity authentication is successful at a party ρ ∈ I, R, then the protocol invokes
the U Pρ (σ) output signal, indicating that the session is active. Messages are
sent, usually using a session protocol, only while the session is active.
The protocol outputs signal DOW Nρ (σ, ξ) to indicate termination of ses-
sion σ at party ρ ∈ I, R. The ξ ∈ {OK, F ail} signals successful (OK) or
unsuccessful (Fail) termination. Successful termination is always the result of
a previous CLOSE(σ) input signal, requesting the protocol to terminate ses-
sion σ; unsuccessful (Fail) termination is due to failure, and does not require a
previous CLOSE input signal.

Security of shared-key authentication-handshake protocol. We next


define security of shared-key authentication-handshake protocols. A protocol
ensures responder authentication if every session which is active (UP) at the
initiator, is at least OP EN ed at the responder. Initiator authentication re-
quires the ‘dual’ property for the responder. The following definition attempts
to clarify these notions, hopefully without excessive formalization.
Definition 5.2 (Mutually entity-authenticating shared-key handshake proto-
col). An execution of a handshake protocol is said to be responder-authenticating
if whenever any initiator pI ∈ P signals U PI (σ), there was a previous BIN D(pR , σ)
output signal at some responder pR ∈ P . Similarly, an execution is said to be
initiator-authenticating if whenever some responder pR ∈ P signals U PR (σ),
there was a previous BIN D(pR , σ) signal at some initiator pI ∈ P .
A two-party shared key handshake protocol ensures responder (initiator)
authentication, if for every PPT adversary ADV, the probability of an execution
not to be responder (respectively, initiator) authenticating is negligible (as a
function of the length of the shared key κ).
A handshake protocol is mutually authenticating if it ensures both responder
and initiator authentication.

The rest of this section deals with secure and vulnerable designs of mutual-
authentication handshake protocols. Achieving a secure design is not very
difficult, however, it has to be done carefully, since it is also easy to come up
with a vulnerable design.

199
Note 5.2: More rigorous treatment and sessions without explicit identifiers

There is a wealth of literature and research on the topics of authentication, key


setup and key distribution protocols, which is far beyond our scope; some of the
basic works include [14, 20, 23, 27, 28]. In particular, many formal works on secure
handshake protocols avoid the assumption of explicit session identifiers. This is
obviously more general, and also somewhat simplifies the definitions of the signals.
However, practical protocols do use explicit session identifiers, which are ‘practi-
cally unique’; we find that requiring the protocol to produce such unique identifiers
simplifies the definitions and discussion of security.

Consecutive vs. concurrent authentication Support for multiple con-


current handshakes is a critical aspect of handshake protocols. Some protocols
only allow consecutive handshakes between each pair of initiator and responder.
We refer to such protocols as consecutive handshake protocols, and to protocols
that allow concurrent handshakes as concurrent handshake protocols.
One can transform a concurrent handshake protocol into a consecutive
handshake protocol, by simply blocking OP EN requests until the correspond-
ing DOW N response. We say that a (potentially concurrent) handshake proto-
col ensures consecutive mutual authentication if the consecutive version of the
protocol ensures mutual authentication. Consecutive initiator and responder
authentication are defined similarly.
It is easier to ensure consecutive authentication - there are significantly
less ways to attack it. However, realistic protocols are usually designed and
used with such restriction, i.e., allowing concurrent handshakes. This makes
sense; applications sometimes use concurrent parallel sessions between the same
two parties, and furthermore, concurrent handshakes are necessary to prevent
‘lock-out’ due to synchronization-errors (e.g., lost state by Initiator), or an
intentional ‘lock-out’ by a malicious attacker, as part of a denial-of-service at-
tack. In any case, the consecutive handshakes restriction is not really essential;
therefore, the conservative design principle (Principle 7) indicates we should
always assume concurrent handshakes are possible.
When designers neglect to consider the threats due to concurrent sessions,
yet the protocol allows concurrent sessions, the result is often a vulnerable
protocol. This is a typical example of the results of failures to articulate the
requirements from the protocol and the adversary model. We next present
an example of such vulnerability: the SNA handshake protocol. We believe
that the designers of the SNA handshake protocol considered only sequential
handshakes, although SNA implementations allows concurrent sessions.

5.2.2 The (insecure) SNA entity-authenticating handshake


protocol
As a simple, yet realistic, example of a two-party, shared key mutual authenti-
cation handshake protocol, consider the SNA handshake protocol. IBM’s SNA

200
Figure 5.2: SNA two-party mutual authentication handshake protocol

(Systems Network Architecture) was the primary networking technology from


1974 till the late 1980s, and is still in use by some ‘legacy’ applications. Security
in SNA is based on a ‘master key’ k shared between the two communicating
parties.
We describe the insecure version of the SNA entity-authenticating hand-
shake protocol, and later its replacement - the 2PP entity-authenticating hand-
shake protocol. Both protocols use a shared secret key k, to (only) authenticate
two parties to each other, without deriving a session key. We first explain the
protocol, illustrated in Fig. 5.2, and then discuss its security.
The SNA handshake protocol operates in three simple flows, as illustrated
in Fig. 5.2. The protocol uses a block cipher E. The initiator, say Alice, sends
to its peer, say Bob, her identifier, which we denote A, and NA , a random l-bit
binary string which serves as a challenge (nonce). Here, l is the size of the
inputs and outputs to a block cipher E used by the protocol.
The responder, say Bob, replies with a ‘proof of participation’ Ek (NA ),
using the pre-shared key k. Bob also sends his own random l-bit challenge
(nonce) NB . In this description we omitted the mechanism by which Alice
matches the response to the request; this may be by Alice including a session
identifier σ, which Bob includes in the response, or by Bob including the nonce
NA , which he received from Alice, in his response.
Upon receiving Bob’s response, Alice validates that the response contains
the correct function Ek (NA ) of the nonce that it previously selected and sent. If
so, Alice concludes that it communicates indeed with Bob. Alice then completes
the handshake by sending its own ‘proof of participation’ Ek (NB ). Again we
omitted the mechanism matching this message to the rest of the session, such
as a session identifier σ.

201
Finally, Bob similarly validates that it received the expected function Ek (NB )
of its randomly selected nonce NB , and concludes that this handshake was ini-
tiated by Alice.
Both parties, Alice and Bob, signal successful completion of the handshake,
using U P (σ), (only) upon receiving the expected response (Ek (NA ) for Alice
and Ek (NB ) for Bob).
Readers are encouraged to transform the description above and in Fig. 5.2
to pseudocode.

SNA handshake ensures (only) consecutive mutual authentication.


The simple SNA handshake of Fig. 5.2 is secure if restricted to consecutive
handshake - but is vulnerable allowing concurrent handshakes, as shown in the
following exercises.
Let us explain why the protocol ensures consecutive mutual authentica-
tion, i.e., mutual authentication, when restricted to a single-concurrent-session.
Suppose, first, that Alice completes the protocol successfully. Namely, Al-
ice received the expected second flow, Ek (NA ). Assume that this happened
without Bob previously receiving NA as first flow from Alice (and sending
EK (NA ) back). Due to the consecutive restriction, Alice surely did not com-
pute EK (NA ). Hence, the adversary must have computed EK (NA ), contra-
dicting the PRP assumption for E. Note that an eavesdropping attacker may
collect such pairs (NA , Ek (NA ) or NB , Ek (NB )), however, since NA , NB are
quite long strings (e.g., 64 bits), the probability of such re-use of same NA , NB
is negligible.

Exercise 5.2 (SNA handshake fails to ensure concurrent mutual authenti-


cation). Show that the SNA handshake protocol does not ensure concurrent
mutual authentication.

Sketch of solution: see Fig 5.3.


Exercise 5.3. Does the SNA handshake protocol ensure concurrent responder
authentication? Explain, using a sequence diagram.
Hint for solution: recall that the attacker is allowed to instruct an initiator
to OP EN sessions. In the attack, it suffices to show how Eve causes Alice
to think she had two sessions with Bob, while Bob was involved only in one
session. Finding the right sequence to cause this is the main challenge in this
question.
Solution: See chapter 10 for a possible sequence diagram.
Note that the SNA handshake fails even to ensure consecutive authentica-
tion, if E is a (shared key) encryption scheme rather than a block cipher.
Exercise 5.4. Assuming that Alice and Bob each participate in no more than
one session at a given time. Show that the SNA authentication protocol (Fig. 5.2)
may not ensure (even) consecutive mutual authentication, if the function E is
implemented using an (IND-CPA secure) symmetric encryption scheme E.

202
Figure 5.3: Example of SNA handshake failing to ensure concurrent initiator
authentication. Eve pretends to be Alice, who is not present at all during the
communication. To do this, Eve uses two connections.

Hint: Consider the different modes-of-operation constructions of an encryp-


tion scheme from a block cipher, and encryption of a single-block message (such
as NA or NB ). For one or more of these, the attacker could impersonate as
Alice by sending NA , receiving NB and Ek (NA ), and then computing Ek (NB )
by using only this information - without knowing the key k!
Solution: See chapter 10.
Before we present secure protocols, it is useful to identify weaknesses of the
SNA protocol, exploited in the attacks on it, and derive some design principles:

• Encryption and block-cipher (PRP) do not ensure authenticity. To au-


thenticate messages, use a MAC function!
• The SNA protocol allowed redirection: giving to Bob a value from a
message which was originally sent - in this case, by Bob himself - to
a different entity (Alice). To prevent redirection, we should identify the
party in the challenge, or even better, use separate keys for each direction.
• Prevent replay and reorder. The SNA attack sent to Bob a first flow,
which contained part of a message sent (by Bob) to Alice, during the
second flow. To prevent this, the protocol should identify flow.

5.2.3 2PP: Three-Flows authenticating handshake protocol


We next present 2P P , a secure two party shared-key authenticating handshake
protocol; the name 2P P simply stands for two party protocol. The 2PP protocol
was a replacement to the SNA handshake protocol, proposed in [28].
Following the weaknesses identified above in the SNA handshake protocol,
the 2PP handshake follows the following design principles:

203
Figure 5.4: 2PP: Three-Flows Authenticating Handshake Protocol.

Use the correct cryptographic mechanism needed: Specifically, 2PP uses


Message Authentication Code (MAC) for authentication, rather than re-
lying on block ciphers or encryption, as done by the SNA authentication
protocol.
Prevent redirection: Include identities of the parties (A,B) as part of the
input to the authentication function (MAC).
Prevent reorder: Separate 2nd and 3rd flows: 3 vs. 2 input blocks.
Prevent replay and do not provide ‘oracle’ to attacker: Authenticated
data (input to MAC) in a flow always includes a random value (nonce)
selected by the recipient in a previous flow.
The flows of the 2PP handshake are presented in Figure 5.4. The values
NA and NB are called nonce; each is an l-bit string, where l is the the length
of the shared key κ, and often referred to as the security parameter. The
nonces NA , NB are selected randomly, by Alice (initiator) and Bob (responder),
respectively.
The 2PP handshake typically uses σ = NA + + NB as the session identifier;
this ensures the uniqueness of the session identifier σ. The protocol, at both
ends, outputs BIN D once it receives the first flow from the peer (so it can
construct σ = NA + + NB ), and OP EN upon receiving the (correct) last flow
from the peer. By validating this last flow, 2PP ensures mutual authentication
(Def. 5.2). If interested, see Lemma 5.1 and its proof, both in Note 5.3.

5.3 Session-Authenticating Handshake Protocols


Works on cryptographic protocols, including handshake and session protocols,
usually adopt the Monster-in-the-Middle (MitM) adversary model (see subsec-
tion 5.1.3). In this case, entity authentication is often insufficient; the actual
messages exchanged between the parties should be protected. In this section,

204
Note 5.3: Proof that 2PP ensures mutual authentication

Lemma 5.1 (2PP ensures mutual authentication). The 2PP protocol, as in Fig-
ure 5.4, is mutually authenticating.

Proof: Recall that in 2PP, σ = NA + + NB . Supposed that there exist a PPT


algorithm ADV that results, with significant probability, in executions which are
not responder authenticating, i.e., where some initiator, e.g., Alice (A ∈ P ), which
signals U PA (NA + + NB ), without a previous BIN D(Alice, NA + + NB ) at responder
Bob.
Alice signals U PA (NA +
+ NB ) only upon receiving M ACκ (2 + +A ← B + + NA ++ NB ).
When does Bob send this? Only in runs in which Bob received the first flow A, NA ,
supposedly from Alice, and then selected NB randomly. In this case, Bob would also
signal BIN D(Alice, NA + + NB ); but we assumed that (with significant probability)
Bob did not signal BIN D(Alice, NA + + NB ).
We conclude that (with significant probability), Bob didn’t send the message 2 + +
A ← B+ + NA + + NB (and M ACκ (2 + +A ← B+ + NA + + NB ). Notice that Alice
definitely has never computed M ACκ over 2 + +A←B+ + NA ++ NB , simply since,
in 2PP, Alice only computes M AC over messages containing A → B, never on
messages containing A ← B.
Yet, our assumption was that ADV, with significant probability, creates execution
where Alice signals U PA (NA + + NB ), implying that at some point during the execu-
tion, Alice received M ACκ (2 ++A ← B + + NA + + NB ), although neither her nor Bob
ever invoked M ACκ over this message. Since ADV is a MitM, surely this message
is available to ADV.
We can now use ADV as a routine of another PPT adversary ADV M AC , which
is able to output this M ACκ value, without calling an oracle with this particu-
lar value, hence contradicting the assumption that M AC is secure against forgery
(Definition 3.1). Adversary ADV M AC simply runs ADV, until ADV outputs
M ACκ (2 + +A ← B+ + NA + + NB ) - with significant probability. The contradic-
tion shows that our assumption of the existence of such ADV must be false.

we study a minor extension, where the handshake protocol does not only au-
thenticate the entities, but also the session, including the messages exchanged
between the parties.

5.3.1 Session-authenticating handshake: signals,


requirements and variants
A session-authenticating handshake protocol includes additional signals, al-
lowing initiator and responder to exchange one or a few messages in a specific
session σ. Specifically, the protocol at both parties has an additional input
signal SEN D(σ, m), requesting to send message m to the peer over session
σ, and the additional output signal RECEIV E(σ, m), to deliver message m
received from the peer over σ. For simplicity, the message-authenticating hand-
shake protocols we discuss in this subsection send at most one message over
each session per participant; session protocols, discussed later, extend this to

205
transmission of multiple messages.
A handshake protocol ensures session authentication if it ensures mutual
entity-authentication Definition 5.2, and furthermore, a message is received at
one party, say Bob, at session σ, only if it was sent by the peer party, say
Alice, in the same session σ, and after the session σ was OP EN ed at Bob.
Furthermore, if a session σ terminates successfully at one party, e.g., with
DOW NI (σ, OK), after a message was sent, i.e., SEN D(σ, m) input signal,
then that message was previously successfully received by the peer in this
session σ, i.e., RECEIV E(σ, m) output signal occurred earlier.
Note that this definition builds on the mapping that exists between sessions
at both ends, as required from mutual entity-authenticating protocol (Def. 5.2).
The session authentication property implies message authentication, i.e.,
messages received by one party were indeed sent by the other party. It also im-
plies freshness, which means that messages are received in FIFO order, without
any reordering or duplications; however, omissions may be permitted. Fresh-
ness also implies that when a message is received, say by Bob, in some ses-
sion, that message was sent by Alice during the same session, after the session
OPENed in Bob.

Session-authenticating handshake: three variants. We present three


slightly different session-authenticating handshake protocols. We begin with a
session-authenticating variant of 2PP; like ‘regular’ 2PP, the session-authenticating
2PP variant also involves three flows. This protocol can authenticate one mes-
sage from responder to initiator and one message from initiator to responder
- however, the responder has to send its message before receiving the message
from the initiator.
However, in the very common and important case of where a client device
initiates a session and sends a request, and the server sends back a response, we
need the initiator (client) to send the first message and the responder (server)
to send back a response, which is not handled by message-authenticating 2PP.
We present two other variants that support this; the first simply extends 2PP
to an additional (fourth) flow, and the other assumes synchronized clocks,
allowing the handshake to involve only one message from client to server and
one response from server to client.

5.3.2 Session-authenticating 2PP


We first discuss the three-flows session-authenticating handshake protocol; this
is a minor extension to the base 2PP, as shown in Figure 5.5. In fact, the only
change is adding the messages (mR from responder to initiator, and mI from
initiator to responder) to the second and third flows, respectively.
The three-flows session-authenticating 2PP has, however, a significant draw-
back, which makes it ill-suited for many applications. Specifically, in this proto-
col, the first message is sent by the responder, and only then the second message
is sent, by the initiator. This does not fit the common ‘request-response’ inter-

206
Figure 5.5: Three-flows session-authenticating 2PP handshake

action, where the initiator contacts the responder and sends to it some request,
and the responder sends back a response to the request.
In the next two subsections, we present two variants of the message-authenticating
handshake that support sending a request by the initiator and receiving a
response from the responder; we refer to such protocols as request-response
authenticating. Note that each of these two variants also has a drawback, com-
pared to the three-flows session-authenticating 2PP: either a fourth flow, or
relying on synchronized time and/or state.

5.3.3 Nonce-based request-response authenticating


handshake protocol
We now present another minor variant of 2PP, which also provides session
authentication for a short message - a request message req from Alice to Bob,
and a response message resp from Bob to Alice, using nonces. This protocol
allows the initiator (Alice) to send its message - the request - first, and the
responder sends its message - the response - only upon receiving the request.
The protocol does not require the parties to use synchronized clocks, and while
it does require each party to remember the (random) values of the nonces that
this party picked, there is no need to maintain this state once the session ended.
See Figure 5.6
The four-flows handshake involves two simple extensions of the basic 2PP
protocol. The first extension is an additional, fourth flow, from responder back
to initiator, which carries the response of the responder to the request from the
initiator, which is sent as part of the third flow (which is the second flow from
initiator to responder). The second extension is simply the inclusion of these
messages - request and response - in the corresponding flows, and as inputs to
the Message Authentication Code (MAC) applied and sent in both flows.
The disadvantage of this protocol is, obviously, the need for one additional
flow, which also implies an additional ‘round trip’ delay until sending the re-

207
Figure 5.6: Four-flows, request-response authenticating handshake protocol

Figure 5.7: Timestamp-based Authenticated Handshake, assuming Synchro-


nized Clocks

quest and response. We next present another request-response authenticating


handshake, which requires only two flows, at the price of requiring synchronized
clocks or state.

5.3.4 Two-Flows Request-Response Authenticating


Handshake, assuming Synchronized State
The 2PP protocol requires three flows, and the request-response authenticat-
ing variant presented above requires an additional flow. In contrast, Fig. 5.7
presents a simple alternative request-response authenticating handshake pro-
tocol, which requires only two flows.
The challenge in this protocol is for the responder to verify the freshness of
the request, i.e., that the request is not a replay of a request already received
in the past. Freshness also implies no reordering; for example, a responder
pR should reject request x from pI , if pR already received another request
x0 from pI , where x0 was sent after x. Freshness prevents an attacker from

208
replaying information from previous exchanges. For example, consider the
request-response authentication of Figure 5.6; if NB is removed (or fixed), then
an eavesdropper to the flows between Alice and Bob in one request-response
session can copy these flows and cause Bob to process the same request again.
For some requests, e.g., Transfer 100$ from my account to Eve, this can be a
concern.
To ensure freshness without requiring the extra flows, one may use Times-
tamps instead of exchanging nonces. Notice that even if clocks are synchro-
nized, there is some delay for messages, forcing recipients to allow messages
which contain timestamps earlier than the time at which they were received.
To deal with this, the recipient (e.g., Bob) typically remembers the last times-
tamp received, and processes only requests with increasing timestamps. This
memory can be erased after there has been no new request for enough time,
allowing the recipient to confirm that a new request is not a replay. Still, when
the delay is large, timestamps have the additional advantage that the recipient
does not need to remember all ‘pending nonces’ (sent and not yet received),
assuming synchronized clocks.
Timestamp-based authenticated handshake may operate as follows. The
initiator includes with each request, say of session σ, the timestamp value
TA (σ), which is guaranteed to exceed the value TA (σ 0 ) sent with any previously-
sent request sent from this initiator to this responder. One way to implement
TA is to set it, at time t, to TA = timeA (t), where timeA (t) gives the current
value of a clock maintained by Alice at time t, which monotonously increases
with time t. Another way for the initiator, e.g. Alice, to implement TA , is using
a counter maintained by Alice and incremented upon every OP EN or SEN D
event. However, often the requirement to maintain a persistent state is harder
than the requirement of keeping a clock.
The responder, e.g. Bob in Fig. 5.7, needs to maintain the last-received
value of TA and ignore or reject requests which arrive with TA values smaller
than the last value it received. Even this state may be reduced if the responder
also has a clock, synchronized - to some extent - with the initiator, as we show
in the following exercise.

Exercise 5.5. Assume that the initiator (Alice) and responder (Bob) have
synchronized clocks, timeA () and timeB (), whose values never differ by more
than ∆ seconds, i.e. for every time t holds |timeA (t) − timeB (t)| ≤ ∆, and
that the delay for a message sent between Alice and Bob is at most D. Explain
how to modify the protocol to reduce the persistent storage requirements, while
maintaining security (and in particular, ensuring freshness).

5.4 Key-Setup Handshake


In this chapter, as in most of this course, and, in fact, in most of the work
on cryptographic protocols, we adopt the MitM adversary model, as presented
in subsection 5.1.3. A MitM adversary is able to eavesdrop on messages as well
as to modify them. In a typical use of a handshake protocol, the handshake

209
is used only to securely setup a following exchange of messages - a session.
Therefore, if we consider a MitM adversary, we normally have to consider not
just attacks on the handshake phase, but also following attacks on the session
itself.
Later, in subsection 5.1.4, we discuss the session protocols that protect the
communication over a session from a MitM attacker; these protocols rely on
a secret key shared between the participants in the session. In principle, the
parties could use a fixed shared secret key for all sessions between them - e.g.,
one could simply use the shared initialization (‘master’) key κ. However, there
are significant security advantages in limiting the use of each key, following
Principle 2. Specifically:

• By changing the key periodically, we reduce the amount of ciphertext


using the same key available to the cryptanalyst, which may make crypt-
analysis harder or infeasible.
• By changing session keys periodically, and making sure that each of the
session keys remain secret (pseudo-random) even if all other session keys
are exposed, we limit or reduce the damages due to exposure of some of
the keys.
• The separation between session keys and master key allows some or all
security to be preserved even after attacks which expose the entire storage
of a device. One way to achieve this is when the master key is confined
to a separate Hardware Security Module (HSM), protecting it even when
the device storage is exposed. We later discuss solutions which achieve
limited preservation of security following exposure, even without an HSM.
Handshake protocols are often used to setup a new, separate key for each
session; when sessions are of limited length, and/or are limited in the amount of
information carried over them, this provides the desired reduction in ciphertext
associated with any single key. While such protocols usually also ensure mutual
authentication, we now focus on the key-setup aspect.

5.4.1 Key-Setup Handshake: Signals and Requirements


Key-setup handshake protocols require only a minor change to the U P signal,
as described in subsection 5.2.1. Specifically, we only need to add the session
key as part of the U P signal, as in: U Pρ (σ, k), to signal that session σ, at party
ρ ∈ {I, R}, would use shared key k.
There are two security requirements from key-setup handshake protocols.
The first requirement is that the two parties agree on the same key. For sim-
plicity, we require this to hold in executions where the responder (pR ) outputs
U PR (σ, k); this usually suffices, since in most protocols, e.g., 2PP, an U PR
event implies a previous U PI event.
The second requirement from key-setup handshakes is that each session key
would be secret. More precisely, each session key should be pseudo-random, i.e.,

210
indistinguishable from a random string of same length, even if the adversary is
given all the other session keys.
We say that a two-party shared-key handshake protocol ensures secure key-
setup if it ensures both requirements, i.e., it ensures synchronized key-setup as
well as pseudo-random session keys.

5.4.2 Key-setup 2PP extension


We next explain the key-setup 2PP extension, a simple extension to the 2PP
protocol, that ensures secure key-setup. This is achieved by outputting, in the
U PI and U PR signals, the session key as:

k = P RFκ (NA +
+ NB ) (5.2)

In Eq. (5.2, κ denotes the long-term shared secret key (provided to both parties
in the IN IT (κ) input signal), and NA , NB are the values exchanged in the
protocol.
Since both parties compute the key in the same way from NA + + NB , it
follows that they will receive the same key, i.e., the key-setup 2PP extension
ensures synchronized key-setup. Furthermore, since NA and NB are chosen
randomly for each session, then each session identifier - the pair NA +
+ NB - is
used, with high probability, in no more than a single session. Since the session
keys are computed using a pseudo-random function, k = P RFκ (NA + + NB ),
it follows that the key of each session is pseudo-random (even given all other
session keys). Namely, the key-setup 2PP extension ensures secure key setup.

5.4.3 Key-Setup: Deriving Per-Goal Keys


Following the key-separation principle (principle 6), session protocols often use
two separate keyed cryptographic functions, one for encryption and one for
authentication (MAC); the key used for each of the two goals should be pseudo-
random, even given the key to the other goal. We refer to such keys are per-goal
keys. The next exercise explains how we can use a single shared key, from the
2PP or another key-setup protocol, to derive such per-goal keys.

Exercise 5.6 (Per-goal keys).

1. Show why it is necessary to use separately pseudorandom keys for encryp-


tion and for authentication (MAC), i.e., per-goal keys.
2. Show how to securely derive one key kE for encryption and a key kA for
authentication, both from the same session key k, yet each key (e.g., kE )
is pseudo-random even given the other key (resp., kA ).
3. Show a modification of the key-setup 2PP extension, which also derives
a secure, pseudo-random pair of keys kE , kA , but a bit more efficiently
than by deriving both of them from k.

211
Explain the security of your solutions.
Hints for solution:

1. Assume you are given a secure encryption and MAC and ‘corrupt’ them
into the required counter-example.
2. Let kE = P RFk (‘E’), kA = P RFk (‘A’).
3. Instead of deriving k as in Eq. (5.2), derive kE , kA ‘directly’ using:

kE = P RFκ (‘E’ +
+ NA +
+ NB ) (5.3)
kA = P RFκ (‘A’ +
+ NA +
+ NB ) (5.4)

To further improve security of the session protocol, we may use two separate
A→B A→B
pairs of per-goal keys: one pair (kE , kA ) for (encryption, authentication)
B→A B→A
of messages from Alice to Bob, and another pair (kE , kA ) for (encryption,
authentication) of messages from Bob to Alice.

Exercise 5.7. 1. How could the use of separate, pseudo-random pairs of


per-goal keys for the two ‘directions’ improve security?
2. Show how to securely derive all four keys (both pairs) from the same
session key k.
3. Show a modification of the key-setup 2PP extension, which securely de-
rives all four keys (both pairs) ‘directly’, a bit more efficiently than by
deriving them from k.
Explain the security of your solutions.

Hints for solution of parts 2 and 3: follow a similar approach as for Ex. 5.6,
but derive directly from the ‘master key’, saving a cryptographic operation.

5.5 Key Distribution Protocols and GSM


In this section, we expand a bit beyond our focus on two party protocols, to
briefly discuss shared-key, three-party Key Distribution Protocols. In general,
key distribution protocols establish a shared key between two or more entities.
We focus on Key Distribution Protocols which use only symmetric cryptogra-
phy (shared keys), and involve only three parties: Alice, Bob - and a trusted
third party (TTP), often referred to as the Key distribution Center (KDC),
whose goal is to establish a shared key between the other parties. The KDC
shares a key with each party: kA with Alice and kB with Bob; using these keys,
the KDC helps Alice and Bob to share a symmetric key kAB between them.
There are many types of Key Distribution Protocols. We present one typ-
ical, simple protocol in Figure 5.8. Later in this section, we will focus on a

212
Alice KDC Bob

‘Bob’, time, M ACkM (time +


+ ‘Bob’)
A

h i
cB = EK E (KAB ), T = cB , M ACkM (time +
+ ‘Alice’ +
+ cB )
B B
cA = EK E (KAB ), M ACkM (time +
+ ‘Bob’ +
+ cA +
+ T)
A A

M
kAB ← P RFkAB (‘MAC’)
E
kAB ← P RFkAB (‘Enc’)

+A→B+
T, cReq = EkE (Request), M ACK M (1 + + time +
+ cReq )
AB AB

E M
Derive kAB from T and kAB , kAB
from kAB
+A←B+
cResp = EK E (Response), M ACK M (2 + + time +
+ cResp )
AB AB

Figure 5.8: Key Distribution Center Protocol. The protocol assumes that the
E
KDC shares two keys with each party; with Alice, it shares kA for encryption
M E
and kA for authentication (MAC), and with Bob, it shares kB for encryption
M
and kB for authentication (MAC). In this protocol, the KDC selects a shared-
key kAB to be used by Alice and Bob for the specific request-response. Alice
and Bob use kAB and a pseudo-random function P RF to derive two shared
E M
keys, kAB = P RFKAB (‘Enc’) (for encryption) and kAB = P RFKAB (‘MAC’)
(for authentication, i.e., MAC).

different key distribution protocol, which is used in the GSM cellular network
standard - and which is notoriously insecure.
The process essentially consists of two exchanges. The first exchange is
between Alice and the KDC. In this exchange, the KDC sends to Alice the key
kAB that will be shared between Alice and Bob. In addition, Alice receives a
ticket T , which is, essentially, the key kAB , encrypted and authenticated - for
Bob.
In the second phase, Alice sends her request to Bob, together with the ticket
T , allowing Bob to retrieve kAB . Alice and Bob both derive from kAB the
E M
shared encryption and authentication (MAC) keys, kAB and kAB , respectively.
Note that in the above protocol, the KDC never initiates communication,
but only responds to an incoming request; this communication pattern, where
a server machine (in this case, the KDC) only responds to incoming requests,
is referred to as client-server. It is often preferred, since it relieves the server
(e.g., KDC) from the need to maintain state for different clients, which makes
it easier to implement an efficient service, especially when clients may access
different servers.

213
In many applications, the TTP has an additional role: access control.
Namely, the TTP would control the ability of the client (Alice) to contact
the service (Bob). In this case, the ticket does not only transport the key but
also becomes a permit for the use of the server. One important example for
such a service is the Kerberos system [135], adopted in Windows and other
systems.
This protocol requires and assumes synchronized clocks (time) between all
parties. This allows Alice and Bob to validate that the key KAB they receive is
‘fresh’ and not a replay; when tickets also serve as permits, this also allows Bob
to validate that Alice was allowed access to the service (at the given time). If
there are no synchronized clocks, then we must use a different flow than shown
in Figure 5.8; in particular, Bob should be able to provide a nonce NB to the
KDC, and then to validate that nonce NB exists in the ticket that he receives.
Such protocols were also studied and deployed, e.g., see [27].

Exercise 5.8. Extend the KDC protocol of Figure 5.8 to provide also access-
control functionality by the KDC. Specifically, Alice should send her request also
to the KDC, in an authenticated and private manner; and the KDC should in-
clude a secure signal for Bob within the ticket T , letting Bob know that the
KDC approves that request. Your solution should only use the flows in Fig-
ure 5.8, adding and/or modifying the information sent as necessary. The so-
lution should avoid exposure of the request to the attacker (eavesdropper); the
KDC, of course, should be aware of the request (to approve it).

Hint: notice that an extension to the protocol of Figure 5.8 is required,


since in that protocol, the KDC does not even receive Alice’s request, and
surely cannot ‘approve’ it.

Exercise 5.9. Present an alternative design for a KDC protocol, which avoids
the assumption of synchronized clocks. Your solution should maintain client-
server communication, i.e., the KDC (as a server) should only send responses
to incoming requests, and never initiate communication with a client.

Hint: you may solve this by defining the contents of the flows in Fig. 5.9;
and you may use ideas from the 2PP protocol. Or, a simpler solution may use
additional state in the KDC, essentially, for counters.
The rest of this section focuses on a very specific, important case-study of
a three-party key distribution protocol, namely, that of the GSM network, and
some of its (very serious) vulnerabilities.

5.5.1 Case study: the GSM Key Distribution Protocol


We next discuss the GSM security protocol, another important-yet-vulnerable
shared-key authentication and key-setup protocol. The GSM security protocol
is performed at the beginning of each connection between a Mobile and a Base.
The base is the cellular network provider used in this connection; the base
typically does not maintain the mobile’s key ki , which is known only to the

214
Alice KDC Bob

Figure 5.9: Key Distribution Center Protocol, without use of synchronized


clocks (exercise 5.9).

mobile itself and to the Home, which is the cellular network provider to whom
the mobile is subscribed. The home retrieves ki from its clients keys table CK,
but does not send it to the base. Instead, the home uses ki to derive and send
to the base per-session credentials, allowing the base to authenticate the client
and to set up a shared-key with it, to be used to encrypt the communication for
confidentiality. We now provide some more details on this process, and later
discuss some significant vulnerabilities.
Fig. 5.10 shows a simplification of the flows of the GSM handshake pro-
tocol. The handshake begins with the mobile sending its identifier IMSI (In-
ternational Mobile Subscriber Identity). The base forwards the IMSI to the
home, which retrieves the key of the user owning this IMSI, which we denote
ki . The home then selects a random value r, and computes the pair of values
(K, s) as (K, s) ← A38(ki , r). Here, K is a symmetric session key, and s stands
for secret result, a value used to authenticate the mobile. The function2 A38
should be a Pseudo Random Function (PRF).
2 The spec actually computes K, s using two separate functions: s ← A3(k , r) and K ←
i
A8(ki , r), and does not specify the required properties from A3, A8. However, for security, the
functions should be ‘independently pseudorandom’ - for example, using A3 = A8 is clearly
insecure. Indeed, in practice, both are computed by a joint function - usually a a specific
function called COM P 128. We use the abstract notation A38 for this function; COM P 128

215
Figure 5.10: The GSM key distribution protocol (simplified). Time proceeds
from top to bottom.

The home sends the resulting GSM authentication triplet (r, K, s) to the
base. For efficiency, the home usually sends multiple triplets to the base (not
shown). Another way for the base to avoid requesting again an authentication
triplet from the home is for the base to reuse a triplet in multiple connections
with the same mobile. Note that the GSM spec uses the terms KC , rand and
sres for what we denote by K, r and s, respectively.
Fig. 5.10 also shows an example of a message m sent from mobile to base,
encrypted using the connection’s key K. Of course, in typical real use, the
mobile and the base exchange a session consisting of many messages encrypted
with EK .
Contrary to Kerckhoffs’ principle (principle Principle 3), all of GSM’s cryp-
tographic algorithms, including the stream ciphers as well as the A3 and A8
algorithms, were kept secret, apparently in the (false) hope that this will im-
prove security. We believe that the choice of keeping GSM algorithms secret
contributed to the fact that these algorithms proved to be vulnerable - and
furthermore, that serious, and pretty obvious, vulnerabilities exist also in the
GSM security protocols. We discuss two of these in the next two subsections.
is a possible instantiation.

216
Figure 5.11: Fake-Base Replay attack on GSM. Time proceeds from left to
right.

5.5.2 Replay attacks on GSM


We now describe a simple attack against the GSM handshake, as in Figure 5.10.
The attack involves a false base provider, i.e., an attacker impersonating as a
legitimate base. Note that GSM designers seem to have assumed that such
an attack is infeasible, as building such a device is ‘too complex’; this proved
short-sighted, and fake base attacks have been, and still are, common. This
assumption, and, in particular, the implication that GSM is insecure against
a MitM adversary, has not been done clearly, and certainly was not really
necessary - security against a MitM attacker requires only minor changes and
no significant overhead; i.e., GSM design violated both principle 1, clear attack
model, and principle 7, conservative design.
One typical, simple variant of the attack is the fake-base replay attack,
shown in Figure 5.11. Note that both Figure 5.10 and Figure 5.11 are schedule
diagrams. They are shown differently: in Figure 5.11, time proceeds horizon-
tally from left to right, while in Figure 5.11 time proceeds vertically, from top
to bottom.
The attack has three phases:

Eavesdrop: in the first phase, the attacker eavesdrops on a legitimate con-


nection between the mobile client and a legitimate base. The handshake
between the client and the base is exactly like in Figure 5.11, except
that Figure 5.11 does not show the home (and the messages sent to it).
Cryptanalyze: in the second phase, the attacker cryptanalyzes the cipher-
texts collected during the eavesdrop phase. Assume that the attacker is
successful in finding the session key K shared between client and base;

217
this is reasonable, since multiple effective attacks are known on the GSM
ciphers A5-1 and A5-2.
Impersonate: finally, once cryptanalysis has exposed the session key K, the
attacker sets up a fake base station and uses that key to communicate
correctly with the client. This allows the attacker to eavesdrop and mod-
ify the communication. The attacker may also relay the messages from
the client to the ‘real’ base; one reason to do this is so that the client is
unlikely to detect that she is actually connected via a rogue base.

5.5.3 Cipher-agility and Downgrade Attacks


Practical cryptographic protocols should be designed in a modular manner,
and in particular, should allow the use of different cryptographic systems, as
long as they fulfill the requirements, e.g., encryption, PRF or MAC; this prop-
erty is usually referred to as cipher-agility. The set of cryptographic schemes
used in a particular execution of the protocol is referred to as the ciphersuite,
and the process of negotiating the ciphersuite is called ciphersuite negotiation.
Although ciphersuite negotiation is not complex, it is all too often done inse-
curely, allowing different downgrade attacks, which allow an attacker to trick
the parties into using a particular ciphersuite chosen by the attacker, typically a
vulnerable one. These attacks usually involve a Monster-in-the-Middle (MitM)
attacker.
GSM supports cipher-agility, in the sense that the base and the client (mo-
bile) negotiate the stream-cipher to use; GSM defines three stream-ciphers,
denoted A5/1, A5/2 and A5/3 (also called Kasumi), and also the ‘null’ A5/0
which simply means that no encryption is applied. However, the GSM cipher-
suite negotiation is not protected at all, allowing trivial downgrade attacks.
Worse, GSM has a very unusual property, making downgrade attacks much
worse than with most systems/protocols: the same key is used by all encryp-
tion schemes, allowing an attacker to find a key used with one (weak) scheme,
and use it to decipher communication protected with a different (stronger) ci-
pher. In this subsection, we describe the GSM ciphersuite negotiation process
and the related vulnerabilities and attacks.

GSM ciphersuite negotiation. The GSM ciphersuite negotiation process


is shown in Figure 5.12. In the first message of the handshake, containing
the mobile client’s identity (IMSI), the client also sends the list ciphers of
the stream-ciphers it supports. The base selects the stream cipher it prefers,
among those in the list ciphers. Usually, the base would select the stream
cipher considered most secure among those that this base supports. In the
case of GSM, the A5/2 cipher is well known to be relatively weak; in fact, the
A5/2 cipher was designed to be weak. A weak cipher was considered necessary
to allow export of GSM devices to countries to which it was not allowed, at the
time, to export devices with (secure) encryption. Indeed, very effective attacks
were found against A5/2, and quite effective attacks were also found against

218
Figure 5.12: Cipher-suite negotiation in GSM.

A5/1, making A5/3 the preferred algorithm (with some attacks against it too).
A5/i
We denote encryption with stream-cipher A5/i and key K as EK .

A simplified, unrealistic downgrade attack. We first present a simpli-


fied, unrealistic downgrade attack against GSM in Figure 5.13. In this attack,
the client supports A5/1 and A5/2, but the MitM attacker ‘removes’ A5/1 and
only offers A5/2 to the base. As a result, the entire session between base and
client is only protected using the (extremely vulnerable) A5/2 cipher.

A ‘real’ downgrade attack. The simplified attack in Figure 5.13 usually


fails, i.e., is mostly impractical. The reason is that GSM specifies that all clients
should support the A5/1 cipher, and, furthermore, that a base supporting
A5/1 should refuse to use A5/2 (or A5/0, which means no encryption at all).
However, a minor variant of the attack circumvents this problem, by modifying
the message from the base to the client, rather than the message from the client
to the base, as shown in Figure 5.14.
In the ‘real’ downgrade attack on GSM, as in Figure 5.14, the base still uses
the ‘stronger’ cipher, e.g., A5/1. The attacker modifies the ciphersuite selec-
tion, sent from the base to the client, causing the client to use the vulnerable
cipher, e.g., A5/2. Since A5/2 is very weak, it may be broken in few seconds,
or even in under a second.
However, note that the GSM base waits significantly less than a second for
the first encrypted message, from the time it sends the Start message. If the

219
Figure 5.13: A simplified, unrealistic Downgrade Attack on GSM. This version
usually fails, since GSM specifies that all clients should support the A5-1 cipher,
so Base should refuse to use A5-2.

MitM would only begin the cryptanalysis of A5/2 after receiving the Start
message, the base would time-out and ‘break’ the connection.
However, this issue is easily circumvented, by exploiting the fact that GSM
bases allow the client much more time to compute the authenticator s - more
than 10 seconds, typically. The MitM attacker exploits this fact by delaying
the transmission of the authenticator s, until it finishes cryptanalysis of the
message encrypted with A5/2 and finds the key K. Only at that point, the
MitM forwards s to the base; when receiving the Start command, the MitM
attacker can easily forward the messages from the client after re-encrypting
them with A5/1. See Figure 5.14.
Several additional variants of this attack are possible; see, for example, the
following exercise.

Exercise 5.10 (GSM combined replay and downgrade attack). Consider an


attacker who eavesdrops and records the entire communication between mo-
bile and base during a connection which is encrypted using a ‘strong’ cipher,
say A5/3. Present a sequence diagram, like Figure 5.11, showing a ‘combined
replay and downgrade attack’, allowing this attacker to decrypt all of that ci-
phertext communication by later impersonating as a base, and performing a
downgrade attack.

Hint: the attacker will resend the value of r from the eavesdropped-upon
communication (encrypted using a ‘strong’ cipher) to cause the mobile to re-
use the same key - but with a weak cipher, allowing the attacker to expose the
key.

220
Figure 5.14: GSM Downgrade Attack - the ‘real’ version. This version works,
since GSM clients are willing to use the weak A5/2 cipher.

Protecting against downgrade attacks. Downgrade attacks involve mod-


ification of information sent by the parties - specifically, the possible and/or
chosen ciphers. Hence, the standard method to defend against downgrade
attacks is to authenticate the exchange, or at least, the ciphersuite-related in-
dicators.
Note that this requires the parties to agree on the authentication mech-
anism, typically, a MAC scheme. It may be desirable to also negotiate the
authentication mechanism. In such case, the negotiation should be bounded to
reasonable time, and the use of the authentication scheme and key limited to a
few messages, to foil downgrade attacks on the authentication mechanism. Ev-
ery authentication mechanism supported should be secure against this (weak)
attack.
It is also necessary to avoid the use of the same key for different encryption
schemes, as done in GSM, and exploited, e.g., by the attacks of Figure 5.14
and Exercise 5.10. This is actually easy, and does not require any significant
resources - it seems that there was no real justification for this design choice
in GSM, except for the fact that this allows the home to send just one key
K before knowing which cipher would be selected by the mobile and base.
Indeed, the next exercise shows that a significant improvement in security may
be possible even using the existing protocol, by an appropriately-designed new
cipher option.

Exercise 5.11 (Fix GSM key reuse.). Show a fix to the vulnerability caused

221
by reuse of the same key for different ciphers in GSM, as exploited in Exer-
cise 5.10. The fix should not require any change in the messages exchanged
by the protocol, and should simply appear as a new cipher, say A5/9, using
only information available to the client and base. You may assume that the
client has an additional secret key, k9,i and that the home sends to the base an
additional key K9 .
Hint: Note that your fix is not required to prevent the attack of Figure 5.14,
although this is also easy - if the Base may assume that all clients support
A5/9.
Note that the GSM response to these vulnerabilities was simply to abolish
the use of the insecure A5/2 in mobiles. This prevents downgrade attacks to
A5/2, but still allows downgrading from A5/3 to A5/1.

5.6 Resiliency to key exposure: forward secrecy, recover


secrecy and beyond
One of the goals of deriving pseudorandom keys for each session was to reduce
the damage due to exposure of one or some of the session keys. A natural
question is, can we reduce risk from exposure of the entire state of the parties,
including, in particular, exposure of the ‘initialization/master’ key κ?
One approach to this problem was already mentioned: place the master
key κ within a Hardware Security Module (HSM), so that it is assumed not
to be part of the state exposed to the attacker. However, often, the use of an
HSM is not a realistic, viable option. Furthermore, cryptographic keys may
be exposed even when using an HSM - by some weakness of the HSM, such as
side-channels allowing (immediate or gradual/partial) exposure of keys, or the
keys may be exposed via cryptanalysis.
In this section, we discuss a different approach to handle key exposures:
design the handshake protocol to ensure security, even in scenarios where the
attacker may obtain some of the secret information, e.g., private (master and
session) keys. We mostly focus on two notions of resiliency to key exposure:
forward secrecy and recover secrecy. We explain these two notions and present
handshake protocols satisfying them. We also briefly discuss additional, even
stronger notions of resiliency to key exposures, mainly, an extension for each
of the two notions: perfect forward secrecy (PFS) and perfect recover secrecy
(PRS).

5.6.1 Forward secrecy handshake


We use the term forward secrecy to refer to key setup protocols where exposure
of all keys in some future time (and session), including the master and session
keys which are kept at that future time, does not expose the keys or contents
of already-completed sessions, even if the ciphertext exchanges were recorded
by the attacker. Note that this implies that each period i must use a separate
master key M Ki , and at the beginning of session i, we must erase any previous

222
Figure 5.15: (Weak-)Forward-secrecy Handshake, implemented as in Eq. 5.5,
i.e., M Ki = P RG(M Ki−1 ). Exposure of all the information of an entity at
time t does not compromise the confidentiality of past sessions. Information
sent by the entity or sent to the entity, in any session which ended before time
t, remains secure even after the keys kept at time t are exposed. Session keys
may be derived from the current master key M Ki as in Eq. 5.6, similarly to
the derivation of session keys from the (fixed) master key in Eq. 5.2.

master key (e.g., M Ki−1 ). Definition follows. Note that some authors refer
to this notion as weak forward secrecy, and we often adopt this, to emphasize
the distinction from the stronger notion of perfect forward secrecy (which we
present later).

Definition 5.3 (Handshake with (Weak) Forward Secrecy). A handshake pro-


tocol π ensures forward secrecy if exposure of the entire state of an entity at
time t, including the current master (and session) keys, does not compromise
the confidentiality of information sent by the entity or sent to the entity in any
session which ended before time t.

We next present (weak) forward secrecy key-setup handshake, a forward-


secrecy variant of the key-setup 2PP extension, which we discussed and pre-
sented earlier, in subsection 5.4.2. The difference is that instead of using a
single master key κ, received during initialization, the forward-secrecy hand-
shake uses a sequence of master keys M K0 , M K1 , . . .; for simplicity, assume
that each master key M Ki is used only for the ith handshake, with M K0
received during initialization.
The key to achieving the (weak) forward secrecy property is to allow easy
derivation of the future master keys M Ki+1 , . . . from the current master key
M Ki , but prevent the reverse, i.e., maintain the previous master keys M Ki−1 , M Ki−2 , . . . , M K0
pseudorandom, even for an adversary who knows M Ki , M Ki+1 , . . .. A simple
way to achieve this is by using a PRG, namely:

M Ki = P RG(M Ki−1 ) (5.5)


The session key ki for the ith session can be derived using the corresponding
minor change to Eq. 5.2, namely:

ki = P RFM Ki (NA +
+ NB ) (5.6)

223
Figure 5.16: Forward-secrecy key setup: derivation of per-session master keys
and session keys

The resulting protocol is illustrated in Fig. 5.16. The use of NA and NB


in Eq. (5.6) is not really necessary, since each master key is used only for a
single handshake.
Note that it is easy to implement a PRG using a PRF (Ex. 2.54). Hence,
we can use a PRF to derive both ki and κi .

5.6.2 Recover-Security Handshake Protocol


We use the term recover security3 , or weak recover security, to refer to key
setup protocols where a single session without eavesdropping or other attacks,
suffices to recover security from previous key exposures. Definition follows.

Definition 5.4 ((Weak) Recover security handshake). A handshake protocol


π ensures (weak) recover security if security (secrecy and/or authenticity) is
ensured for messages exchanged during session i, provided that there exists
some previous session i0 < i such that:

1. There is no exposure of keys from session i0 to session i; all keys and


other state prior to session i0 is exposed (known to attacker).
2. During session i0 itself, all messages are delivered correctly, without eaves-
dropping, injection or modification. We allow MitM attacks on commu-
nication in other sessions (including session i).

The protocols of Figure 5.16 and Figure 5.18 ensure forward secrecy - but
not recover secrecy. This is since the attacker can use one exposed master key,
say M Ki0 −1 , to derive all the following master keys, including M Ki , using
Eq. 5.5; in particular, M Ki0 = P RG(M Ki0 −1 ).
However, a simple extension suffices to ensure recover secrecy, as well as
forward secrecy. The extension is simply to use the random values exchanged
in each session, i.e., NA , NB , in the derivation of the next master key, i.e.:
3 We may sometimes use the term ‘recover secrecy’ instead of ‘recover security’, due to

the similarity with ‘forward secrecy’. However, ‘recover security’ is a more appropriate term,
since this concept allows recovery not only of secrecy/confidentiality, but also of authenticity
and integrity.

224
Figure 5.17: RS-Ratchet key setup protocol, ensuring both (weak) Forward
Secrecy and Recover Secrecy

M Ki = P RG(M Ki−1 ) ⊕ NA ⊕ NB (5.7)


Figure 5.17 illustrates the resulting key-setup handshake protocol, the RS-
Ratchet protocol.
By computing the new master key as exclusive-OR of these three values,
it is secret as long as at least one of these three values is secret - this could
be viewed as one-time-pad encryption (OTP, see § 2.4). Since the recover
secrecy requirement assumes at least one session where the attacker does not
eavesdrop or otherwise interfere with the communication, then both NA and
NB are secret, hence the new master key κi is secret. Indeed, XOR with one
of the two keys is sufficient; by XOR-ing with both of them, we ensure secrecy
of the master key even if the attacker is able to capture one of the two flows,
i.e., even stronger security.
Note that the RS-Ratchet protocol requires the parties to have a source
of randomness which is secure - even after the party was broken-into (and
keys exposed). In reality, many systems only rely on pseudo-random genera-
tors (PRGs), whose future values are computable using a past value. In such
case, it becomes critical to use also the input from the peer (NA or NB ), and
these values should be also used to re-initialize the PRG, so that new nonces
(NA , NB ) are pseudorandom and not predictable.

5.6.3 Stronger notions of resiliency to key exposure


Forward secrecy and recover secrecy significantly improve the resiliency against
key exposure. There are additional and even stronger notions of resiliency to
key exposure, which are provided by more advanced handshake protocols; we
only cover a few of these in this course - specifically, the ones in Table 5.2.
Protocols for many of the more advanced notions of resiliency use public
key cryptology, and in particular, key-exchange protocols such as the Diffie-
Hellmen (DH) protocol. Indeed, it seems plausible that public-key cryptog-
raphy is necessary for many of these notions. This includes the important
notions of Perfect Forward Secrecy (PFS) and Perfect Recover Secrecy (PRS),
which, as the names imply, are stronger variants of forward and recover se-
crecy, respectively. We now briefly discuss PFS and PRS, to understand their

225
advantages and why they require more than the protocols we have seen in this
chapter; we discuss these notions further, with implementations, in § 6.4.

Perfect Forward Secrecy (PFS). We now define an even stronger secu-


rity notion than forward-secrecy, which we call Perfect Forward Secrecy (PFS).
PFS, like Forward Secrecy, also requires resiliency to exposures of state, includ-
ing keys, occurring in the future. However, on top of that, PFS also requires
resilience to exposure of the previous state, again including keys, as long as the
attacker can only eavesdrop on the session. We next define this notion.
Definition 5.5 (Perfect Forward Secrecy (PFS)). A handshake protocol π en-
sures perfect forward secrecy (PFS) if exposure of the state during all sessions
except session i, does not expose the keys of session i, provided that during
session i, the attacker can only eavesdrop on messages (and the keys of session
i are not otherwise disclosed).

Other definitions of PFS. Unfortunately, different experts use the term


PFS for different notions, including merely considering it to be synonymous
with Forward-Secrecy. This includes the first use of this term (to our knowl-
edge), by Gunther [86] - although we believe that our definition quite capture
the intuitive notion as described (informally) by Gunther. We discuss some
other definitions for PFS in ??.
We discuss some PFS handshake protocols in the next chapter, which deals
with asymmetric cryptography (also called public-key cryptography, PKC).
Indeed, all known PFS protocols, as we defined PFS, are based on PKC.

Exercise 5.12 (Forward Secrecy vs. Perfect Forward Secrecy (PFS)). Present
a sequence diagram, showing that the forward-secrecy key-setup handshake pro-
tocol presented in subsection 5.6.1, does not ensure Perfect Forward Secrecy
(PFS).

Perfect Recover Secrecy (PRS). We use the term perfect recover secrecy
to refer to key setup protocols where a single session without MitM attacks
suffices to recover secrecy from previous key exposures. Definition follows.
Definition 5.6 (Perfect Recover Secrecy (PRS) handshake). A handshake pro-
tocol π ensures perfect recover secrecy (PRS), if secrecy is ensured for messages
exchanged during session i, provided that there exists some previous session
i0 < i such that:

1. There is no exposure of keys from session i0 to session i; all keys and


other state prior to session i0 is exposed (known to attacker).
2. During session i0 itself, all messages are delivered correctly, without MitM
attacks (injection or modification); however, eavesdropping is allowed.
Allow MitM attacks on communication in other session (including session
i).

226
Figure 5.18: Forward-secrecy key setup: derivation of per-session master keys
and per-goal session keys

Note the similarity to PFS, in allowing only eavesdropping during the ‘tar-
get’ session (i for PFS, i0 for PRS). Also similarly to PFS, we discuss some
PRS handshake protocols in the next chapter, which deals with asymmet-
ric cryptography. Indeed, all known PRS protocols are based on asymmetric
cryptography.

Additional notions of resiliency. The research in cryptographic protocols


includes additional notions of resiliency to key and state exposures, which we
do not cover in this course. These include threshold security [53], which ensures
that the entire system remains secure even if (up to some threshold) of its mod-
ules are exposed or corrupted, proactive security [39], which deals with recovery
of security of some modules after exposures, and leakage-resiliency [69], which
ensures resiliency to gradual leakage of parts of the storage. These - and other
- notions are beyond our scope.

5.6.4 Per-goal Keys Separation.


Our discussion focused on deriving a single session key. However, for security,
we should use a separate key for encryption, from the key we use for authen-
tication. Furthermore, the attacker may be more likely to attack traffic in one
direction, say Alice to Bob, than traffic in the other direction, e.g., since the
attacker can control the traffic in one direction (chosen plaintext attack) or to
predict it (known plaintext attack). See the key-separation principle (princi-
ple 6) and the discussion in subsection 5.4.3.
We can implement such separation using a Pseudo-Random Function (PRF).
A→B A→B B→A A←B
Specifically, derive (ki,E , ki,A ), (ki,E , ki,A ), the four session keys for ses-

227
Figure 5.19: Relations between notions of resiliency to key exposures. An
arrow from notion A to notion B implies that notion A implies notion B. For
example, a protocol that ensures Perfect Forward Secrecy (PFS) also ensures
Forward Secrecy.

sion i, as follows (see also Figure 5.18):


A→B
ki,E = P RFκi (A → B +
+ ‘E’)
A→B
ki,A = P RFκi (A → B +
+ ‘A’)
B→A
ki,E = P RFκi (B → A +
+ ‘E’)
B→A
ki,A = P RFκi (B → A +
+ ‘A’)

Since κi is secret, and P RF is a pseudo-random function, then P RFκi is in-


distinguishable from a random function (over same domain and range). Since
the inputs for deriving each of the four keys are different, the outputs of a ran-
dom function would be all uniformly random (and independent of each other).
Since P RFκi is indistinguishable from a random function, it follows that the
four keys are independently pseudo-random.

5.6.5 Resiliency to exposures: summary


In this section, we discussed several notions of resiliency of handshake protocols
to exposure of secret information, including cryptographic keys. We mostly fo-
cused on two notions of resiliency to key exposure: forward secrecy and recover
secrecy. We explained these two notions, and presented handshake protocols
satisfying them. We also briefly discussed additional, even stronger notions of
resiliency to key exposures, mainly, an extension for each of the two notions:
perfect forward secrecy (PFS) and perfect recover secrecy (PRS). Designs for
PFS and PRS involve public key cryptology; we cover these designs in chap-
ter 6. Forward secrecy and recover secrecy, the two notions on which we focused
in this section, are achievable by shared-key key-setup protocols.
We compare the four notions, along with the ‘regular’ secure key-setup
handshake, in Table 5.2, and present the relationships between them in Fig. 5.19.

Terminology for resiliency properties. Many different notions of resiliency


are considered in the literature; we focused only on a few of them, which we

228
Table 5.2: Notions of resiliency to key exposures of key-setup handshake pro-
tocols. See implementations of forward and recover secrecy in subsection 5.6.1
and subsection 5.6.2 respectively, and for the corresponding ‘perfect’ notions
(PFS and PRS) in subsection 6.4.1 and subsection 6.4.2, respectively.

consider most basic and important. Even for these notions, there are some
different interpretations and definitions, and sometimes inconsistent usage of
terms. In particular, some experts use the term forward secrecy as a synonym
to perfect forward secrecy (PFS). Furthermore, while break-in recovery is often
mentioned as a goal for the ratchet protocols, we introduced the terms recover
secrecy in subsection 5.6.2 and perfect recover secrecy (PRS) in subsection 6.4.2.

229
5.7 Shared-Key Session Protocols: Additional Exercises
Exercise 5.13 (Vulnerability of Compress-then-Encrypt). One of the impor-
tant goals for encryption of web communication, is to hide the value of ‘cookies’,
which are (often short) strings sent automatically by the browser to a website,
and often used to authenticate the user. Attackers often cause the browser to
send requests to a specific website, to try to find out the value of the cookie
attached by the browser. Assume, for simplicity, that the attacker controls the
entire contents of the request, before and after the cookie, except for the cookie
itself (which the attacker tries to find out). Furthermore, assume that the length
of the cookie is known. Further assume that the browser uses compression be-
fore encrypting requests sent to the server.
Finally, assume a very simple compression scheme, that only replaces: (1)
sequences of three or more identical characters with two characters, and (2)
repeating strings (of two or more characters, e.g., xyxy), with a single string
plus one character (e.g., ⊥xy; assume that the ⊥ does not appear in the request
except to signal compression).

1. Present an efficient attack which exposes the first character of the cookie.
2. Extend your attack to expose the entire cookie.

Exercise 5.14. Some applications require only one party (e.g., a door) to au-
thenticate the other party (e.g., Alice); this allows a somewhat simpler protocol.
We describe in the two items below two proposed protocols for this task (one
in each item), both using a key k shared between the door and Alice, and a
secure symmetric-key encryption scheme (E, D). Analyze the security of the
two protocols.
1. The door selects a random string (nonce) n and sends Ek (n) to Alice;
Alice decrypts it and sends back n.
2. The door selects and sends n; Alice computes and sends back Ek (n).

Repeat the question, when E is a block cipher rather than an encryption


scheme.

Exercise 5.15. Consider the following mutual-authentication protocol, using


shared key k and a (secure) block cipher (E, D):
1. Alice sends NA to Bob.
2. Bob replies with NB , Ek (NA ).
3. Alice completes the handshake by sending Ek (NB ⊕ Ek (NA )).

Show an attack against this protocol, and identify the design principles which
were violated by the protocol, and which, if followed, should have prevented such
attacks.

230
Exercise 5.16 (GSM). In this exercise we study some of the weaknesses of
the GSM handshake protocol, as described in subsection 5.5.1. In this exer-
cise we ignore the existence of multiple types of encryption and their choice
(‘ciphersuite’).

1. In this exercise, and in usual, we ignore the fact that the functions A8, A3
and the ciphers Ei were kept secret; explain why.
2. Present functions A3, A8 such that the protocol is insecure when using
them, against an eavesdropping-only adversary.
3. Present functions A3, A8 that ensure security against MitM adversary,
assuming E is a secure encryption. Prove (or at least argue) for security.
(Here and later, you may assume a given secure PRF function, f .)
4. To refer to the triplet of a specific connection, say the j th connection,
we use the notation: (r(j), sres(j), k(j)). Assume that during connection
j 0 attacker received key k(ĵ) of previous connection ĵ < j 0 . Show how
a MitM attacker can use this to expose, not only messages sent during
connection ĵ, but also messages sent in future connections (after j 0 ) of
this mobile.
5. Present a possible fix to the protocol, as simple and efficient as possible, to
prevent exposure of messages sent in future connections (after j 0 ). The
fix should only involve changes to the mobile and the base, not to the
home.

Exercise 5.17. Fig. 5.20 illustrates a simplification of the SSL/TLS session-


security protocol; this simplification uses a fixed master key k which is shared
in advance between the two participants, Client and Server. This simplified
version supports transmission of only two messages, a ‘request’ MC sent by
the client to the server, and a ‘response’ MS sent from the server. The two
messages are protected using a session key k 0 , which the server selects randomly
at the beginning of each session, and sends to the client, protected using the
fixed shared master key k.
The protocol should protect the confidentiality and integrity (authenticity)
of the messages (MC , MS ), as well as ‘replay’ of messages, e.g., client sends
MC in one session and server receives MC on two sessions.

1. The field cipher suite contains a list of encryption schemes (‘ciphers’)


supported by the client, and the field chosen cipher contains the cipher
in this list chosen by the server; this cipher is used in the two subsequent
messages (a fixed cipher is used for the first two messages). For simplicity
consider only two ciphers, say E1 and E2, and suppose that both client
and server support both, but that they prefer E2 since E1 is known to be
vulnerable. Show how a MitM attacker can cause the parties to use E1
anyway, allowing it to decipher the messages MC , MS .

231
Figure 5.20: Simplified SSL

2. Suggest a minor modification to the protocol to prevent such ‘downgrade


attacks’.
3. Ignore now the risk of downgrade attacks, e.g., assume all ciphers sup-
ported are secure. Assume that MC is a request to transfer funds from
the clients’ account to a target account, in the following format:
Date Operation type Comment Amount Target account
(3 bytes) (1 byte) (20 bytes) (8 bytes) (8 bytes)
Assume that E is CBC mode encryption using an 8-bytes block cipher.
The solution should not rely on replay of the messages (which will not
work since only one message is sent in each direction on each usage).
Mal is a (malicious) client of the bank, and eavesdrop on a session where
Alice is sending a request to transfer 10$ to him (Mal). Show how Mal can
abuse his Man-in-the-Middle abilities to cause transfer of larger amount.
Explain a simple fix to the protocol to prevent this attack.

Exercise 5.18. Consider the following protocol for server-assisted group-shared-


key setup. Assume that every client, say i, shares a key ki with the server, and
let G be a set of users s.t. i ∈ G. Client i may ask the server to provide
it with a key kG , shared with all the users in G, by sending the server a re-
quest specifying the set G, authenticated using ki . The server replies by sending
xG = kG + Πj∈G P RFkj (t) to the clients in G, where t is the time4 . User i then
computes P RFki (t) and then finds kG = xG mod P RFki (t). Every other user
in the set G can similarly receive xG and compute from it kG similarly.
4 The Π notation denotes multiplication of the elements much like Σ denotes addition,

e.g., Πj∈{1,2,3} aj = a1 · a2 · a3 .

232
Present an attack allowing a malicious client, e.g., Mal, to learn the key
kG for a group it does not belong to, i.e., M al 6∈ G; e.g., assume G = {a, b, c}.
Mal may eavesdrop to all messages, and request and receive kG0 for any group
G0 s.t. M al ∈ G0 .

Exercise 5.19. In the GSM protocol, the home sends to the base one or more
authentication triplets (r, K, s). The base and the mobile are to use each triplet
only for a single handshake; this is somewhat wasteful, as often the mobile has
multiple connections (and handshakes) while visiting the same base.

1. Suppose a base decides to re-use the same triplet (r, K, s) in multiple


handshakes, for efficiency (less requests to home). Present message se-
quence diagram showing that this may allow an attacker to impersonate
as a client. Namely, that client authentication fails.
2. Suggest an improvement to the messages sent between mobile and base,
that will allow the base to reuse the (r, K, s) triplet received from base,
for multiple secure handshakes with the mobile. Your improvement should
consist of a single additional challenge rB which the base selects randomly
and sends to the mobile, together with the challenge r received in the
triplet from the home; and a single response sB which the mobile returns
to the server, instead of sending the response s as in the original protocol.
Show the computation of sB by mobile and base: sB = .
Your solution may use an arbitrary pseudo-random function P RF .
3. GSM sends frames (messages) of 114 bits each, by bit-wise XORing the
nth plaintext frame with 114 bits output from A5/iK (n). Here, A5/i, for
i = 1, 2, . . . , is a cryptographic function, n is the frame number, and K
was a key received from the home. A5/1 and A5/2 are described in the
specifications - and both are known to be vulnerable; other functions can
be agreed between mobile and base. Both A5/1 and A5/2 are insecure;
for this question, assume the use of a secure cipher, say A5/5. Suppose,
again, that a base decides to re-use the same triplet (r, K, s) in multi-
ple handshakes. Present a message sequence diagram in which an a A
mobile has two connections to the base, sending message m1 in the first
connection and message m2 in the second connection. Assume that the
base re-use the same triplet (r, K, s) in both connections, and that the
attacker knows the contents of m1 . Show how the attacker can find m2 .
Note: the improvement suggested in the previous item (rB , sB ) does not
have significant impact on this item - you can solve with it or without it.
4. To prevent the threat presented in the previous item, the mobile and base
can use a different key K 0 = (instead of using K).
5. Design a base-only forward secrecy improvement to A5/5. Namely, even
if attacker is given access to all of the base memory after the j th hand-
shake using the same r, the attacker would still not be able to decipher
information exchanged in past connections. Your design may send the

233
value of j together with r from base to mobile, and may change the stored
value of s at the end of every handshake; let sj denote the value of s
at the j th handshake, where the initial value is s received from the home
(i.e., s1 = s). Your solution consists of defining the value of sj given
sj−1 , namely: sj = .
Exercise 5.20. The GSM protocol is very vulnerable to downgrade attacks;
let’s design a method to prevent downgrade attacks on new ciphers, when both
client (mobile) and base support (the same) new cipher.
1. An attacker impersonating as a base, may try to behave as if it is a base
supporting only an older cipher (without protection against downgrade
attack).
Exercise 5.21. Consider the following key establishment protocol between any
two users with an assistance of a server S, where each user U shares a secret
key KU S with a central server S.
A → B : (A, NA )
B → S : (A, NA , B, NB , EKBS (A +
+ NA ))
S → A : (A, NA , B, NB , EKAS (NA +
+ sk), EKBS (A +
+ sk), NB )
A → B : (A, NA , B, NB , EKBS (A +
+ sk))
Assume that E is an authenticated encryption. Show an attack which allows
an attacker to impersonate one of the parties to the other, while exposing the
secret key sk.
Exercise 5.22 (Hashing vs. Forward Secrecy). We discussed in §5.6.1 the
use of PRG or PRF to derive future keys, ensuring Forward Secrecy. Could
a cryptographic hash function be securely used for the same purpose, as in
κi = h(κi−1 )? Evaluate if such design is guaranteed to be secure, when h is a
(1) CRHF, (2) OWF, (3) bitwise-randomness extracting.
Exercise 5.23 (PFS definitions). Below are informal definitions for PFS from
the literature. Compare them to our definitions for PFS: are they equivalent?
Are they ‘weaker’ - a protocol may satisfy them yet not be PFS as we define, or
the other way around? Or are they incomparable (neither is always weaker)?
Can you give an absurd example of a protocol meeting the definition, which is
‘clearly’ not sensible to be claimed to be PFS? Any other issue?
From Wikipedia, [171] An encryption system has the property of forward
secrecy if plain-text (decrypted) inspection of the data exchange that oc-
curs during key agreement phase of session initiation does not reveal the
key that was used to encrypt the remainder of the session.
From [127, 138] A protocol has Perfect Forward Secrecy (PFS) if the compro-
mise of long-term keys does not allow an attacker to obtain past session
keys.

234
Chapter 6

Public Key Cryptology

In chapters 2 and 3, we studied symmetric (shared key) cryptographic schemes


- for encryption and for message-authentication, respectively. Indeed, until the
1970s, all of cryptology was based on the use of symmetric keys - and almost
all of it was done only within defense and intelligence organization, with very
few publications, academic research and commercial products.
This changed quite dramatically in the 1970s, with the beginning of what we
now call modern cryptology, as we described in subsection 1.2.1. In particular,
in their seminal paper [56] from 1976, Diffie and Hellman observed that there
could be significant advantages in cryptographic schemes that used different
keys - a public one (e.g. for encryption) and a private one (for decryption). This
revolutionary idea has developed into the huge area of public key cryptology,
also referred to as asymmetric cryptology. Diffie and Hellman also identified
three types of public-key schemes: public-key encryption, digital signatures
and key exchange, and presented the first public-key construction: the DH key
exchange protocol.
In this chapter, we introduce public key cryptology, beginning with key-
exchange protocols. We discuss public-key encryption and signature schemes
mostly in the later sections, except for brief introduction in the next section.

6.1 Introduction to PKC


The basic observation underlying asymmetric cryptology is quite simple, at
least in hindsight: security requirements are asymmetric. For example, an
encryption scheme should prevent a MitM attacker from decrypting ciphertext,
when it does not have the intended-receiver’s key; but then we may not care if
the attacker may encrypt messages. Notice, indeed, that the security definition
we presented for encryption, Def. 2.10, does not require preventing the attacker
from encrypting plaintexts, only from decrypting ciphertexts.
Specifically, Diffie and Hellman defined three basic goals for public key
cryptosystems; and these still remain the three most important types of public

235
key cryptology: public key cryptosystem (PKC), digital signature schemes, and
key exchange protocols.

6.1.1 Public key cryptosystems


Public key cryptosystems (PKC) are encryption schemes consisting of three
algorithms, (KG, E, D), and using a pair of keys: a public key e for encryption,
and a private key d for decryption. Both keys are generated by the key genera-
tion algorithm KG. The encryption key is not secret; namely, we assume that
it is known to the attacker. The correctness requirement of PKCs is similar to
the one for shared-key cryptosystems, namely:

(∀m ∈ M, (e, d) ← KG(l)) Dd (Ee (m)) = m (6.1)

See illustration of public key cryptosystem in Figure 2.2.


Note that we only define a stateless version of PKC. The reason for that is
that typically, the public key e may be shared and used by multiple parties; we
obviously cannot assume that these parties share a state variable among them,
and therefore, they also cannot be synchronized with the ‘recipient’ decrypting
the ciphertexts.
We further discuss PKC in sections ?? to 6.6.

6.1.2 Digital signature schemes


Digital signature schemes, introduced in subsection 3.3.2, consist of three al-
gorithms, (KG, S, V ), for Key Generation, Signing and Verifying, respectively.
Key Generation (KG) is a randomized algorithm, and it outputs a pair of
correlated keys: a private signing key s for ‘signing’ a message, and a public
validation key v for validating a given signature for a given message. The vali-
dation key v is not secret: it should only allow validation of authenticity, and
should not facilitate signing.
Digital signatures serve a similar function to Message Authentication Code
(MAC) schemes, except that rather than using a single function M AC and a
single key k for authenticating and for validating, they use a distinct key s and
function S for signing (authenticating), and a distinct key v and function V
for validating authenticity. The correctness requirement of signature schemes
is similar to the one for MAC schemes, namely:

(∀m ∈ M, (s, v) ← KG(l)) Vv (m, Ss (m)) = T RU E (6.2)

We illustrate digital signature schemes in Figure 3.2, and discuss them in § 6.7.

6.1.3 Key exchange protocols


Key exchange protocols are defined by a pair of probabilistic algorithms, (KG, F )
(for key-generation and key-combining, respectively). A run of the protocol in-
volves two parties, typically referred to as Alice (or A) and Bob (or B). The key

236
generation function KG receives as input a security parameter 1l , and outputs
a pair of keys: a public key and a private key. The key-combining function F
receives as input a public value (from one party) and a private value (from the
other party), and produces a (shared) key. Each party, e.g., Alice, uses KG to
generate a pair (a, PA ) of private value a, kept by Alice, and a public value PA ,
Public Key Cryptology
which Alice sends to its peer, Bob. Each party also receives public information
from the peer: Alice receives PB from Bob, and Bob receives PA from Alice.
Each party now uses F to derive a key; Alice derives kA = F (a, PB ) and Bob
deriveskBKey exchange.
= F (b, PA ). See Figure 6.1.

Alice Bob
PA
PA PB
KG KG
𝑎 Shared key: 𝐹 𝑎, 𝑃𝐵 = 𝐹(𝑏, 𝑃𝐴 ) 𝑏

Figure 6.1: Key Exchange protocol. Each party, Alice and Bob, runs the Key-
Generation algorithm KG, which outputs a (private, public) key-pair: a, PA for
Alice and b, PB for Bob. The parties exchange their public keys (PA and PB ).
Then, each party applies a key-combining function F to its own private key (a
for Alice and b for Bob), and to the public key received from the peer (PA from
Alice and PB from Bob). A key-exchange protocol should ensure correctness
8/18/2018 © Amir Herzberg 4
(kA = F (a, PB ) = F (b, PA ) = kB ) and key-indistinguishability (kA = kB is
pseudorandom), allowing use of kA = kB as a shared key.

A key exchange protocol should ensure correctness and key-secrecy. The


correctness requirement is that both parties will derive the same key, namely
that for every security parameter 1l holds:

∀(a, PA ) ← KG(1l ), (b, PB ) ← KG(1l ) F (a, PB ) = F (b, PA )



(6.3)

Key-indistinguishability requires, intuitively, that an eavesdropping adversary,


who ‘sees’ PA and PB , cannot learn anything about the shared key; equiva-
lently, it requires that the adversary cannot distinguish between being given
randomly-generated PA , PB and the key derived from them, versus being given
randomly-generated PA , PB and a random string of the same length as the key.
The following definition states this requirement more precisely.

Definition 6.1 (The key indistinguishability requirement). Let (KG, F ) be a


key-exchange protocol, and A be an efficient (PPT) adversary. We say that
(KG, F ) ensures key-indistinguishability if for every PPT adversary A and for

237
sufficiently-large security parameter 1l , holds:

A (PA , PB , r) = 1
 
 
A (PA , PB , F (a, PA )) = 1  where 
 where   $

Pr 

$
  (a, P ) ←
 − Pr  A KG(1l ),  ∈ N EGL(1l )

(a, PA ) ← KG(1l ), $
KG(1l ),
 
 (b, PB ) ←
 
$ 
(b, PB ) ← KG(1l ) $
r ← {0, 1}|F (a,PA )|
(6.4)

6.1.4 Advantages of Public Key Cryptography (PKC)


Public key cryptography is not just a cool concept; it is very useful, allowing
solutions to problems which symmetric cryptology fails to solve, and making
it easier to solve other problems.
We first identify three important challenges which require the use of asym-
metric cryptology:
Signatures provide evidences. Only the owner of the private key can dig-
itally sign a message, but everyone can validate this signature. This
allows a recipient of a signed message to know that once he validated the
signature, he has the ability to convince other parties that the message
was signed by the sender. This is impossible using (shared-key) MAC
schemes, and allows many applications, such as signing an agreement,
payment order or recommendation/review. An important special case is
signing a public key certificate, linking between an entity and its public
key.
Establishing security. Using public key cryptology, we can establish secure
communication between parties, without requiring them to previously
exchange a secret key or to communicate with an additional party (such
as a KDC, see § 5.5). We can send the public key signed by a trusted
party (in a certificate); or, if the attacker is only eavesdropper, use a key-
exchange protocol (this is not secure against a MitM attacker). Finally,
in the common case where one party (the client) knows the public key
of the other party (the server), the client can encrypt a shared key and
send to the server.
Stronger resiliency to exposure. In § 5.6 we discussed the goal of resiliency
to exposure of secret information, in particular, of the ‘master key’ of
shared-key key-setup protocols, and presented the forward secrecy key-
setup handshake. In subsection 5.6.3, we also briefly discussed some
stronger resiliency properties, including Perfect Forward Secrecy (PFS),
Threshold security and Proactive security. Designs for achieving such
stronger resiliency notions are all based on public key cryptology; see
later in this chapter.

238
Public key cryptology (PKC) also makes it easier to design and deploy
secure systems. Specifically, public keys are easier to distribute, since they
can be given in a public forum (such as directory) or in an incoming message;
note that the public keys still need to be authenticated, to be sure we are
receiving the correct public keys, but there is no need to protect their secrecy.
Distribution is also easier since each party only needs to distribute one (public)
key to all its peers, rather than setting up different secret keys, one per each
peer.
Furthermore, public keys are easier to maintain and use, since they may
be kept in non-secure storage, as long as they are validated before use - e.g.,
using MAC with a special secret key. Finally, only one public key is required
for each party, compared to O(n2 ) shared keys required for n peers; namely,
we need to maintain - and refresh - less keys.

6.1.5 The price of PKC: assumptions, computation costs


and key-length
With all the advantages listed above, it may seem that we should always use
public key cryptology. However, PKC has three significant drawbacks: com-
putation time, key-length and potential vulnerability. We discuss these in this
subsection.
All of these drawbacks are due to the fact that when attacking a PKC
scheme, the attacker has the public key which corresponds to the private key.
The private key is closely related to the public key - for example, the private
decryption key ‘reverses’ encryption using the public key; yet, the public key
should not expose (information about) the private key. It is challenging to
come up with a scheme that allows this relationship between the encryption and
decryption keys, and yet where the public key does not expose the private key.
In fact, as discussed in subsection 1.2.1, the concept of PKC was ‘discovered’
twice - and in both times, it took years until the first realization of PKC was
found; see [121].
Considering the challenge of designing asymmetric cryptosystems, it should
not be surprising that all known public-key schemes have considerable draw-
backs compared to the corresponding shared-key (symmetric) schemes. There
are two types of drawbacks: overhead and required assumptions.

PKC assumptions and quantum cryptanalysis Applied PKC algorithms,


such as RSA, DH, El-Gamal and elliptic-curve PKCs, all rely on specific com-
putational assumptions, mostly on the hardness of specific number-theoretic
problems such as factoring; there are other proposed algorithms, but their
overheads are (even) much higher.
There is reason to hope that these specific hardness assumptions are well-
founded. The basic reason is that many mathematicians have tried to find
efficient algorithms for these problems for many years, long before their use for
PKC was proposed; and efforts increased by far as PKC became known and
important.

239
However, it is certainly conceivable that an efficient algorithm exists - and
would someday be found. In fact, such discovery may even occur suddenly
and soon - such unpredictability is the nature of algorithmic and mathematical
breakthroughs. Furthermore, since all of the widely-used PKC algorithms are
so closely related, it is even possible that such cryptanalysis would apply to all
of them - leaving us without any practical PKC algorithm. PKC algorithms
are the basis for the security of many systems and protocols; if suddenly there
will not be a viable, practical and ‘unbroken’ PKC, that would be a major
problem.
Furthermore, efficient algorithms to solve these problems are known - if an
appropriate quantum computer can be realized. There has been many efforts
to develop quantum computers, with significant progress - but results are still
very far from the ability to cryptanalyze these PKC schemes. But here, too,
future developments are hard to predict.
All this motivates extensive work by cryptographers, to identify additional
candidate PKC systems, which will rely on other, ‘independent’ or - ideally -
more general assumptions, as well as schemes which are secure even if large-
scale quantum computing becomes feasible (post-quantum cryptology). In par-
ticular, this includes important results of PKC schemes based on lattice prob-
lems. Lattice problems are very different from number-theoretic problems, and
seem resilient to quantum-computing; furthermore, some of the results in this
area have proofs of security based on the general and well-founded complexity
assumption of NP-completeness. Details are beyond our scope; see, e.g., [3].

PKC overhead: key-length and computation. One drawback of asym-


metric cryptology, is that all proposed schemes - definitely, all proposed scheme
which were not broken - have much higher overhead, compared to the corre-
sponding shared-key schemes. There are two main types of overheads: compu-
tation time and key-length.
The system designers choose the key-length of the cryptosystems they use,
based on the sufficient effective key length principle (principle 4). These deci-
sions are based on the perceived resources and motivation of the attackers, on
their estimation or bounds of the expected damages due to exposure, and on the
constraints and overheads of the relevant system resources. Finally, a critical
consideration is the estimates of the required key length for the cryptosystems
in use, based on known and estimated future attacks. Such estimates and rec-
ommendations are usually provided by experts proposing new cryptosystems,
and then revised and improved by experts and different standardization and
security organizations, publishing key-length recommendations.
We present three well-known recommendations in Table 6.1. These recom-
mendations are marked in the table as LV’01, NIST2014 and BSI’17, and were
published, respectively, in a seminal 2001 paper by Lenstra and Verheul [119],
recommendations by NIST from 2014 [8] and recommendations by the German
BSI from 2017 [38]. See these and much more online at [78].
Recommendations are usually presented with respect to a particular year

240
in which the ciphertexts are to remain confidential. Experts estimate the ex-
pected improvements in the cryptanalysis capabilities of attackers over years,
due to improved hardware speeds, reduced hardware costs, reduced energy
costs (due to improved hardware), and, often more significantly by hardest to
estimate, improvements in methods of cryptanalysis. Such predictions cannot
be done precisely, and hence, recommendations differ, sometimes considerably.
Accordingly, Table 6.1 compares the recommended key lengths for three years:
2020, 2030 and 2040.
Table 6.1 presents the recommendations for four typical, important cryp-
tosystems, in three columns: two to four. Column two presents the recommen-
dations for symmetric cryptosystems such as AES. The same recommendations
hold for any other symmetric (shared-key) cryptosystem, for which there is not
‘shortcut’ cryptanalytical attack (which provides significantly better efficiency
compared to exhaustive search).
Column three presents the recommendations for RSA and El-Gamal, the
two oldest and most well-known public-key cryptosystems; we discuss both
cryptosystems, in sections 6.6 and 6.5.2. This column also applies to the Diffie-
Hellman (DH) key-exchange protocol; in fact, the El-Gamal cryptosystem is
essentially a variant of the DH protocol, as we explain in subsection 6.5.2. RSA
and El-Gamal/DH are based on two different number-theoretic problems: the
factoring problem (for RSA) and the discrete-logarithm problem (for DH/El-
Gamal); but the best-known attacks against both are related, with running time
which is exponential in half the key-length. We briefly discuss these problems
in subsection 6.1.7.
The fourth column of table 6.1 presents the recommendations for elliptic-
curve based public-key cryptosystems such as ECIES. As the table shows, the
recommended key-lengths for elliptic-curve based public-key cryptosystems are,
quite consistently, much lower than the recommendations for the ‘older’ RSA
and El-Gamal/DH systems; this makes them attractive in applications where
longer keys are problematic, due to storage and/or communication overhead.
We do not cover elliptic-curve cryptosystems in this course; these are covered
in other courses and books, e.g., [3, 89, 159].
Table 6.1 shows that the required key-length is considerably higher for
public-key schemes - about twice than symmetric schemes for Elliptic-curve
cryptosystem, and over twenty times (!) for RSA and DH schemes. The
lower key-length recommendations for Elliptic-curve cryptography, makes these
schemes attractive in the (many) applications where key-length is critical, such
as when communication bandwidth and/or storage are limited.
The bottom row of Table 6.1 compares the running time of implementations
of AES with 128 bit key in counter (CTR) mode, RSA with 1024 and 2048 bit
key, and 256 bit ECIES elliptic curve cryptosystem. We see that the symmetric
cryptosystem (AES) is many orders of magnitude faster. It supports more 4·109
bytes/second, compared with less than 106 bytes/second for the comparably-
secure 2048-bit RSA, and less than 32·103 bytes/second for ECEIS. We used the
values reported for one of the popular Cryptographic libraries, Crypto++ [50].
Energy costs would usually be even worse than this factor of 4000 to 4,000,000

241
Table 6.1: Comparison of key-length and computational overheads: symmetric
cryptosystems, vs. main types of asymmetric cryptosystems

times the costs of comparable-security shared key algorithms.


From bottom row, we derive the bottom line:

Principle 11 (Minimize use of PKC). Designers should avoid, or, where ab-
solutely necessary, minimize the use of public-key cryptography.
In particular, consider that typical messages are much longer than the size
of inputs to the public-key algorithms. In theory, we could use designs as
presented for encryption and MAC, for applying the public-key algorithms to
multiple blocks. However, the resulting computation costs would have been
absurd. Even more absurd would be an attempt to modify the public-key
operation to support longer inputs. Luckily, there are simple and efficient solu-
tions, to both encryption and signatures, which are used essentially universally,
to apply these schemes to long (or VIL) messages:
Signatures: use the hash-then-sign (HtS) paradigm, see subsection 4.2.6.
Encryption: use the hybrid encryption paradigm, see the following subsection
(subsection 6.1.6).

6.1.6 Hybrid Encryption


The huge performance overhead of asymmetric cryptosystems implies that they
can rarely be used directly to encrypt messages, or: are usually used only for
very short messages. Typical, longer messages, are usually encrypted with a
symmetric key. If the parties do not have a shared symmetric key, then they

242
setup such shared key - using either key-exchange protocol between them, or
by having one party use the public key of its peer to encrypt the shared key
and send it, encrypted, to the peer. The later, widely-used, technique is called
hybrid encryption; details follow.
Hybrid encryption combines a public key cryptosystem (P KG, P KE, P KD)
with a shared key cryptosystem (SKG, SKE, SKD), to allow efficient public
key encryption of long messages. Our description assumes that the sender
knows the recipient’s public key e.

Figure 6.2: Hybrid encryption: combining public key cryptosystem


(P KG, P KE, P KD) with shared key cryptosystem (SKG, SKE, SKD), to
allow efficient public key encryption of long message m using public key e.

To perform hybrid encryption, the sender selects a random key k ← SKG(1n );


the sender will share k with the recipient. The sender encrypts k using the pub-
lic key e, resulting in cipher-key cK , namely: cK = P KEe (k). The sender then
encrypts the plaintext message m, using this shared key k, resulting in the
cipher-message cM = SKEk (m). Finally, we send the ciphertext (cK , cM ) to
the recipient. Decryption is left as an exercise. See Figure 6.2.

6.1.7 The Factoring and Discrete Logarithm Hard Problems


Public-key cryptology is based on the theory of complexity, and specifically
on (computationally) hard problem. Intuitively, a hard problem is a family of
computational problems, with two properties:
Easy to verify: there is an efficient algorithm to verify solutions. The effi-
cient is the formal definitions usually refers to the standard complexity-
theory notion of Probabilistic Polynomial Time (PPT) algorithms.
Hard to solve: there is no known efficient algorithm that solves the problem
(with significant probability). We refer here to known algorithms; it is
unreasonable to expect a proof that there is no efficient algorithm to a
problem for which there is an efficient (PPT) verification algorithm. The
reason for that is that such a proof would also solve the most important,
fundamental open problem in the theory of complexity, i.e., it would show
the N P 6= P ; see [79].

243
Intuitively, public key schemes use hard problems, by having the secret key
provide the solution to the problem, and the public key provide the parameters
to verify the solution. To make this more concrete, we briefly discuss factoring
and discrete logarithm, the two hard problems which are the basis for many
public key schemes, including the oldest and most well known: RSA, DH, El-
Gamal. For more in-depth discussion of these and other schemes, see courses
and books on cryptography, e.g., [159]. Note that while so far, known attacks
are equally effective against both systems (see Table 6.1), there is not yet a
proof that an efficient algorithm for one problem implies an efficient algorithm
against the second.

Factoring
The factoring problem is one of the oldest problems in algorithmic number
theory, and is the basis for RSA and other cryptographic schemes. Basically,
the factoring problem involves finding the prime divisors (factors) of a large
integer. However, most numbers have small divisors - half of the numbers
divide by two, third divide by four and so on... This allows efficient number
sieve algorithms to factor most numbers. Therefore, the factoring hard problem
refers specifically to factoring of numbers which have only large prime factors.
For the RSA cryptosystem, in particular, we consider factoring of a number
n computed as the product of two large random primes: n = pq. The factoring
hard problem assumption is that given such n, there is no efficient algorithm
to factor it back into p and q.
Verification consists simply of multiplying p and q, or, if given only one of
the two, say p, of dividing n by p and confirming that the result is an integer
q with no residue.

Discrete logarithm
The discrete logarithm problem is another important, well-known problem from
algorithmic number theory - and the basis for the DH (Diffie-Hellman) key-
exchange protocol, the El-Gamal cryptosystem, elliptic-curve cryptology, and
additional cryptographic schemes.

Note: readers are expected to be familiar with the basic notions of number
theory. Our discussion of the public key cryptosystems (RSA, DH, El-Gamal)
and the relevant mathematical background is terse; if desired/necessary, please
consult textbooks on cryptology and/or number theory, e.g., [159].
Discrete logarithms are defined for a given cyclic group G and a generator g
of G. A cyclic group G is a group whose elements can be generated by a single
element g ∈ G, called a generator of the group, by repeated application of the
group operation. The usual notation for the group operation is the same as
the notation of multiplication in standard calculus, i.e., the operation applied
to elements x, y ∈ G is denoted x · y or simply xy, and the group operation
applied a times to element x ∈ G, where a is an integer (i.e., a ∈ N), is denoted

244
xa ; in particular, g a = Πai=1 g. Applying these notations, a cyclic group is a
group where (∀x ∈ G)(∃a ∈ N)(x = g a ).
Given generator g of cyclic group G and an element x ∈ G, the discrete
logarithm of x over G, w.r.t. g, is the integer a ∈ N s.t. x = g a . This is
similar to the ‘regular’ logarithm function logb (a) over the real numbers R,
which returns the number logb (a) ≡ x ∈ R s.t. bx = a, except, of course, that
discrete logarithms are restricted to integers, and defined for given cyclic group
G.
There are efficient algorithms to compute logarithms over the reals; these
algorithms work, of course, also when the result is an integer. However, for
some finite cyclic groups, computing discrete logarithms is considered hard.
Note that the verification of discrete logarithms only requires exponentiation,
which can be computed efficiently.
Definition 6.2 (Discrete logarithm problem). Let G be a cyclic group, g be
a generator for G and x ∈ G be an element of G. A discrete logarithm of x,
for group G and generator g, is an integer a ∈ N s.t. x = g a . The discrete
logarithm problem for G is to compute integer a ∈ N s.t. x = g a , given x ∈ G
and generator g for G.

In practical cryptography, the discrete logarithm problem is used mostly for


groups defined by multiplications ‘mod p’ over the group Z∗p ≡ {1, 2, . . . , p − 1}
for a prime p, or for elliptic curve groups. We focus on the ‘mod p’ groups. For
such groups, the discrete logarithm problem is to compute integer a ∈ N s.t.
x = g a mod p, given prime integer p ∈ N and x, g ∈ Z∗p ≡ {1, 2, . . . , p − 1}.
When p−1 has only ‘small’ prime factors, then there are known algorithms,
such as the Pohlig-Hellman algorithm, that efficiently compute discrete loga-
rithms. To avoid this and other known efficient discrete-logarithm algorithms,
one common solution is to use a modulus p which is not just a prime, but a
safe prime. A prime number p ∈ N is called a safe prime, if p = 2q + 1 for
some prime q ∈ N.
Many efforts have failed to find an efficient algorithm to compute discrete-
logarithms for ‘mod p’ groups where p is a safe prime, which we simply refer
to as ‘safe-prime groups’. This motivates the discrete logarithm assumption
(DLA) for safe prime groups.

Definition 6.3 (The discrete logarithm assumption for safe prime groups).
The discrete logarithm assumption (DLA) for safe prime groups holds, if for
every PPT algorithm A holds:.

Pr [a = A (g a mod p)] ∈ N EGL(n) (6.5)


$
a←Z∗
p

for every safe prime p of at least n bits, and every generator g of Z∗p =
{1, 2, . . . , p − 1}.

245
On the secrecy implied by the discrete logarithm assumption. Sup-
pose that the discrete logarithm assumption for safe-prime groups holds, i.e., it
is computationally-hard to find the discrete-log a, given g a mod p, where g is
a generator of the safe-prime group. Does this mean that the attacker cannot
learn any information on a?
The answer is no; the attacker can learn some information about a - e.g.,
its least-significant bit (LSb). As we will see later, this has important impli-
cations on the design of some discrete logarithm-based cryptographic schemes,
most notably, on the ‘secure’ way to use the Diffie-Hellman protocol; e.g.,
see Claim 6.3.
Let us explain how the attacker can learn the least-significant bit (LSb) of
the exponent a. Namely, we explain an efficient method using which an attacker
can learn LSb(a), given g a mod p. The method is based on two claims from
basic number theory, which we present below. We present these claims without
proof, since their proofs, although not complex, would still require us to get
into more number theory than we believe desirable in this introductory course;
interested readers can find these proofs, e.g., in [98], a great textbook on the
mathematics of cryptography. To present the two claims, we first introduce the
notion of quadratic residue modulo p, which has many uses in the mathematics
of cryptography.
Definition 6.4. Let p be a prime number, and let y be a positive integer. We
say that y is a quadratic residue modulo p, if there is some integer z s.t. y = z 2
mod p.
The first claim simply states that quadratic residuosity can be efficiently
computable.
Claim 6.1. There is an efficient algorithm that can determine if a given pos-
itive integer y is a quadratic residue modulo a given prime p.
The second claim says that y is a quadratic residue mod p if and only if the
least significant bit of x is zero. Combined with Claim 6.1, this shows that we
can efficiently find the least-significant bit of the exponent x.
Claim 6.2. Let y = g x mod p, where p is a prime, g is a generator for Fp∗ ,
and x is a positive integer. Then y is a quadratic residue if and only if the
least significant bit of x is zero.
Let us prove just the easy direction: if the LSb of x is zero, then y is a
quadratic residue. The LSb of an integer x is zero when x is even, i.e., x = 2x0
for some integer x0 . Namely, if the LSb of x is zero, then
0 0
y = g x = g 2x = (g x )2 mod p (6.6)
0
Let z = g x ; we see that y = z 2 mod p, hence y is a quadratic residue modulo
p. For the (harder) proof for the other direction, i.e., that if y is a quadratic
residue, then x must be even (i.e., its LSb must be zero), as well as for proof
of Claim 6.1, see, e.g., [98].

246
Note, however, that there are also several ‘positive’ results. For example,
[90] shows that except for some least-significant bits, the other bits of the
discrete log are secure.

Discrete logarithms over other groups. Using Z∗p = {1, 2, . . . , p − 1}


with a safe prime p is not the only option. Sometimes there are computational
and even security advantages in using other finite cyclic groups for which the
discrete logarithm problem is considered hard, rather than using safe prime
groups. However, the use of such non-safe groups should be done carefully;
multiple ‘attacks’, i.e., efficient discrete-logarithm algorithms, are known for
some groups. In particular, discrete logarithm can be computed efficiently
when p − 1 is smooth, which means that p − 1 has only small prime factors,
e.g., when p is of the form p = 2x + 1.
This problem is not just a theoretical concern; [163] shows weaknesses in
the groups used by multiple implementations, including in popular products
such as OpenSSL, and discusses the reasons that may have led to the insecure
choices. Therefore, we mostly focus on the use of safe-prime groups.

6.2 The DH Key Exchange Protocol


A major motivation for public key cryptology, is to secure communication be-
tween parties, without requiring the parties to previously agree on a shared
secret key. In their seminal paper [56], Diffie and Hellman introduced the con-
cept of public key cryptology, including public-key cryptosystem (PKC), which
indeed allows secure communication without pre-shared secret key. However,
this paper did not contain a proposal for implementing a PKC.
Instead, [56] introduced the key exchange problem, and present the Diffie-
Hellman (DH) key exchange protocol, often referred to simply as the DH pro-
tocol. Although a key-exchange protocol is not a public key cryptosystem, yet
it also allows secure communication - without requiring a previously shared
secret key. In fact, the goal of a key exchange protocol, is to establish a shared
secret key.
In this section, we explain the DH protocol, by developing it in three steps
- each in a subsection. In subsection 6.2.1 we discuss a ‘physical’ variant of the
DH protocol, which involves physical padlocks and exchanging a box (locked
by one or two locks).

6.2.1 Physical key exchange


To help understand the Diffie-Hellman key exchange protocol, we first de-
scribe a physical key exchange protocol, illustrated by the sequence diagram in
Fig. 6.3. In this protocol, Alice and Bob exchange a secret key, by using a box,
and two padlocks - one of Alice and one of Bob. Note that initially, Alice and
Bob do not have a shared key - and, in particular, Bob cannot open Alice’s

247
Figure 6.3: Physical Key Exchange Protocol

padlock and vice verse; the protocol nevertheless, allows them to securely share
a key.
Alice initiates the protocol by placing the key to be shared in the box, and
locking the box with her padlock. When Bob receives the locked box, he cannot
remove Alice’s padlock and open the box. Instead, Bob locks the box with his
own padlock, in addition to Alice’s padlock. Bob now sends the box, locked by
both padlocks, to Alice.
Upon receiving the box, locked by both padlocks, Alice removes her own
padlock and sends back the box, now locked only by Bob’s padlock, back to
Bob. Finally, Bob removes his own padlock, and is now able to open the box
and find the key send by Alice. We assume that the adversary - Monster
in the Middle - cannot remove Alice’s or Bob’s padlocks, and hence, cannot
learn the secret in this way. The Diffie-Hellman protocol replaces this physical
assumption, by appropriate cryptographic assumptions.
However, notice that there is a further limitation on the adversary, which
is crucial for the security of this physical key exchange protocol: the adversary
should be unable to send a fake padlock. Note that in Figure 6.3, both padlocks
are stamped by the initial of their owner - Alice or Bob. The protocol is not
secure, if the adversary is able to put her own padlock on the box, but stamping
it with A or B, and thereby making it appear as Alice’s or Bob’s padlock,
respectively. This corresponds to the fact that the Diffie-Hellman protocol is
only secure against an eavesdropping adversary, but insecure against a MitM
adversary.
The critical property that facilitated the physical key exchange protocol, is
that Alice can remove her padlock, even after Bob has added his own padlock.

248
Figure 6.4: XOR Key Exchange Protocol

Namely, the ‘padlock’ operation is ‘commutative’ - it does not matter if Alice


placed her padlock first and Bob second, she can still remove her padlock as is
it was applied last. In a sense, the key to cryptographic key exchange protocols
such as Diffie Hellman, is to perform a mathematical operation which is also
commutative; of course, there are many commutative operations. We next dis-
cuss ‘insecure prototypes’ key-exchange protocols based on three commutative
operations: addition, multiplication and XOR.

6.2.2 Some candidate key exchange protocol


In this subsection, we present few ‘prototype’ key-exchange protocols, which
help us to properly explain the Diffie-Hellman protocol. Unlike the physical
key exchange protocol of subsection 6.2.1, these are ‘real protocols’, i.e., involve
only exchange of messages - no physical objects or assumptions. We begin
with three insecure ‘prototypes’, each using a different commutative operation:
XOR, Addition and Multiplication.

The XOR, Addition and Multiplication key exchange protocols. The


sequence diagram in Figure 6.4 presents the first prototype: the XOR key
exchange protocol. This prototype tries to use the XOR operator, to ‘implement
the padlocks’ of Figure 6.3. XOR is a natural candidate, as it is commutative
- and known to provide confidentiality, when used ‘correctly’ (as in one-time
pad).
However, as the next exercise shows, the XOR key exchange protocol allows
an attacker to find out the exchanged key.

249
Exercise 6.1 (XOR key exchange protocol is insecure). Show how an eaves-
dropping adversary may find the secret key exchanged by the XOR Key Ex-
change protocol, by (only) using the values sent between the two parties.

Solution (sketch): attacker XORs all three messages, to obtain: k = (k ⊕


kA ) ⊕ (k ⊕ kA ⊕ kB ) ⊕ (k ⊕ kB ).
The following exercise shows that a similar vulnerability occurs if we use
multiplication or addition instead of XOR.

Exercise 6.2 (Addition/multiplication key exchange is insecure). Present two


sequence diagrams similar to Figure 6.4, but using addition, and then using
multiplication, instead of XOR. Shows that both versions are vulnerable to MitM
attackers, similar to Exercise 6.1.

Goal is limited to security against eavesdropper. Notice that the at-


tacker against the XOR key exchange protocol, only needs to eavesdrop on
the communication, as marked in Figure 6.4. In fact, all the key-exchange
protocols in this section, including the Diffie-Hellman protocol, are evaluated
only against an eavesdropping adversary. To ensure security against Monster-
in-the-Middle (MitM) attacker, see the authenticated key-exchange protocols
of subsection 6.4.1.

Exponentiation key exchange. So far, we tried to ‘implement’ the com-


mutative property of physical keys, using three commutative mathematical
operators: XOR, addition and multiplication, and all turned out insecure. Es-
sentially, the vulnerabilities stem from the fact that the attacker was able to
‘remove’ elements, by performing the dual operation: subtraction for addition
and division for multiplication. We remove XOR using XOR - but that is still
consistent, since XOR is the ‘dual operation’ to itself.
In Fig. 6.5, we try to use another basic commutative mathematical operator:
exponentiation.
Unfortunately, the exponentiation operation may also be removed to find
the exponent - by computing the logarithm. The logarithm function is not
as efficient as addition, multiplication or even exponentiation, but, over the
integers or real numbers, it is still a rather efficient operation.
If the base g is known, then an eavesdropper can simply remove all the
exponentiations, which reduces the protocol to the (insecure) multiplication
key exchange (Ex. 6.2). Namely, the attacker applies the logarithm operator,
with basis g, to the three messages of Fig. 6.5 - resulting in the values k · ra ,
k · ra · rb and k · rb . The attacker can now combine these three values to find
k, as in Exercise 6.2.
But what if g is randomly chosen by Alice and not made public? Then,
the attacks of Exercises 6.1 and 6.2 above, against the XOR, addition and
multiplication key exchange protocols, do not apply against the Exponentiation
Key Exchange Protocol, as there is no obvious way to use g x in order to compute
y
g y out of (g x ) .

250
Figure 6.5: Exponentiation Key Exchange Protocol

Unfortunately, g is not really needed - we can ‘fix’ the attack to work r


without it. The attacker can compute the logarithm of the second flow g kra b
for the base of the first flow g kra , which gives the exponent rb - and hence allows
the attacker to find the key g k just like Bob finds it. Hence, the Exponentiation
Key Exchange Protocol (Fig. 6.5) is, indeed, insecure. However, we will not
give up - and we next show how we ‘fix’ this protocol, and finally present a
possibly-secure key exchange protocol!

Modular-Exponentiation Key Exchange. We now try to ‘fix’ the Expo-


nentiation Key Exchange Protocol (Fig. 6.5). The attack against it used the
fact that the computations in Fig. 6.5 are done over the field of the real (or
natural) numbers - R (or N), where there are efficient algorithms to compute
logarithms. This motivates changing this protocol, to use, instead, operations
over a group in which the (discrete) logarithm problem is considered hard.
Such groups exist, e.g., the ‘mod p’ group, for safe prime p. We present this
variant in Fig. 6.6.
One way to try to break the Modular-Exponentiation Key Exchange Pro-
tocol of Fig. 6.6, is to compute the (discrete) logarithm of three the values
exchanged by the protocol - like the attack above against the ‘regular’ Expo-
nentiation Key Exchange Protocol. This works when discrete logarithm can
be computed efficiently, e.g., when p − 1 is smooth, i.e., has only small prime
factors, e.g., p = 2x + 1.
However, this attack requires computing discrete logarithms - and this is
believed to be computationally-hard for certain modulus. In particular, discrete
logarithm is assumed to be computationally hard when the modulus p is a large
safe prime, i.e., p = 2q + 1 for some prime q, see Def. 6.3.
In the following subsection, we present the Diffie-Hellman protocol - which is

251
Figure 6.6: Modular-Exponentiation Key Exchange Protocol, for safe prime p
groups. The values k, a and b are integers smaller than p − 1.

essentially an improved variant of the Modular-Exponentiation Key Exchange


Protocol of Fig. 6.6.

6.2.3 The Diffie-Hellman Key Exchange Protocol and


Hardness Assumptions
In Fig. 6.7, we - finally - present the Diffie-Hellman (DH) key exchange pro-
tocol, for safe prime groups. The protocol assumes that the parties agree on
a safe prime p and on a generator g for the multiplicative group Zp∗ . The
protocol consists of only two flows: in the first flow, Alice sends g a mod p,
where a is a private key chosen by Alice; and in the second flow, Bob re-
sponds with g b mod p, where b is a private key chosen by Bob. The result
of the protocol is a shared
a secret value g ab mod p, computed by Alice as g ab
b b
mod p = g mod p mod p, and by Bob as g ab mod p = (g a mod p)
mod p.
The Diffie-Hellman protocol is, essentially, a simplified, and slightly op-
timized, variant of the Modular-Exponentiation Key Exchange Protocol of
Fig. 6.6. The basic difference is that instead of exchanging g k mod p, for
some random k, we now exchange g ab ; this is a bit simpler, and more efficient:
only two flows instead of three, no need to compute inverses (a−1 , b−1 ), and
one less exponentiation.
Notice that both Diffie-Hellman and the Modular-Exponentiation key ex-
change protocol, are designed only against an eavesdropping adversary, and are
definitely insecure against a MitM attacker. Show this by solving the follwing
exercise.

252
Figure 6.7: Diffie-Hellman Key Exchange Protocol, for safe prime p groups;
a, b are integers smaller than p − 1.

Exercise 6.3. Present a schedule diagram showing that the DH protocol is


insecure against a MitM attacker. Specifically, show that a MitM attacker can
cause Alice and Bob to believe they have a secure channel between them, using
shared key exchanged by DH, while in fact the attacker has the keys they use
to communicate.

Hint: the attacker impersonates as Alice to Bob and as Bob to Alice,


thereby establishing a shared key with both of them, who now believe they
use a secret shared key.
In fact, all a MitM attacker needs to do is to fake the message from a party,
allowing it to impersonate as that party (with a shared key with the other
party). Indeed, in practice, we always use authenticated variants of the DH
protocol, as we discuss in subsection 6.4.1.
Also, note that both Diffie-Hellman and the Modular-Exponentiation key
exchange protocols generalize to any multiplicative group. However, their se-
curity requires the discrete-logarithm to be a hard problem over that group,
which does not hold for all groups - e.g., it it does not hold over the real num-
bers, and also not for many modular groups, even with prime modulo p (e.g.,
if p − 1 is ‘smooth’, i.e., a multiplication of small primes).
So, can we assume that the DH protocol is secure against an eavesdropping
adversary, when computed over a group in which discrete logarithm is assumed
‘hard’, e.g., the ‘mod p’ group where p is a safe-prime? The answer is not trivial:
can an adversary, knowing g b mod p and g a mod p, somehow compute g ab
mod p, without requiring knowledge of a, b or ab? This is one of the important
open questions in applied cryptography.
The standard approach is to assume a stronger assumption, such as the
Computational DH (CDH) assumption. The CDH assumption essentially means
that it is infeasible to compute the DH shared secret, when using the ‘mod p’
group, for safe-prime p. The assumption generalizes to some other groups -
but not to others, e.g., it does not hold over the real numbers.
Definition 6.5 (Computational DH (CDH) for safe prime groups). The Com-
putational DH (CDH) assumption for safe prime groups holds, if for every
PPT algorithm A, every constant c ∈ R, and every sufficiently-large integer

253
n ∈ N, holds:
A(g a mod p, g b mod p) = g ab mod p < nc
 
Pr (6.7)
$
a,b←Zp∗

for every safe prime p of at least n bits and every generator g of Zp∗ =
{1, 2, . . . , p − 1}.
However, note that even if the CDH assumption for safe prime groups holds,
an eavesdropper may still be able to learn some partial information about g ab
mod p. The following claim shows that this allows immediate deduction of
whether g ab mod p is a quadratic residue modulo p, i.e., exposes ‘one bit of
information’ about g ab mod p: whether it is a quadratic residue or not.
Claim 6.3. Let PA = g a mod p and PB = g b mod p, where p is a prime, g
is a generator for Fp∗ , and a, b are positive integers. Given PA , PB , we can
efficiently deduce if g ab mod p is a quadratic residue modulo p.
Proof: from Claim 6.1, the attacker can efficiently find if g a mod p and
b
g mod p are quadratic residues modulo p. From Claim 6.2, this gives us the
least significant bit (LSb) of a and b, since, e.g., LSb(a) = 0 if and only if g a
mod p is a quadratic residue modulo p. Clearly, ab is even, i.e., LSb(ab) = 0,
if either a or b is even. Hence, given PA = g a mod p and PB = g b mod p, we
can efficiently know LSB(ab). From Claim 6.2, this also allows us to efficiently
deduce if g ab mod p is a quadratic residue modulo p.
Therefore, using g ab mod p directly as a key may allow an attacker to
learn some partial information about the key - even assuming that the CDH
assumption is true. Notice that while we show only exposure of this one bit of
information - the quadratic residuosity of g ab mod p - there could be ways to
expose more information, without violating the CDH assumption. So, how can
we use the DH protocol to securely exchange a key ? Some implementations
simply ignore this concern; however, let us discuss two ‘more secure’ options.

First option: use DDH groups. The first option is not to use a safe prime
group; instead, use a group which is believed to satisfy a stronger assumption
- the Decisional DH (DDH) Assumption. Some of the groups where DDH is
assumed to hold, and are used in cryptographic schemes, include Schnorr’s
group and some elliptic-curve groups; we will not get into such details (see,
e.g., in [159]).
Definition 6.6 (The Decisional DH (DDH) Assumption). The Decisional
DH (DDH) Assumption is that there is no PPT algorithm A that  a can dis-
b ab
tinguish,
 with significant advantage compared to guessing, between g , g , g
and g a , g b , g c , for random a, b and c. A group for which the DDH assumption
is believed to hold is called a DDH group.
Note, however, that this technique requires use of a group where DDH is
assumed to hold, and in particular, a safe-prime group cannot be used; this is
a significant hurdle.

254
Exercise 6.4 (Safe prime groups do not satisfy DDH). Show that safe prime
groups are not DDH groups, i.e., they do not satisfy the DDH assumption.
Another challenge with the use of this method is that the resulting shared-
value g ab is ‘only’ indistinguishable from a random group element, but not
from a random string. This implies that we cannot use g ab directly as a key.
However, this challenge was essentially resolved by Fouque et al. in 2006 [73],
who showed that the least-significant bits of g ab are indistinguishable from
random - enough bits to use as a shared key, for typical key-length and prime
size.

Second option: use CDH group, with key derivation. The second
option is to use a safe prime group - or another group where ‘only’ CDH is
assumed to hold - but not use the bits of g ab mod p directly as a key. Instead,
we derive a shared key k from g ab mod p, using a Key Derivation Function
(KDF). We discuss key derivation in the following subsection.

6.3 Key Derivation Functions (KDF) and the


Extract-then-Expand paradigm
[Section not yet completed]
Key derivation is useful whenever we want to generate a key k from some
‘imperfect secret’ x; such imperfect secret is a string which is longer than the
derived key, but is not fully random; e.g., x = g ab mod p is a typical example.
Key derivation is done by applying a function to the shared secret g ab ; this
function may be randomized or deterministic.
Key derivation function (KDF). A (randomized) Key Derivation Func-
tion (KDF) has two inputs, k = KDFs (x). The first input, x, is the
imperfect secret; the second input, s (salt), is a public random value,
used to ensure that the output key is uniformly random; details follow.
Deterministic key derivation using a randomness-extracting hash function.
An alternative is to derive the key as k = h(g ab mod p), where h is a
(deterministic) randomness extracting hash function. A randomness ex-
tracting hash function does not require the parties to share a uniformly-
random string s (salt), in contrast to a KDF. In typical applications
of key-exchange, such shared randomness is not available, so that is an
important advantage. We discuss randomness-extraction and other prop-
erties of cryptographic hash functions in chapter 4.
We focus in this section on (randomized) Key Derivation Functions. Intu-
itively, a KDF ensures that its output k = KDFs (x) is pseudorandom, pro-
vided that the salt s is uniformly-random, and that the imperfect secret x is
‘sufficiently unpredictable’.
To define this ‘sufficiently unpredictable’ requirement, we assume that the
imperfect secret x is generated by some probabilistic algorithm G, i.e., x = G(r)

255
PRF fk (x) KDF fk (x) Rand.-extracting hash h(x)
Key k Secret, random Public, random No key
Data x Arbitrary ‘Sufficiently’ secret and random ‘Sufficiently’ secret and random
Output Pseudorandom Pseudorandom Pseudorandom

Table 6.2: Comparison: PRF, KDF and Randomness-extracting hash function

where r is a uniformly random string. We refer to G as the secret sampler;


it models the process using which we generate the shared secret x. Note that
often, imperfect secrets are not really the product of an algorithm, but are
due to some physical process; the algorithm G is just a model of the process
generating the randomness. Of course, the output of the Diffie-Hellman key
exchange can be precisely modeled as a sampler G.
Definition 6.7 (Key Derivation Function (KDF) wrt G). A key derivation
function (KDF) KDFs (x) is a PPT algorithm with two inputs, a salt s and
an imperfect secret x. We say that KDFs (x) is secure with respect to PPT
algorithm G (secret sampler), if for every PPT algorithm D (distinguisher)
holds:

Pr [D(s, KDFs (G(x))) = ‘Rand’] − Pr [D(s, r) = ‘Rand’] ∈ N EGL (6.8)


s,x s,r

Where the probabilities are taken over random coin tosses of ADV , and over
$
s, x, r ← {0, 1}n .
For more details on key derivation, including provably-secure constructions,
see [111].
Note that a KDF seems quite similar, in its ‘signature’ (inputs and out-
puts), to a Pseudo-Random Function (PRF). Let’s compare PRF to KDF,
adding to the comparison another related primitive - randomness-extracting
hash function. See also Table 6.2.
We first note that the goal of all three mechanisms is to output uniformly
pseudorandom strings (unpredictable by attacker). Furthermore:
• Both KDF and PRF use, for this purpose, a uniformly-random key; how-
ever, PRF require this key to be secret, while KDF allows a public key.
Of course, randomness-extracting hash does not require any key (so, in
a sense, it’s more ‘extreme’ than KDF).
• A PRF ensures that the output is pseudorandom, even for arbitrary in-
put, possibly known or even determined by the attacker. In contrast,
both KDF and randomness-extracting hash assume that the input is
‘sufficiently’ secret and random. For one definition of ‘sufficiently ran-
dom input’, see the hashing lecture (where I presented such definition for
randomness-extracting hash). But for this course purposes, an intuitive
understanding should suffice.

256
The following exercise shows that the KDF and PRF definitions are ‘in-
comparable’ : a function can be a PRF but not a KDF and vice versa. An
interesting example is the popular HMAC construction, which is used in many
applications. Some of these applications rely on HMAC to satisfy the PRF
properties, e.g., for message authentication; other applications rely on HMAC
to satisfy the KDF properties, e.g., to derive a key from g ab mod p as in the
DH key exchange; and yet other applications rely on HMAC to satisfy both the
KDF properties and the PRF properties, e.g., strong resiliency to key exposure
of the Double-ratchet key-exchange protocols, see subsection 6.4.4.

Exercise 6.5 (PRF vs. KDF). Assume that f is a (secure) PRF and g is a
(secure) KDF. Derive from these a (secure) PRF f 0 and a (secure) KDF g 0 ,
such that f 0 is not a secure KDF and g 0 is not a secure PRF.

6.4 Using DH for Resiliency to Exposures: PFS and


PRS
As discussed above, and demonstrated in Ex. 6.3, the DH protocol is vulnerable
to a MitM attacker; its security is only against a passive, eavesdropping-only
attacker. In most practical scenarios, attackers who are able to eavesdrop,
have some or complete ability to also perform active attacks such as message
injection; it may seem that DH is only applicable to the relatively few scenarios
of eavesdropping-only attackers.
In this section, we discuss extensions of the DH protocol, extensively in
practice to improve resiliency to adversaries which have MitM abilities, com-
bined with key-exposure abilities. Specifically, these extensions allow us to
ensure Perfect Forward Secrecy (PFS) and Perfect Recover Secrecy (PRS), the
two strongest notions of resiliency to key exposures of secure key setup proto-
cols as presented in Table 5.2 (§ 5.6).

6.4.1 Authenticated DH: Perfect Forward Secrecy (PFS)


Assuming that the parties share a secret master key M K, it is quite easy
to extend the DH protocol in order to protect against MitM attackers. All
that is required is to use a Message Authentication Code (MAC) scheme, and
authenticate the DH flows. In particular, we can use a pseudo-random function
(PRF), say f , for both a MAC and to derive the key; see Fig. 6.8, showing an
authenticated variant of the DH protocol.

Lemma 6.1 (Informal: authenticated DH ensures PFS). The Authenticated


DH protocol (Fig. 6.8) ensures secure key-setup with perfect forward security
(PFS).

Sketch of proof: The PFS property follows immediately from the fact that
ki , the session key exchanged during session i, depends only on the result of
the DH protocol, i.e., is secure against an eavesdropping-only adversary. The

257
Figure 6.8: The Authenticated DH Protocol, ensuring PFS (but not recover
security)

protocol also ensures secure key setup, since a MitM adversary cannot learn
M K and hence cannot forge the DH messages (or find ki ).
Exercise 6.6. Alice and Bob share master key M K and perform the authen-
ticated DH protocol daily, at the beginning of every day i, to set up a ‘daily
key’ ki for day i. Assume that Mal can eavesdrop on communication between
Alice and Bob every day, but perform MitM attacks only every even day (i s.t.
i ≡ 0 ( mod 2)). Assume further that Mal is given the master key M K, on the
fifth day. Could Mal decipher messages sent during day i, for i = 1, . . . , 10?
Write your responses in a table.
Note that the results of Ex. 6.6 imply that the authenticated DH proto-
col does not ensure recover security. We next show extensions that improve
resiliency to key exposures, and specifically recover security after exposure,
provided that the attacker does not deploy MitM ability for one handshake.

6.4.2 The Synchronous-DH-Ratchet protocol: Perfect


Forward Secrecy (PFS) and Perfect Recover Secrecy
(PRS)
The authenticated DH protocol ensures perfect forward secrecy (PFS), but
does not recover secrecy; namely, a single key exposure, at some time t, suffices
to make all future handshakes vulnerable to a MitM attacker - even if there
has been some intermediate handshakes without (any) attacks, i.e., where the
attacker has neither MitM nor eavesdropper capabilities. To see that the au-
thenticated DH protocol does not ensure recovery, see the results of Ex. 6.6.
Note that the (shared key) RS-Ratchet protocol presented in subsection 5.6.2
(Fig. 5.17), achieved recovery of secrecy - albeit, not Perfect Recover Secrecy
(PRS). Namely, the authenticated DH protocol does not even strictly improve
resiliency compared to RS-Ratchet protocol: authenticated DH ensures PFS
(which RS-Ratchet does not ensure), but RS-Ratchet ensures recovery of se-
crecy (which authenticated DH does not ensure).
In the remainder of this section, we present three protocols which ensure
both PFS and PRS, by combining Diffie-Hellman (DH) with a ‘ratchet’ mech-
anism; we refer to these as Diffie-Hellman Ratchet protocols. We begin, in

258
Figure 6.9: The Synchronous-DH-Ratchet protocol: PFS and PRS

this subsection with the Synchronous DH Ratchet protocol, as illustrated in


Fig. 6.9.
Like the Authenticated DH protocol presented above, the Synchronous DH
Ratchet protocol also authenticates the DH exchange; hence, as long as the
authentication key is unknown to the attacker at the time when the protocol is
run, then the key exchanged by the protocol is secret. The improvement cf. to
the authenticated DH protocol is in the key used to authenticate the DH ex-
change; instead of using a fixed master key (M K) as done by the authenticated
DH protocol (Fig. 6.8), the Synchronous DH Ratchet protocol authenticates
the ith DH exchange using the session key exchanged in the previous exchange,
i.e., ki−1 . An initial shared secret key k0 is used to authenticate the very first
DH exchange, i = 1.
Lemma 6.2 (Informal: Synchronous-DH-Ratchet ensures PFS and PRS). The
Synchronous-DH-Ratchet protocol (Fig. 6.9) ensures secure key-setup with per-
fect forward secrecy (PFS) and perfect recovery secrecy (PRS).
Sketch of proof: The PFS property follows, like in Lemma 6.1, from the fact
that ki , the session key exchanged during session i, depends only on the result
of the DH protocol, i.e., is secure against an eavesdropping-only adversary. The
protocol also ensures secure key setup, since a MitM adversary cannot learn
ki−1 and hence cannot forge the DH messages.
The PRS property follows from the fact that if at some session i0 there is
only eavesdropping adversary, then the resulting key ki0 is secure, i.e., un-
known to the attacker, since this is assured when running DH against an
eavesdropping-only adversary. It follows that in the following session (i0 + 1),
the key used for authentication is unknown to the attacker, hence the execution
is again secure - and results in a new key ki0 +1 which is again secure (unknown
to attacker). This continues, by induction, as long as the attacker is not (some-
how) given the key ki to some session i, before the parties receive the messages
of the following session i + 1.

6.4.3 The Asynchronous-DH-Ratchet protocol


The synchronous DH-ratchet protocol ensures both PFS and PRS. However,
it has the following disadvantage: for a party, say Alice, to move to the next

259
set of keys, it must first receive the message from its peer (say Bob), since
the computation of the new key ki requires to receive the public-value from
the peer (g bi mod p). What if such a message never arrives, maybe because a
Man-in-the-Middle (MitM) attacker prevents it from arriving?
The Asynchronous-DH-Ratchet protocol, presented in Fig. 6.10, addresses
this challenge, i.e., allows Alice to change her key even before she receives a
message from Bob - and similarly for Bob, of course. However, this results
in a slightly more complex protocol, as we explain. For simplicity, we present
the protocol focusing only on protecting the authenticity of messages, leaving
for the reader the exercise of extending to support message confidentiality, too
(Exercise 6.7).
The key difference is that this protocol maintains separate key counters for
Alice and Bob; namely, each shared key is now denoted kiA ,iB , i.e., each key
has two indices: iA for Alice and iB for Bob. We derive kiA ,iB , as follows,
using a pseudo-random function f , much like in the synchronous-DH-ratchet
protocol, namely:

kiA ,iB = fkiA −1,iB (g aiA biB ) ⊕ fkiA ,iB −1 (g aiA biB ) (6.9)
Actually, Alice and Bob cannot apply Eq. (6.9) directly, since Alice does
not know biB - or even the current value of iB - and, similarly, Bob does not
know aiA - or even the current iA . Instead, Alice computes the current key
value kiA ,îB using aiB , its private, randomly-chosen exponent, and the latest
public value PB,îB which Alice received from Bob, i.e.:
ai ai
kiA ,îB = fki (PB,Aî ) ⊕ fki (PB,Aî ) (6.10)
A −1,îB B A ,îB −1 B

Similarly, Bob computes the current key value using:


bi bi
kîA ,iB = fkî (PA,Bî ) ⊕ fkî (PA,Bî ) (6.11)
A −1,iB A A ,iB −1 A

Notice that Alice ‘maintains’ iA , and Bob ‘maintains’ iB ; i.e., Alice can
increment iA at any time, and Bob cannot immediately be aware of this, and
vice versa. However, this does not cause a problem: Alice maintains its own
counter for Bob, which we denote îB , and similarly Bob maintains îA . Alice
authenticates messages it sends using key kiA ,îB , and includes îB so that Bob
will know which key it should use to authenticate them; and increments îB when
it receives a new, validly-authenticated key-share from Bob. See Fig. 6.10.
This design allows the Asynchronous-DH-Ratchet protocol to ensure perfect
forward secrecy (PFS) and perfect recover secrecy (PRS), and furthermore, to
be ‘robust’ to a MitM adversary, i.e., a MitM adversary cannot prevent the
parties from changing their keys. Indeed, a party can move to a new key
even when the peer is non-responsive. This can be a significant advantage,
especially in asynchronous scenarios, where one party may be disconnected for
long periods, e.g., in securing communication between mobile users.

260
Alice Bob
Init: iA , îB ← 0 , PB,0 = g Init: îA , iB ← 0 , PA,0 = g
Send mA,1
ψ
Increment iA z }| {
mA,1 , g aiA mod p, îB , fki (mA,1 , g aiA mod p)
A ,îB

Receive mA,1
Increment îA
PA,îA ← ψ

ψ
Send mB,1
z }| { Increment iB
mB,1 , g biB mod p, îA , fkî (mB,1 , g biB mod p)
A ,iB

Receive mB,1
Increment îB Send mB,2
ψ
PB,îB ← ψ z }| { Increment iB
mB,2 , g biB mod p, îA , fkî (mB,2 , g biB mod p)
A ,iB

Receive mB,2
Increment îB
PB,îB ← ψ

Figure 6.10: The Asynchronous-DH-Ratchet protocol. Alice derives her keys


as in Eq. (6.10), and Bob derives his keys as in Eq. (6.11).

Exercise 6.7. The Asynchronous-DH-Ratchet protocol, as presented in Fig. 6.10,


ensures only authenticity, not confidentiality. Show a similar sequence diagram,
showing an extended version of the protocol, that also ensures confidentiality.

Hint: Notice that you should not use the same key for both authentication
and encryption. Also, notice that in order to ensure perfect forward and recover
security, the protocol must make sure to erase old decryption keys.

6.4.4 The Double Ratchet Key-Exchange protocol [to be


completed]
The ratchet key-exchange protocols, presented above, provided very strong
resiliency to key exposure: perfect-forward secrecy (PFS) and perfect-recover
secrecy (PRS). However, to ensure such resiliency with respect to periods of
length T seconds requires performing the ratchet exchange - and the two mod-
ular exponentiation operations it involves (per party) - once every T seconds.
Modular exponentiation is ‘efficient’ in the sense of requiring only polynomial
running time, but this time is still quite high for frequent application.

261
As a result, practical deployments of ratchet key-exchange protocols would
usually use rather ‘long’ periods, i.e., use large values of T . To provide addi-
tional security against key-exposure, even between runs of the ratchet exchange
(involving modular exponentiations), the Double Ratchet Key-Exchange proto-
col combines the strong PFS+PRS security of the ratchet key-exchange pro-
tocols, with the weaker security guarantees of the Recover-Security Handshake
protocol of subsection 5.6.2. Let us explain how this is done, with some sim-
plifications. The double-ratchet protocol runs, essentially, the asynchronous-
ratchet protocol presented in the previous subsection, to derive the asynchronous-
ratchet keys kiA ,iB [0], essentially as in Eqs. (6.9-6.11). Note that in the double-
ratchet protocol, these keys would have an index whose value is zero, rather
than simply kiA ,iB as in the asynchronous-ratchet protocol - but otherwise this
would be the same.
After deriving the asynchronous-ratchet keys kiA ,iB [0], the double-ratchet
protocol then applies the PRF-based ratchet of the Recover-Security Handshake
protocol, described in subsection 5.6.2, to extract additional keys, kiA ,iB [i] for
i > 0. This derivation only requires (one) computation of a PRF, and is
therefore much more efficient than the multiple exponentiations required for
each round of the ratchet protocol. Hence, it is reasonable to perform this PRF-
based more frequently, than the frequency of performing the asynchronous-
ratchet protocol, with its much higher overhead.
The PRF-ratchet provides additional security for the duration between two
invocations of the asynchronous-ratchet protocol. Namely, the PRF-ratchet
suffices to make it impossible to ‘go back’, i.e., exposure of kiA ,iB [i] does not
expose kiA ,iB [j] for j < i.
From each of these keys kiA ,iB [i], we need to further derive specific keys for
different message-security purposes: encryption from Alice to Bob (ekiA→B A ,iB
[i]),
encryption from Bob to Alice (ekiB→A A ,iB
[i]), authentication of message from
Alice to Bob (akiA→B A ,iB
[i]) and authentication of message from Bob to Alice
(akiB→A
A ,iB
[i]). We leave it to the reader to define the derivation of these different
keys.
The resulting Double-Ratchet Key-Exchange protocol is illustrated in Fig-
ure 6.11, however, this figure should be improved; we leave it here in the hope
that it may provide some help in understanding.
The double-ratchet protocol is the essence of the TextSecure protocol, de-
ployed in popular instant messaging applications, e.g., WhatsApp and Signal.
[TBC]

6.5 Discrete-Log based public key cryptosystems: DH


and El-Gamal
6.5.1 The DH PKC
In their seminal paper [56], Diffie and Hellman presented the concept of public-
key cryptosystems - but did not present an implementation. On the other hand,

262
Figure 6.11: The Double-Ratchet Key-Exchange protocol. Figure should be
updated, it does not conform exactly with the notation currently in text (and
other improvements are needed too).

Figure 6.12: The DH-h Public Key Cryptosystem. The private decryption key
dA and random b are random integers less than p − 1. The text in blue is the
same as the two flows of the DH protocol.

they did present the DH key exchange protocol (Figure 6.7). We next show
that a minor tweak allows us to turn the DH key exchange protocol into a
PKC; we accordingly refer to this PKC as the DH PKC, and a variant of it
that also uses a hash function h as the DH-h PKC.
Fig. 6.12 presents the DH-h public key cryptosystem, by slightly changing
the presentation of the DH key exchange protocol (Figure 6.7). Essentially,
instead of Alice selecting random secret a and sending g a mod p to Bob in
the first flow of the DH protocol, we now view g dA mod p as Alice’s public
key eA , and dA is Alice’s private key. To encrypt a message m for Alice, using
her public key eA = g dA mod p, Bob essentially performs his role in the DH
protocol, i.e., selects random random value b ∈ [2, p − 2], and computes the
ciphertext.

263
Figure 6.13: The El-Gamal Public-Key Encryption

In the ‘basic’ DH PKC, the ciphertext is a pair g b mod p, m ⊕ g dA ·b mod p .




This has the disadvantage, that secrecy relies on the (stronger) the Decisional
DH assumption (DDH), rather than on the (weaker) Computational DH as-
sumption (CDH). In fact, if an attacker is able to distinguish between g dA ·b and
a random string, then indistinguishability may not hold for DH PKC. To ad-
dress this, in DH-h we apply a cryptographic, randomness-extracting hash func-
tion h before XORing, i.e., the ciphertext is the pair g b mod p, m ⊕ h(g dA ·b mod p) .


Exercise 6.8. Write formulas for the three functions comprising the DH PKC:
(KGDH (n), E DH , DDH ), and similarly for the DH-h PKC.

Another variant of the DH PKC uses a keyed key-derivation function fs (g dA ·b


mod p), with a uniformly-random key/salt s; see discussion of randomness ex-
tracting and KDF in § 6.3.
We next discuss the El-Gamal PKC, which is essentially a DDH-based vari-
ant of the DH-PKC; the El-Gamal PKC relies on the DDH, and does not
involve a hash of KDF function.

6.5.2 The El-Gamal PKC


The El-Gamal PKC was proposed separately from the DH protocol; however,
it is essentially a minor variant of the ‘basic’ DH-PKC, not using either a hash
function h or a KDF. Note tht this implies that security of El-Gamal requires
the group used satisfies the DDH assumption (Definition 6.6).
As shown in Fig. 6.13, the El-Gamal encryption of plaintext m, denoted
EeEGA
(m), is computed as follows, using Alice’s public key eA = g dA mod p:
$
b ← [2, p − 1] ; x = g b mod p ; y = m · ebA mod p (6.12)
El-Gamal decryption is therefore:
y mod p
DdA (x, y) = (6.13)
xdA mod p

264
The correctness property holds since:
y mod p
DdA (x, y) =
xdA mod p
b
m · g dA mod p mod p
= dA
(g b mod p) mod p
m · g b·dA mod p
=
g b·dA mod p
= m

6.5.3 Homomorphic encryption and re-encryption.


The El-Gamal encryption is homomorphic with respect to multiplication. Namely,
the multiplication of two ciphertexts equals the encryption of the multiplication
of the two plaintexts. Let us see how this works.
Given two messages m1 , m2 , let (x1 , y1 ) = EeEG
A
(m1 ), (x2 , y2 ) = EeEG
A
(m2 )
be their El-Gamal encryptions. Then we can compute the El-Gamal encryption
of the multiplication of the two plaintexts, m1 · m2 mod p, as follows:

EeEG
A
(m1 · m2 mod p) = (x1 · x2 , y1 · y2 ) mod p (6.14)

Exercise 6.9. Show that Eq. 6.14 is correct encryption of m1 · m2 mod p.


Hint: compute the decryption and show it returns m1 · m2 mod p.
Homomorphic encryption allows computation on encrypted values; in par-
ticular, we just saw how, given two ciphertexts, we can compute the encryp-
tion of their multiplication - without knowing the plaintexts or the decryp-
tion key. There are also encryption schemes which are homomorphic with
respect to other operations, e.g., addition. Furthermore, there are even en-
cryption schemes which are homomorphic with respect to both multiplication
and encryption; such schemes are referred to as Fully Homomorphic Encryp-
tion (FHE), and allow arbitrary computation over encrypted values. To avoid
confusion, schemes which are homomorphic with respect to only one opera-
tion, e.g., multiplication, are sometimes referred to as partially-homomorphic
encryption (PHE). Known FHE schemes are complex and have significant over-
head, in terms of computation time and/or key/ciphertext length, compared
to PHE schemes such as El-Gamal.
Let us focus on the El-Gamal multiplicative homomorphic encryption, and
demonstrate how it can be used for applications requiring anonymity, and
specifically voter anonymity. This is a tiny taste from the large research on the
use of cryptography to ensure privacy, anonymity and e-voting.
Consider, first, a trivial design for an e-voting system: voters encrypt their
votes with the public key of a trusted server, to which they then send their

265
Tally Decrypt
Alice Bob Cora
server server
eDS eDS eDS
eDS dDS
EeDS (piA )

EeDS (piB )

EeDS (piC )

EeDS (piA · piB · piC )

Decrypt and output piA · piB · piC

Figure 6.14: Example: Anonymous-Voting using two servers and


multiplicative-homomorphic encryption, e.g., El-Gamal PKC. Each candidate
i is assigned a unique small prime number: pi > 1. Each voter, e.g. Alice,
selects one candidate, say piA , and sends to the tally server EeDS (piA ). The
tally server combines the encrypted votes by computing eDS (piA · piB · piC ) =
EeDS (piA ) · EeDS (piB ) · EeDS (piC ) and sending it to the decrypt server. The
decrypt server outputs the combined vote piA · piB · piC . By factoring this, we
find how many votes were given to each candidate i.

votes; the server decrypts and then tallies the votes. This system requires
complete trust in the server; in particular, the server can trivially know the
vote of each voter.
To ensure voter anonymity, we separate the two functions of the server and
use two servers: a tally server and a decryption server. We also switch the
order of operations: the encrypted votes are sent first to the tally server, who
aggregates all of them into a single (encrypted) value, and then sent to the
decryption server, who decrypts to produce the final outcome.
The voting process is shown in Fig. 6.14. As shown, each candidate i
is assigned a unique small prime number: pi > 1. Each voter, e.g. Alice,
selects one candidate, say piA , and sends to the tally server EeDS (piA ). The
tally server combines the encrypted votes by computing EeDS (piA · piB · piC ) =
EeDS (piA ) · EeDS (piB ) · EeDS (piC ) and sending it to the decrypt server. The
decrypt server outputs the combined vote piA · piB · piC . By factoring this, we
find how many votes were given to each candidate i.
Let us point out another simple mechanism using multiplicative homomor-
phic encryption, e.g., El-Gamal: re-encryption. Namely, given a ciphertext
c, we can compute another ciphertext c0 such that c and c0 decrypt to the
same value, without knowing the plaintext or the decryption key. To do this,
we first encrypting the message 1, i.e., obtaining (x1 , y1 ) = EeA (1) (suppose
eA is the public key). We then use the homomorphic property to combine
(x1 , y1 ) = EeA (1) with c = (x, y) = EeA (m), thereby computing the new

266
ciphertext c0 = (x0 , y 0 ), which decrypts to the same value as c. Namely,
x0 = x · x1 , y 0 = y · y1 . From the homomorphic property, m = DdA (x0 , y 0 ), since
m = m · 1. Re-encryption is used in different protocols to preserve anonymity.
Notice that the above method of re-encryption of El-Gamal encryption re-
quires use of the public key eA with which the message was encrypted. In
some applications, it is desirable to allow re-encryption without specifying the
public key, e.g., for recipient anonymity. In such case, one can use an elegant
extension called universal re-encryption, which allows re-encryption without
knowledge of the encryption key eA . This is done by appending the encryption
EeA (1) to each ciphertext; see details in [84].
Note also that we only discussed re-encryptions which preserved the same
decryption key. El-Gamal also allows proxy re-encryption, where a proxy is
given a special key, denoted eA→B and computed by Alice, that allows the
proxy to transform ciphertext encrypted to Alice, c = EeA (m), into encryption
of the same message with Bob’s key, c0 = EeB (m). For more details on this
cool mechanism and its applications, see [33].

6.6 The RSA Public-Key Cryptosystem


In 1978, Rivest, Shamir and Adelman presented the first proposal for a public-
key cryptosystem - as well as a digital signature sheme [144]. This beautiful
scheme is usually referred to by the initials of the inventors, i.e., RSA; it was
awarded the Turing award in 2002, and is still widely used. We would only
cover here some basics details of RSA; a more in-depth study is recommended,
by taking a course in cryptography and/or reading one of the many books on
cryptography covering RSA in depth.

6.6.1 RSA key generation.


Key generation in RSA is more complex that for DH and El-Gamal; we first
list the steps, and then explain them:
• Select a pair of large prime numbers p, q; let n = p · q and let φn =
(p − 1) · (q −
 1).
 To encrypt messages of up to N bits, pick p, q to be each
a random N2 bits.
• Select a value e which is co-prime to φn , i.e., gcd(e, φn ) = 1.
• Compute d s.t. ed = 1 mod φn .
• The public key is (e, n) and the private key is (d, n).

Selecting e to be co-prime to φn is necessary - and sufficient - to ensure that


e has a multiplicative inverse d in the group mod φn . To find the inverse d,
we can use the extended Euclidean algorithm. This algorithm efficiently finds
numbers d, x s.t. e · d + φn · x = gcd(e, φn ) = 1; namely, ed = 1 mod φn .

267
n 1 2 3 4 5 6 7 8 9 10
φ(n) 1 1 2 2 4 2 6 4 6 4
factors? none none none 2·2 none 2·3 none 23 3·3 2·5

Table 6.3: The Euler function φ(n), computed for small integers; see eq. (6.17).

The public key of RSA is < e, n > and the private key is < d, n >, since
the modulus n is required for both encryption and decryption. However, we -
and others - often abuse notation and refer to the keys simply as e and d.
Notice that exactly the same process is used for key generation for RSA
signature schemes (discussed below), except that we denote the public verifica-
tion key by v (instead of public encryption key e) and the private signing key
by s (instead of private deryption key d).

6.6.2 Textbook RSA: encryption, decryption and


correctness.
We first present textbook RSA, which is a bit simpler than ‘real’ RSA encryp-
tion, presented later. The textbook RSA encryption c = Ee (m) and decryption
m = Dd (c) processes are simple and similar:
EeRSA (m) = me mod n (6.15)
DdRSA (c) = d
c mod n (6.16)
Where the message m is encoded as a positive integer, and limited to m < n,
ensuring that m = m mod n; in fact, 1 < m < n − 1 (why?). To handle the
common case where the plaintext is longer, we use hybrid encryption, i.e., use
the public key encryption to encrypt a shared key and then use the shared key
to efficiently encrypt the long message; see §6.1.6.
The reason this method is referred to as textbook RSA is that, to ensure
security, we usually pre-process messages before applying encryption, in a (key-
less) process usually referred to as padding. We explain the need for padding
in §6.6.3, and OAEP, the commonly-used padding scheme, in §6.6.4.

Correctness of textbook RSA


Let us explain why these processes ensure correctness of textbook RSA, i.e.,
m = DdRSA (EeRSA (m)) = (me mod n)d mod n. This is based on a fundamen-
tal result from number theory: Euler’s Theorem.
Before we present Euler’s theorem, we introduce the Euler’s function φ(n).
The value of φ(n) is defined as the number of positive integers which are less
than n and co-prime to n, i.e.,
φ(n) ≡ |{i ∈ N|i < n ∧ gcd(i, n) = 1}| (6.17)
The Euler function φ(n), computed for small integers, is shown in Table 6.3.
Note that for any primes p, q holds φ(p) = p−1, φ(q) = q−1, and φ(p · q) = (p−

268
1)(q −1); you can see it in the table, but it is not hard to confirm that this holds
in general. This is the reason for us using φn = φ(n) = φ(p · q) = (p − 1)(q − 1)
in the RSA key generation process.
We now present Euler’s theorem:
Theorem 6.1 (Euler’s theorem). For any co-prime integers m, n holds mφ(n) =
1 mod n.
We will not prove the theorem; interested readers can find proofs in many
texts. However, let us use the theorem, to explain RSA’s correctness, i.e., why
Dd (Ee (m)) = m.
Our explanation, below, uses Euler’s theorem, with m being the message
and n being the RSA modulus. However, the theorem requires m and n to be
co-primes. Does this hold?
Recall that the message m may be any positive integer s.t. 1 < m < n − 1.
How many of these possible messages - i.e., integers smaller than n - are co-
prime to n? Why, this is exactly the definition of φ(n), which we know to be:
φ(n) = φ(p · q) = (p − 1) · (q − 1) = n − q − p + 1.
So most possible messages m are indeed co-prime to to n - but p + q − 2
messages are not co-prime to n. Our explanation would not hold for these val-
ues. We assure the reader, however, that correctness holds also for these values;
it simply requires a slightly more elaborate argument. The argument usually
uses the Chinese Remainder Theorem, and can be found in many textbooks
on cryptography (and number theory). We present only the argument for the
case that m and n are co-prime (gcd(m, n) = 1).
Recall that ed = 1 mod φ(n), i.e., for some integer i holds ed = 1 + i · φ(n).
Hence:
med mod n = m1+i·φ(n) mod n
 i
= m · mφ(n) mod n

b
Recall Eq. (2.7) : ab mod n = (a mod n) mod n. Hence, by substitut-
ing mφ(n) = 1 mod n (Euler’s theorem), we have:
 i
med mod n = m · mφ(n) mod n mod n = m · 1i = m (6.18)

To sum up, equation (6.18) shows the correctness of textbook RSA, as


defined by eqs. (6.15) and (6.15).

Choosing e to improve efficiency


The public exponent e is not secret, and, so far, it was only required to be co-
prime to φ(n). This motivates choosing e that will improve efficiency - usually,
to make encryption faster.
In particular, choosing e = 3 implies that encryption - i.e., computing me
mod n - requires only two multiplications, i.e., is very efficient (compared to ex-
ponentiation by larger number). Note, however, that there are several concerns

269
with such extremely-small e; in particular, if m is also small, in particular, if
c = me < n (without reducing mod n), then we can efficiently decrypt by
1/e
taking the e-th root: c1/e = (me ) = m. This particular concern is rela-
tively easily addressed by padding, as discussed below; but there are several
additional attacks on RSA with very-low exponent e, in particular in the case
where sending the same message, or ‘highly related’ messages, to multiple re-
cipients. These attacks motivate (1) the use of padding to break any possible
relationships between messages (see §6.6.4), as well as (2) the choice of slightly
larger e, such as 17 or even 216 +1 = 65537. The reason to choose these specific
primes is that exponentiation requires only 5 or 17 multiplications, respectively;
see next exercise.
Exercise 6.10. Given integer m, show how to compute m17 in only five mul-
16
tiplications, and m2 +1 in only 17 multiplications.
Hint: use the following idea: compute m8 with three multiplications by
 2 2
m8 = m2 .

6.6.3 The RSA assumption and security


Now that we have seen that RSA is a PKC and ensures correctness, it is time
to discuss the security of RSA.
Intuitively, the security of RSA is based on the following assumption, usu-
ally referred to as the RSA assumption.
Definition 6.8 (RSA assumption). Choose n, e as explained above, i.e., n = pq
for p, q chosen as random l-bit prime numbers, and e is co-prime to φn . The
RSA assumption is that for any efficient (PPT) algorithm A and constant c,
and for sufficiently large l, holds:
Pr [A((e, n), me mod n) = m] < lc (6.19)
$
Where m is chosen randomly m ← [1, n − 1].
The RSA assumption is also referred to sometimes as the RSA trapdoor
one-way permutation assumption. The ‘trapdoor’ refers to the fact that d is a
‘trapdoor’ that allows inversion of RSA; the ‘one-way’ refers to the fact that
computing RSA (given public key (e, n)) is easy, but inverting is ‘hard’; and the
‘permutation’ is due to RSA being a permutation (and in particular, invertible).
See also the related concept of one-way functions in §4.4.
One obvious question is the relation between the RSA assumption and
the assumption that factoring is hard. Assume that some adversary AF can
efficiently factor large numbers, specifically, the modulus n (which is part of
the RSA public key). Hence, AF can factor n, find q and p, compute φn and
proceed to compute the decryption key d, given the public key < e, n >, just
like done in RSA key generation. We therefore conclude that if factoring is
easy, i.e., exists such adversary A, then the RSA assumption cannot hold (and
RSA is insecure).

270
RSA Security.
Another question is the relationship between the RSA assumption and the
security of RSA as used as a public key cryptosystem (PKC). Trivially, if the
RSA assumption is false, then RSA is not a secure cryptosystem - definitely
not the ‘textbook RSA’ defined above.
However, what we really care about is the reverse direction, i.e., having a
cryptosystem which is secure assuming that the RSA assumption holds. This
does not hold for textbook RSA. There are a few problems we should address,
to build a secure PKC from RSA:

1. If we apply textbook RSA, i.e., as in Eq. (6.15), then it cannot be IND-


CPA secure, since it is completely deterministic. Some randomness must
be added to the process. Namely, the IND-CPA experiment allows the
attacker to know (even to choose) two specific messages m0 , m1 . The
attacker is then given encryption c∗ = emb mod n for random bit b, and
‘wins’ if it finds b. However, the attacker can compute c0 = em0 mod n
and c1 = em1 mod n, and compare to c∗ to immediately find b and ‘win’.
Obviously, randomization is necessary.
2. The RSA assumption, as stated, does not rule out a potential exposure
of partial information about the plaintext, e.g., a particular bit. Note
that the log n least-significant bits were shown to be secure [2].
3. RSA is vulnerable to chosen ciphertext attack (CCA). Specifically, the
attacker wants to decrypt ciphertext c = me mod n - without making
a decryption query with c (as that would not be considered ‘winning’).
Instead, attacker asks for decryption of a different ciphertext: ĉ = c · m̂e
mod n. This returns m · m̂, from which the attacker obtains the original
plaintext m.
4. Furthermore, textbook RSA - as well as version 1.5 and earlier of the
PKCS 1 standard - were shown vulnerable to a practical feedback-only
CCA attack by Bleichenbacher [36].
5. As additional motivation, recall the attacks mentioned above against RSA
when using a small exponent (e.g., e = 3) and sending identical or related
messages to multiple recipients.

The standard way to deal with these concerns is to apply a randomized


‘padding’ function to the plaintext, and only then apply the ‘textbook’ RSA
function (eq. 6.15), as we discuss next.

6.6.4 RSA with OAEP (Optimal Asymmetric Encryption


Padding)
So far, we discussed textbook RSA (§6.6.2); however, this version is rarely used
in practice, due to known attacks and vulnerabilities, listed above. Bellare

271
Figure 6.15: RSA with OAEP (Optimal Asymmetric Encryption Padding).

and Rogaway presented in [22] the OAEP (Optimal Asymmetric Encryption


Padding) scheme, which is widely used for padding of messages before applying
the ‘textbook RSA’ exponentiation (Eq. (6.15)).
The goal of OAEP is to prevent different attacks, including CCA attacks;
to prevent CCA, OAEP includes a verification mechanism, which allows the
recipient to confirm that the ciphertext is the result of legitimate encryption
rather than manipulation of intercepted (different) ciphertext.
[TBC]
The security of OAEP is analyzed using the Random Oracle Methodology
(ROM), see §4.6.

6.7 Public key signature schemes


We now discuss the third type of public-key cryptographic schemes: signature
schemes, introduced in subsection 3.3.2. Signature schemes consist of three
efficient algorithms (KG, S, V ), illustrated in Figure 3.2:

Key-generation KG, a randomized algorithm, whose input is the key length


l, and which outputs the private signing key s and the public validation
key v, each of length l bits.
Signing S , a (deterministic or randomized) algorithm, whose inputs are a
message m and the signing key s, and whose output is a signature σ.
Validation V , a deterministic algorithm, whose inputs are a message m, sig-
nature σ and validation key v, and which outputs an indication whether
this is a valid signature for this message or not.

Figure ?? illustrates the process of signing a message (by Alice) and valida-
tion of the signature (by Bob). We denote Alice’s keys by A.s (for the private
signing key) and A.v (for the public validation key); note that this figure as-
sumes that Bob knows A.v - we later explain how signatures also facilitate
distribution of public keys such as A.v.

272
Signature schemes have two critical properties, which make them a critical
enabler to modern cryptographic systems. First, they facilitate secure remote
exchange in the MitM adversary model; second, they facilitate non-repudiation.
We begin by briefly discussing these two properties.

Signatures facilitate secure remote exchange of information in the


MitM adversary model. Public key cryptosystems and key-exchange pro-
tocols, facilitate establishing of private communication and shared key between
two remote parties, using only public information (keys). However, this still
leaves the question of authenticity of the public information (keys).
If the adversary is limited in its abilities to interfere with the communication
between the parties, then it may be trivial to ensure the authenticity of the
information received from the peer. In particular, many works assume that the
adversary is passive, i.e., can only eavesdrop to messages; this is also the basic
model for the DH key exchange protocol. In this case, it suffices to simply send
the public key (or other public value).
Some designs assume that the adversary is inactive or passive during the
initial exchange, and use this exchange information such as keys between the
two parties. This is called the trust on first use (TOFU) adversary model.
In other cases, the attacker may inject fake messages, but cannot eavesdrop
on messages sent between the parties; in this case, parties may easily authen-
ticate a message from a peer, by previously sending a challenge to the peer,
which the peer includes in the message.
However, all these methods fail against the stronger monster-in-middle
(MitM) adversary, who can modify and inject messages as well as eavesdrop
on messages. To ensure security against such attacker, we must use strong,
cryptographic authentication mechanisms. One option is to use message au-
thentication codes, however, this requires the parties to share a secret key in
advance ; if that’s the case, the parties could use this shared key to establish
secure communication directly.
Signature schemes provide a solution to this dilemma. Namely, a party
receiving signed information from a remote peer, can validate that informa-
tion, using only the public signature-validation key of the signer. Furthermore,
signatures also allow the party performing the signature-validation, to first
validate the public signature-validation key, even when it is delivered by an in-
secure channel which is subject to a MitM attack, such as email. This solution
is called public key certificates.
As illustrated in Fig. 3.3, a public key certificate is a signature by an entity
called the issuer or certificate authority (CA), over the public key of the subject,
e.g., Alice. In addition to the public key of the subject, subject.v, the signed
information in the certificate contains attributes such as the validity period,
and, usually, an identifier and/or name for the subject (Alice).
Once Alice receives her signed certificate Cert, she can deliver it to the
relying party (e.g., Bob), possibly via insecure channels such as email. This
allows the relying party (Bob) to use Alice’s public key, i.e., rely on it, e.g., to

273
validate Alice’s signature over a message m, as shown in Fig. 3.3. Note that
this requires Bob to trust this CA and to have its validation key, CA.v.
This discussion of certificates is very basic; more details will be provided in
chapter 8, which discusses public-key infrastructure and the TLS/SSL protocol.

Signatures facilitate non-repudiation. The other unique property of dig-


ital signature schemes is that they facilitate non-repudiation. Namely, upon
receiving a properly signed document, together with a signature by some well-
known authority establishing the public signature-validation key, the recipient
is assured that she can convince other parties that she received the document
signed properly. This is very useful property. This property does not hold for
message-authentication codes (MAC schemes), where a recipient can validate
an incoming message has the correct MAC code, but cannot prove this to an-
other party - in particular, since the recipient is able to compute herself the
MAC code for arbitrary messages.

Designs of signature schemes. We next briefly present two designs of


signature schemes: RSA-based signature, and Discrete-Log based signatures
(focusing on a variant known as El-Gamal signatures). Both designs also make
use of cryptographic hash functions h(·), which we discuss in chapter 4.

6.7.1 RSA-based signatures


RSA signatures were proposed in the seminal RSA paper [144], and are based
on the RSA assumption, with exactly the same key-generation process as for
the RSA PKC. The only difference in key generation, is that for signature
schemes, the public key is denoted v (as it is used for validation), and the
private key is denoted s (as it is used for signing).
There are two main variants of RSA signatures: signature with message re-
covery, and signature with appendix. We begin with signatures with appendix,
as in practice, almost all applications of RSA signatures are with appendix; in
fact, we present (later) signatures with message recovery mainly since they are
often mentioned, and almost as often, a cause for confusion.

RSA signature with appendix.


In the (theoretically-possible) case that input messages are very short, and can
be be encoded as a positive integer which is less than n, we can sign using RSA
by applying the RSA exponentiation directly to the message, resulting in the
signature σ. In this case, the signature and validation operations are defined
as:

SsRSA (m) = (ms mod n, m)


RSA
Vv (σ, m) = {m if m = σ v mod n, ‘error’ otherwise}

274
Above, s is the private signature key, and v is the public validation key. The
keys are generated using the RSA key generation process; see subsection 6.6.1
In practice, as discussed in §4.2.6, input messages are of variable length -
and rarely shorter than modulus. Hence, real signatures apply the Hash-then-
Sign (HtS) paradigm (??), using some cryptographic hash function h, whose
range is contained in (1, n − 1), i.e., allowable input to the RSA function.
Applied to the RSA FIL signature as defined above, we have the signature
scheme (S RSA,h , V RSA,h ), defined as follows:

s
SsRSA,h (m) = ([h(m)] mod n, m)
VvRSA,h (σ, m) = {m if h(m) = σ v mod n, error otherwise}

The resulting signature scheme is secure, if h is a CRHF; see §4.2.


This signature scheme is called signature with appendix since it requires
transmission of both original message and its signature. This is in contrast
to a rarely used variant of RSA signatures which is called signature with mes-
sage recovery, which we explain next. ‘Signature with recovery’ is rarely, if
ever, applied in practice; we describe it since there is a lot of reference to it
in literature, and in fact, this method causes quite a lot of confusion among
practitioners. Hopefully the text below will help to avoid such confusion.

RSA signature with message recovery.


RSA signatures with message recovery have the cute property, that they only
require transmission of one mod n integer - the signature; the message itself
does not need to be sent, as it is recovered from the signature. This cute
property would result in a small savings of bandwidth, compared to signature
with appendix, when both methods are applicable. However, as we explain
below, this method is rarely applicable; furthermore, it is cause for frequent
confusion.
RSA signatures with message recovery require the use of an invertible
padding function R(·) which is applied to the messages to be signed. The
main goal of R is to ensure sufficient, known redundancy (in R(m); this is why
we denote it by R). This redundancy, applied to the message before the public
key signature operation, should make it unlikely that a random value would
appear as a valid signature.
The output of R(m) is used as input to the RSA exponentiation; to ensure
recovery of m, the value of R(m) must be less than the RSA modulus n. Note
that this implies that length of m has to be even shorter than the length of
R(m), since R(m) must contain all of m, plus the redundancy.
Once R is defined, the signature and validation operations for RSA with
Message Recovery (RSAwMR) would be:
s
SsRSAwM R (m) = [R(m)] mod n (6.20)
 −1 v
VvRSAwM R (x)

= R (x mod n) if defined, else error (6.21)

275
For validation to be meaningful, there should be only a tiny subset of the
integers x s.t. xv mod n would be in the range of R, i.e., the result of the
mapping of some message m. Since there are at most n values of xv mod n to
begin with, this means that the range of R, i.e., the set of legitimate messages,
must be tiny in comparison with n - which means that the message space should
be really tiny.
In reality, messages being signed are almost always much longer than the
tiny message space available for signatures with message recovery. Hence, the
use of this method is almost non-existent. In fact, our description of signature
schemes (Figure 3.2) assumed that the message is sent along with its signature,
i.e., our definition did not even take into consideration schemes like this, which
avoid sending the original message entirely.
Note that RSA signatures with message recovery are often a cause of con-
fusion, due to their syntactic similarity to RSA encryption. Namely, you may
come across people referring to the use of ‘RSA encryption with the private
key’ as a method to authenticate or sign messages. What these people really
mean is to the use of RSA signatures with message recovery. We caution to
avoid such confusing use of terminology; RSA signatures are usually used with
appendix, but even in the rare cases of using RSA signatures with message
recovery, RSA signing is not the same as encryption with the private key!

6.7.2 Discrete-Log based signatures


[TBD]

276
6.8 Additional Exercises
Exercise 6.11. The Diffie-Hellman protocol is a special case of a key ex-
change protocol, defined by the pair of functions (KG, F ), as introduced in
subsection 6.1.3.
1. Present the Diffie-Hellman protocol as a key exchange protocol, i.e., define
the corresponding (KG, F ) functions.
2. We presented two assumptions regarding the security of the DH protocol:
the computational-DH (CDH) assumption and the decisional-DH (DDH)
assumption. Show that one of these assumption does not suffice to ensure
key-indistinguishability? What about the other one?
Exercise 6.12. It is proposed that to protect the DH protocol against an im-
poster, we add an additional ‘confirmation’ exchange after the protocol termi-
nated with a shared key k = h(g ab mod p). In this confirmation, Alice will send
to Bob M ACk (g b ) and Bob will respond with M ACk (g a ). Show the message-
flow of an attack, showing how an attacker (Monster) can impersonate as Alice
(or Bob). The attacker has ‘MitM capabilities’, i.e., it can intercept messages
(sent by either Alice or Bob) and inject fake messages (incorrectly identifying
itself as Alice or Bob).
Exercise 6.13. Suppose that an efficient algorithm to find discrete log is found,
so that the DH protocol becomes insecure; however, some public-key cryptosys-
tem (G, E, D) is still considered secure, consisting of algorithms for, respectively,
key-generation, encryption and decryption.
1. Design a key-agreement protocol which is secure against an eavesdropping
adversary, assuming that (G, E, D) is secure (as a replacement to DH).
2. Explain which benefits the use of your protocol may provide, compared
with simple use of the cryptosystem (G, E, D), to protect the confiden-
tiality of messages sent between Alice and Bob against a powerful MitM
adversary. Assume Alice and Bob do have known public keys.
Exercise 6.14. Assume that there is an efficient (PPT) attacker A that can
find a specific bit in g ab mod p, given only g a mod p and g b mod p. Show
that the DDH assumption does not hold for this group, i.e., that there is an
efficient (PPT) attacker A that can distinguish, with significant advantage over
random guess, between g ab mod p and between g x for x taken randomly from
[1, . . . , p − 1].
Exercise 6.15. It is frequently proposed to use a PRF as a Key Derivation
Function (KDF), e.g., to extract a pseudo-random key k 0 = P RFk (g ab mod p)
from the DH exchanged value g ab mod p, where k is a uniform random key
(known to attacker). Show a counterexample: a function f which is a secure
PRF, yet insecure if used as a KDF. For your construction, you may use a
secure PRF f 0 .

277
Figure 6.16: How not to ensure resiliency - illustration for Ex. 6.16

Figure 6.17: Insecure variant of the authenticated DH protocol, studied in


Exercise 6.17

Exercise 6.16 (How not to ensure resilient key exchange). Fig. 6.16 illustrates
a slightly different protocol for using authenticated DH to ensure resilient key
exchange. Present a sequence diagram showing that this protocol is not secure.
Hint: show how an attacker is able to impersonate as Alice, without know-
ing any of Alice’s previous keys; at the end of the handshake, Bob will believe
it has exchanged key k with Alice, but the key was actually exchanged with
the attacker.
Exercise 6.17 (Insecure variant of authenticated DH). The protocol in Fig. 6.17
is a simplified version of the authenticated DH protocol of Fig. 6.8.
1. Show a sequence diagram for an attack showing that this variant is inse-
cure. Hint: your attack may take advantage of the fact that there is no
validation of the incoming flows, except for validation of the MAC values.
2. Explain why this attack does not apply to the ratchet protocol (Fig. ??).

Exercise 6.18. The protocol in Fig. 6.18 is an (incorrect) attempt at a robust-


combiner authenticated DH protocol.
1. Show a sequence diagram for an attack showing that this variant is inse-
cure.

278
Figure 6.18: Insecure ‘robust-combiner’ authenticated DH protocol, studied in
Exercise 6.18

2. Show a simple fix that achieves the goal (robust combiner authenticated
DH protocol).

Exercise 6.19. Alice and Bob use low-energy devices to communicate. To


ensure secrecy, they run, daily, the Sync-DH-Ratchet protocol (Fig. 6.9), but
want to further improve security, by changing keys every hour, but avoiding
additional exponentiations. Let kij denote the key they share after the j th hour
of the ith day, where ki0 = ki (the key exchanged in the ‘daily exchange’ of
Fig. 6.9).
1. Show how Alice and Bob should set their hourly shared secret key kij .
2. Define the exact security benefit achieved by your solution.

Exercise 6.20. Assume it takes 10 seconds for any message to pass between
Alice and Bob.
1. Assume that both Alice and Bob initiate the ratchet protocol (Fig. ??)
every 30 seconds. Draw a sequence diagram showing the exchange of
messages between time 0 and time 60seconds; mark the keys used by each
of the two parties to authenticate messages sent and to verify messages
received.
2. Repeat, if Bob’s clock is 5 seconds late.
3. Repeat, when using the ‘double-ratchet’ variant, where both parties per-
form the PRF derivation whenever 10 seconds passed since last changing
the key.

Exercise 6.21. In the ratchet protocol, as described (Fig. ??), the parties
derive symmetric keys ki,j and use them to authenticate data (application)
messages they exchange between them, as well as the first message of the next
handshake.
1. Assume a chosen-message attacker model, i.e., the attacker may define
arbitrary data (application) messages to be sent from Alice to Bob and

279
Figure 6.19: Authenticated DH with changing Master Keys (Ex. 6.24)

vice verse at any given time, and ‘wins’ if a party accepts a message never
sent by its peer (i.e., that message passes validation successfully). Show
that, as described, the protocol is insecure in this model.
2. Propose a simple, efficient and secure way to avoid this vulnerability, by
only changing how the protocol is used - without changing the protocol
itself.
Exercise 6.22. The DH protocol, as well as the ratchet protocol (as described
in Fig. ??), are designed for communication between only two parties.
1. Extend DH to support key agreement among three parties.
2. Similarly extend the ratchet protocol.
Exercise 6.23. In the double-ratchet protocol, as described in class, at the
t
beginning of ‘turn’ t in a party, the party uses the ‘current key’ ki,j to derive
t+1 t+1
two keys: k , to be used at the next ‘turn’, and k , used to authenticate
i,j
d
i,j
and encrypt messages sent and received between the peers.
t+1
1. Explain why kd
i,j is used to authenticate and encrypt, rather than using
t+1
ki,j .

t+1
2. Explain how to use kd
i,j (to authenticate and encrypt messages sent be-
tween the peers).

Exercise 6.24 (Authenticated DH with changing Master Keys). Figure 6.19


shows a variant of the Authenticated DH protocol, where the master key is
changing (as indicated). Assume that this protocol is run daily, from day i = 1,
and where M k0 is a randomly-chosen secret initial master key, shared beteen
Alice and Bob; messages on day i are encrypted using session key ki . An
attacker can eavesdrops on the communication between the parties on all days,
and on days 3, 6, 9, . . . it can also spoof messages (send messages impersonating
as either Alice or Bob), and act as Monster-in-the-Middle (MitM). On the fifth
day (i = 5), the attacker is also given the initial master key M k0 .

280
Figure 6.20: Insecure variant of Ratchet DH Protocol (Ex. 6.25)

• What are the days that the attacker can decrypt (know) their messages,
upon day ten?
• Show a sequence diagram of the attack, and list every calculation done by
the attacker. For every value used by the attacker, explain why/how that
value known to the attacker.

Exercise 6.25 (Insecure variant of Ratchet DH). Figure 6.20 shows a vulnera-
ble variant of the Ratchet DH protocol, using a (secure) pseudorandom function
f to derive the session key. Assume that this protocol is run daily, from day
i = 1, and where k0 is a randomly-chosen secret initial key, shared between
Alice and Bob; messages on day i are encrypted using key ki . An attacker
can eavesdrops on the communication between the parties on all days, and on
days 3, 6, 9, . . . it can also spoof messages (send messages impersonating as ei-
ther Alice or Bob), and act as Monster-in-the-Middle (MitM). On the fifth day
(i = 5), the attacker is also given the initial key k0 .

• On which day can attacker first decrypt messages? Answer:


• On the day you specified, what are the days that the attacker can decrypt
messages of ?
• Explain the attack , including a sequence diagram if relevant. Include
every calculation done by the attacker.

Exercise 6.26 (Improving security of Sync-Ratchet). Alice and Bob use low-
energy devices to communicate. To ensure secrecy, they run, daily, the Sync-
Ratchet protocol (Fig. 5.12), but want to further improve security, by changing
keys every hour, but avoiding additional exponentiations. Let kij denote the key
they share after the j th hour of the ith day, where ki0 = ki (the key exchanged
in the ‘daily exchange’ of Fig. 5.12).
1. Show how Alice and Bob should set their hourly shared secret key kij = .
2. Define the exact security benefit achieved by your solution.
Hint: Compute kij = P RFkj−1 (‘next’).
i

281
Exercise 6.27. We saw that El-Gamal encryption (Eq. (6.12) may be re-
randomized, using the recipient’s public key, and mentioned that this may be
extended into an encryption scheme which is univerally re-randomizable, i.e.
where re-randomization does not require the recipient’s public key. Design such
encryption scheme. Hint: begin with El-Gamal encryption, and use as part of
the ciphertext, the result of encrypting the number 1.
Exercise 6.28. Design a simple and efficient protocol allowing a set of users
{1, . . . , n} to exchange messages securely among them, i.e., ensuring confiden-
tiality, authenticity and integrity of the messages. Your design may assume
that users know the public keys of each other; furthermore, you may assume
that each user i has two pairs of public-private keys: an encryption-decryption
key-pair (e(i), d(i)) and a signing-verifying key-pair (s(i), v(i)). Assume that
the public keys e(i), v(i) of every user i, are known to all users. Your descrip-
tion should be very clear, well-defined and concise, but can be high-level, e.g.,
use (a, b, c) ← α to denote ‘splitting’ a tuple α = (a, b, c) into its components.
Please read all three parts below before answering; the third part is an addi-
tional requirement, that you may meet already in answer of parts 1 and 2, and
this may save your time when answering part 3.
1. A send function (process) that receives message m, receiver identifier i, a
sender identifier j and the sender’s private keys d(j), s(j), and produces
a ‘ciphertext’ c to be sent to i, i.e., c = sendj,d(j),s(j) (m, i).
2. A receive function (process) that receives ‘ciphertext’ c, receiver identifier
i, and the receiver’s private keys d(i), s(i), and produces message m and
sender identifier j, if c was output of send executed by j on message m
with receiver i, and an error indicator ⊥ otherwise.
3. Another goal is to prevent recipients from proving to somebody else that
the sender sent the message (as when showing signed message). Is this
provided by your solutions? Why? If not, present processes for sending
and receiving to support this goal.
Exercise 6.29. The RSA algorithm calls for selecting e and then computing
d to be its inverse ( mod φ(n)). Explain how the key owner can efficiently
compute d, and why an attacker cannot do the same.
Exercise 6.30. The RSA key generation algorithm requires the selection of
two large primes p, q. Would it be secure to save time by using p = q? Or first
choose p, then let q be the next-largest prime?
Exercise 6.31 (Tiny-message attack on textbook RSA). We discussed that
RSA should always be used with appropriate padding, and that ‘textbook RSA’
(no padding) is insecure, in particular, is not probabilistic so definitely does not
ensure indistinguishability.
1. Show that textbook RSA may be completely decipherable, if the message
length is less than |n|/e. (This is mostly relevant for e = 3.)

282
2. Show that textbook RSA may be completely decipherable, if there is only
a limited set of possible messages.
3. Show that textbook RSA may be completely decipherable, if the message
length is less than |n|/e, except for a limited set of additional (longer)
possible messages.
Exercise 6.32. Consider the use of textbook RSA for encryption (no padding).
Show that it is insecure against a chosen-ciphertext attack.

Exercise 6.33. Consider a variation of RSA which uses the same modulus N
for multiple users, where each user, say Alice, is given its key-pair (A.e, A.d)
by a trusted authority (which knows the factoring of N and hence φ(N ). Show
that one user, say Mal, given his keys (M.e, M.d) and the public key of other
users say A.e, can compute A.d. Note: recall that each users’s private key is
the inverse of the public key ( mod φ(n), e.g., M.e = M.d−1 mod φ(n).

Exercise 6.34. Public-key algorithms often use term ‘public key’ to refer to
only one component of the public key. For example, with RSA, people often
refer to e as the public key, although the actual RSA public key consists of the
pair (e, n), i.e., also includes the modulus n.
Consider an application which receives an RSA signature (eA , n), where eA
is the same as in the public key (eA , nA ) of user Alice, but n 6= nA ; however,
the application still concludes that this is a valid signature by Alice. Show how
this allows an attacker to trick the recipient into believing - incorrectly - that
an incoming message (sent by the attacker) was signed by Alice.
Note :similar situation exists with other public key algorithms, e.g., elliptic
curves, where the public key consists of a specification of a curve and of a
particular ‘public point’ on the curve, but often people refer only to the point
as if it is the (entire) public key. In particular, this led to the ‘Curveball’
vulnerability in the Windows certificate-validation mechanism [154], which was
due to validation of only the ‘public point’ and use of the curve selected by the
attacker.

Exercise 6.35. You are given textbook-RSA ciphertext c = 281, with public
key e = 3 and modulus n = 3111. Compute the private key d and the message
m = cd mod n.

Hint: it is probably best to begin by computing the factorization of n.

Exercise 6.36. Consider the use of textbook RSA for encryption as well as for
signing (using hash-then-sign), with the same public key e used for encryption
and for signature-verification, and the same private key d used for decryption
and for signing. Show this is insecure against chosen-ciphertext attacks, i.e.,
allows either forged signatures or decryption.

Exercise 6.37. The following design is proposed to send email while preserv-
ing sender-authentication and confidentiality, using known public encryption

283
and verification keys for all users. Namely, assume all users know the pub-
lic encryption and verification keys of all other users. Assume also that all
users agree on public key encryption and signature algorithms, denoted E and
S respectively.
When one user, say Alice, wants to send message m to another user, say
Bob, it computes and sends: c = EB.e (m + + ‘A’ ++ SA.s (m)), where B.e is Bob’s
public encryption key, A.s is Alice’s private signature key, and ‘A’ is Alice’s
(unique, well-known) nickname, allowing Bob to identify her public keys. When
Bob receives this ciphertext c, it first decrypts it, which implies it was sent to
him. He then identifies by the ‘A’ that it was purported to be sent by Alice.
To validate this, he looks up Alice’s public verification key A.v, and verifies the
signature.
1. Explain how a malicious user Mal can cause Bob to display a message
m as received from Alice, although Alice never sent to Bob that mes-
sage. (Alice may have sent a different message, or sent that message to
somebody else.)
2. Propose a simple, efficient and secure fix.

Exercise 6.38 (Signcryption). Many applications require both confidentiality,


using recipient’s public encryption key, say B.e, and non-repudiation (signa-
ture), using sender’s verification key, say A.v. Namely, to send a message to
Bob, Alice uses both her private signature key A.s and Bob’s public encryption
key B.e; and to receive a message from Alice, Bob uses his private decryption
key B.d and Alice’s public verification key A.v.

1. It is proposed that Alice will select a random key k and send to Bob the
triplet: (cK , cM , σ) = (EB.e (k), k⊕m, SignA.s (‘Bob0 +
+k⊕m)). Show this
design is insecure, i.e., a MitM attacker may either learn the message m
or cause Bob to receive a message ‘from Alice’ - that Alice never sent.
2. Propose a simple, efficient and secure fix. Define the sending and receiv-
ing process precisely.
3. Extend your solution to allow prevention of replay (receiving multiple
times a message sent only once).

284
Chapter 7

The TLS/SSL protocols for


web-security and beyond

In this chapter, we discuss the Transport-Layer Security (TLS) protocol, which


is the main protocol used to secure connections over the Internet - and, in
particular, web-communication. TLS is currently at version 1.3, and we will
also discuss earlier versions: TLS 1.2, 1.1 and 1.0, and versions 2.0 and 3.0 of
SSL - the predecessor of TLS.
The SSL/TLS protocol is arguably the most ‘successful’ security protocol -
it is definitely very widely used; but even more significantly, it is used in more
diverse scenarios and environments than any other security protocol.
From a security point of view, the wide popularity of SSL/TLS is a double-
edged sword. On the one hand, this wide popularity implies that ‘black-hat
crackers’ have a strong motivation to find vulnerabilities in the SSL/TLS pro-
tocols and in their popular implementations. In fact, the desire to be able to
‘break’ secure connections, even motivates powerful organizations such as the
NSA to invest extensive efforts in ‘injecting’ intentional, hidden vulnerabilities
(backdoors) into the protocols and implementations. One important example
are the vulnerabilities built-into the Dual-EC Deterministic Random Bit Gen-
erator (DRBG) [43], which was later ‘pushed’ into widely-used libraries used
by SSL/TLS implementations - all as part of the efforts of the NSA to improve
its abilities to eavesdrop on communications.
On the other hand, this wide popularity also motivates extensive efforts by
the ‘white-hat’ security community, including researchers from academia and
industry, to identify vulnerabilities and improve the security of the protocols
and their implementations. As a result, many extensions and changes have been
proposed over the years to improve security, as well as to adapt the protocol
to new requirements or scenarios. Some of these extensions and changes were
adopted as an inherent part of a new revision of the protocol, and many others
can be deployed using the built-in extensions mechanism, which itself was added
as an extension (mostly from TLS version 1.1).
This wide use of SSL/TLS also has dual implication for learning and teach-

285
Figure 7.1: Overview of the use of TLS/SSL, with server public key certificate,
to secure the connection between browser and website.

ing SSL/TLS. On the one hand, being such an important, widely-used and
widely-studied protocol, implies that this is probably the most important and
most interesting protocol to study. On the other hand, this also means that
there is an excessive wealth of important and interesting information - indeed,
entire books were dedicated to cover SSL/TLS, e.g., [137, 143], and even they
do not cover all aspects. We have tried to maintain a reasonable balance; how-
ever, there were many hard choices and surely there is much to improve; as in
other aspects, your feedback would be appreciated.

7.1 Introducing TLS/SSL


7.1.1 TLS/SSL: High-level Overview
The TLS and SSL protocols were originally designed to secure the commu-
nication between a web-browser and a web-server, and, while they are now
widely deployed for additional applications, web-security remains their main
application. We present a highly-simplified overview of this typical use-case in
Figure 7.1.
In Figure 7.1, we show Alice, a typical web user, browsing to the protected
website [Link] Notice that the URL begins with the protocol name
https, rather than the protocol name http, used for unprotected web sites. The
process consists of the following steps:

1. The user (Alice) enters the desired Universal Resource Locator (URL);
in the given example, the URL is [Link] The URL consists
of the protocol (https) and the domain name of the desired web-server
([Link]); in addition, the path may contain identification of a specific
object in the server. In this example, Alice does not specify any specific
object; and the browser considers this a request for the default object
[Link]. The choice of the https protocol, instructs the browser to

286
open a secure connection, i.e., send the HTTP requests over an SSL/TLS
session, rather than directly over an unprotected TCP connection. Note
that the request may be specified in one of three ways: (1) by the user
‘typing’ the URL into the address-bar of the browser, i.e., ‘manually’, (2)
‘semi-automatically’, by the user clicking on an hyperlink which specified
this URL, or (3) ‘fully automatically’, by a script running in the browser;
scripts can instruct the browser to request different objects.
2. This step is used, if necessary, to resolve the Internet-Protocol (IP) ad-
dress of the domain name of the server ([Link]). The step is not
necessary, and hence skipped, if this IP address is already known (e.g.,
cached from previous connection). To find the IP address, the browser
sends a resolve request to a Domain Name System (DNS) server, spec-
ifying the domain ([Link]). The server responds with the IP address
of the [Link] domain, e.g., [Link].
3. The browser sends the TCP SYN message to the web-server’s IP address
([Link]), and the server responds with the TCP SYN+ACK message,
thereby opening the TCP connection between the browser and the web-
server.
4. The browser sends the TLS Hello message, and the [Link] responds
with its public key (pk) and its certificate, linking the public key pk to
the domain name [Link], and signed, much in advance, by a trusted
certificate authority (CA). The client validates pk based on the certificate,
and then uses the public key pk to encrypt a random, secret shared key.
These steps are referred to as the TLS handshake.
5. Next, the client and server communicate in a TLS session, sending mes-
sage back and forth. This includes sending GET, POST and other re-
quests, from the browser to the web-server, and receiving the responses.
The figure shows some typical flows.
6. Finally, the browser displays the page to the user (Alice), together with
a few security indicators such as the URL and the padlock.

7.1.2 TLS/SSL: security goals


TLS and SSL are designed to ensure security between two computers, usually
referred to as a client and a server, in spite of attacks by a MitM (Monster-in-
the-Middle) attacker. Table 7.1 lists these security goals, and indicates whether
they are achieved by different versions of the TLS/SSL handshake protocol.
Most goals are met by all versions - except for vulnerabilities, which often
allowed attackers to circumvent some of the goals. Few goals, mainly Perfect
Forward Secrecy (PFS), were not achieved by SSLv2 - but all were achieved by
later versions.
The goals of the different versions of TLS/SSL, as summarized in Table 7.1,
include:

287
Goal SSLv2 SSLv3 TLS1.0-1.2 TLS1.3
Key exchange
Server authentication
Client authentication
Connection integrity
Confidentiality
Cipher agility #
G
PFS #

Table 7.1: TLS/SSL: security goals vs. versions: : achieved , #G: achieved
partially, #: not achieved. Note that known attacks and vulnerabilities, may
cause failure to achieve a goal (even where marked here as ‘achieved’) - we
present several of these.

Key exchange: securely setup a secret shared key, preventing exposure of


this key to a MitM attacker.
Server authentication: authenticate the identity of the server, i.e., assure
the client that it is communicating with the right server.
Client authentication: authenticate the identity of the client. Client au-
thentication is optional; in fact, TLS/SSL is usually used without client
authentication, allowing an anonymous, unidentified client to connect to
the server. When client authentication is desired, it is usually performed
by sending a secret credential within the TLS/SSL secure connection,
such as a password or cookie.
Connection Integrity: Ensure that the communication received by one party,
is exactly identical to the communication sent by the peer (in spite of a
MitM attacker). This includes preventing message re-ordering and trun-
cating attacks. Of course, an attack - or even benign failure - could dis-
rupt communication, leaving some information sent but never received
by the peer, but TLS/SSL would detect such events.
Connection confidentiality: Ensure that a MitM attacker cannot learn any-
thing about the information sent between the two parties, except for the
‘traffic pattern’ - amount of information sent/received information.
Cipher agility: Cipher agility allows us to change the cryptographic algo-
rithms used (i.e., through cipher-suite negotiation). Cipher agility is an
important property of cryptographic protocols; in particular, it is essen-
tial, when a vulnerability is found or suspected in a particular algorithm
(mainly, cryptosystem, MAC, or hash).
Perfect forward secrecy (PFS): From version 3 of SSL, the SSL/TLS hand-
shakes support the (optional) use of authenticated DH key agreement,
which ensures perfect forward secrecy (PFS), as discussed in subsec-
tion 6.4.1.

288
Robust cryptography. Cipher agility helps us to recover after a possible
cryptographic vulnerability is identified; until then, the system could be vul-
nerable. In contrast, cryptographic robustness requires the protocol to ensure
its security properties, even while some of its cryptographic modules or as-
sumptions are weak. In particular, the protocol should maintain security even
if its sources of randomness are imperfect, and when one of its cryptographic
algorithms (‘building blocks’) is vulnerable, i.e., deploy robust combiner design.
Cryptographic robustness was not defined as an explicit goal in SSL/TLS, how-
ever, there have been several elements of robust cryptographic designs since
SSLv3.

7.1.3 SSL/TLS: Engineering goals


In addition to the security goals, the success of TLS/SSL is largely due to its
focus - from the very first versions - on the generic ‘engineering goals’, appli-
cable to any system, of efficiency, ease of deployment and use, and flexibility.
By addressing these goals, TLS/SSL is widely used and applicable in a very
wide range of applications and scenarios. Let us briefly discuss these three
engineering goals.

Efficiency - and session resumption. Efficiency is always a desirable goal.


In the case of TLS/SSL, there are two main efficiency considerations: compu-
tational overhead and latency. In terms of computational overhead, the main
consideration is minimizing the computationally-intensive public-key opera-
tions. To minimize public-key operations, once the handshake establishes a
shared key (using public key operations), the parties may reuse this key to es-
tablish future connections without requiring additional public-key operations.
We refer to the set of connections based on the same public-key exchange as
a session, and a handshake that reuses the pre-exchanged shared key as a
session-resumption handshake.
In terms of minimizing latency, the main consideration is to minimize the
number of round trip exchanges. End-to-end delays are typically on the order
of tens to hundreds milliseconds, which is usually much higher than the trans-
mission delays, esp. for the limited amount of information sent in TLS/SSL
exchange. Reducing the number of round trips became even more important as
transmission speeds increased; this is reflected by the fact that until TLS 1.3,
all designs had a fixed number of two round-trips to complete the handshake,
only then allowing the client to send a protected message (already in the third
exchange). In contrast, a TLS 1.3 handshake requires only a single round-
trip (before sending a protected message), and even allows the clients to send
a request already in the first exchange (with some limitations and somewhat
reduced security properties, see later).
A more minor efficiency consideration is minimization of bandwidth; this is
mainly significant in scenarios where bandwidth is limited, such as very noisy
wireless connections.

289
TLS Handshake HTTPS ...
HTTP ...
TLS record
TCP sockets API
TCP
IP

Figure 7.2: Placement of TLS/SSL in the TCP/IP protocol stack. The TL-
S/SSL record protocol (in green) The TLS handshake protocol is marked in
green; it establishes keys for, and also uses, the TLS record layer protocol (also
in green). We mark in yellow, the HTTPS protocol, and other application pro-
tocols that use the TLS/SSL record protocol. Application protocols that do not
use TLS/SSL for security, including the HyperText Transfer Protocol (HTTP),
are marked in pink. These protocols, as well as TLS/SSL itself, all use the TCP
protocol, via the sockets library layer. TCP ensures reliable communication,
on top of the (unreliable) Internet Protocol (IP).

Extensibility and versatility. The SSL/TLS protocol is arguably the most


‘successful’ security protocol - it is definitely very widely used; but even more
significantly, it is used in more diverse scenarios and environments than any
other security protocol. The use of the protocol in such diverse scenarios is
based on its extensibility and versatility. The protocol supports many optional
mechanisms, e.g., client authentication, and flexibility such as cipher-agility;
furthermore, from TLS version 1.1 and even earlier, the TLS protocol supports
a built-in extension mechanism, providing even greater flexibility.

Ease of deployment and use. Finally, the success and wide-use of the
SSL/TLS protocols are largely due to their ease of deployment and usage. As
shown in Figure 7.2, the TLS/SSL protocol is typically implemented ‘on top’
of the popular TCP sockets API, and then used by applications, directly or via
the HTTPS or other protocols. This architecture makes it easy to install and
use SSL/TLS, without requiring changes to the operating-system and kernel.
This is in contrast to some of the other communication-security mechanisms, in
particular the IPsec protocol [60,74], which, like TLS, is also an IETF standard.

7.1.4 TLS/SSL and the TCP/IP Protocol Stack


See Figure 7.2 for the placement of the TLS protocols with respect to the
TCP/IP protocol stack, and Figure 7.3 for a typical connection.
[This subsection is yet to be written; this material is well covered in many
textbooks on networking, e.g., [112].]

290
Figure 7.3: Phases of TLS/SSL connection. The black flows (Syn+Ack and
later Fin+Ack) are the TCP connection setup and tear-down exchanges, re-
quired to ensure reliability. The fuchsia flows represent the TLS/SSL hand-
shake; notice there are often more than the three shown. The blue flows rep-
resent the data transfer, protected using TLS/SSL record layer; and the red
flows represent the TLS/SSL connection tear-down exchange.

7.1.5 The SSL/TLS record protocol


We now discuss the SSL/TLS record protocol; our discussion is very brief, since
our focus is on the handshake protocol.
The SSL/TLS record protocol assumes that it is run ‘on-top’ of an underly-
ing reliable communication protocol - typically, TCP. Hence, it assumes that,
if there is not an attack, messages sent are received reliably, without losses,
duplications or re-ordering; any deviation must indicate an attack and justifies
closing the connection.
The record protocol provides four services, in the following order:
Fragment: break the TCP stream into fragments. Each fragment consists
of up to 16KB. A motivation for fragmenting is to allow pipeline opera-
tion, reducing the latency. For example, we may already be sending the
first fragment, while encrypting and computing the MAC for the second
fragment and receiving the third fragment.
Compress: apply lossless compression to each fragment. Compression may
reduce the processing overhead and the communication. Note that ci-
phertext cannot be compressed – therefore, if we want to compress, this
must be done before encryption. Note, however, that the length of com-
pressed data depends on the amount of redundancy, and that encryption
may not necessarily hide the length of the (compressed) plaintext; hence,
there is a potential risk of exposure of some measure of the redundancy
of data when applying compress-then-encrypt. Indeed, the fact that SS-
L/TLS applies compression before encryption was exploited in several
compression attacks, including TIME and CRIME [10,130,145,153]. Note
that even if TLS compression is disabled, compression attacks may still
be possible by exploiting application-level compression, as done in the
BREACH attack.
Authenticate: SSL/TLS authenticates the plaintext by applying a MAC func-
tion to the concatenation of the sequence number, type, version, length

291
Figure 7.4: The SSL/TLS record layer protocol.

and the compressed plaintext fragment. The outcome is concatenated


to the plaintext before applying encryption. The sequence number is the
number of bytes in the stream of plaintext prior to this fragment; the type
identifies the application protocol (e.g., http); the version identifies the
TLS/SSL version, and the length is the number of bytes in the fragment.
Encrypt: SSL/TLS encrypts the concatenation of the compressed plaintext
fragment, MAC and, if necessary, padding. Padding is required when
using a mode-of-operation of a block cipher; it is not required for stream
ciphers.

7.2 The beginning: the handshake protocol of SSLv2


We begin our discussion of the SSL/TLS handshake protocols by presenting, in
this section, the SSLv2 (SSL version 2.0) handshake protocol, and discussing
its features - and some of its main vulnerabilities and deficiencies.
SSL version 2 is the earliest published version of the SSL protocol. The
SSLv2 handshake protocol is interesting - and not just for the historical impor-
tance of being the first published version of TLS/SSL. One motivation to study
it, is that SSLv2 already introduces much of the basic concepts and designs
used in later versions - and, since it is a bit simpler, it is a good way for us
to introduce these basic TLS/SSL concepts and designs. Second, SSLv2 has
some serious vulnerabilities, which are instructive and teaching. Third, amaz-
ingly enough, it was recently shown that a significant fraction of web servers
still support SSLv2, although they also support (and prefer) later versions -
furthermore, this allows attack on clients, even when the client supports only
newer versions of TLS/SSL.
The SSLv2 handshake is already a non-trivial cryptographic protocol, with
support for multiple options and mechanisms - all supported also by later ver-
sions (of SSL and TLS), often with extensions and improvements. We describe
the protocol in the following three subsections. In §7.2.1 we present the ‘basic’

292
Client C Server S ([Link])
Client hello: client version (vC ), client random (rC )

Server hello: server random (rS ) and certificate: SCA.s (S.e, [Link], . . .)

Client key exchange: ES.e (kM );


Client finish: EkC (rS )

Server finish: EkS (rC )

Figure 7.5: ‘Basic’ SSLv2 handshake. The client selects randomly a shared
master key kM , encrypts it using RSA encryption with the server’s public
key S.e, and sends to the server. The client key kC and server key kS are
derived from the master key kM and the randoms rC , rS using the M D5 crypto-
hash, see Eqs. (7.1,7.2). This ‘basic’ handshake does not include ciphersuite
negotiation, session resumption and client authentication, which we illustrate
in the following figures.

handshake, namely, the handshake when there is no existing session (already


established shared key), and the protocol uses public-key operations to share a
key. In contrast, in §7.2.2 we present the session resumption handshake, allow-
ing to re-use the public key exchanged in a previous handshake between the
same client and server, to open a new connection without additional public key
operations. In §7.2.3 we discuss how SSLv2 handles ciphersuite negotiation,
and explain how an attacker may exploit the (insecure) SSLv2 ciphersuite ne-
gotiation mechanism, to launch a simple downgrade attack. Finally, in §7.2.4
we discuss how SSLv2 supports the (optional) client-authentication feature.

Terms of SSLv2. Note that SSLv2, as described in the original publica-


tions, e.g., in [96], uses several terms which were modified in later versions; for
simplicity and consistency, we use the later terms also when describing SSLv2.

7.2.1 SSLv2: the ‘basic’ handshake


In this subsection we discuss the ‘basic’ SSLv2 handshake, illustrated in Fig. 7.5,
which is a somewhat simplified version of the SSLv2 handshake protocol. In
particular, this simplified version does not include ciphersuite negotiation, ses-
sion resumption and client authentication. We discuss these additional aspects
of SSLv2 in the following subsections.

Key-derivation and randomization in SSLv2. The SSLv2 handshake


protocol establishes a shared master key, which we denote kM . The master key
is selected by the client, and simply sent encrypted to the server in the Client
key exchange message, as ES.e (kM ).

293
The public key encryption of kM is the most computationally-intensive
operation by the client; therefore, it is desirable for the protocol to be secure
even if the client reuses the same master key kM and its encryption ES.e (kM ) in
multiple connections, assuming that the master key was not exposed. To ensure
this, the client and server each selects and exchange a random ‘nonce’, rC and
rS , respectively, and derive the cryptographic keys to protect communication in
the connection from the master key together with both these random numbers.
The use of both random numbers rC and rS is required, to ensure that a
different key is used in different connections. This is essential, to prevent replay
of messages - from either client or server; see exercise 7.1.

Exercise 7.1. For costs and energy savings, IoT devices often have limited
computation power, and often do not have a source of random bits, and no
non-volatile memory. Consider such an IoT-lock, used to lock/unlock a car or
a door. The IoT-lock is implemented as a simple web server, and uses SSLv2
to authenticate the client requests. Specifically, to unlock (or lock), the client
sends to the IoT-lock the corresponding command (‘unlock’ or ‘lock’), together
with a ‘secret password/code’, say consisting of 20 alphanumeric characters.
Assume that the IoT-lock uses the same server-random string rS in all con-
nections, selected randomly upon initialization of the IoT-lock. Show how a
message sequence diagram demonstrating how a MitM may trick the IoT-lock
into unlocking, by replaying messages from a legitimate connection between the
client and the IoT-lock. Note: this attack works for any version of SSL/TLS,
for implementations which reuse rS .

In fact, in SSLv2, the parties derive and use two keys from kM and the
random nonces rC , rS : the client key kC , used by the client to protect messages
it sends, and the server key kS , used by the server to protect messages it sends.
These are derived as follows:

kC = M D5(kM +
+ “1” +
+ rC +
+ rS ) (7.1)
kS = M D5(kM +
+ “0” +
+ rC +
+ rS ) (7.2)

The SSLv2 designers chose, wisely, to separate between kC , used to protect


messages from client to server, vs. kS , used to protect messages from server
to client. In particular, many websites are public, and send exactly the same
information to all users; however, we may want to protect the confidentiality
of the contents, e.g., queries, sent by the users. By separating between kC and
kS , the attacker cannot use the large amount of known plaintext sent from
server to client, to cryptanalyze the ciphertext sent from client to server. This
separation follows the principle of key separation (§2.5.9).
However, the SSLv2 design does not fully follow the principle of key-separation.
In particular, it uses the same key for confidentiality (encryption) and message-
authenticity (MAC). This is improved in later versions of SSL/TLS, which we
present in the following sections.

294
Client Server
Client hello: client random (rC ), ID

Server hello: server random (rS )

Client finish: EkC (rS )

Server finish: EkS (rC )

Figure 7.6: SSLv2 handshake, with ID-based session resumption. The client-
hello includes session-ID, received from server in a previous connection. If
the server does not have the (ID, kM ) pair, then it simply ignores the ID
and sends server-hello, as in the ‘basic’ handshake (Fig.7.5). However, if the
server still has (ID, kM ), from a previous connection, then it reuses kM , i.e.,
‘resumes the session’, and derives new shared keys from it (using Eqs. (7.1,7.2).
This avoids the public key operations, encryption by client and decryption by
server of master key kM , as well as the overhead of transmitting the certificate
SCA.s (S.e, [Link]).

7.2.2 SSLv2: ID-based Session Resumption


The main overhead of the SSL/TLS protocol is due to the computationally-
intensive public key operations. Often, there are multiple connections between
the same (client, server) pair, over a short period of time; in such cases, the
server and client may re-use the master key exchanged previously, thereby
avoiding additional public key operations. To identify the re-use of the same
master key, the server includes an identifier ID at the end of a handshake
where the key is exchanged, and the clients sends this ID with its client-hello,
to re-use the same master key in another connection (without additional public
key operations). This process is called session resumption, and we illustrate it
in Figure 7.6.
The impact of session-resumption can be quite dramatic. The savings are
mostly on the computation (CPU) time; instead of computing public-key en-
cryption (for client) and decryption (for server) for every TCP connection, we
now need only require these operations for the first TCP connection in a ses-
sion. The ratio of the computation time with and without session resumption
is typically on the orders of 100 for typical usage, such as for protecting web
communication using the https protocol, i.e., running http over SSL/TLS.
However, the ID-based session resumption mechanism, which is the only
one supported in SSLv2, has a significant drawback: it requires the server to be
stateful, specifically, to maintain state for each session (for the session-ID and
the master key). In the typical case where the same web-server is running over
multiple machines, this requires that this storage be shared between all of these

295
Client Server
Client hello: version (vC ), client random (rC ), cipher-suites

Server hello: server random (rS ),


certificate: SCA.s (S.e, [Link], . . .) and cipher-suites

Client key exchange: cipher-suite, ES.e (kM );


Client finish: EkC (rS )

Server finish: EkS (rC )

Figure 7.7: SSLv2 handshake, with ciphersuite negotiation. Both client and
server indicate the cipher-suites they support in their respective hello messages;
then, the client specifies its preferred cipher-suite in the key exchange message
(this aspect was modified in later versions).

servers, or to ensure that a client will contact the same machine each time - a
difficult requirement that sometimes is infeasible. These drawbacks motivate
the adoption of alternative methods for session resumption, most notably, the
session-token mechanism that we discuss later (and which requires the use of
TLS).
Note that the session resumption protocol is one reason for requiring the
use of client and server random numbers; see the following exercise.
Exercise 7.2. Consider implementations of the SSLv2 protocol, where the (1)
client random or (2) server random fields are omitted (or always sent as a fixed
string). Show a message sequence diagram for two corresponding attacks, one
allowing replay of messages to the client, and one allowing replay of messages
to the server.

Hint: perform replay of messages from one connection to a different con-


nection (both using the same master key, i.e., same session).

7.2.3 SSLv2: ciphersuite negotiation and downgrade attack


In §2.7 we presented the cryptographic building blocks principle (Principle 8),
stating that systems should base security on few, well-defined building blocks.
The goals of this principle include cipher-agility, i.e., allowing flexibility, re-
placement and upgrade of the cryptographic mechanisms. For example, in
§5.5.3 we discuss how GSM supports cipher-agility, to allow use of the weak-
security, exportable A5/2 encryption scheme, as well as of the stronger, non-
exportable encryption algorithms A5/1 and A5/3; we also discuss how the
GSM support for cipher-agility is vulnerable to downgrade attack.
SSLv2 also supports cipher-agility. Figure 7.7 illustrates the SSLv2 cipher-
suite negotiation mechanism, and Fig 7.8 is an example of the negotiation
process, when the client supports three ciphersuites, all using the MD5 hashing

296
Client Server
Client hello: client random (rC ),
cipher-suites=RC4 128 MD5, RC4 40 MD5, DES 64 MD5

Server hello: server random (rS ), certificate: SCA.s (S.e, [Link], . . .)


and cipher-suites=RC4 128 MD5, RC4 40 MD5

Client key exchange: RC4 128 MD5, ES.e (kM );


Client finish: EkC (rS )

Server finish: EkS (rC )

Figure 7.8: SSLv2: example of cipher-suite negotiation.

Client MitM Server


Client hello: client random (rC ), Client hello: client random (rC ),
cipher-suites=RC4 128 MD5, RC4 40 MD5 cipher-suites=RC4 40 MD5

Server hello: server random (rS ), certificate: SCA.s (S.e, [Link], . . .)


and cipher-suites=RC4 40 MD5

Client key exchange (RC4 40 MD5): ES.e (kM );


Client finish: EkC (rS )

Server finish: EkS (rC ), EkS (ID)

Figure 7.9: The SSLv2 downgrade attack. Server and client end up using master
key kM with only 40 secret bits, which the attacker can find by exhaustive
search. Attacker does not need to find key during handshake; parties use
the 40-bit key for entire connection, attacker may even just record ciphertexts
and decrypt later. Note that while SSLv2 is not used anymore, we later discuss
version-downgrade-attacks that trick the server and/or client into using SSLv2,
exposing them to this (and other) attacks on SSLv2.

algorithm, but with three different ciphers (128-bit RC4, 40-bit RC4 and 64-bit
DES), and the server supports two ciphers (128-bit RC4 and 40-bit RC4).
In SSLv2, the finish messages only confirm that the parties share the same
server and client keys (KS and KC , respectively), but not the integrity of the
rest of the hello messages - in particular, there is no authentication of the
ciphersuites sent by server and client. This allows simple downgrade attacks,
removing ‘strong’ ciphers from the list of ciphers supported by client and/or
server. Figure 7.9 illustrates how a Monster-in-the-Middle (MitM) attacker
may perform this downgrade attack on SSLv2; in the example illustrated, the
attacker removes the ‘regular-version’ 128-bit RC4 encryption from the list of
ciphers supported by the client, leaving only the weaker ‘export-version’ 40-bit
RC4 encryption. Note that the SSLv2 downgrade attack is even simpler than
the downgrade attack on GSM (§5.5.3).

297
Client Server
Client hello: version (vC ), random (rC ), cipher-suites, [extensions,]

Server hello: version (vS ), random (rS ), cipher-suite, [extensions,]


certificate: SCA.s (S.e, [Link], . . .)

Client key exchange: ES.e (kP M );


Client finish: P RFkM (‘client finished’, h(previous flows))

Server finish: P RFkM (‘server finished’, h(previous flows))

Figure 7.10: The ‘basic’ RSA-based handshake, for SSLv3 and TLS 1.0, 1.1
and 1.2. The master key kM is computed, as in Eq. (7.3), from the pre-master
key kP M , which is sent in the client key exchange message (third flow). Notice
that the client key exchange message simply contains encryption of kP M , i.e.:
ES.e (kP M )).

7.2.4 Client authentication in SSLv2


All versions of SSL and TLS, including SSLv2, support an (optional) client
authentication mechanism, where the client proves its identity by sending a
certificate for a public signature-validation key, and then signs content sent by
the server. We do not present details of the client authentication mechanism in
SSLv2, since it differs considerably from the design in later versions - but not in
a very interesting way, so it seems unnecessary to describe it in details. Later
versions of SSL/TLS implement client authentication in a slightly simpler and
more efficient manner. We will discuss client authentication in the following
sections, where we present more advanced versions of SSL and TLS.

7.3 The Handshake Protocol: from SSLv3 to TLSv1.2


We will now discuss the evolution of the SSL/TLS handshake protocol after
version 2, from version 3 of SSL [75], to versions 1.0, 1.1 and 1.2 of TLS [54,
55, 165]. These four handshake protocols are quite similar - we will mention
the few major differences. Later, in §7.4, we present version 1.3 of TLS, which
is the latest - and involves more significant differences, compared to the more
incremental changes of these earlier versions.
Figure 7.10 illustrates the ‘basic’ variant of the handshake protocol, of the
SSLv3 protocol and the TLS protocol (versions 1.0 to 1.2). Like SSLv2, this
‘basic’ variant uses RSA encryption to send encrypted key from client to server.
In the following subsections, we discuss the main improvements introduced
in these later versions of SSL/TLS, including:

Improved key derivation and kP M (§7.3.1): the key derivation process was
significantly overhauled between SSLv2 and the later versions, beginning

298
with SSLv3. In particular, the client-key-exchange message of the basic
exchange includes the premaster key kP M , from which the protocol de-
rives the master key kM . As before, the master key kM is used to derive
the keys for the record-protocol, used to encrypt and authenticate data
on the connection.
Improved negotiation and handshake integrity (§7.3.2): from SSLv3, the
finish message authenticates all the data of all previous flows of the hand-
shake; this prevents the SSLv2 downgrade attack (Figure 7.9). TLS, and
to lesser degree SSLv3 too, also improve other aspects of the negotiation,
in particular, support for extensions, negotiation of the protocol version,
and negotiation of additional mechanisms, including key-distribution and
compression.
DH key exchange and PFS (§7.3.4): From SSLv3, the SSL/TLS proto-
cols supports DH key exchange, as an alternative or complementary mech-
anism to the use of RSA-based key exchange (the only method in SSLv2).
The main advantage is support for Perfect forward secrecy (PFS).
Session-Ticket Resumption (§7.3.5): an important TLS extension allows
Session-Ticket Resumption, a new mechanism for session resumption.
Session-ticket resumption allows the server to avoid keeping state for
each session, which is often an important improvement over the ID-based
session resumption mechanism supported already in SSLv2 (but which
requires servers to maintain state for each session).

Two of these changes - improved key derivation and improved handshake


integrity - have impact already on the ‘basic’ handshake. To see this impact,
compare Figure 7.10 (for SSLv3 to TLS 1.2) to Figure 7.5 (the corresponding
‘basic’ handshake of SSLv2). We therefore begin our discussion with these two
changes.

7.3.1 SSLv3 to TLSv1.2: improved derivation of keys


Deriving master key from premaster key
From SSLv3, the handshake protocol exchanges a pre-master key kP M , instead
of the master key kM exchanged in SSLv2. The parties derive the master key
kM from the pre-master key kP M , using a PRF, as in Eq. (7.3):

kM = P RFkP M (“master secret” +


+ rC +
+ rS ) (7.3)

The main motivation for this additional step is that the value exchanged
between the parties, may not be a perfectly-uniform, secret binary string, as
required for a cryptographic key. When exchanging the shared key using the
‘basic’, RSA-based handshake, this may happen when the client does not have
a sufficiently good source of randomization, or if the client simply resends the
same encrypted premaster key as computed and used in a previous connection

299
to the same server - not a recommended way to use the protocol, of course, but
possibly attractive for some very weak clients.
When exchanging the shared key using the DH protocol, there is a different
motivation for using this additional derivation step, from premaster key to
master key. Namely, the standard DH groups are all based on the use of a safe
prime; as we explain in §6.2.3, this implies that we rely on the Computational
DH assumption (CDH), and that the attacker may be able to learn at least
one bit of information about the exchanged key. By deriving the master key
from the premaster key, we hope to ensure that the entire master key would
be pseudorandom.

Deriving connection keys


Another important improvement of the handshake protocols of SSLv3 to TLS1.2,
compared to the SSLv2 handshake, is in the derivation of the connection keys,
used for encryption and authentication by the record protocol. This aspect is
not apparent from looking at the flows (Fig. 7.10).
Specifically, recall that in SSLv2, we derived from the master-key kM two
keys, kS for protecting traffic sent by the server S, and kC for protecting traffic
sent by the client C, as in Eqs. (7.1, 7.2). In SSLv3 and TLS, we use kM
to derive, for traffic sent by the client C and server S, three keys/values each,
A A
for a total of six keys/values: two authentication (MAC) keys, (kC , kS ), two
E E
encryption keys, (kC , kS ), and two initialization vectors, (IVC , IVS ). In each
pair we have one key/value for traffic sent by client C, and the other for traffic
sent by server S.
To derive these six keys/values, we generate from kM a long string which
is referred to as key block, which we then partition into the six keys/values.
The exact details of the derivation differ between these different versions of the
handshake protocol, and arguably, none of the derivations is fully justified by
standard cryptographic definitions and reductions. We present the following
simplification, leaving the exact details for exercises; the interested reader can
find the full details in the corresponding RFC specifications.
Our simplification is defined using a generic Pseudo-Random Function P RF ,
whose input is an arbitrary-length string, and whose output is a ‘sufficiently
long’ pseudo-random binary string called key-block, as follows:

key-block = P RFkM (‘key expansion’ +


+ rC +
+ rS ) (7.4)

The key-block is is then partitioned into the six keys/values, as illustrated


in Table 7.2.

7.3.2 Crypto-agility, backwards compatibility and


downgrade attacks
SSLv2 already supports cipher-agility, with the negotiation of ciphersuites.
However, the negotiation mechanism in SSLv2 is both very limited and inse-
cure, resulting in significant overhaul in later versions of the SSL/TLS hand-

300
Table 7.2: Derivation of connection keys and IVs, in SSLv3 to TLS1.2

key-block = P RFkM (‘key expansion’ +


+ rC +
+ rS )
A
kC kSA E
kC kSE IVC IVS

shake. Many of these changes were required to deal with vulnerabilities and
attacks exploiting weaknesses of the negotiation mechanisms, beginning with
the simple downgrade attack on SSLv2 (Figure 7.9); others were required to
add additional flexibility and options, often required for critical security or
functionality features; some examples follow.
We begin by discussing a change which is rather insignificant, yet, it is
visible already by comparing the ‘basic’ handshake flows of the two versions:
Figure 7.10 (for SSLv3 to TLS 1.2) vs. Figure 7.5 (the corresponding ‘basic’
handshake of SSLv2). Namely, in both versions, the client-hello message in-
cludes the list of cipher-suites supported by the client. However, in SSLv2,
the server responds with the subset of this list which the server also supports
(server’s cipher-suites), and the client chooses its preferred cipher-suite from
this list. In contrast, from SSLv3, the server responds, in server-hello, with its
choice of the cipher-suite, rather than sending its own cipher-suite list to let the
client decide, as in SSLv2. The goal of this change appears merely to simplify
the handshake a bit, since, in practice, SSLv2 servers almost always sent only
one cipher-suite back, the one they most preferred among these offered by the
client, making it redundant for the client to send back the same value.
With this out of the way, let us consider the other changes, which, in
contrast, have significant security impact.

Handshake integrity - preventing SSLv2 downgrade attack


Beginning with SSLv3, the handshake protocol includes a simple mechanism
for validating the integrity of all handshake messages. Namely, each side, client
and server, authenticates the entire handshake, using the master key derived
for that connection. This improvement prevents the SSLv2 downgrade attack.
Specifically, from SSLv3, the client and server send, in their respective
finish message, a validation value denoted vC , vS respectively, whose goal is to
validate the integrity of all previously exchanged messages in that handshake;
upon receiving the finish-message from the peer (server or client, respectively),
the value is checked and if incorrect, the handshake is aborted.
Similarly to the keys-derivation process (§7.3.1), the details slightly differ
among the different versions, and we present a slight simplification, consistent
with the one we used in §7.3.1. Namely, similarly to the derivation of the key-
block (Eq. (7.4)), the client and server compute the validation values vC , vS
as follows, using a pseudorandom function P RF with arbitrary-length inputs:

vC = P RFkM (‘client finished’ +


+ h(handshake-messages)) (7.5)

301
vS = P RFkM (‘server finished’ +
+ h(handshake-messages)) (7.6)
The equations use a cryptographic hash function h, whose definition differs
between the different versions. Specifically, in TLS 1.2, the hash function is
implemented simply as SHA-256, i.e., h(m) = SHA 256(m). The TLS 1.0 and
1.1 design is more elaborate, and follows the ‘robust combiner for MAC’ design
of §3.6.2; specifically, the hash is computed by concatenating the results of
two cryptographic hash functions, M D5 and SHA1, as: h(m) = M D5(m) + +
SHA1(m). SSLv3 also similarly combines MD5 and SHA1, however, there the
combination is in the computation of P RF itself; details omitted.

Backwards compatibility and protocol-version negotiation


SSL was an immediate success; it was widely deployed soon after it was re-
leased (as SSLv2). Hence, when introducing SSLv3, designers had to seriously
consider backward compatibility, namely, allowing a client/server running a new
version of SSL/TLS, to interact with a server/client, respectively, running an
older version. Note that already SSLv2 includes a version number in the client
hello and server hello messages.
The idea is that new servers can use an older version of the protocol, when
the client-hello message indicates that the client only supports this old version.
Similarly, to allow new clients to interact with servers running older versions,
all versions use client-hello messages which are compatible with SSLv31 . This
allows the server to process the hello message, responding with its (older, lower)
version; the handshake then proceeds using the older, lower version.
Note that this backwards-compatibility mechanism requires support by the
server. In reality, some servers simply fail to respond to new protocol versions.
Some clients try to work with such servers anyway, by a downgrade dance: try
first to connect using the latest version, but if receiving no response (or error),
try with older versions. However, this is vulnerable; an attacker can simply
block connection attempts (or send back a fake error message), causing the
client to use an older, vulnerable version of the protocol; see exercise below.

Exercise 7.3. Consider client that supports ‘downgrade dance’ as described


above. Namely, the client first tries to connect using TLS 1.2; if that fails, it
tries to connect using TLS 1.1; and if that also fails, it tries to connect using
TLS 1.0. Present a message sequence diagram for a MitM attack, which tricks
this client into using TLS 1.0, even when the server it tries to connect with
supports TLS 1.2 or even TLS 1.3.

Securing the downgrade dance: the SCSV cipher-suite and beyond


Exercise 7.3 shows a potential vulnerability for a common case, where clients
use ‘downgrade dance’ to ensure backwards compatibility with servers support-
ing older (lower) versions of the TLS/SSL protocol. How can we mitigate this
1 The client-hello is not compatible with SSLv2, probably since the SSLv2 format is too

rigid.

302
risk, while still allowing clients running new versions of TLS to interact with
servers running older versions?
The standard solution is the Signaling Cipher Suite Value (SCSV) cipher-
suite, specified in RFC 7507 [131]. Clients that support SCSV, first try to
connect to the server using their current TLS version - no change from clients
not supporting SCSV. The difference is only when this initial connection fails,
and the client decides to try the ‘downgrade dance’, to support connections
with servers supporting (only) older versions of TLS/SSL.
In these ‘downgrade dance’ handshakes, the client adds a special ‘cipher
suite’ to its list of supported cipher suites, sent as part of the ClientHello
message. The special ‘cipher suite’ is called TLS FALLBACK SCSV, and is
encoded by a specific string. Unlike the original (and main) goal of the cipher
suites field, the SCSV is not an indication of cryptographic algorithms sup-
ported by the client. Instead, the existence of SCSV indicates to the server,
that this handshake message is sent as part of a downgrade dance by the client,
i.e., that the client supports a higher version than the one specified in the cur-
rent handshake. If the server receives such handshake, and supports a higher
version of the protocol itself, this would indicate an error or attack, as this
client and server should use the higher version. Therefore, in this case, the
server responds with an appropriate indication to the client.
This use of the cipher-suites fields for signaling the downgrade dance is a
‘hack’ - it is not the intended, typical use of this field. A ‘cleaner’ alternative
would be to achieve similar signaling using a dedicated extension mechanism;
later in this section, we describe the TLS extension mechanism, which could
have been used for this purpose. We believe that the reason that SCSV was
defined using this ‘hack’ (encoding of a non-existent cipher suite) rather than
using an appropriate TLS extension, was the desire to support downgrade
dance to older versions of the TLS/SSL protocol, that do not support TLS
extensions.

SSL-Stripping and HSTS: Downgrade to Insecure Connection and


Defenses
An even more extreme downgrade attack, is to trick the client into using an
insecure connection, even when the server supports secure (TLS/SSL) connec-
tions. In particular, this attack may be attempted against web connections,
which are done over either the (unprotected) HTTP protocol, or over the (pro-
tected) HTTPS protocol, which is essentially HTTP running over TLS/SSL.
Browsers connect to websites using the protocol - HTTP or HTTPS - speci-
fied in the URL, which is often received as a hyperlink in the previously-received
webpage. If that previously-received webpage is unprotected, then the hyper-
link may be modified by a MitM attacker - specifically, changing from the
protected HTTPS to the unprotected HTTP.
Browsers provide an indication to the user of the protocol used (HTTP or
HTTPS), but many or most users are unlikely to notice a downgrade (from

303
HTTPS to HTTP). This attack is referred to as SSL-Stripping, and was first
presented by Marlinspike [125].
The best defense against this kind of attack, is for the browsers to detect or
prevent HTTP hyperlinks to a website which always uses (or offers) HTTPS
connections. The standard mechanism to ensure that is the HSTS (HTTP
Strict Transport Security) policy, defined in RFC 6797 [97]. The HSTS policy
indicates that a particular domain name (of a web server), should be used only
via HTTPS (secure) connections, and not via unprotected (HTTP) connec-
tions. HSTS is sent as an HTTP header field (Strict-Transport-Security), in
an HTTP response sent by the web server to the client.
The HSTS policy specifies that the specific domain name, and optionally
also subdomains, should always be connected using HTTPS, i.e., a secure con-
nection. Specifically,
1. The browser should only use secure connections to the server; in partic-
ular, if the browser receives a request to connect to the server using the
(unprotected) HTTP protocol, the browser would, instead, connect to
the server using the HTTPS (protected) protocol, i.e., using HTTP over
SSL/TLS.
2. The browser should terminate any secure transport connection attempts
upon any secure transport errors or warnings, e.g., for use of invalid
certificate.

The HSTS policy is designed to prevent attacks by a MitM attacker, hence,


the HSTS policy itself must be protected - and, in particular, the attacker
should not be able to ‘drop’ it. For this reason, HSTS policy must be known
to the browser before it connects to the server. This may be achieved in two
ways:

Caching - max-age: The HSTS header field has a parameter called max-
age, which defines a period of time during which the browser should
‘remember’ the HSTS directive, i.e., keep it in cache. Any connection
within this time, would be protected by HSTS. This works if the browser
previously had a secure connection with the site, and received the HSTS
header, before the amount of time specified in max-age. This motivates
the use of a large value for max-age; however, notice that if a domain
must move back to HTTP for some reason, or there are failures in the
secure connection attempts for some reason, e.g., expired certificate, then
the site may be unreachable for max-age time units.
Pre-loaded HSTS policy: The browser maintains a list of HSTS domains
which are preloaded, i.e., do not require a previous visit to the site by
this browser. This avoids the risk of a browser accessing an HSTS-using
website but without a cached HSTS policy. However, this requires the
browser to be preloaded with the HSTS policy - a burden on the site and
on the browser, and some overhead for this communication. An optional

304
parameter of the HSTS header, instructs search engines to add the site
to the HSTS preload list of related browsers. This is used by Google to
maintain the pre-loaded HSTS list of the Chrome browser.

Backward compatibility with SSLv2


The SSLv3/TLS backward-compatibility mechanism does not extend to SSLv2,
since SSLv2 uses a different client-hello format. To allow interoperability with
SSLv2 servers, the SSLv3 specifications [75] allow clients to interact with SSLv2
servers by sending the client-hello message using the SSLv2 format, specifying
the use of version 3. However, the standard warns that this method for back-
wards compatibility will be ‘phased out with all due haste’. The reason is
that a client that uses the SSLv2 hello message, may be subject to a simple
MitM downgrade attack causing them to use the SSLv2 handshake, even when
interacting with an SSLv3 server; see following exercise.
Exercise 7.4 (Protocol downgrade attack on SSLv3). Show message sequence
diagram for a MitM version downgrade attack, tricking an SSLv3 server and an
SSLv3 client who sends SSLv2-format client-hello (for backward compatibility),
into completing the handshake using SSLv2 and using a weak (40-bit) cipher.

Hint: see this attack (referred to as ‘version rollback attack’) in [168].


Note that interactions between many SSLv3 servers and clients are actually
protected from the protocol downgrade attack of Exercise 7.4, by an ingenious
‘trick’, designed by Paul Kocher. These clients signal their support of SSLv3,
by encoding a ‘signal’ of that in the padding used in the RSA encryption. For
details on this and other issues related to MitM version rollback, see [168] and
appendix E.2 of RFC2246 (TLS1.0).

Extended cipher-agility (ciphersuites)


To conclude this subsection, we note that beginning with SSLv3, the cipher-
suites supported define additional aspects, mainly, the key-exchange mecha-
nism. Specifically, in addition to the ‘basic’ key-exchange mechanism based on
RSA encryption, as in SSLv2, the later versions support also Diffie-Hellman
(DH) key-exchange, using either ‘static’ or ‘ephemeral’ DH public keys (§7.3.4),
and other key-exchange methods. In subsection 7.3.4, we discuss the support
in SSLv3 and TLS for DH-based key exchange.

7.3.3 Secure extensibility principle and TLS extensions


We see that the lack of secure version and ciphersuite negotiation in SSLv2,
resulted in significant challenge in providing a secure yet backwards-compatible
new versions for SSL/TLS. In general, when a security protocol is upgraded,
there is a potential conflict between the strong desire to preserve backward
compatibility, and the desire to protect against version downgrade attacks.
This is one of the many motivations for the important principle of extensibility

305
by design, which requires pre-designed, built-in secure mechanisms for exten-
sions and backward compatibility. This is an important design principle for
cryptographic systems and protocols.
Principle 12 (Secure extensibility by design). When designing security sys-
tems or protocols, one goal must be to build-in secure mechanisms for exten-
sions, downward compatible versions, and negotiation of options and crypto-
graphic algorithms (cipher-agility).
Note that the extensibility-by-design principle implies that extension mech-
anisms must be designed, not only that they be secure. By designing protocols
to be extensible and flexible, we allow the use of the same protocol in many
diverse scenarios, environments and applications. This allows more efforts to
be dedicated to analyzing the security of the protocol and its implementations.
We next discuss the TLS extensions mechanism, which is a great example -
in fact, some of the supported extensions have become very popular and had
great impact on the adoption of TLS (and migration of web servers from SSL).

Secure extensibility: version-dependent key separation


While modern protocols like TLS are adopting the extensibility-by-design prin-
ciple and support secure extensions and downward-compatible versions, there is
yet an element of extensibility that is often neglected: key separation. Namely,
suppose the same key - in particular, public-private key pair - is used by both
a vulnerable protocol and a secure protocol. Then, it may be possible to ex-
pose the key by running the vulnerable protocol, and exploit this to attack the
system also when using the secure protocol. In particular, a significant number
of web-servers were found to support SSLv2, with the same key-pair they use
for improved-security TLS handshake, opening them to the DROWN vulner-
ability [5]. We conclude that we need to extend the key-separation principle
(principle 6) to also require separate keys for different versions of a protocol.
Principle 13 (Key-separation (improved)). Use separate, independent pseu-
dorandom keys for: (1) each different cryptographic scheme, (2) different types
and sources of plaintext, (4) different periods, and (5) different versions of the
protocol or scheme.

The TLS Extensions mechanism


One of the most important improvements of TLS over SSL, is that TLS support
a flexible and secure extensions mechanism. This mechanism allows clients
to specify additional fields, not defined in the protocol, but supported (and
‘understood’) by some of the servers. Once a server receives an extension that
it supports, its behavior may change from the ‘standard protocol’ in arbitrary
way (as defined by the extension); however, servers should ignore any unknown
extension.
Extensions were supported as early as TLS1.0, where servers are required
to ignore any unknown fields appended beyond the known fields, as defined

306
in [31, 32]. Support for extensions became mandatory from TLS1.1. Some
standard extensions facilitate important functionality, and some are needed for
security; and users may define additional extensions.
Server Name Indication (SNI) is an example of an important, popular ex-
tension; in fact, SNI became mandatory from TLS 1.1, and was one of the
main factors motivating websites and clients to adopt TLS. SNI is designed to
support the common scenario, where the same web server is used to provide
web-pages belonging to multiple different domain names, e.g., [Link] and [Link].
Each domain name may require a different certificate; the SNI extension allows
the client to indicate the desired server domain name early on in the protocol,
before the server has to send a certificate to the client - allowing the server
to send the desired certificate based on the web-page that is being requested.
Before TLS, using SSL, the common way to a web-server to support multiple
web-sites, with different domain names, was by having each site use a dedicated
port - an inconvenient and inefficient solution.

7.3.4 SSLv3 to TLSv1.2: DH-based key exchange


From SSLv3, the TLS/SSL handshake supports two methods of DH key ex-
change: ephemeral and static (certified). There is a huge difference in the
popularity of the two methods: the ephemeral method is very widely used,
while the static (certified) method is rarely used. However, the two methods
are actually similar.
In both methods, the parties derive a shared key kP M , referred to as the pre-
master key, following the DH protocol. Specifically, TLS/SSL use a modular
group, with an agreed upon safe prime p and generator g. The parties exchange
their ‘public keys’, g S.x (for the server) and g C.y (for the client), where each
party uses a randomly-generated private key: S.x for the server and C.y for
the client. The parties then derive the pre-master key kP M , again as in ‘plain’
DH key exchange, namely:

kP M = g C.y·S.x mod p (7.7)


Recall that when using a modular group, the value exchanged by the DH
protocol is not pseudorandom; namely, security may rely only on the computa-
tional DH assumption (CDH), as we know that the stronger DDH (decisional
DH) assumption does not hold for such groups. This is one motivation for not
using kP M directly as a key to cryptographic functions. Instead, we derive
from the pre-master key kP M another key, the master key kM , which should
be pseudorandom. See §7.3.1, where we discuss the derivation of the master
key and of the keys for specific cryptographic functions, such as PRF, MAC or
shared-key encryption.

Static (certified) DH handshake. In static (certified) DH key exchange,


the server’s DH public key is signed as part of the signing process of a public key
certificate. Namely, the signing entity is a certificate authority which is trusted

307
Client Server
Client hello: version (vC ), random (rC ),
cipher-suites (. . . DH. . . ),

Server hello: version (vS ), random (rS ), cipher-suite:. . . DH. . . ,


certificate: SCA.s ((g, p, g S.x mod p), [Link], . . .), [, extensions]

Client key exchange: g C.y ;


Client finish: P RFkM (‘client finished’, h(previous flows))

Server finish: P RFkM (‘server finished’, h(previous flows))

Figure 7.11: SSLv3 to TLSv1.2: handshake with static (certified) DH public


key for the server, g S.x mod p. Pre-master key kP M is computed as in Eq.
(7.7), and master key kM is computed - from kP M - as in Eq. (7.3).

by the browser, and the certificate contains the domain name (e.g., [Link]) and
other parameters such as expiration date: SCA.s ((g, p, g S.x mod p), [Link], . . .).
See Figure 7.11.
In practice, the use of a certificate implies that the server’s DH public key,
g S.x , is fixed for long periods, similarly to the typical use of RSA or other
public key methods. Hence, the static (certified) DH key exchange is similar
in its properties to the RSA key exchange; the difference is simply that instead
of using RSA encryption to exchange the key, and relying on the RSA (and
factoring) assumptions, the static (certified) DH key exchange relies on the DH
(and discrete-logarithm) assumptions.

Ephemeral DH handshake: ensuring Perfect Forward Secrecy (PFS).


The ephemeral DH key exchange uses a different, randomly-chosen private key
for each exchange; for DH, this means that each party selects a new private
exponent (S.x for the server, C.y for the client) in each handshake. This is
illustrated in Figure 7.12.
Once the TLS session terminates, the private exponents are erased - as
well as any keys derived from them, including the pre-master key kP M , the
master key kM , the derived key block (Eq. (7.4)) and the keys derived from
it (kSA , kSE , kC
A E
, kC ). This ensures perfect forward secrecy (PFS), i.e., the ith
session between client and server is secure against a powerful MitM attacker,
even if the attacker is given, all the keys and other contents of the memory of
both client and server before and after the ith session, as long as the keys are
given only after the ith handshake is completed.

Security assumptions of DH key exchange. An obvious, important dif-


ference between the RSA key exchange and the DH key exchange methods,
is that instead of using RSA encryption to exchange the key, and relying on
the RSA (and factoring) assumptions, the static (certified) DH key exchange
relies on the computational-DH (and discrete-logarithm) assumptions. Notice,

308
Client Server
Client hello: version (vC ), random (rC ), cipher-suites (incl. DHE-RSA) [, ID] [,extensions]

Server hello: version (vS ), random (rS ), cipher-suite (DHE-RSA),


certificate: SCA.s (S.v, [Link], . . .), [, extensions]
server key exchange: SS.s ((p, g, g S.x mod p))

Client key exchange: g C.y ;


Client finish: P RFkM (‘client finished’), h(previous flows))

Server finish: P RFkM (‘server finished’), h(previous flows))

Figure 7.12: SSLv3 to TLSv1.2: handshake with ephemeral DH public keys


g S.x mod p, g C.y mod p, signed using RSA (a DHE-RSA cipher-suite). Pre-
master key kP M is computed as in Eq. (7.7), and master key kM is computed
as in Eq. (7.3).

however, that in the typical case where the certificate uses RSA signatures, the
security of the handshake still relies also on the RSA (and factoring) assump-
tions. Namely, the DH key exchanges require both the computational-DH (and
discrete logarithm) assumption, and the RSA (and factoring) assumption. In
this sense, DH key exchanges are ‘less secure’ (or, rely on more assumptions)
compared to the RSA key exchange; in the case of ephemeral DH, this is off-
set by the security advantage of ephemeral DH handshake, namely, ensuring
perfect forward secrecy.

7.3.5 SSLv3 to TLSv1.2: session resumption


Both SSLv3 and TLS, like SSLv2, support the (stateful) ID-based session re-
sumption mechanism; however, many TLS servers also support extensions, in-
cluding the session-ticket extension, which is an alternative, ‘stateless’ method
for session resumption. In this subsection we discuss these two methods.

ID-based session resumption in SSLv3 and TLS 1.0-1.2


We begin with the (stateful) ID-based session resumption mechanism, which
did not change much from its implementation in SSLv2.
Figure 7.13 illustrates the handling of ID-based session resumption, in the
SSLv3 handshake protocol, and in versions 1.0-1.2 of the TLS protocol. In the
figure, the client-hello message contains the session-ID, denoted simply ID,
which was received from server in a previous connection.
Session resumption is possible, when the server still has the corresponding
entry (ID, kM , γ) saved from a previous connection; ID is the session identifier,

309
Client Server
Client hello: client random (rC ), cipher-suites, ID

Server hello: server random (rS )

Client finish: P RFkM (‘client finished’), h(previous flows))

Server finish: P RFkM (‘server finished’), h(previous flows))

Figure 7.13: SSLv3 to TLS1.2 handshake, with ID-based session resumption.

kM is the session’s master key, and γ contains ‘related information’ such as the
ciphersuite used in the session.
When the server has the (ID, kM , γ) entry, it reuses kM and γ, i.e., ‘re-
sumes the session’, and derives new shared keys from it (using Eqs. (7.1,7.2).
This avoids the public key encryption (by client) and decryption (by server) of
master key kM , as well as the transmission of the relevant information, most
significantly, the public key certificate.
Note that when either the client or the server, or both, do not have a
valid (ID, kM ) pair, then the handshake is essentially the same as for a ‘basic’
handshake (without resumption), as in Fig. 7.5. The only changes are the
inclusion of the ID from client (if it has it), and the inclusion of an ID in
the ‘server-finish’ message, to be (optionally) used for future resumption of
additional connections (in the same session).
The session resumption mechanism can have a significant impact on perfor-
mance; in particular, websites often involve opening of a very large number of
TCP connections to the same server, to download different objects. The reduc-
tion in CPU time can easily be a ratio of dozens or even hundreds. Therefore,
this is a very important mechanism; however, it also has some significant chal-
lenges and concerns, as we next discuss.

Session-ID resumption: challenges and concerns


The basic challenge of ID-based session resumption is the need to maintain
state, and lookup the state - and key - using the ID. To minimize the stor-
age and lookup time overhead, the cache of saved (ID, kM ) pairs cannot be
too large; on the other hand, if the cached is too small, then the resumption
mechanism is less effective.
This challenge is made much harder, since web servers are usually replicated
- to handle high load and to reduce latency by placing the server closer to the
clients, e.g., in a Content Distribution Network (CDN).

310
Ensuring PFS with ID-based session resumption Another challenge is
that the exposure of the master key kM , exposes the entire communication of
every connection to an eavesdropper; namely, the storage of the key may foil
the perfect-forward secrecy (PFS) mechanism. To ensure PFS, we must ensure
that all copies of the key kM are discarded, without any copies remaining - a
non-trivial challenge.
This challenge is often made even harder due to the way that web-servers im-
plement the (ID, kM ) cache. Specifically, in some popular servers, e.g. Apache,
the operator can only define the size of the (ID, kM ) cache. Suppose the goal
is to ensure PFS on daily basis, i.e., to change keys daily. Then the cache size
must be small enough to ensure that entries will be thrown out after at most a
day, yet, if it is too small, there will be many cache misses, i.e., the efficiency-
gain of the resumption mechanism will be reduced. Furthermore, even if we
use a small cache, a client which continues a session for very long time may
never get evicted from the cache, and hence we may not achieve the goal of
ensuring PFS on daily basis, if the cache uses the (usual) paradigm of throwing
out the least-recently-used element; to ensure entries are thrown after one day
at most, it should operate as a queue (first-in-first-out).
Exercise 7.5. Consider a web server which has, on average, one million daily
visitors, but the number in some days may be as low as one thousand. What
is the required size of the ID-session cache, in terms of number of (ID, kM )
entries, to ensure PFS on daily basis, when entries are removed from the cache
only when necessary to make room for new entries? Can you estimate or
bound, how many of the connections will be served from cache on a typical
day? Assume the ID-session cache operates using a FIFO eviction paradigm.

The Session-Ticket extension and its use for session resumption


The TLS extensions mechanism provides an alternative, stateless session-resumption
mechanism. The idea is simple: together with the finish message of a suc-
cessful handshake, the server attaches a session-ticket extension. Later, when
the client re-connects to the same server, it attaches the previously-received
session-ticket extension. See Figure 7.14.
The ticket should allow any of the ‘authorized servers’ (e.g., running the
website), to recover the value of the master key kM of the session with the client
- but prevent attackers, eavesdropping on the ticket as sent by the client, from
finding kM . This is achieved by having kM , and other values sent in the ticket,
encrypted using a secret, symmetric Session Ticket Encryption Key (STEK),
which we denote kST EK , known (only) to all authorized servers. Clients cannot
encrypt the tickets; hence, they must store both ticket and (unencrypted) copy
of the session’s master key kM , to allow the client to perform its part of the
handshake.
The contents of the session ticket are only used by the servers, and are
opaque to the clients, i.e., not ‘understood’ or used by the clients; hence, dif-
ferent implementations may use different tickets. RFC5077 [150] recommends a

311
Client Server
Client hello: client random (rC ), cipher-suites, ticket-extension(τ )

Server hello: server random (rS )

Client finish: P RFkM (‘client finished’, h(previous flows))

Server finish: P RFkM (‘server finished’, h(previous flows))


ticket-extension(τ 0 )

Figure 7.14: Ticket-based session resumption.

structure which uses Encrypt-then-Authenticate, where the encrypted contents


include the protocol version, ciphersuite, compression method, master secret,
client identity and a timestamp.

Session tickets and PFS. To preserve PFS, e.g., on daily basis, we need to
make sure that each ticket key kST EK is kept for only the allowed duration -
e.g., up to 24 hours (‘daily’). In principle, this is easy; we can maintain this key
only in memory, and never write it to disk or other non-volatile storage, making
it easier to ensure it is not kept beyond the desired period (e.g., daily). This
rule may require us to maintain several ticket-keys concurrently, e.g., generate
a new key once an hour, allowing it to ‘live’ for up to 24 hours.
In the typical case of replicated servers, the ticket keys kST EK should be
distributed securely to all replicates. Changing the key becomes even more
important, with it being used in so many machines.
Unfortunately, like for ID-based resumption, many popular web-servers
implement ticket-based resumption in ways which are problematic for perfect
forward secrecy (PFS). These web-server implementations do not provide a
mechanism to limit the lifetime of the ticket key, except by restarting the server
(to force the server to choose a new ticket key). For some administrators and
scenarios, this lack of support for PFS may be a consideration for choosing a
server, or for using session-IDs and disabling session-tickets.

7.3.6 Client authentication


The SSL and TLS protocols support, already from SSLv2, a mechanism for
authenticating client, as an optional service of the handshake. In this subsec-
tion we describe how this optional client authentication mechanism works, in
SSLv3 and in TLS 1.0 to 1.2.
The SSL/TLS client authentication mechanism is illustrated in Figure 7.15.
The mechanism consists of tree additions to the ‘basic’ handshake. First, the
server signals the need for client authentication, by including the certificate

312
Figure 7.15: Client authentication in SSLv3 to TLS 1.2.

request field together with the server-hello message. The certificate-request field
identifies the certificate-autorities (issuers) which are accepted by this server;
namely, client authentication is possible only if the client has a certificate from
one of these entities.
Next, the client attaches, to its client key exchange message, two fields.
The first is the certificate itself; the second, called certificate verify, is a digital
signature over the handshake messages. The ability to produce this signature,
serves as proof of the identity of the client.
This client authentication mechanism is quite simple and efficient; however,
it is not widely deployed. In reality, TLS/SSL is typically deployed using only
the public key (and certificate) of the server, i.e., only allowing the client to
authenticate the server, but without client authentication. The reason for that
is that SSL/TLS client authentication requires clients to use a private key,
and to obtain a certificate on the corresponding public key; furthermore, that
certificate must be signed by an authority trusted by the server.
This raises two serious challenges. First, clients often use multiple devices,
and this requires them to have access to their private keys on these multiple
devices, which raises both usability and security concerns. Second, clients must
obtain a certificate - and from an authority trusted by the server. As a result,
most websites prefer to avoid the use of SSL/TLS client authentication; when
user authentication is required, they rely on sending secret credentials such as
passwords or cookies, over the SSL/TLS secure connection.
Note also that the client authentication mechanism requires the client to

313
Client Server
Client hello:
 client random (rC ), cipher-suites,
key-exchange: g1a1 mod p1 , g2a2 mod p2 , . . . , extensions

b
Server hello: server random (rS ), key-exchange:i, gi i , SCA.s (S.v, . . .), SS.s (handshake)

Client finish: P RFkM (‘client finished’), h(handshake)), Application-data

Figure 7.16: TLS 1.3 1-RTT (‘full’) handshake.

send her certificate ‘in the clear’. This may be a privacy concern, since the
certificate may allow identification of the client.

7.4 State-of-Art: TLS 1.3


Finally, we briefly discuss TLS 1.3 - the latest version. This version is the
result of a major re-design; we only discuss the main aspects related to the
handshake protocol.
One of the main goals of TLS 1.3 has been to improve (reduce) latency.
This is due to the fact that modern network connections often have very high
latency, however, the propagation delays are due to physical limitations, such
as the speed of light, and cannot be much reduced; queuing delays are also often
significant. Hence, to minimize delay, it is desirable to minimize the number of
‘round trips’, where a party has to wait for a response before continuing with
the protocol.
In Figure 7.16, we present the TLS 1.3 ‘full, 1-RTT handshake’. In con-
trast to the ‘basic handshake’ of the previous sections (and versions), this
handhshake uses the DH protocol for key exchange. Besides the key exchange,
the handshake includes a signature by the server (over its DH exponent).
The TLS 1.3 full (1-RTT) handshake allows the client to send the request
after a single round-trip (hence, it is called 1-RTT handshake). Namely, when
the server sends its (single) flow, this contains both the server’s hello message
(with the server’s DH exponent, extensions, certificate and signature), and the
server’s finished message, proving the integrity of the exchange.
Notice that in order to allow the server to send the finished message in
its (single) flow, the client has to send its key-exchange message, containing
its exponentiation (contribution to the DH key exchange), before yet agree-
ing on the specific parameters (a pair (pi , gi ) of prime and generator). The
client can do this by sending the exponentiations using multiple (pi , gi ) pairs,
{g1a1 mod p1 , g2a2 mod p2 , . . .}. There is a cost here in overhead of comput-
ing and sending these values - but the savings in RTT are usually much more
important.
In Figure 7.18 we present the Zero Round-Trip-Time (Zero-RTT) variant
of the TLS 1.3 handshake.

314
Figure 7.17: TLS 1.3: ‘full handshake’.

Figure 7.18: TLS 1.3: Zero-RTT handshake.

To be completed... session resumption... maybe an exercise on delta from


1.2?

7.5 TLS/SSL: Additional Exercises


Exercise 7.6 (Record protocol). 1. SSL and TLS record protocol uses frag-
ments of size up to 16KB. Explain potential disadvantage of using much
longer fragments (or no fragments).
2. Explain potential disadvantage of using much shorter fragments.
3. Explain why fragmentation is applied before compression, authentication
and encryption.
4. The SSL/TLS record protocols apply authentication and then encryption
(AtE). Is it possible to reverse the order, i.e., apply encryption and then

315
authentication (EtA)? Can you identify advantages to AtE and/or to
EtA?
5. The SSL/TLS record protocols apply compression, then authentication.
Is it possible to reverse the order, i.e., apply authentication and then
compression? Can you identify advantages to either order?
Exercise 7.7 (SSL 2 key derivation). SSL uses MD5 for key derivation. In this
question, we explore the required properties from MD5 for the key derivation to
be secure.
1. Show that it is not sufficient to assume that MD5 is collision-resistant,
for the key derivation to be secure.
2. Repeat, for the one-way function property.
3. Repeat, for the randomness-extraction property.
4. Define a simple assumption regarding MD5, which ensures that key deriva-
tion is secure. The definition should be related to cryptographic functions
and properties we defined and discussed.

Exercise 7.8 (TLS handshake: resiliency to key exposure). Fig. 7.10 presents
the RSA-based SSL/TLS Handshake. This variant of the handshake protocol
was popular in early versions, but later ‘phased out’ and completely removed in
TLS 1.3. The main reason was the fact that this variant allows an attacker that
obtains the server’s public key, to decrypt all communication with the server
using this key - before and after the exposure.
1. Show, in a sequence diagram, how a MitM attacker who is given the
private key of the server at time T1 , can decrypt communication of the
server at past time T0 < T1 .
2. Show, in a sequence diagram, how TLS 1.3 avoids this impact of exposure
of the private key.
3. Show, in a sequence diagram, how a MitM attacker who is given the
private key of the server at time T1 , can decrypt communication of the
server at future time T2 > T1 .
4. Explain which feature of TLS 1.3 can reduce the exposure of future com-
munication, and how.
Exercise 7.9 (Protocol version downgrade attack). Implementations of SSL
specify the version of the protocol in the ClientHello and ServerHello messages.
If the server does not support the client’s version, then it replies with an error
message. When the client receives this error message (‘version not supported’),
it re-tries the handshake using the best-next version of TLS/SSL supported by
the client.

316
Present a sequence showing how a MitM attacker can exploit this mecha-
nism to cause the server and client to use an outdated version of the protocol,
allowing the attacker to exploit vulnerabilities of that version.
Exercise 7.10 (Client-chosen cipher-suite downgrade attack). In many vari-
ants of the SSL/TLS handshake, e.g., the RSA-based handshake in Fig. 7.10,
the authentication of the (previous) handshake messages in the Finish flows,
is relied upon to prevent a MitM attacker from performing a downgrade attack
and causing the client and server to use a less-preferred (and possibly less se-
cure) cipher-suite. However, in this process, the server can choose which of
the client’s cipher-suites would be used. To ensure the use of the cipher-suite
most preferred by the client, even if less preferred by the server, some client
implementations send only the most-preferred cipher-suites. If none of these
is acceptable to the server, then the server responds with an error message. In
this case, the client will try to perform the handshake again, specifying now
only the next-preferred cipher-suite(s), and so on - referred to as ‘downgrade
dance’.

1. Show how a MitM attacker can exploit this mechanism to cause the server
and client to use a cipher-suite that both consider inferior.
2. Suggest a fix to the implementation of the client which achieves the same
goal, yet is not vulnerable to this attack. Your fix should not require
changes in the handshake protocol itself, or in the server.
3. Suggest an alternative fix, which only involves change in the handshake,
and does not require change in the way it is used by the implementation.

Exercise 7.11 (TLS server without randomness). An IoT device provides http
interface to clients, i.e., acts as a tiny web server. For authentication, clients
send their commands together with a secret password, e.g., on, ¡password¿ and
off, ¡password¿. Communication is over TLS for security, with the RSA-based
SSL/TLS handshake, as in Figure 7.10.
The IoT device does not have a source of randomness, hence, it computes
the server-random rS from the client-random, using a fixed symmetric key kS
(kept only by the device), as: rS = AESkS (rC ).
1. Present a message scheduling diagram showing how an attacker, which
can eavesdrop on a connection in which the client turned the device ‘on’,
can later turn the device ‘on’ again, without the client being involved.
2. Would your answer change (and how), if the device supports ID-based
session resumption? Ticket-based session resumption?
3. Show a secure method for the server to compute the server-random method,
which will not require a source of randomness. The IoT device may use
and update a state variable s; you solution consists of the computation
of the server-random: rS = and of the update to the
state variable performed at the end of every handshake: s = .

317
Exercise 7.12. Consider a client and server that use TLSv1.2 with ephemeral
DH public keys, as in Fig. 7.12. Assume that the client and server run this
protocol daily, at the beginning of every day i. (Within each day, they may
use session resumption to avoid additional public key operations; but this is
not relevant to the question). Assume that Mal can (1) eavesdrop on com-
munication every day, (2) perform MitM attacks (only) every even day (i s.t.
i ≡ 0 ( mod 2)), (3) is given all the keys known to the server on the fourth
day. Note: the server erases any key once it is not longer in use (i.e., on fourth
day, attacker is not given the ‘session keys’ established n previous days).
Fill the ‘Exposed on’ column of day i in in Table ??, indicating the first day
j ≥ i in which the adversary should be able to decrypt (expose) the traffic sent
on day i between client and server. Write ‘never’ if the adversary should never
be able to decrypt the traffic of day i. Briefly justify.

Day Eavesdrop? MitM? Given keys? Exposed on... Justify


1 Yes No No
2 Yes Yes No
3 Yes No No
4 Yes Yes Yes
5 Yes No No
6 Yes Yes No
7 Yes No No
8 Yes Yes No

Exercise 7.13 (TLS with PRS). Consider a client that has three consecu-
tive TLS connections to a server, using TLS 1.3. An attacker has different
capabilities in each of these connections, as follows:
• In the first connection, attacker obtains all the information kept by the
server (including all keys).
• In the second connection, attacker is disabled.
• In the third connection, attacker has MitM capabilities.

Is the communication between client and server exposed, during the third con-
nection?

1. Show a sequence diagram showing that with TLS 1.3, communication


during third connection is exposed to attacker.
2. Present an improvement to TLS 1.3 that will protect communication dur-
ing third connection.
3. Further to provide same protection, even if attacker can eavesdrop to
communication during the second connection.

318
4. How can your improvement be implemented using TLS 1.3, allowing ‘nor-
mal’ TLS 1.3 interaction if client/server do not support your improve-
ment?

Exercise 7.14. A Pierpont prime is a prime number of the form 2u · 3v + 1,


where u, v are non-negative integers; Pierpont primes are a generalization of
Fermat primes. Assume that the client’s key exchange sent in the Client Hello
message of TLS 1.3 includes {giai mod pi }, where for some i, say i = 3, the
prime p3 is a Pierpont prime.

1. Assume that the server selects to use this p3 , i.e., sends back g3b3 mod p3
(as part of the server-hello message, and signed). Explain how a MitM
attacker would be able to eavesdrop and modify messages sent between
Alice and Bob.
2. Assume that the server prefers p2 , which is a safe prime. Show that the
attacker is able to (adapt the attack and still) eavesdrop/modify messages
between Alice and Bob.
3. Would this attack be possible, if Alice authenticates to the website (1)
using userid-pw, or (2) using TLS client authentication ? Explain.

319
Chapter 8

Public Key Infrastructure (PKI)

8.1 Introduction: PKI Concepts and Goals


Hopefully, the readers who reached this chapter, are already well aware of the
many applications and advantages of public key cryptography, e.g., easier key
distribution, as public keys do not require encryption while distributed. The
basic feature of public key cryptography, is that the party using a public key,
normally does not know the corresponding private key. Instead, it relies on
this being the ‘correct’ public key for its needs; we therefore refer to this party
as the relying party.
For the relying party to rely on a given public key for some purpose, it
should be able to validate, with acceptable certainty, that this is the correct
public key for the particular purpose and application. But how? This is exactly
the goal of the Public Key Infrastructure (PKI). PKI provides the infrastructure
that allows a relying party to obtain a public key and validate that the public
key fits its purpose. PKI is, therefore, an essential component for practical
deployments of public key cryptography (PKC).
There are multiple proposals of PKIs; we focus on the X.509 standard [99],
which is, by far, the dominant PKI, and mostly on X.509v3, i.e., version 3 of
X.509, which is almost-always the one deployed. X.509 is designed for general-
ity; there are several published X.509 profiles, which define different restrictions
on the contents and use of X.509 certificates. We mostly focus on the most
well-known profile, the PKIX profile, used by Internet protocols, and defined in
RFC 5280 [47]. We also discuss some extensions of X.509, mainly OCSP (RFC
6960) [151] and Certificate Transparency (CT) [114–116]). All of these (X.509
with the PKIX profile as well as CRLs, OCSP and CT) are used in common
implementations of the TLS/SSL handshake protocols, which we discussed in
chapter 7, and in particular its application to secure the communication be-
tween browsers and web-servers; this particular application is often referred to
as the Web PKI. TheWeb PKI is probably the most well-known, and possibly
also more important, application of PKI; see subsection 8.1.2.

320
Certificate Authority
(Aka CA or Issuer)

Bob’s public key Certificate


Alice Bob.e CB
(relying party)

Certificate CB :
CB = SignCA.s ([Link], Bob.e, . . .) Subject
(e.g, website [Link]))
Nurse

Figure 8.1: PKI Entities and typical application for server-authentication in


Web-PKI process. Here, we show the simple case of a typical identity certificate
issued by a trusted CA (‘trust anchor’ or ‘root CA’) to website [Link]. Some
PKIs also allow relying parties to use certificates issued by an intermediate-CA
(Figure 8.6) or a path of multiple intermediate-CAs (Figure 8.7). Dashed arrow
represent certificate issuing process, occurring once, before client connections.

Basic PKI concepts: certificate, issuer (CA) and subject. All main-
stream PKIs, including X.509, distribute a public key pk together with a set
ATTR of attributes and a digital signature σ, which is the result of a signa-
ture algorithm applied to input containing both pk and ATTR. The tuple
(pk, ATTR, σ) is called a public key certificate or simply a certificate. The cer-
tificate is issued by an entity referred to as a Certificate Authority (CA) or
simply as the issuer.
Most attributes refer to the subject of the certificate, i.e., the entity who
knows (‘owns’) the private key corresponding to the certified public key pk.
In addition, there are often few additional attributes related to the certificate
itself rather than to the subject, such as the certificate validity period and
serial number.
The CA provides the certificate to the subject, who (often) provides it to
the relying party - for example, during the TLS/SSL handshake.

Identity certificates. Many certificates include an identifying attribute, i.e.,


an identifier of the subject; such certificates are referred to as identity certifi-
cates. In the typical server-authentication use by TLS/SSL ofWeb PKI, the
relying-party is the browser, the subject is the web-site, and the relevant iden-
tifier is the domain name of the website, e.g., [Link]. This typical use-case is
illustrated in Figure 8.1. Identity certificates may include multiple identifiers,
as well as non-identity attributes.
Figure 8.1 illustrates the basic PKI entities and interactions. The simple

321
scenario in this figure includes three parties: the Certificate Authority (also
called CA or issuer), the subject of the certificate, i.e., the entity whose public
key is certified, and the relying party, i.e., the party who receives the subject’s
certificate in order to validate the subject’s public key. In this typical example,
the subject is the website [Link], and the relying party is Alice’s browser.
The figure shows the simple case of a typical identity certificate, issued to
website [Link] directly by a CA trusted by the relying party, who, in this case,
Alice’s browser. Such directly-trusted CAs are called ‘trust anchors’ or ‘root
CAs’. In reality, most Web-PKI certificates are indirectly-issued, i.e., issued by
an intermediate-CA (Figure 8.6), or even by a path of multiple intermediate-
CAs (Figure 8.7).
The certificate CB is a signature using the private signing key of the CA,
CA.s (not shown), over the subject’s identity, in this case the domain name
[Link], and the subject’s public key, in this case the public encryption key
Bob.e, typically provided from subject to CA. This is an identity certificate
since it contains the identifier (domain name) [Link]. In Web-PKI, using
TLS/SSL server-authentication, the web-site (subject) [Link] sends the cer-
tificate CB to the relying party (Alice’s browser).

8.1.1 Requirements from PKI schemes.


The basic, high-level goal of a public key infrastructure (PKI), is to allow
relying parties to ensure that it uses a correct public key for its specific needs
and application. This high-level goal, motivates different specific requirements.
We now present these informally; let us first focus on requirements addressed
by X.509:
Accountability: it should be possible to identify the issuer (CA), given a
(valid or revoked) certificate.
Validity and revocation: the issuing CA should be able to limit the validity
period of certificates, and to revoke a certificate. Relying parties should
not accept a certificate outside the validity period, or after it is revoked,
plus some constant ∆ time units to account for time to provide revocation
information. Revocation in X.509 is discussed in § 8.4.
Trust management: A PKI facilitates trust management if it includes (‘suf-
ficient’) mechanisms to limit the risk due to a rogue CA, in particu-
lar, a rogue intermediate CA; see Figure 8.6 and Figure 8.7. In subsec-
tion 8.3.5 we discuss the main trust-management mechanisms standard-
ized in X.509.
There are several ‘post-X.509’ PKI designs, which aim to address additional
requirements, including:
Transparency: we say that a PKI ensures transparency if it provides mecha-
nisms allowing parties to be aware of all issued certificates; transparency
mechanism may further allow awareness to only specific subsets of the

322
certificate set, e.g., certificates with a particular identifier (e.g., domain
name). In § 8.6, we discuss Certificate Transparency (CT), a proposed
standard for extending X.509 with mechanisms to ensure transparency.
Equivocation-prevention/detection: a PKI supports equivocation preven-
tion if there cannot exist two mutually-valid yet different identity cer-
tificates for the same identifier (e.g., domain name). A PKI supports
equivocation detection if such ‘equivocating’ certificates may exist, but
would be detected soon after the second one is issued.
Privacy: Some ‘post-X.509’ PKI designs attempt to ensure different privacy
properties, e.g., to allow certificate subjects to present only a subset of
the attributes from a certificate issued to them, e.g., present a certificate
but only ‘exposing’ a specific ‘property-attribute’ such as (age¿21), and
not exposing other attributes (such as name or location).

From the three ‘post-X.509’ requirements, we only discuss transparency (in


§ 8.6).

8.1.2 The Web PKI


In chapter 7 we discussed the use of certificates by the SSL/TLS protocol, used
to secure web traffic and other applications. The SSL/TLS client (often, the
browser) receives a certificate (pk, ATTR, σ) authenticating the server’s public
key pk, and binding it to the server’s domain, e.g. [Link], which is specifies as
one of the attributes in ATTR. The certificate contains a signature σ; for the
certificate to be valid, σ must be a signature over (pk, ATTR), which validates
correctly using the public validation key of some trusted certificate authority
(CA).
In web security applications, each browser maintains a list of trusted root
certificate authorities (root CAs). These root CAs can also certify additional
CAs, referred to as intermediate CAs; we explain the process later on. To
support SSL/TLS, a web-server for a domain, e.g., [Link], needs a certificate
for that domain, signed by a root or intermediate CA, and, of course, to know
and use the corresponding private key.
Note that SSL/TLS also supports (optional) client authentication; however,
SSL/TLS client authentication requires a client certificate - and only few clients
have these client certificates. Similarly, client certificates are required for end-
to-end secure email services, e.g., using S/MIME [142]; but, again, only a tiny
fraction of the users went through the process of obtaining a client certificate,
and as a result, these secure email services are not widely used. The diffi-
culties of obtaining client certificates are probably one reason for the fact that
most secure messaging applications rely on authentication by the provider, and
sometimes also by the peer user, but not on client certificates.
For further discussion, focusing on weaknesses of the currentWeb PKI,
see subsection 8.5.1.

323
2001 VeriSign: attacker gets code-signing certs
2008 Thawte: email-validation (attackers’ mailbox)
2008,11 Comodo not performing domain validation
2011 DigiNotar compromised, over 500 rogue certs discovered
2011 TurkTrust issued intermediate-CA certs to users
2013 ANSSI, the French Network and Information Security Agency, issued
intermediate-CA certificate to MitM traffic management device
2014 India CCA / NIC compromised (and issued rogue certs)
2015 CNNIC (China) issued CA-cert to MCS (Egypt), who issued rogue certs
2015,17 Symantec issued unauthorized certs for over 176 domains

Table 8.1: Some PKI failures.

8.1.3 PKI failures


The basic goal of PKI is to validate that a given public key pk, is the correct
public key for a given purpose. Usually, when validation is successful, then the
key indeed is associate with the set of attributes ATTR. However, there are
several ways this may fail, in which case certificates must be revoked. These
failures include:

Subject key exposure: private keys should be well protected from exposure;
however, exposures do happen. Normally, exposures are quite rare; how-
ever, a discovery of a software vulnerability may cause exposure of many
private keys, as happened due to the Heartbleed Bug [40, 173].
CA failures: usually, certificate authorities have operated in a secure, trust-
worthy manner, and issued correct certificate to the rightful subjects - as
required and expected. However, there have also been several incidents
where CAs have failed in different ways, including vulnerable subject iden-
tification, e.g., insecure email validation, issuing intermediate-CA certifi-
cates to untrusted entities, e.g. to all customers, and even CA compromise
and issuing of rogue certificates or what appears to be intentional issuing
of rogue certificates. See Table 8.1.
Cryptanalytical certificate forgery: certificate-based PKIs all use and de-
pend on the hash-then-sign mechanism, and therefore become vulnerable
if the signature scheme is vulnerable - or if the hash-function used is
vulnerable. Specifically, certificate forgery was demonstrated when us-
ing hash functions vulnerable to chosen-prefix collision attacks, specifi-
cally, using RIPEMD, MD4 and MD5 [158], and later also using SHA-
A [120, 158]. See chapter 4.

324
8.2 Basic X.509 PKI Concepts
In this section, we discuss the basic notions of the X.509 PKI standard, which
was developed as part of the X.500 global directory standard. X.509 is the
most widely deployed PKI specification, and also includes some of the more
advanced PKI concepts which we cover in the following sections.

8.2.1 The X.500 Global Directory Standard


X.500 [42] is an ambitious, extensive set of standards of the International
Telecommunication Union (ITU), a United Nations agency whose role is to
facilitate international connectivity in communications networks. The goal of
X.500 is to facilitate the interconnection of directory services provided by dif-
ferent organizations and systems. The first version of X.500 was published as
early as 1988, and numerous extensions and updates were published over the
years.
The basic idea of X.500 is to provide a trusted, unified and ideally global
directory, by combining the data and services of its multiple component di-
rectories. Such a unified directory would be operated by interoperability of
trustworthy providers, such as telecommunication companies. While imple-
mentations of X.500 exists, its deployment is quite limited, and definitely far
from the vision of a global directory. Among the possible reasons for the lim-
ited deployment are the high complexity of the X.500 design, concerns that
X.500 interoperability may cause exposure of sensitive information, and lack
of sufficient trust among different directory providers.
However, some concepts from X.500 live on; in particular, X.500 contributed
extensively to the development of PKI schemes. The X.500 designers observed
that an interoperable directory should bind standard identifiers to standard
attributes.
One important set of attributes define the public key(s) of each entity. The
entity’s public encryption key allows relying parties to encrypt messages so that
only the intended recipient may decrypt them. Similarly, the entity’s public
validation key allows relying parties to validate statements signed by the entity.
We next discuss the main form of standard identifier defined in X.500: the
distinguished name.

8.2.2 The X.500 Distinguished Name


The design of X.500 was extensively informed by the experience of telecommu-
nication companies at the time, which included provision of directory services
to phone users. Phone directory services are mostly based on looking up the
person’s common name; the common name has the obvious advantage of being
a meaningful identifier - we usually know the common name of a person when
we ask the directory for that person’s information. Phone directories would
normally also allow specification of the relevant area, e.g., in form of locality;

325
C Country
L Locality
O Organization name
OU Organization unit
CN Common name

Table 8.2: Standard keywords/attributes in X.500 Distinguished Names

by limiting search to specific areas or localities, the directory services can be


decentralized.
However, obviously, a common name is not a unique identifier - in fact,
some common names are quite common, if you excuse the pun. In classical
phone directories, this is addressed by returning a set of results containing all
relevant entries, along with the relevant common-name and other attributes
(e.g., location).
The X.500 designers decided that in order to allow efficient use of large,
global directories, returning multiple results is not a viable option. Instead,
they decided to use a more refined identifier, with multiple keywords - where
the common name will simply be one of these keywords. This identifier is
the X.500 Distinguished Name (DN). The distinguished name was designed to
satisfy the following three main goals for identifiers:

Meaningful: identifiers should be meaningful and recognizable by humans.


This make it easier to memorize the identifier, as well as to link it with
off-net identifier, with potential legal and reputation implications.
Unique: identifiers should be unique, i.e., different subjects should have differ-
ent identifiers, allowing each identifier to be mapped to a specific subject.
Decentralized management: multiple, ‘distributed’ issuers, can issue iden-
tifiers, without restrictions, i.e., any issuer is allowed to issue any identi-
fier.
The uniqueness requirement is an obvious challenge, as common names
are obviously not unique. To facilitate unique DNs for people sharing the
same common name, X.500 distinguished names consist of a sequence of several
keyword-value pairs. The inclusion of multiple keywords - also referred to as
attributes - helps to ensure unique identification, when combined with the
common name. Typical, standard keywords are shown in Table 8.2; however,
a directory is free to use any keyword it desires.
To satisfy the ‘meaningful’ goal, identifiers should have readable representa-
tions. RFC 1779 [105] specifies a popular string representation for distinguished
names, where keyword-value pairs are separated by the equal sign, and different
pairs are separated by comma or semicolon. Other representations are possible
too, e.g., Figure 8.2 includes encoding of a DN using slash for separation.

326
Figure 8.2: Example of the X.509 Distinguished Name (DN) hierarchy.

Let us give two simple examples of different legitimate interpretations (and


implementations) of the RFC 1779 representation:

1. The distinguished name (DN) for a police officer names John Doe in the
Soho precinct of the NYPD may be defined as: C=US/L=NY/O=NYPD/OU=soho/CN=John
Doe; see Figure 8.2.
2. The distinguished name (DN) for an IBM UK employee with the name
Julian Jones may be written as: CN=Julian Jones, O=IBM, C=GB.
Read below on the author’s experience with this (realistic) DN.

Note that keyword-value pairs comprising an X.500 distinguished name are


specified in a sequence, i.e., as an ordered list. This allows the distinguished
names to be organized as a hierarchy, using the sequence of keywords as the
nodes, as illustrated in Figure 8.2. By assigning a specific, single entity to as-
sign identifiers in a sub-tree of the X.500 DN hierarchy, this entity can ensure
uniqueness by never allocating the same identifier (DN) to two different sub-
jects, e.g., the Soho precinct of the NYPD may maintain its own sub-directory.
This also allows queries over the entire set of distinguished names which begin
with a particular prefix of keyword-value pairs.
However, this implies that X.500 distinguished names cannot be issued in an
entirely decentralized manner - some control and coordination on the allocation
of identifiers is required. Furthermore, there are also some caveats with respect
to the other goals - unique and meaningful identifiers.
Let us first consider the goal of meaningful identifiers. The use of sub-
divisions such as ‘organization unit (OU)’ may help to reduce the likelihood
of two persons with the same common name in the same ‘bin’, this possibility
still exists. As a result, administrators may have to enforce uniqueness by

327
‘modifying’ the common name. For example, if there are multiple IBM UK
employees with the name Julian Jones, one of them may be assigned the DN:

CN=Julian Jones2, O=IBM, C=GB


This results in less meaningful distinguished names; e.g., it is easy to confuse
between the DNs of the two employees. For example, the author has sent to
CN=Julian Jones, O=IBM, C=GB messages intended for CN=Julian Jones2,
O=IBM, C=GB, when using an email system that used distinguished names as
email addresses. Luckily, both Julians were both understanding of the mistake.
Another cause of mistakes and ambiguity is the fact that there are no rules
governing the order of the keywords, i.e., the structure of the hierarchy, as
is evident from the two examples we presented. In particular, some multi-
national organizations may use the country as the top level category, as in
CN=Julian Jones, O=IBM, C=GB, while others may view the organization
itself as the top-level category, as in CN=Julian Jones, C=GB, O=IBM. These
two distinguished names are different; this distinction may not be obvious to
a non-expert, further reducing from the goal of ‘meaningful’ names.
There are also cases where uniqueness is not guaranteed. Some namespaces
are shared by design, and cannot be segregated with a single authority assigning
identifiers in each segment. For example, consider Internet domain names;
multiple registrars are authorized to assign names in several top-level domains
such as com and org. There is a coordination process between registrars, but
if not followed correctly, conflicts may occur.
This problem is more severe with respect to public key certificates for In-
ternet domain names, which can be issued by multiple Certificate Authorities;
any faulty authority may issue a certificate to an entity who does not rightfully
own the certified domain name. Such incidents occurred - often due to inten-
tional attack; e.g., see Table 8.1. This is a major concern forWeb PKI as well
as PKI in general, and we discuss it further later in this chapter.
We conclude that X.500 distinguished names are not perfectly meaningful
and definitely not decentralized; furthermore, sometimes, distinguished names
may even not perfectly ensure uniqueness. Indeed, there seem to be an inherent
challenge in satisfying all three goals, although achieving any two of these three
properties is definitely feasible - a classical trilemma scenario.

The identifiers trilemma. We argued that X.500 distinguished names may


fail to ensure each of the three goals defined above - uniqueness, meaningfulness
and decentralized management. In contrast, several other identifiers ensure
pairs of these three properties:
Common names are meaningful - and decentralized, as any person can de-
cide on the name. However, they are definitely not unique.
Public keys and random identifiers are decentralized and (almost always)
unique. However, obviously, public keys are not directly meaningful to
humans.

328
Figure 8.3: The Identifiers Trilemma: the challenge of co-ensuring unique,
decentralized and meaningful identifiers.

Email addresses are unique and meaningful. However, they are not decen-
tralized, since each issuer can only assign identifiers (email addresses) in
its own domain.

This begs the question: is there a scheme which will ensure identifiers which
fully satisfy all three properties, i.e., would be unique, meaningful and managed
and issued in a decentralized way? It seems that this may be hard or impossible,
i.e., it may be possible to only fully ensure two of these three goals, but not all
three. We refer to this challenge as the Identifiers Trilemma1 , and illustrate it
in Figure 8.3.

Additional concerns regarding X.500 Distinguished Names. We con-


clude our discussion of X.500 distinguished names, by discussing few additional
concerns.
Privacy. The inclusion of multiple categorizing fields in X.500 DNs, may
expose information in an unnecessary, and sometimes undesired, manner. For
example, employees may not always want to expose their location or organiza-
tional unit.
Flexibility. People may change locations, organization units and more; with
X.500 DNs, this may result in ‘incorrect’ DN, or require change of the DN -
both undesirable.
Usability. X.500 DNs are designed to be meaningful, i.e., users can easily
understand the different keywords and values. However, sometimes this may
not suffice to ensure usability. In particular, consider two of the most important
applications for public key cryptography and certificates: secure web-browsing
and secure email/messaging.
1 This challenge is also referred to as Zooko’s triangle; however, Zooko has apparently

referred to a different trilemma, albeit also related to identifiers. Specifically, Zooko consid-
ered the challenge of identifiers which will be distributed, meaningful for humans, and also
self-certifying, allowing recipients to locally confirm the mapping from name to value.

329
Secure web-browsing: users, as well as hyperlinks, specify the desired web-
site using an Internet domain name, and not a distinguished name. Hence,
the relevant identifier for the web site is that domain name - provided
by the user or in the hyperlink. This requires mapping from the domain
name to the distinguished name. A better solution is for the certificate
to directly include the domain name; this is supported by the PKIX
standard, explained below.
Secure email/messaging: users also do not use distinguished names to iden-
tify peers with whom they communicate using email and instant mes-
saging applications. Instead, they use email addresses - or application-
specific identification. This problem may not be as meaningful, since
most end users do not have a public key certificate at all; and, again,
PKIX allows certificates to directly specify email address.

8.2.3 X.509 Public Key Certificates


The X.500 standard included a dedicated sub-standard, X.509, which defined
authentication mechanisms, allowing entities to authenticate themselves to the
directory. X.509 defined multiple authentication mechanisms, e.g., the use of
password based authentication. However, one of these authentication methods
became a very important, widely used standard: the X.509 public key certifi-
cate.
Originally, the main goal of the X.509 authentication was to allow each
entity to maintain its own record with the directory, e.g., to change address.
However, it was soon realized that public key certificates allow many more ap-
plications, since they allow recipients to authenticate the public key of a party
without requiring any prior communication. As a result, X.509 certificates be-
came a widely deployed standard, which is used for SSL/TLS, code-signing,
secure email (PGP, S/MIME), IP-sec and more - and all that, in spite of com-
plaints about the complexity of the X.509 specifications and encoding formats.
The definition of the X.509 certificates did not change too much from the
first version of X.509; the contents (fields) of that first version of X.509 are
shown in Figure 8.4. These fields are, by their order in the certificate:

Version: the version of the X.509 certificate and protocol.


Certificate serial number: a serial number of the certificate, incremented
by the issuing CA whenever it signs a certificate.
Signature-process Object Identifier (OID): this is an identifier of the pro-
cess used for signing the certificate, typically using the ‘hash then sign’ de-
sign. This identifier specifies both the underling public key signature algo-
rithm, e.g., RSA, as well as the hash algorithm, e.g. SHA-256. The algo-
rithm may be written as a string for readability, and standard string terms
are used for widely used methods, e.g., sha256WithRSAEncryption; no-
tice the use of the term ‘RSA Encryption’ when referring to RSA signa-

330
Figure 8.4: X.509 version 1 certificate.

tures - a common misnomer. In the certificate itself, the algorithm is typ-


ically specified using the Object-Identifier (OID) standard; see note 8.1.
Issuer Distinguished Name: the distinguished name of the certificate au-
thority which issued, and signed, the certificate.
Validity period: the period of time during which the certificate is to be con-
sidered valid.
Subject Distinguished Name: the distinguished name of the subject of the
certificate, i.e., the entity to whom the certificate was issued. This entity
is expected to know the private key corresponding to the certified public
key.
Subject public key information: this field contains two sub-fields. The
first simply specifies the public key of the subject. The second field
identifies the allowed usage of the certified public key - e.g., to encrypt
messages sent to the subject, or to validate signatures by the subject. The
field also specifies the specific public-key algorithm, including key-length,
e.g., RSA-2048. The algorithm is identified using an Object-Identifier
(OID), see note 8.1.
Signature: finally, this field contains the result of the application of the sig-
nature algorithm (identified by the signature-process OID field above),
to all of the other fields in the certificate, using the private signing key
of the issuer (certificate authority). The sequence of all these fields in
the certificate, excluding the signature field itself, is referred to as the
to-be-signed fields; see Figures 8.4, 8.5. This allows the relying party to
validate the authenticity of the fields in the certificate, e.g., the validity
period, the subject distinguished name and the subject public key.

331
Note 8.1: Object identifiers (OIDs)

An object identifier (OID) is used as a unique identifier for arbitrary objects. OIDs
are standardized by the ITU and ISO/IEC, as part of the ANS.1 standard [41, 62].
Object identifiers are specified as a sequence of numbers, e.g., [Link].45.34. OID
numbers are assigned hierarchically to organizations and to ‘individual objects’;
when an organization is assigned a number, e.g., 1.16, it may assign OIDs whose
prefix is 1.16 to other organizations or directly to objects, e.g. [Link].45.34. The
top level numbers are either zero (0), allocated to ITU, 1, allocated to ISO, or 2,
allocated jointly to ISO and ITU. RFC 3279 [9] defines OIDs for many cryptographic
algorithms and processes used in Internet protocols, e.g., RSA, DSA and elliptic-
curve signature algorithms; when specifying a signature process, the OID normally
also specifies both the underlying public key signature algorithm and key length,
e.g., RSA-2048, and the hashing function, e.g. SHA-256, used to apply the ‘hash-
then-sign process. X.509 uses OIDs to identify signature algorithms and other types
of objects, e.g., extensions and issuer-policies. The use of OIDs allows identification
of the specific type of each object, which helps interoperability between different
implementations.

Exercise 8.1. Provide a security motivation for the fact that the signature
process is specified as one of the (signed) fields within the certificate. Do this
by constructing two ‘artificial’ CRHFs, hA and hB ; to construct hA and hB ,
you may use a given CRHF h. Your constructions should allow you to show
that it could be insecure to use certificates where the signature process (incl.
hashing) is not clearly identified as part of the signed fields. Specifically, design
hA , hB to show how an attacker may ask a CA to sign a certificate for one
name, say Attacker, and then use the resulting signature over the certificate to
forge a certificate for a different name, say Victim.

X.509 Certificates: Versions 2 and 3. Following X.509 version 1, the


X.509 certificates were extended by few additional fields; see Figure 8.5.
Version 2 of X.509 added two fields, both of them for unique identifiers -
one for the subject and one for the issuer (CA). These fields were defined to en-
sure uniqueness, in situations where the distinguished name may fail to ensure
uniqueness, as discussed in subsection 8.2.2. However, these unique identifier
fields are not in wide use, as they are entirely unrelated to the meaningful
identifiers used in typical applications.
Version 3 of X.509 (X.509v3) is the one in practical use; the main rea-
son for its wide success is that it dramatically increased the expressiveness of
X.509 certificates. This is although X.509v3 added just one field to the list
of fields in version 2, as can be seen in Figure 8.5. However, this single field
is the general-purpose extensions mechanism, providing extensive flexibility
and expressiveness to certificates, and facilitating many applications and use
cases; this field is usually also much longer than all other fields combined. The
X.509v3 extensions mechanism is the subject of the next subsection.

332
Figure 8.5: X.509 version 3 certificate. Version 2 is identical, except for not
having the extensions field.

8.2.4 The X.509v3 Extensions Mechanism


As shown in Figure 8.5, X.509 certificates, from version 3, include a field that
can contain one, or more, extensions. Extensions are specified by three simple
fields, which we describe in this subsection; in the following subsections, we
discuss some specific, important extensions.
Each extension is specified, using the following three fields:
Extension identifier: specifies the type of the extension. The extension iden-
tifier is specified using an object identifier (OID), to facilitate interoper-
ability. The following subsections discuss some important extensions,
e.g., key usage and name constraint.
Extension value: this is an arbitrary string which provides the value of the
extension. For example, a possible value for the key-usage extension
would indicate that the certified key is to be used as a public encryption
key, while a possible value for the name constraint extension may be
Permit C=GB, allowing the subject of the certificate to issue its own
certificates, but only with the value ‘GB’ (Great Britain) to their ‘C’
(country) keyword.
Criticality indicator: this is a binary flag, i.e., an extension can be marked
as ‘critical’ or as ‘non-critical’. The value of the criticality indicator flag
in an extension instructs relying parties how to handle the certificate if
the relying party is not familiar with this type of extension, as indicated
by the extension identifier. A relying party must never use a certificate

333
which includes an extension marked as critical, if this type of extension
is unknown to this relying party. Relying parties should use certificates
with unknown extension types, if they are marked as ‘non-critical’. When
the relying party is familiar with the type of an extension, the value of
the criticality indicator is not applicable.

The criticality indicator flag is a simple mechanism - but very valuable, by


allowing both critical extensions and non-critical extensions. Both types of
extensions may be required, when extending a standard protocol - in this case,
a certificate. Indeed, some extensions are always marked ‘critical’, others are
always marked ‘non-critical’, and others are marked differently depending on
needs. We next present examples of each of these three types of extensions,
focusing on standard extensions.

Example 8.1 (Always non-critical example). An X.509 extension called TLS


feature is defined in RFC 7633 [88]. This extension, in the server certificate,
indicates that the server supports specific TLS ‘feature’. The term ‘feature’ here
is used to refer to a specific TLS extension (see subsection 7.3.3); we use the
term ‘feature’ instead of ‘extension’, to avoid confusion with X.509 extensions.
The ‘TLS feature’ X.509 extension allows the server to indicate to the client
that the server supports certain important TLS features; however, some clients
may not support this extension, so if it would be marked ‘critical’, they would
have rejected the certificate and the connection would fail. Hence, this X.509
extension should be marked ‘non-critical’. Note that this extension isn’t one of
the standard extensions defined in either X.509 or PKIX.
Example 8.2 (Per-application critical/non-critical example). The ‘extended
key usage’ extension allows the issuer to define allowed usage for the public
key certified, which is in addition to or in place of the usage specified in the
‘key usage’ extension. In some scenarios, the ‘extended key usage’ should be
‘critical’, e.g., to prevent incorrect usage based on the ‘key usage’ extension, by
clients not supporting ‘extended key usage’. In other scenarios, the ‘extended
key usage’ should be ‘non-critical’, e.g., when allowing some additional usage
over that specified already in the ‘key usage’ extension.

Example 8.3 (Always critical example). PKIX specifies that the ‘key usage’
extension must be marked ‘critical’, while X.509 allows the ‘key usage’ exten-
sion to be marked as either ‘critical’ or ‘not-critical’. Let us give a somewhat-
contrived example of a possible attack exploiting a certificate where key-usage
was not marked as ‘critical’, causing a relying party who does not understand
this extension, to make a critical security mistake. Assume that the parties
(key-owner, i.e., certificate subject, and relying party) use ‘textbook RSA’ en-
cryption, i.e., encrypts plaintext mE by computing c = meE mod n; and ‘text-
book RSA’ signing, i.e., sign message mS by outputting σ = h(mS )d mod n,
i.e., ‘decrypting’ the hash of the message. Furthermore, assume the key-owner
uses its decryption key to authenticate that it is active at a given time, by

334
decrypting an arbitrary challenge ciphertext sent to it; this requires only a rel-
atively weak form of ciphertext-attack resistance, where the attacker must ask
for the decryption before seeing the challenge ciphertext it must decrypt, often
referred to as IND-CCA1 secure and assumed for textbook RSA. A key-owner
using this mechanism must use its key only for decrypting these challenges;
assume it receives a certificate CE for its encryption key e, with the key-usage
extension correctly marking this as an encryption key, but not marked as crit-
ical.
An attacker may abuse this, together with the fact that ‘key usage’ is not
understood by some relying parties, to mislead these relying-parties into thinking
that the key-owner signed some attacker-chosen-message mA , as follows. The
attacker computes cA = h(mA ) and sends it to the key-owner, as if it is a
standard challenge ciphertext to be decrypted. The key-owner therefore decrypts
cA and outputs the decryption, cdA mod n = h(mA )d mod n, which we denote
by σA , i.e., σA ≡ h(mA )d mod n. Now the attacker sends the pair (mA , σA ),
along with the certificate CE , to the relying party, claiming mA was signed by
the key-owner with signature σA . Since the relying party is not familiar with
the key-usage extension, and it was not marked ‘critical’ in the key-owner’s
certificate CE , then the relying-party would validate (mA , σA ), which would
validate correctly, and thereby incorrectly consider mA as validly-signed by the
key-owner.

The flexibility offered by the ‘criticality indicator’ makes the X.509 (and
PKIX) extensions mechanism much more versatile and useful; it is a pity that
this idea has not been adopted by other extension mechanisms. For example,
TLS client and servers simply ignore unknown extensions, i.e., treat them as
‘non-critical’, as discussed in chapter 7. It would be nice to allow also definition
of critical extensions, i.e., instructing the TLS peer to refuse connection if it
does not know the extension. This can be achieved quite easily, and in different
ways, see next exercise.

Exercise 8.2. Design how TLS may be extended to support critical extensions.
Could you achieve this using the existing TLS extensions mechanism?

8.3 Certificate Validation and Standard Extensions


Upon receiving a certificate, the relying party must decide whether it can rely
on and use the certified public key, for a particular application. If the cer-
tificate is signed by a CA trusted by the relying party, then the relying party
would immediately apply the certificate validation process, which uses the pub-
lic signature-validation key of the CA, CA.v, to determine if the given cer-
tificate is valid. We refer to such a directly-trusted CA as a trust anchor. If
the certificate is not signed by a trust anchor, then the relying party should
perform a more complex certification path validation process, which would de-
termine if the relying party may trust this certificate, based on an additional
set of certificates.

335
Most of our discussion applies to both the X.509 specifications as defined
by the ITU, and to their adaptation for use in Internet protocols, as defined by
the IETF in RFC 5280 [47], Public Key Infrastructure (PKIX) certificate and
CRL profile. We point out few points where PKIX and X.509 differ.
In the following subsections, we discuss the certificate validation processes
and the most important X.509 and PKIX standard extensions. In subsec-
tion 8.3.1, we discuss the validation process for certificates signed by a trust-
anchor. In subsection 8.3.2, we discuss the standard alternative name ex-
tensions, providing alternative identification for both issuer and subject. In
subsection 8.3.3 we discuss standard extensions dealing with the usage of the
certified key, and the certificate policies related to the issuing and usage of the
certificate. In subsection 8.3.5 we discuss standard extensions defining con-
straints on certificates issued by the subject, for the special case where the
subject of a certificate is also a CA. In subsection 8.3.4 we discuss the pro-
cess of certificate path validation, allowing validation of certificates which are
not trusted directly by a trust-anchor, but, instead, by establishing trust in
intermediate CAs, and the related standard extensions.

8.3.1 Trust-Anchor-signed Certificate Validation


We begin our discussion of the X.509 certificate validation process, by consider-
ing the (simpler) case, where the issuer is directly trusted by the relying party,
i.e., a trust anchor. We later discuss the case where the issuer is not trusted
directly, which requires further validation of the certificate path defined by a
given set of additional certificates.
Assume, therefore, that a relying party receives a certificate signed (issued)
by a trust anchor, i.e., the relying party trusts I, the issuer CA, and knows its
public validation key I.v. To validate the certificate, the relying party uses I.v
and the contents of the certificate, as follows:

Issuer. The relying party verifies that the issuer I of the certificate, as iden-
tified by the issuer distinguished name field, is a trusted CA, i.e., a trust
anchor.
Validity period. The relying party checks the validity period specified in the
certificate. If the public key is used for encryption or to validate sig-
natures on responses to challenges sent by the relying party, then the
certificate should be valid at the relevant times, including at the current
time. If the public key is used to validate signature generated at the past,
then it should be valid at a time when these signatures already existed,
possibly attested by supporting validation by trusted time-stamping ser-
vices.
Subject. The relying party verifies that the subject, as identified, using dis-
tinguished name, in the ‘subject’ field, is an entity that the relying party
expected. For example, when the relying party is a browser and it re-
ceives a website certificate, then the relying party should confirm that

336
the website identity (e.g., domain name) is the same as indicated in the
‘subject distinguished name’ field of the certificate.
Signature algorithms. The relying party confirms that it can apply and
trust the validation algorithm of the signature scheme identified in the
signature algorithm OID field of the certificate. If the certificate is signed
using an unsupported algorithm, or an algorithm known or suspected to
be insecure, validation fails.
Issuer and subject unique identifiers. From version 2, X.509 certificates
also include fields for unique identifiers for the issuer and the subject,
which the relying party should use to further confirm their identities. In
PKIX, these identifiers are usually not used, and PKIX does not require
their validation. This is probably since in PKIX, the issuer and subject
identifiers are typically in corresponding extensions.
Extensions. First, the relying party should validate that it is familiar with
any extension marked as ‘critical’; the existence of any unrecognized ex-
tension, marked as critical, would invalidate the entire certificate. Then,
the relying party should validate the provided set of key and policy ex-
tensions, as discussed in the next subsection. Finally, the relying party
should validate the existence and validity of any non-standard extensions,
which is required or supported by the relying party.
Validate signature. The relying party next uses the trusted public validation
key of the CA, CA.v, and the signature-validation process as specified in
the certificate, to validate the signature over all the ‘to be signed’ fields
in the certificate, i.e., all fields except the signature itself.

In the following subsections, we discuss the main standard extensions de-


fined in X.509 and PKIX, and then, in subsection 8.3.4, the validation of cer-
tificates which are not signed by a trust anchor, but, instead, supported by a
set of certificates.

8.3.2 Standard Alternative-name Extensions


Both X.509 and PKIX define the standard SubjectAltName (SAN) and Issuer-
AltName extensions, provide alternative identification mechanism to comple-
ment or replace the Distinguished Name mechanism, providing identification
for identifying, respectively, the subject and the issuer. These alternative fields
allow the use of other forms of names, identifiers and addresses for the subject
and/or issuer. Note that a certificate may contain multiple SANs.
The most important form of an alternative name is a Domain Name System
(DNS) name, referred to as dNSName, e.g., [Link]. These dNSNames are
used by most Internet protocols, and are familiar to most users. Also allowed
but rarely used alternative names, include email addresses, IP addresses, and
URIs.

337
In fact, the use of alternate names is so common, that in many PKIX certifi-
cates, the subject and issuer distinguished-name fields are left empty. Indeed,
PKIX (RFC 5280) specifies that this must be done, when the Certificate Au-
thority can only validate one (or more) of the alternative name forms, which
is often the case in practice. PKIX specifies that in such cases, where the Sub-
jectAltName extensions is the only identification and the subject distinguished
name is empty, then the extension should be marked as critical, and otherwise,
when there is a subject distinguished name, it should be marked as non-critical.
Note that PKIX (RFC 5280) specifies that the Issuer Alternative Name
extension should always be marked as non-critical. In contrast, the X.509
standard specifies that both alternative-name extensions, may be flagged as
either critical or non-critical.
Also, note that implementations of the Secure Socket Layer (SSL) and
Transport Layer Security (TLS) protocols, often allow certificates to include
wildcard certificates, which, instead of specifying a specific domain name, use
the wildcard notation to specify a set of domain name. For TLS, this support is
clearly defined in RFC 6125 [149]. Wildcard domain names are domain names
where some of the alphanumeric strings are replaced with the wildcard char-
acter ‘*’; there are often restrictions on the location of the wildcard character,
e.g., it may be allowed only in the complete left-most label of a DNS domain
name, as in *.[Link]. Wildcard domain names are not addressed in
PKIX (RFC5280) or X.509, and RFC 6125 mentions several security concerns
regarding their use.

8.3.3 Standard key-usage and policy extensions


We next discuss another set of standard extensions, defined in both X.509 and
PKIX, which deal with the usage of the certified key and with the certificate
policies related to the issuing and usage of the certificate. This includes the
following extensions:

The authority key identifier extension. Provides an identifier for the is-
suer’s public key, allowing the relying party to identify which public val-
idation key to use to validate the certificate, if the issuer has multiple
public keys. It is always non-critical.
The subject key identifier extension. Provides an identifier for the certi-
fied subject’s public key, allowing the relying party to identify that key
when necessary, e.g., when validating a signature signed by one of few
signature keys of the subject - including signatures on (other) certificates.
It is always non-critical.
The key usage extension. Defines the allowed usages of the certified public
key of the subject, including for signing, encryption and key exchange.
The specification allows the use of same key for multiple purposes, e.g.,
encryption and validating signatures, however, this should not be used,

338
as the use of the same key for such different purposes may be vulner-
able - security would not follow from the pure security definitions for
encryption and for signatures. An exception, of course, is when using
schemes designed specifically to allow both applications, such as sign-
cryption schemes. This extension may be marked as critical or not.
The extended key usage extension. Allows definition of specific key-usage
purposes as supported by relying parties. The specification also allows
the CA to indicate that other uses, as defined by the key-usage extension,
are also allowed; otherwise, only the specified purposes are allowed. This
extension may be marked as critical or not.
The private key usage period extension. This extension is relevant only
for certification of signature-validation public keys; it indicates the al-
lowed period of use of the private key (to generate signatures). Always
marked non-critical.
The certificate policies extension. This extension identifies one or more
certificate policies which apply to the certificate; for brief discussion of
certificate policies, see Note 8.2. The extension identifies certificate poli-
cies using object identifiers (OID). In particular, the policy OID in the
certificate policies extension, is the main mechanism to identify the type
of validation of the legitimacy of the certificate, performed by the CA
before it issued the certificate - Domain Validation (DV), Origin Valida-
tion (OV) or Extended Validation (EV). For more discussion on certificate
policies and on the three types of validation, see Note 8.2. The certificate
policies extension may be marked as critical or as non-critical.
The policy mappings extension. This extension is used only in certificates
issued to another CA, called CA certificates. It specifies that one of the
issuer’s certificate policies, can be considered equivalent to a given (differ-
ent) certificate policy used by the subject (certified) CA. This extension
may be marked as critical or as non-critical.

Exercise 8.3. Some of the extensions presented in this subsection should al-
ways be non-critical, while others may be marked either critical or non-critical.
Justify each of these designations, e.g., for each of these extensions, give an
example of a case where it should be non-critical.

8.3.4 Certificate path validation


Certificate path validation allows validation of certificates which are not trusted
directly by a trust-anchor, but, instead, by establishing trust in intermediate
CAs, and the related standard extensions. PKI schemes require the relying
parties to trust the contents of the certificate, mainly, the binding between the
public key and the identifier. In the simple case, the certificate is signed by a
CA trusted directly by the relying parties, as in Figure 8.1. Such a CA, which

339
Note 8.2: Certificate policy (CP) and Domain/Origin/Extended Validation

A certificate policy (CP) is a set of rules that indicate the applicability of the
certificate to a particular use, such indicating a particular community of relying
parties that may rely on the certificate, and/or a class of relying party applications
or security requirements, which may rely on the certificate. Certificate policies
inform relying parties of the level of confidence they may have in the correctness of
the bindings between the certified public key and the information in the certificates
regarding the subject, including the subject identifiers. Namely, the Certificate
Policy provides information which may assist the relying party to decide whether
or not to trust a certificate for a particular purpose. The certificate policy may also
be viewed as a legally-meaningful document, which may define, and often limit,
the liability and obligations of the issuer (CA) for potential inaccuracies in the
certificate, and define statutes to which the CA, subject and relying parties should
conform; however, these legal aspects are beyond our scope.
One application of the certificate policies extension, and specifically of the policy
OID field, is as a method to identify the type of validation performed by the CA
before issuing the certificate. The type of validation is an indicator of its trustwor-
thiness, and may be used by relying parties to determine their use of the certificate.
Three types of validation are defined: Domain Validation (DV) and Organization
Validation (OV) and Extended Validation (EV) (in order of increasing trust).
Domain Validation (DV) is a fully-automated - but not very secure - validation
process. It involves sending a request to an address associated with the domain,
and validating the response. The address may be an IP address or email address
(sometimes referred to as email validation). Domain Validation is vulnerable to
MitM attacks; it is also vulnerable to off-path attacks exploiting weakness of the
domain name system (DNS) or of the routing infrastructure; see [93].
Both (OV) and Extended Validation involve additional validation (beyond DV), such
as review of documents. Extended validation, as the name implies, require more
through validation - although the precise requirements are not well defined.
The policy OID field is defined as the standard method to identify Extended Vali-
dation (EV) certificates. The OID field is also often used to identify DV and OV
certificates.
Browsers make minimal use of the type of validation, if at all. Many browsers display
the type of validation of the certificate to the user - usually, only when the user
enters a menu focusing on details of the site’s certificate (which most users never
do). Some browsers make some user-interface distinction, usually only between EV
certificates vs. other (OV and DV) certificates, but these indications are mostly
considered ineffective and their use seem to be in decline.

340
is directly trusted by a relying party, is called a trust anchor of that relying
party.
Direct trust in one or more trust-anchor (directly trusted) CAs, might suf-
fice for small, simple PKI systems. However, many PKI systems are more
complex. For example, browsers deploy PKI to validate certificates provided
by a website, during the SSL/TLS handshake. Browsers typically directly
trust a large number - about 100 - of trust anchor CAs, referred to in browsers
as root CAs. Furthermore, in addition to the root CAs, browsers also indi-
rectly trust certificates signed by other CAs, referred to as intermediate CA;
an intermediate CA must be certified by root CA, or by a properly-certified,
indirectly-trusted intermediate CA.
Different relying parties may have different trust anchors, and different
requirements for trusting intermediate CAs. The same CA, say CAA , may be
a trust anchor for Alice, and an intermediate CA for Bob, who has a different
trust anchor, say CAB .
Relying parties and PKIs may apply different conditions for determining
which certificates (and CAs) to trust. For example, in the PGP web-of-trust
PKI, every party can certify other parties. One party, say Bob, may decide
to indirectly trust another party, say Alice, if Alice is properly certified by
a ‘sufficient’ number of Bob’s trust anchors, or by a ‘sufficient’ number of
parties which Bob trusts indirectly. The trust decision may also be based on
ratings specified in certificates, indicating the amount of trust in a peer. Some
designs may also allow ‘negative ratings’, i.e., one party recommending not to
trust another party. The determination of whether to trust an entity based
on a set of certificates - and/or other credentials and inputs - is referred to as
the trust establishment or trust management problem, and studied extensively;
see [34, 35, 94, 95, 122] and citations of and within these publications.
We focus on the simpler case, where a single valid certification path suffices
to establish trust in a certificate. This is the mechanism deployed in most PKI
systems and by most relying parties, and specified in X.509 and PKIX. In both
of these, the validation of the certificate path is based on several certificate path
constraints extensions, which we discuss in the following subsections.

8.3.5 The certificate path constraints extensions


In this subsection, we present the three certificate path constraints extensions:
basic constraints, name constraints and policy constraints. These constraints
are relevant only for certificates issued to a subject, e.g. [Link], by some
intermediate CA (ICA), i.e., ICA is not directly trusted by the relying party
(say Alice), i.e., is not one of Alice’s trust anchors.
Since an intermediate CA (ICA) is not a trust anchor for Alice (the relying
party), then Alice would only trust certificates issued by the ICA if the ICA
is ‘properly certified’ by some trust anchor CA; we use TACA to refer to a
specific Trust Anchor CA which Alice trusts, and based on this trust, may or
may not trust a given ICA.

341
In the simple case, illustrated in Figure 8.6, the relying party (Alice) receives
two certificates: a certificate for the subject, e.g., the website [Link],
signed by some Intermediate CA, which we denote ICA; and a certificate for
ICA, signed by the trust anchor CA, TACA. In this case, we will say that the
subject, [Link], has a single-hop certification path from TACA, since ICA
is certified by the trust anchor TACA. In this case, therefore, the certification
path consists of two certificates: CICA , the certificate issued by the trust anchor
TACA to the intermediate CA ICA, and CB , the certificate issued by the
intermediate CA ICA to the subject ([Link]).
In more complex scenarios there are additional Intermediate CAs in the cer-
tification path from the trust anchor to the subject, i.e, the certification path
is indirect, or in other words, contains multiple hops. For example, Figure 8.7
illustrates a scenario where the subject, [Link], is certified via an indi-
rect certification path with three hops, i.e., including three intermediate CAs:
ICA1, ICA2 and ICA3. The subject [Link] is certified by ICA3, which
is certified by ICA2, which is certified by ICA1, and only ICA1 is certified by
a trust anchor CA, TACA. Hence, in this example, the certification path con-
sists of four certificates: (1) CICA1 , the certificate issued by the trust anchor
TACA to the intermediate CA ICA1, (2) and (3), the two certificates CICA2
and CICA3 , issued by the intermediate CAs ICA1 and ICA2, respectively, to
the intermediate CAs ICA2 an ICA3, respectively, and finally (4) CB , the
certificate issued by the intermediate CA ICA3 to the subject ([Link]).
We use the terms subsequent certificates to refer to the certificates in a
certification path which were issued by intermediate CAs, and the terms root
certificate or trust-anchor certificate to refer to the ‘first’ certificate on the path,
i.e., the one issued by the trust-anchor CA. The second certificate along the
path is certified by the intermediate CA certified by the trust anchor (in the
trust-anchor certificate); and any following certificate along the path, say the
ith certificate along the path (for i > 1), is certified by the intermediate CA
which was certified in the (i − 1)th certificate in the path. The length of a
certificate path is the number of intermediate CAs along it, which is one less
than the number of certificates along the path.
Note that, somewhat contrary to their name, the certification path con-
straints cannot prevent or prohibit Intermediate CAs from signing certificates
which do not comply with these constraints; the constraints only provide infor-
mation for the relying party, say Alice, instructing Alice to trust a certificate
signed by ICA, only if it conforms with the constraints specified in the certifi-
cates issued to the intermediate CAs.

8.3.6 The basic constraints extension


The basic constraints extension defines whether the subject of the certificate,
say [Link], is allowed to be a CA itself, i.e., if [Link] may also
sign certificates (e.g., for other domains or for employees). More specifically,
the extension defines two values: a Boolean flag denoted simply cA (with this

342
non-standard capitalization), and an integer called pathLenConstraint (again,
with this capitalization).
The cA flag indicates if the subject ([Link]) is ‘allowed’ to issue
certificates, i.e., act as a CA; if cA = T RU E, then [Link] may issue
certificates, and if cA = F ALSE, then it is not ‘allowed’ to issue certificates.
Recall that this is really just a signal to the relying parties receiving certificates
signed by [Link]; also, this only restricts the use of the certificate that
I issued to [Link] for validation of certificates issued by [Link],
it does not prevent or prohibit [Link] from issuing certificates, which a
relying party may still trust, either since it directly trusts [Link] (i.e., it is
a trust anchor), or since it receives also an additional certificate for [Link]
signed by a different trusted CA, and that certificate allows [Link] to be
a CA, e.g., by having the value TRUE to the cA flag in the basic constraints
extension.
The value of the pathLenConstraint is relevant only when there is a ‘path’
of more than one intermediate CA, between the Trust Anchor CA and the
subject. For example, it is relevant only in Figure 8.7, and not in Figure 8.6.
Note also that root CA
For example, in both Figure 8.6 and Figure 8.7, the Trust Anchor CA
(TACA) signs certificate CICA1 , where is should specify the ICA1 is a trusted
(intermediate) CA. Namely, it must set the cA flag in the basic-constraints
extension of CICA1 to TRUE. However, in Figure 8.7, ICA1 further certifies
ICA2 which certifies ICA3 - and only ICA3 certifies the subject ([Link]).
Therefore, for the relying party to ‘trust’ certificate CB for the subject, signed
by ICA3, it is required that CICA1 will also contain the path-length (pathLen)
parameter in the basic constraint extension, and this is parameter must be at
least 2 - allowing two more CAs till certification of the subject. Similarly, the
certificate issued by ICA1 to ICA2 must contain the basic constraints extension,
indicating cA as TRUE, as well as value of 1 at least for the pathLen parameter.
Unfortunately, currently, essentially all browsers do not enforce path-length
constraints on the root CAs. Root CAs sometimes do enforce path-length
constraints on intermediate CAs, however, these are usually rather long, e.g.,
3, leaving wide room for an end-entity to receive, by mistake, a certificate
allowing it to issue certificates. Of course, in most cases, end-entity certificates
will not allow issuing certificates, typically since their basic-constraints will
indicate that they are not a CA.
Browsers usually enforce basic constraint, although, failures may happen,
esp. since this kind of flaw - lack of validation - is not likely to be detected by
normal user.

Exercise 8.4 (IE failure to validate basic constraint). Old versions of the IE
browser failed to validate the basic constraint field. Show a sequence diagram
for an attack exploiting this vulnerability, allowing a MitM attacker to collect
the user’s password to trusted sites which authenticate the user using user-id
and password, protected using SSL/TLS.

343
Exercise 8.5. Assume that TACA is concerned that subject-CAs may issue
certificates to end-entities (e.g., websites) and neglect to include a basic con-
straint extension, to prevent the end entity from issuing certificates. Explain
how TACA may achieve this, for the scenarios in Figure 8.6 and in Figure 8.7.
Identify any remaining potential for such failure by one of the intermediate
CAs in these figures.

8.3.7 The name constraint extension


The name constraint extension is used in certificates issued to a subject CA,
such as the intermediate CAs in Figure 8.6 and Figure 8.7. The name constraint
extension restricts the set of subject-names to be certified by the subject CA, as
well as by any subsequent CA. For example, in Figure 8.7, name constraint in-
cluded in certificate CICA1 issued by TACA to ICA1, would restrict certificates
issued by ICA1, ICA2 and ICA32 .
The name constraint extension has two possible parameters, which we de-
note by the names3 permit (to define permitted name spaces) and exclude (to
forbid name spaces, typically within the permitted name space). Focusing on
the PKIX profile, both parameters are an identifier for a names, where usually
the name is a domain names; we focus on this case. When a domain name
is specified, this is taken to include sub-domains, e.g., if a name constraints
contain parameter permit (only) for domain name com, then this allows sub-
domains such as [Link], but not names in other top-level domains such as
[Link]. The exclude parameter takes precedence; i.e., if a certificate contains
both permit for domain name, say edu, and exclude for subdomain [Link],
then this allows subsequent certificates only for domains in the edu top-level
domain, and excludes domains in the subdomain [Link]. See examples in
the tables in Figures 8.6 and reffig:PKI:path:constraints.
Note that these examples focus on the typical case of DNS domain names,
however, the restrictions may apply to other types of names, e.g., email ad-
dresses or X.509 distinguished names.
Figure 8.8 presents an example of a typical application of the name con-
straint extension, using X.509 domain names. In this example, the NTT Japan
CA issues a certificate to IBM Japan, allowing the IBM Japan CA to certify
any certificates with the value ‘IBM’ for the organization (O) keyword - imply-
ing that IBM Japan cannot certify other organizations. Also, see IBM Japan
certifying the ‘main’, corporate IBM CA, but excluding sites where the value
of the country (C) keyword is Japan, i.e., not allowing corporate IBM CA to
certify sites in Japan, even IBM sites. Notice that such certificate issued by
corporate IBM would also be trusted by relying parties using only NTT Japan
2 The name constraints in C
ICA1 would also restrict certificates issued by the subject
([Link]); we didn’t list this above, since the subject’s certificate, CB , should prevent
the subject from issuing certificates, using the basic constraints extension.
3 The actual parameter names are permittedSubtrees and excludedSubtrees, which are a

bit cumbersome.

344
TACA Certificate CICA ICA
(Trust Anchor CA) (Intermediate CA)

Certificates
CICA , CB
Relying party
(e.g, Alice’s browser)

Certificates CICA , CB Subject


(e.g, [Link])
Nurse

CICA constraints extensions


CB
Basic Naming Policy
valid?
cA pathLen Permit Exclude Req. Policy
1 No (any) (any) (any) (any) No
2 Yes (any) [Link] none or [Link] none or > 1 Yes
3 Yes (any) [Link] (any) (any) No
4 Yes (any) [Link] [Link] (any) No
5 Yes (any) (any) (any) 0 No
6 Yes (any) [Link] (any) (any) No
7 Yes (any) (none) [Link] (any) No

Figure 8.6: A single-hop (length one) certificate-path, consisting of trust-


anchor CA T ACA, an intermediate CA ICA, and a subject (e.g., website
[Link]). The table shows the impact of several examples of certificate
path constraints extensions in certificate CICA , on the validity of CB ; see dis-
cussion in subsection 8.3.5. Each row is one example of the constraints in CICA ;
for all of them, assume that the certificate is for domain name [Link]
and has no certificate policies extension. For example, in row 1, CICA does not
have the cA flag set (true); namely, CICA does not indicate that ICA is a CA,
and hence CB is invalid. In contrast, in row 2, certificate CB is valid, since the
cA flag is true, the naming-constraints permit [Link] and does not exclude
[Link], and either there is no policy-constraint or its value is more than
1.

345
TACA ICA1 ICA2 ICA3
CICA1 CICA1 , CICA1 ,
CICA2 CICA2 ,
CICA3
Certificates
CICA1 , CICA2 ,
Relying party CICA3 , CB
(e.g, Alice’s browser)
Certificates CICA1 ,
CICA2 , CICA3 , CB Subject
(e.g, [Link])
Nurse

CICA1 constraints extensions


CB
Basic Naming Policy
valid?
cA pathLen Permit Exclude Req. Policy
1 Yes <2 (any) (any) (any) No
2 Yes none or ≥ 2 [Link] none or [Link] none or > 3 Yes
3 Yes (any) (any) (any) ≤3 No
4 Yes (any) [Link] (any) (any) No
5 Yes (any) (none) [Link] (any) No

Figure 8.7: A length 3 certificate-path, consisting of trust-anchor CA T ACA,


three intermediate CAs (ICA1, ICA2, ICA3), and a subject (e.g., website
[Link]). The table shows the impact of the different certificate path
constraints extensions (see subsection 8.3.5), in particular, of the pathLen (path
length) parameter of the basic constraints extension. For the examples in the
table, assume that none of the certificates has the certificate policies extension,
and that the intermediate certificates CICA1 , CICA2 , CICA3 all has the cA
flag set in ‘Basic constraints’, and that CICA2 , CICA3 do not have any other
constraints. For example, in row 1, CB is invalid, since the pathLen field in
the Basic-constraints extensions of CICA1 is set to less than 2 (and the path
from ICA1 to ICA3 is of length two). In contrast, in row 2, the pathLen
constraint does not exist (or is satisfied), and the other constraints in CICA1
are also set to allow the certificate path to be valid (compare to the examples
in Figure 8.6).

346
Figure 8.8: Example of the use of Name Constraint, where constraints are
over distinguished name keywords. NTT Japan issues a certificate to IBM
Japan, with the name constraint Permit O=IBM, i.e., allowing it to certify only
distinguished names with the value ‘IBM’ to the ‘O’ (organization) keyword,
since NTT Japan does not trust IBM Japan to certify other organizations. IBM
Japan certifies the global IBM, only for names in the IBM organization (Permit
O=IBM), and excluding names in Japan (Exclude C=Japan). Similarly, global
IBM certifies Symantec for all names, except names in the IBM organization.

as a trust anchor, provided that other relevant constraints such as certificate


path length are satisfied (or not specified).
Figure 8.9 presents a similar example, but using DNS domain names instead
of X.509 distinguished names.
Unfortunately, currently, essentially all browsers do not enforce any naming
constraints on the root CAs, and root CAs rarely enforce naming constraints
on intermediate CAs. Therefore, although we believe most browsers do support
name constraints, these are rarely actually deployed in practice.

8.3.8 The policy constraints extension


In addition to the basic constraints and name constraint extensions, X.509 and
PKIX also define a third standard extension that defines additional constraints
on subsequent certificates. This is the policy constraints extension, which is
related to the certificate policies and certificate policy mappings extensions;
see Note 8.2.
The policy constraints extension allows the CA to define two requirements
which must hold, for subsequent certificates in a certificate path to be consid-
ered valid:
requireExplicityPolicy: if specified as a number n, and the path length is
longer than n, then all certificates in the path must have a policy required

347
Figure 8.9: Example of the use of Name Constraint, with similar constraints
to the ones in Figure 8.8, but here using DNS names (dNSName).

by the user.
inhibitPolicyMapping: if specified as a number n, and the certificate path is
longer than n, say C1 , . . . , Cn , Cn+1 , . . ., then Cn+1 and any subsequent
certificate, should not have a policy mapping extension.

8.4 Certificate Revocation


In several scenarios, it becomes necessary to revoke an issued certificate, prior
to its planned expiration date. This may be due to security lapses or due to
administrative reasons:

Revocation due to security lapses: to mitigate key compromise and an in-


correctly issued certificate, e.g., CA compromise, failure of the CA to
properly validate, or failure of CA to indicate constraints; see further
discussion in subsection 8.1.3.
Administrative revocation: due to non-security-related need to terminate
usage. RFC 5280 [47] lists a surprisingly long list of reasons for such
‘administrative revocation’; let us give few examples here too: (1) due to
change required in the distinguished name or another attribute, (2) when
a key is replaced, e.g., due to a proactive decision to change to a different
algorithm and final example (3) when the organization or company that
the certificate belonged to has ceased operations.

However, this brings the question: how to revoke the certificates, i.e., how
to inform the relying parties that a certificate was revoked? This turns out to
be a significant challenge - definitely much larger than originally anticipated,
at the early X.509 design, who initially only offered one solution - Certificate

348
Revocations List (CRL) - which is widely implemented - but often not used due
to its excessive overhead, as we discuss below.
Indeed, there is still no consensus on the ‘best’ revocation mechanism. We
discuss several revocation mechanisms in the following subsections, focusing on
these that are already applied. Part of the difficulty in converging on the ‘best’
solution, is that the actual patterns of certificate revocation in practice, is not
sufficiently studied and understood, although there are definitely some good
studies, e.g., [173]. Another challenge is that there are multiple considerations
and trade-offs, including communication (bandwidth) overhead, delay (until
revocation information arrives), handling failure to receive revocation data,
and privacy (exposure of interest in specific certificate).

Prefetch (‘push’) vs. As-needed (‘pull’) collection of revocation data.


One common categorization of revocation mechanism is between prefetch (or
push) and as-needed (pull) mechanisms; let us explain these two categories, as
well as mechanisms not falling into eiter (and sometimes referred to as network-
assisted). Relying parties need to determine revocation status when they need
to determine if a specific certificate is valid - e.g., when a TLS client receives
a certificate from a TLS server. For determining revocation status, the relying
party needs relevant information; should this information be already obtained
prefetch, or should it be fetched only when as-needed? Fetching the informa-
tion only as-needed (also referred to as pull) avoids unnecessary communication
and storage, but may cause delay (waiting for information to arrive), reliabil-
ity (what to do if revocation information is unavailable), and privacy concerns
(e.g., exposing the website being visited). Fetching revocation information
prefetch (or push), may cause unnecessary communication and storage over-
head. We will see that some of the revocation mechanisms prefetch revocation
data (push), some collect it only as-needed (pull), and other approaches do not
fall exactly into either category, since they involve also other entities such as
the subject (e.g., website), and are sometimes referred to as network assisted.

8.4.1 Certificate Revocation List (CRL)


The X.509 designers probably expected revocation to be a rare incident, with
a small number of certificates which were revoked (but not yet expired) at
any given time. In this case, a simple solution is for the CA to periodically
distribute a list of all revoked certificates; a daily period (24-hours validity) is
common. This approach is defined as part of the X.509 standard, and called a
Certificate Revocation List (CRL). A relying party can use the CRL to detect
if a certificate issued by the CA was revoked.
CRLs are defined as part of the X.509 and PKIX standards, with several
variants and extensions, including an extension mechanism much like that of
X.509 certificates. The contents of the X.509 CRL are shown in Figure 8.10.
Bandwidth overhead is a major concern with CRLs, since CRLs can often
be quite large. Measurements of CRL overhead were reported in [173], who
found median CRL of 51KB and maximal CRL of 76MB (yes, this is in Mega

349
Figure 8.10: X.509 Certificate Revocation List (CRL) format.

Bytes!); and [157], who found average CRL of 173KB. The reason for that
is that the number of revocations may be surprisingly high; specifically, [173]
found that about 8% of the non-expired certificates were revoked. In truth,
most of these were caused by a large spike in revocation due to the Heartbleed
bug [40], however, even looking at their measurements before this, about 1% of
the non-expired certificates were revoked - which can still result in excessively
long CRLs.
Three standard X.509 extensions are designed to reduce the bandwidth
overhead of CRLs:
The CRL distribution point extension splits certificates issued by a CA
to several sets (‘distribution points’), each handled by a separate CRL.
To validate a specific certificate, you (only) need the CRL for that distri-
bution list. This reduces the length of each CRL, at the cost of requiring
the CA to sign and distribute multiple CRLs (periodically). Some CAs
adopt CRL distribution points to significantly reduce the size of their
CRLs, however notice that the result is that the relying party (client)
almost always has to download the relevant CRL upon receiving the cer-
tificate, which can cause significant delay - and the communication cost
is still significant, esp. since CAs do not adopt this mechanism or still
have many certificates in each CRL.
The Authorities Revocation List (ARL) extension lists only revocations
of CA certificates. This is essentially equivalent to placing CA certificates
in a dedicated distribution point.
The Delta CRL extension lists only new revocations, which occurred since
last base-CRL. To validate that a given certificate is not revoked, check if

350
it is contained either in the Delta-CRL or in a base-CRL, issued not ear-
lier than the time specified in the Delta-CRL. However, for this method
to be effective, relying parties should cache CRLs beyond their validity
period (in order to download deltas) - this would probably reduce much
from the savings obtained using this method, and make implementation
quite complex. Possibly due to such concerns, Delta-CRLs are not widely
deployed.
Even with such optimizations, CRLs may still introduce significant band-
width overhead.

CRLs: prefetch (‘push’) or on-demand (‘pull’)? CRLs may be prefetched,


i.e., the relaying party collects all CRLs before it needs to check the certifi-
cate. However, this must be done regularly, to make sure that the revocation
information is reasonably updated (fresh); and the amount of data is, as men-
tioned, considerable. Therefore, implementations usually fetch the CRLs only
as-needed (‘pull’), with the associated concerns of delay, reliability and expo-
sure of privacy, on top of the significant communication overhead.
As a result of these concerns, the use of CRLs is not very common - although
they are implemented in many products. In particular, many browsers do not
check CRLs; often, instead, they rely on OCSP, see subsection 8.4.3, or on
optimized prefetch mechanisms, which we discuss next.

8.4.2 Optimized Prefetch (‘push’) Revocation Mechanisms


The large overhead of CRLs, esp. when prefetched (‘push’ approach), moti-
vated efforts to optimize their operation. One simple approach is to prefetch
a subset of the revocation information, focusing on revocation information of
‘important’ certificates; this essentially follows the ARL extension approach,
mentioned above. Currently, such optimizations are deployed by several major
browsers, albeit in proprietary manner, which may include some (undisclosed)
optimizations. Specifically, this includes Google’s CRLsets and Mozilla’s OneCRL [85].
Both CRLsets and OneCRL prefetch revocation information only for a small
subset of certificates; in 2015, Liu et al. reported in [173] that only 0.35% of
the revoked certificates were reported in a CRLsets. OneCRL is restricted to
revocations of intermediate CAs.
Several recent papers [113,157] investigated optimized prefetch mechanisms,
which may allow prefetching to extend to much larger collections of certificates
- with acceptable overhead. Specifically, Smith et al. [157] propose the use of
Certificate Revocation Vectors (CRVs), compressed bit vectors which contain
‘1’ for revoked certificates and ‘0’ for non-revoked certificates. Even without
compression, CRV may be significantly more efficient than CRL, when a sig-
nificant fraction of the certificates is revoked (e.g., 8%). Simple compression
can make CRV more efficient, esp. for lower revocation percentages; see [157]
for details.

351
OCSP Client OCSP Responder
(e.g., relying party) (CA or trusted OCSP server)

OCSP request:
version, {CertID1 , . . .} [, signature] [, extensions]

OCSP response:
ResponseStatus, producedAt, responses, signature

Figure 8.11: The Online Certificate Status Protocol (OCSP). The request in-
cludes one or more certificate identifiers {CertID1 , . . .}; requests are optionally
signed. The OCSP response is signed by the responder, and includes response
for each CertID in the request. Each of these ‘individual responses’ includes
the CertID, cert-status, time of this update, time of the next update, and
optional extensions. Cert-status is either revoked, good or unknown.

Exercise 8.6. Assume a set of a million certificates, and certificates identi-


fied by their serial number (from 1 to one million), encoded, where necessary,
by the minimal number of bits sufficient for these one-million distinct serial
numbers. What will be the size (in bytes) of the CRL, assuming 1% and 8%
of the certificates were revoked? Ignore the size of elements of the CRL except
the list of serial numbers of revoked certificates, and assume (typical although
sub-optimal) 96-bit representation for each serial number. Repeat, for (uncom-
pressed) CRV.

8.4.3 Online Certificate Status Protocol (OCSP)


OCSP (Online Certificate Status Protocol) [151], shown in Figure 8.11 is a
request-response protocol, providing a secure, signed indication to the relying
party, showing the ‘current’ status of certificates (details below). The protocol
involves two entities: the OCSP client, who sends an OCSP request to request
the status of one or more certificates, and the OCSP responder (server), who
responds with a (signed) OCSP response, indicating the status of the certifi-
cate(s).
The OCSP client, i.e., the entity that sends the OCSP request, is either the
relying party or another party. In this subsection, we focus on the ‘classical’
OCSP deployment, where the relying party, e.g. browser, sends the OCSP
request to the CA (or other OCSP responder), as in Figure 8.12; in this case,
the relying party (often browser) acts as the OCSP client. Later, in subsec-
tion 8.4.4, we discuss the stapled-OCSP deployment, where it is the subject,
e.g. website, who sends the OCSP request, i.e., the subject (often website) acts
as the OCSP client.
The OCSP responder, i.e, the entity that processes OCSP requests and
sends responses, is an entity trusted by the relying party; we will assume this

352
OCSP Responder TLS client TLS (web)
(often the CA) (browser) server

TLS Client Hello

TLS Server Hello

OCSP request

OCSP response

TLS key exchange, finish

TLS finish

Figure 8.12: OCSP used by relying party (as OCSP client). There are several
concerns with this form of using OCSP, including privacy exposure, overhead
on CA, and handling of delayed/missing OCSP response by the client/browser.
This last concern, illustrated in Figure 8.13, motivated updated browsers to
support and prefer OCSP-stapling (see Figure 8.14), where the TLS/web server
makes the OCSP request, instead of the client/browser, and ‘staples’ the OCSP
response to the TLS server hello message.

is the CA itself, although it could also be another entity, delegated by the


CA. Each OCSP response message is signed by the OCSP responder or the
CA, allowing the relying party to validate it, even if received via an untrusted
intermediary, e.g., the subject (website).

Improving efficiency with multi-cert OCSP requests. To improve ef-


ficiency, a single OCSP request may specify (request status for) multiple cer-
tificates (CertIDs)4 . Correspondingly, a single OCSP response, using a single
signature, may include (signed) responses for multiple certificates. The support
for OCSP requests and responses for multiple certificates, is especially impor-
tant to support certificates signed by intermediate CAs, using a Certificate-
Path; see Note 8.3.

OCSP vs. CRLs. The length of an OCSP response is linear in the number
of CertIDs in the corresponding OCSP request, rather than a function of
the total number of revoked certificates of this CA, as is the case for CRLs.
Furthermore, the computation required for sending an OCSP response is just
4 Certificate identifiers (CertIDs) may be specified using the hash of the issuer name and

key, and a certificate serial number.

353
Note 8.3: Multi-cert OCSP requests for Certificate-Path (CP)

An indirectly trusted certificate, certified via a certificate-path of (one or more)


intermediate CAs, may be invalidated via revocation of any of these intermediate
CAs. A relying party wishing to validate the status of the certificate, needs an
updated status of the certificate of each intermediate CA, in addition to the status of
the certificate of the subject. The fact that an OCSP request may include multiple
certificates, may allow this process to be more efficient; a single OCSP request-
response interaction may suffice to obtain updated status for all of these certificates,
provided that the same OCSP responder is able to provide (signed) OCSP responses
for all of these certificates (issued by different CAs).
Note that the original OCSP stapling support in TLS, as defined in the certificate-
status extension, does not support stapling of multiple certificates. To support
this important case, browsers and servers should use the later-defined ‘multiple
certificate status’ extension, RFC 6961 [139].

one signature operation, plus some hash function applications, regardless of


the number of revoked certificates or the number of certificates whose status is
requested in this OCSP request. In the common case where the total number
of revoked certificates may be large, this significantly reduces the overhead
of generating and distributing often large CRL responses. Namely, OCSP
provides an alternative which is often more efficient than CRLs; with CRLs,
the CA must ‘push’ the list of all revocations to all relying parties, while with
OCSP, a relying party receives information only about relevant certificates.
In addition, OCSP responses are sent on a timely fashion, when the relying
party is validating the relevant certificate - which may provide a more ‘fresh’
indication compared to the periodical CRL. As a result of these advantages,
OCSP appears to be deployed more than CRLs.

OCSP Challenges: ambiguity and failures. OCSP status responses for


each certificate may specify one of three values: revoked, good or unknown. The
‘unknown’ response is typically sent when the OCSP responder does not serve
OCSP requests for the issuer of the certificate in question, or cannot resolve
their status at the time (e.g., due to lack of response from the CA).These
unknown responses are ambiguous; relying parties are left to decide how to
interpret and respond to it. These ambiguous responses are quite problematic,
as we explain below. But first let us discuss another OCSP scenario that also
leads to similar ambiguity: failed requests.
An OCSP request may fail in multiple ways. One way is when the OCSP
client fails to establish communication or receive response from the OCSP
server. Another reason is when the OCSP responder sends back an OCSP
failure return code, indicating a reason for failure. These reasons include:
• Lack of signature on OCSP request (when required by OCSP responder)
• Request not properly authorized/authenticated, e.g., not from known IP
address, or missing/incorrect authentication information, when required

354
by OCSP responder. Authentication information should be provided by
the client in an appropriate OCSP extension.
• Technical reasons, such as overload or internal error.

Recall now that in the ‘classical’ OCSP deployment, the OCSP client is the
relying party, typically, the browser, as in Figure 8.12. However, this creates a
dilemma for the browser (or other relying party): how should the relying party
respond to OCSP failures and ambiguous responses, e.g., when a response does
not arrive (within reasonable time) or indicates an OCSP failure? The following
are the main options - and why each of them seems unsatisfactory:

Wait: if the problem is timeout, then the relying party may simply continue
waiting for the OCSP response, possibly resending the request periodi-
cally, and never ‘giving up’. However, OCSP servers could fail or become
inaccessible forever, or for extremely long, leaving the relying party in
this state. We do not believe any relying party has taken or will take this
approach; also, it does not address the other types of OCSP ambiguities.
Hard-fail: abort the connection (and inform the user). That is clearly a ‘safe’
alternative, i.e., prevent use of a revoked certificate. However, the OCSP
interaction may often fail or return ambiguous response due to benign
reasons, such as network connectivity issues or overload of the OCSP
responder. In particular, usually, the OCSP responder is the CA, and
CAs often do not have sufficient resources to handle high load of OCSP
requests. Therefore, this approach is not widely adopted.
Ask user: the relying party may, after some timeout, invoke a user-interface
dialog and ask the user to decide if to continue with the connection or
abort it. For example, a browser may invoke a dialog, informing the user
that the certificate-validation process is taking longer than usually, and
ask the user what action it should take. While this option may seem
to empower the user, in reality, users are rarely able to understand the
situation and make an informed decision, and are very likely to continue
with the connection; see discussion of usability in chapter 9. Hence,
except for ‘shifting the responsibility’ to the user, this option is inferior
to direct soft-fail, discussed next.
Soft-fail: finally, the relying party may simply continue as if it received a
valid OCSP response. By far, this is the most widely-adopted option. In
the typical case of a benign failure to receive the OCSP response, there
is no harm in picking this option. However, this choice leaves the user
vulnerable to an impersonation attack using a revoked certificate, when
the attacker can block the OCSP response; see Figure 8.13. Since our
need for cryptography is mainly due to concerns about a Monster-in-the-
Middle attacker, who can surely block communication, this option results
in vulnerability.

355
TLS client MitM (fake server, OCSP Responder
(browser) with revoked cert) (CA)

TLS Client Hello

TLS Server Hello


with revoked certificate

OCSP request

(drop) OCSP response

time-out→
softfail TLS key exchange, finish

TLS finish

(data)

Figure 8.13: The MitM soft-fail Attack on TLS connection, using ‘classical’
OCSP deployment, where the TLS-client (browser) sends the OCSP request
(acts as OCSP client), and assuming the (vulnerable) ‘soft-fail’ handling of
timeouts and ambiguous OCSP responses. The attacker is impersonating as a
web site, to which the attacker has the private key; the corresponding certificate
is already revoked, but the attack allows the attacker to trick the browser into
accepting it anyway, allowing the impersonation attack to succeed. The browser
queries the CA (or other OCSP server) to receive a fresh certificate-status.
However, the attacker ‘kills’ the OCSP request, or the OCSP response (figure
illustrates dropping of the response). After waiting for some time, the browser
times-out, and accepts the revoked certificate sent by the impersonating web
site, although no OCSP response was received. This ‘soft-fail’ behavior is used
by most browsers, since the alternatives (very long timeout, asking the user,
or ‘hard-fail’) are not well received by users.

As Figure 8.13 shows, the soft-fail approach essentially nullifies the value
of OCSP validation - against an attacker that exposes the private key of the
(web) server, or is able to obtain a fake (and later revoked) certificate for
the server’s domain, and is also able to block the OCSP response. Exposing
the private key and obtaining a fake certificate are both challenging attacks;
however, they do occur, otherwise, there was no need in revocations. The other
condition, of being able to block the OCSP response, is often surprisingly easy
for an attacker, e.g., by sending an excessive number of OCSP requests to the
OCSP responder (e.g. the CA) at the same time as the OCSP request from the
relying party. In particular, an attacker is likely to be able to launch such attack

356
Note 8.4: The The UX > Security Precedence Rule

In the OCSP soft-fail vulnerability, As described in § 8.4.3, most browsers support


OCSP, but only using soft-fail, namely, if the OCSP-response is not received within
some time, then the browser simply continues with the connection, i.e., ‘gives up’ on
the OCSP validation and continues using the received certificate (assuming it was
not revoked). It is well understood that this allows a MitM attacker to foil the OCSP
validation, i.e., the use of the soft-fail approach results in a known vulnerability.
Still, browser developers usually prefer to have this vulnerability, to the secure
alternative of hard-fail, namely, aborting a connection after ‘giving up’ on the OCSP
response. The reason is that there are also benign reasons for the OCSP response
not arrive, such as unusually high delay due to network congestion or high load on
the OCSP responder (typically, the CA). And aborting a connection in such cases
would result in loss of availability. If the response is only delayed and eventually
arrives, waiting for a long time would result in poor performance.
Loss of availability, performance, reliability and functionality, are all immediately
visible to the end users, i.e., they harm the user experience (UX). User experience
has a direct, immediate impact on the success of a product. In contrast, security
and privacy considerations are rarely visible to the users. As a results, even whne
vendors and developers care about security and privacy, they usually prefer to
compromise on these goals, to avoid harming the user experience (UX) aspects:
availability, functionality, performance, usability and reliability. We refer to this as
the UX>Security Precedence Rule.

Principle 14 (The UX>Security Precedence Rule). Vendors and developers give


precedence to the user experience (UX) considerations (availability, functionality,
performance, usability and reliability), than to the security and privacy considera-
tions.

Of course, the UX>Security Precedence Rule is just a simplification; real decisions


are more complex, and some vulnerabilities will be considered so critical, that de-
velopers will prefer to fix them, even at the cost of some reduction in UX. However,
usually, the challenge for designers and researchers is to find solutions which will
ensure sufficient security, but avoiding or minimizing harm to the user experience
(UX).

by intentionally invoking appropriate links from a web-site controlled by the


attacker, in a so called web-puppet attack; see the web-security chapter of [93].
In spite of this, soft-fail is common choice of browsers and most other relying
parties, basically, since developers give more weight to user-experience (UX)
considerations, than to security considerations; see Note 8.4. Unfortunately,
as we explained, this allows attackers to circumvent OCSP and use revoked
certificates, by intentionally causing a failure to the OCSP challenge-response
communication.
There are several additional problems with the use of ‘classical’ OCSP de-
ployment, where the OCSP request is sent by the relying party (often, browser):

Delay: since OCSP is an online, request-response protocol, its deployment at

357
the beginning of a connection often results in considerable delay.
Privacy exposure: the stream of OCSP requests (and responses) may expose
the identities of web sites visited by the user to the OCSP responder, or
to other agents able to inspect the network traffic. By default, OCSP
requests and responses are not encrypted, exposing this information even
to an eavesdropper; but even if encryption is used, privacy is at risk.
First, the CA is still exposed to the identities of web-sites visited by
a particular user. Second, even with encryption of OCSP requests and
responses, the timing patterns create a side-channel that may allow an
eavesdropper to identify visited websites.
Computational and communication overhead: while OCSP often reduces
overhead significantly cf. to CRLs, it still requires each response to be
signed, which is computational burden on the OCSP responder. In ad-
dition to this computational overhead, there is the overhead due to the
need of the OCSP responder to interact with any client; this overhead
remains even if applying optimizations that reduce the OCSP computa-
tional overhead, e.g., as in Exercise 8.9.

The computational and communication overhead is a concern for both


OCSP client and OCSP responder. Consider a CA providing OCSP responder
service; the signatures in OCSP responses imply significant processing over-
head, which can be a significant concern to the CA. Normally, CAs cannot
charge for the overhead of handling these OCSP requests; and to provide re-
liable service, they should be ready to respond to a Flash Crowd5 of requests,
from visitors of a (suddenly popular) website, or to respond to request sent as
part of an intentional Denial-of-Service attack (on the CA or on a subject of a
certificate).
Due to the overhead concerns, an OCSP responder may limit its services
to authorized OCSP clients. To support this, OCSP requests may be signed;
some servers may use other ways to authenticate their clients, e.g. using the
optional extensions mechanism supported by OCSP requests.
We next describe OCSP stapling, where the OCSP client is the subject of
the certificate rather than the relying party. The goal of OCSP stapling is to
mitigate these security, privacy and efficiency concerns. In subsection 8.4.5 we
discuss additional methods to reduce the computational overhead of OCSP.

8.4.4 OCSP Stapling and the Must-Staple Extension


In the previous subsection, we have seen several disadvantages of the ‘classical’
OCSP deployment, where the relying party sends the OCSP requests (i.e., acts
as the OCSP client). In this section we discuss an alternative approach, the
OCSP stapling deployment, where the OCSP request is sent by the subject,
5 The term Flash Crowd is the name of a sci-fi novella by Larry Niven, describing ‘physical

flash crowd’ due to the use of a transfer booth.

358
typically the website, acting as the OCSP client. Namely, this design moves
the responsibility to obtain ‘fresh’ OCSP signed responses to the subject (e.g.,
web-server), rather than placing this responsibility (and burden) on every client
(e.g., browser). This addresses the privacy exposure and reduces the overhead
on the OCSP responder (typically, the CA), since it now needs only to send
a single signed OCSP response to each subject (website) - much less overhead
than sending to every relying party (browser). Furthermore, since now only
the subject is supposed to make OCSP requests, the CA may limit the service
to its customers, the subjects.
Therefore, of all the concerns discussed for the relying-party-based OCSP,
only one remains: handling of ambiguous OCSP responses, and in particular,
the MitM soft-fail attack (Figure 8.13). We discuss two variants of OCSP
stapling, which handle, in two different ways, such ambiguities and failures.

(Optional) OCSP Stapling. OCSP stapling is a different way to deploy


OCSP, where the subject, runs the OCSP client and periodically sends OCSP
requests to the OCSP responder, for an OCSP response for the server’s certifi-
cate, e.g., CB .
Let us focus on the typical scenario, where the relying party is a browser
running TLS, who receives a certificate CB from the web (and TLS) server, e.g.
[Link], who is the subject of the certificate CB . In OCSP Stapling, the subject
(web server) periodically sends an OCSP request to the OCSP responder (CA).
The web-server does this periodically, without waiting for the TLS Client Hello
message from the client. See this scenario in Figure 8.14.
The CA (or other OCSP responder) sends back the OCSP response, and
the important and typical scenario is when the response indicates that CB is
still Ok (not revoked), at the current time time(·). We denote the response
as σ; importantly, σ = SignCA.s (CB Ok:time(·)), i.e., contains a signature by
the private signing key CA.s of the CA, on the web-server’s certificate CB and
the current time. This response should satisfy browsers (as relying parties), at
least until [Link] will ‘refresh’ it by again sending OCSP request for CB . The
web-server, e.g., [Link], keeps the response σ, providing it to all connections
by OCSP-stapling-supporting browsers, until it would request and receive a
newer OCSP response, in the next period.
When an OCSP-stapling-supporting browser connects to [Link], it in-
dicates its support for OCSP-stapling by including the CSR TLS extension,
where CSR stands for the Certificate Status Response TLS-extension. The
web-server em may respond by stapling (including) the OCSP response σ,
which it places in the CSR TLS-extension. Note that we now discuss the vari-
ant of OCSP deployment, where stapling is optional; i.e., the web-server may
not staple an OCSP response, e.g., if the web-server did not receive the OCSP
response from the OCSP responder. This is the reason that we added the
word Optional to the term OCSP stapling; we later discuss OCSP Must-Staple,
a variant of OCSP deployment where the subject commits to sending a valid
OCSP response.

359
Web+TLS server [Link],
Browser, TLS client CA and
subject of CB and
and relying party OCSP Responder
OCSP client

OCSP request (for CB )

OCSP response:
σ = SignCA.S (CB Ok:time(·))

TLS Client Hello with


CSR TLS-extension

TLS Server Hello with


CSR extension: σ (OCSP Response)

TLS key exchange, finish

TLS finish

Figure 8.14: (Optional) OCSP stapling in the TLS protocol, using the Cer-
tificate Status Request (CSR) TLS extension, for a typical TLS connection be-
tween browser and web-server [Link], the subject of certificate CB . [Link]
received CB from the CA (not shown); the CA is also the OCSP responder.
The web (and TLS) server [Link] periodically sends OCSP requests to the
CA (also OCSP responder), requesting the status of its own certificate CB . The
CA sends back the OCSP response, σ = SignCA.S (CB Ok:time(·)), signalling
that CB was not revoked up to time time(·). The browser sends the TLS CSR
extension to [Link] with TLS Client Hello, to request OCSP-stapling. The
server sends back σ, the OCSP response, also in the CSR extension. The TLS
handshake now completes as usual.

Once the browser receives the OCSP response (in the CSR TLS-extension),
it validates it, i.e., validates the signature of the CA (using the CA’s public
validation key CA.v), and then validating that the response indicates non-
revocation (which we marked by Ok) and that the time indicated is ‘recent
enough’. When all is Ok, the browser completes the TLS handshake with
[Link] and then continues with the TLS connection.
We described the OCSP-stapling process for a TLS connection between a
browser and a web-server, for the case where the certificate was issued by a root

360
CA (directly trusted by the browser). However, the process is exactly the same
for other TLS clients and servers, and the modifications for the (typical) case
of intermediate CA are simple, following the multi-cert OCSP request-response
as discussed earlier, including in Note 8.3.

Handling Ambiguous OCSP responses and the MitM soft-fail attack.


Let us now return to discuss the handling of ambiguous OCSP responses, and
in particular, handling of the case where no OCSP response is received. For
stapled OCSP, such failure may happen either between subject of the certifi-
cate, typically the web-server, who acts as the OCSP client, and the CA (OCSP
responder); or between the relying party, typically the browser, and the subject
(web-server). In particular, this will happen if the web-server does not support
OCSP stapling.
In any case, the bottom line is that the browser does not receive a stapled
OCSP response from the web-server. In the ‘optional’ OCSP stapling design,
this simply directs the browser to attempt to resolve the revocation situation
by itself. Typically, the browser would now perform an OCSP query directly
with the OCSP responder (typically, the CA), or even request the CRL.
However, now we are basically back in the ‘classical OCSP’ deployment,
where OCSP (and/or CRL) are deployed by the relying party. So, let us con-
sider again the browser’s response if it fails to receive response to its OCSP (or
CRL) request. This places the browser in similar dilemma to the one discussed
earlier - and most implementations would adopt the soft-fail approach, i.e., use
the certificate assuming that it was not revoked.
Unfortunately, this implies we are again vulnerable to an MitM soft-fail
attack, similar to the one presented earlier (Figure 8.13). The attack is only
slightly modified due to the failed effort for OCSP stapling, and should probably
be quite clear from Figure 8.15.
One way to defend against the MitM soft-fail attack (Figure 8.15), is using
the Must-Staple extension to the server’s X.509 certificate, which we discuss
next.

The Must-Staple X.509 extension: enforcing OCSP stapled response.


The attacks of Figure 8.13 and Figure 8.15 show the risk of adopting the soft-
fail approach. The soft-fail mechanism is the equivalent of deciding to allow
bypassing of airport security screening, whenever the line becomes too long. A
likely outcome of such policy would be that attacker will find ways to cause the
line to be congested, and then use the bypass to avoid screening and perform
an attack. We sum this up with the following principle.

Principle 15. Security defenses should not be bypassed due to failures: if de-
fenses are bypassed upon failure, attacker will cause failures to bypass defenses.
Namely, soft-fail security is insecurity.

Awareness of the risk of the soft-fail approach, motivates adoption of the


harsher, hard-fail approach. However, this conflict with the UX>Security rule

361
TLS client MitM (fake server, OCSP Responder
(browser) with revoked cert) (CA)
TLS Client Hello
with CSR extension

TLS Server Hello


without OCSP response

OCSP request

(drop) OCSP response

time-out→
softfail TLS key exchange, finish

TLS finish

(data)

Figure 8.15: MitM soft-fail attack on OCSP-stapling TLS client (browser), us-
ing a revoked TLS server (website) certificate; assume that the attacker has the
certified (and revoked) private key. The browser sends the CSR TLS extension;
however, the website’s certificate does not have the X.509 Must-Staple exten-
sion, or the client does not respect this extension. The attacker impersonates
as the web-server, and send the TLS server-hello and certificate messages; the
attacker does not send the OCSP response (which would have indicated revoca-
tion). The client is misled into thinking that the server does not support OCSP
stapling. The client may now send an OCSP request to the appropriate OCSP
responder, e.g., the relevant CA, but the MitM attacker would ‘kill’ the OCSP
request or response (the figure shows killing of the response). After time-out,
the client ‘gives up’ on the OCSP response, and ‘soft-fails’, i.e., accepts the
certificate and establishes the connection with the impersonated website).

(Principle 14). Definitely, it would be absurd for a browser to refuse connection


to a website, only since it does not receive the OCSP response; this is very
likely due to a benign reason, such as that the website does not support OCSP
stapling!
The Must-Staple X.509 extension is the standard solution to this dilemma.
This extension to the website’s X.509 certificate indicates that the website
always staples OCSP responses. To a large extent, this moves the UX vs.
Security decision from the browser to the website: the browser applies the
‘must-staple’ policy, only to a website that requests it, by using the ‘must-
staple’ extension in its X.509 certificate. As shown in Figure 8.16, this foils the
MitM soft-fail attack on OCSP-stapling TLS client of Figure 8.15.

362
MitM (fake server,
TLS client (browser)
with revoked cert)

TLS Client Hello with CSR extension

TLS Server Hello


without CSR extension;
certificate has Must-Staple extension

abort (and alert/report?)

Figure 8.16: The use of the Must-Staple extension defends against the MitM
soft-fail attack on OCSP-stapling TLS client of Figure 8.15. As in Figure 8.15,
the attacker tries to impersonate a web site, to which the attacker has the
private key and the corresponding certificate, which was already revoked. As
in Figure 8.15, the client sends Client-Hello request, with the CSR extension,
i.e., asking the server to staple OCSP response. As in Figure 8.15, the attacker
responds without the CSR extension, i.e., trying to mislead the client into
falling back to sending an OCSP request (and then soft-failing). However,
the Must-Staple extension instruct the client to refuse to continue without the
OCSP response from the server.

Notice that the UX>Security rule (Principle 14) applies also to websites;
website developers would be reluctant to adopt the Must-Staple extension, if
they believe this may jeopardize the availability of their website. That may be
due to different reasons, such as clients processing the extension incorrectly,
web-servers not supporting the extension or the OCSP process correctly, or to
not receiving the OCSP response from the OCSP responder (usually, the CA).
Unfortunately, as of 2018, the results are not very encouraging [44]. How-
ever, we hope that this will gradually change, as there does not appear to
be any technical reason for either the incorrect processing or for failures of
the web-servers to receive OCSP responses (and then provide them to the
browsers).
Indeed, this is an example of the significant adoption challenge facing de-
signers of new Internet and web security mechanisms; we discuss adoption
challenges further in [93]. Adoption considerations should be an important
part of the design process. In the following exercise, we discuss some issues
which may help - or hinder - the adoption of the OCSP Must-Staple extension.
Exercise 8.7. For each of the following variants of the OCSP Must-Staple ex-
tension process, indicate whether they may help or hinder adoption, and explain
why:
1. Mark the OCSP extension as a critical X.509 extension.
2. Mark the OCSP extension as a non-critical X.509 extension.

363
3. Browsers would support the OCSP Must-Staple extension, however, if
they do not receive the stapled OCSP response from the website as ex-
pected, they would send the request to the CA, and abort the connection
only if this request also fails.
4. Same as previous item, however, the server will have the ability to indicate
if the client should try sending OCSP request to the CA (if it does not
receive it stapled from the web-server). Consider three ways to indicate
this: (a) an option of the OCSP Must-Staple extension, (b) a separate
extension, or (c) an option indicated in a TLS extension returned by the
web server.
Among the variants that you believe may help adoption, indicate their relative
merits (by ordering and justifying).
Notice that Must-Staple extension requires support by the CA, to include it
in the web-server’s certificate, and to provide sufficiently-reliable OCSP service.
An alternative solution which does not require such special certificate-extension
is discussed in Exercise 8.17.

8.4.5 Optimized variants of OCSP


In this section we discuss some possible, non-standard, optimizations to OCSP.

The Certificate-Hash-Tree variant of OCSP. This OCSP variant uses


the Merkle hash-tree technique, introduced in ??, to allow the CA or OCSP re-
sponder to periodically perform a single signature operation, to provide OCSP
responses indicating status for any OCSP requests. Assume that the CA issued
a large set of certificates c1 , c2 , . . . cn , but each OCSP request will contain only
one or few certificate-identifiers.
As shown in Figure 8.17, the signature is computed over the result of a
hash-tree applied to the entire set of certificates issued by the CA (and their
statuses), concatenated with the current time. The leaves of the hash-tree are
the pairs of individual certificates c1 , . . . , cn and their corresponding statuses
s1 , . . . , sn . The construction uses a collision-resistant hash function (CRHF)
denoted h.
The OCSP response for a query for status of certificate ci , consists of this
signature, the time of signing, and the values of ‘few’ internal nodes, essentially,
one node per layer of the hash tree. This allows the OCSP client to recompute
the result of the hash tree, and then validate the signature. For example,
to validate the value of c6 , the response should include h5 , h7−8 and h1−4 .
To validate, compute h6 = h(c6 ), then h5−6 = h(h5 + + h6 ), then h5−8 =
h(h5−6 + + h7−8 ), then h1−8 = h(h1−4 + + h5−8 ) and finally verify the signature
over h1−8 and time, by validating that verif yCA.v (σ1−8 , h1−8 + + time). We
refer to this set of values (e.g., c6 , h5 , h7−8 , h1−4 ) as proof of inclusion of c6 .
Exercise 8.8. Consider certificate-hash-tree variant of OCSP, described above
and illustrated in Figure 8.17.

364
σ1−8 = SignCA.s (h1−8 +
+ time)

h1−8

h1−4 h5−8

h1−2 h3−4 h5−6 h7−8

h1 h2 h3 h4 h5 h6 h7 h8

c1 , s1 c2 , s1 c3 , s3 c4 , s4 c5 , s5 c6 , s6 c7 , s7 c8 , s8

Figure 8.17: Signed certificates hash-tree. The leafs are the pair of certificate
ci and its status si ∈ {good, revoked, unknown}. The root is the signature over
the hash-tree and the time. Every internal node is the hash of its children; in
particular, for every i holds hi = h(ci , si ), and hi−(i+1) = h(hi + + hi+1 ). To
validate any certificate, say c3 , provide the signature of the certificate hash-tree,
i.e., σ1−8 , the time-of-signing and the values of internal hash nodes required to
validate the signed hash, namely h4 , h1−2 and h5−8 .

1. Present pseudo-code for the OCSP client, including validation of the


OCSP responses (including the proofs of inclusion).
2. Let n be the number of certificates issued by a CA, r < n be the number
of revoked certificates, and denote by i be the (typically, much smaller)
number of certificate-identifiers sent in a given OCSP request. What is
the number of (1) signature operations, (2) signature-validation opera-
tions, (3) hash operations, required to (a) produce and send a CRL, (b)
produce and send an OCSP response, (c) produce a certificate-hash-tree
OCSP response, (d) validate a CRL, (e) valid an OCSP response, (f )
validate a certificate-hash-tree response.
3. This variant uses an (unkeyed) collision-resistant hash function (CRHF)
h. Explain why it may be desirable to avoid this assumption.
4. Would it be Ok to use in the design a Second-Preimage Resistant (SPR)
hash function, instead of the keyless CRHF? Present a convincing justi-

365
fication, preferably, with a reduction to prove security or with a counter-
example showing insecurity (for the use of SPR hash function in this
construction).
5. Present an alternative way to replace the keyless CRHF with a different
function which is about as efficient (as the original design using CRHF),
yet is secure under a more acceptable assumption.

Other efficiency-improving variants of OCSP. The OCSP certificate


hash-tree variant allows the OCSP responder (e.g., CA) to use a single sig-
nature operation to authenticate responses, and is an efficient way to validate
many certificates using one signature operation. The following exercise discuses
a different optimization, designed to provide OCSP response to a single certifi-
cate, optimized to avoid any signature in the typical case that the certificate
is not revoked.
Exercise 8.9 (OCSP Hash-chain Variant). To avoid computational burden
on the OCSP responder, it is proposed to use an alternative mechanism to
OCSP, which will usually avoid the use of signatures as long as the certificate
isn’t revoked, yet ensure secure revocation information. Specifically, add an
extension, say called ‘OCSP-chain’, to the X.509 public-key certificate. The
OCSP-chain extension will contain an n-bit binary string x0 ∈ {0, 1}n . A
simple, efficient algorithm uses x0 to validate a ‘daily validation token’ (i, xi )
to be sent by the CA; the CA sends the token (i, xi ) only if the certificate was not
revoked i days after it was issued. Your solution may use a cryptographic hash
function h, and ‘security under random oracle model’ suffices, i.e., modeling
as if h is a random function.

1. Validation of token (i, xi ) is done by checking that x0 = .


2. The CA may efficiently compute xi from xj , for j > i, by: xi = .
3. Assume that the certificate’s maximal validity period is 1000 days. The
CA computes/selects x0 (in the certificate) and x1000 (for last day’s to-
ken), by x0 ← and x1000 ← .
4. Should the OCSP-chain extension be marked ‘critical’ ?
5. Assume that the probability of a random certificate to be revoked on a par-
ticular day is less than 10−5 . A further optimization reduces the number
of hash-chain computations by the CA, by ‘grouping’ 100 certificates in
a common ‘hash-chain group’. As long as none of these certificates is
revoked, they all use the same ‘hash-chain group token’; only when one
of them is revoked, will a per-certificate token be sent. This extension re-
quires an extended version of the OCSP-chain extension, i.e., the exten-
sion should not simply contain an n-bit binary string x0 ∈ {0, 1}n as be-
fore. Instead, the extension should contain: ; and the

366
‘daily validation token’ will change from (i, xi ) to (be-
fore any of the certificates in the group is revoked) and to
(after one or more of the certificates in the group is revoked).
Further improvements may be possible. In this paragraph and Exercise 8.10,
we consider the Revoked-certificates hash-tree OSCP-variant. This OSCP-
variant uses a hash-tree of all revoked certificate-identifiers, sorted by certifi-
cate identifier. Since the tree is sorted, it can provide efficient proof of non-
revocation of a certificate. For example, assume we use the certificate serial
numbers to identify certificates, both in OCSP requests, and as the key for
sorting the revoked identifiers hash-tree. Assume that an OSCP query con-
tains a single certificate serial number, say i. The OCSP response will include
the signed revoked-certificates hash tree, together with:
If i was revoked: proof of inclusion of i in the tree, similar to the one illus-
trated in Figure 8.17.
If i was not revoked: proof of inclusion of i0 and i00 in the tree, where i0 =
max{i0 < i ∧ i0 was revoked} and i00 = min{i00 > i ∧ i00 was revoked}.
Exercise 8.10 (Revoked-certificates hash-tree OSCP-variant). Following the
brief outline in the previous paragraph:
1. Design and analyze an improved OCSP-variant using hash-tree of revoked
certificates.
2. Extend your design to also incorporate the hash-chain technique (Exer-
cise 8.9).

8.5 X.509/PKIX Web-PKI CA Failures and Defenses


We now focus on the application of PKI techniques to protect web commu-
nication, i.e., Web-PKI. As we discussed in § 8.1, there have been multiple,
well-known incidents where certificate authorities issued invalid certificates, of-
ten as part of an attack on TLS/SSL applications for securing web connections;
we presented some failures in Table 8.1. In this section, we discuss the weak-
nesses of the web-PKI usage of X.509/PKIX, which allowed these failures, and
possible defenses. In the next section, we focus on the most ambitious defense
- the Certificate Transparency (CT) PKI, which is a significant extension of
X.509/PKIX.

8.5.1 Weaknesses of X.509/PKIX Web-PKI


In a utopia, each certificate is issued by one of a few, highly trusted CAs, with
a clear focus and ability to provide secure certification service, controlling a
specific name space, and preventing any misleading and unauthorized certifi-
cates. However, the reality is very different. In particular, browsers appear
to trust certificates from too many CAs. For example, a study from 2013 [65]

367
found 1832 browser-trusted signing certificates, each representing a trusted
CA (mostly, intermediate CAs). Out of these, 40% were academic institutions,
and only 20% were commercial CAs. Only 7 of these CAs were restricted to a
particular name-space (name constraints), and most did not have any length
constraints either.
The following seem to be the main weaknesses of the classical Web-PKI
system:

Risk of rogue and negligent CAs: browsers, the most common type of re-
lying party, are distributed with a list of many - typically, around a
hundred or more - ‘root CAs’, i.e., trust-anchor CAs. The reason for this
excessively-long list may be that browser manufacturers do not want to
take the responsibility for judging the trustworthiness of different CAs,
a role that may imply liability and potential anti-trust concerns. Indeed,
the role of a browser is to provide functionality, not necessarily to rep-
resent the trust of individual users; ideally, each user should determine
which CAs are ‘trust anchors’. Browsers typically provide a user inter-
face allowing users to edit the list of trusted CAs; however, it is not
realistic to expect users to really use this and modify the default list of
trusted CAs, and while we are not aware of this being measured yet, we
are quite sure that essentially nobody is doing that. The bottom line
is that some root-CAs may not be sufficiently careful (negligent CAs),
or, worse, may be subject to coercion or otherwise under control of at-
tackers (rogue CAs). Note that such attacks may be done intentionally
by nation-states, which may mean that a CA may have to comply and
cooperate. Unfortunately, the legal and business challenges do not leave
significant hope for a change that will ensure trustworthy CAs, and/or
reduce the number of root CAs.
No restrictions on certified names/domains: not only are there (too) many
‘root’ (trust-anchor) CAs, but, furthermore, any root-CA may certify any
domain, without any restrictions. Furthermore, TLS/SSL certificates is-
sued by root-CAs to intermediate CAs rarely include ‘name constraint’,
hence, intermediate CAs are also may publish a certificates for any do-
main.
Insufficient validation, esp. of Domain-validated certificates: to ensure
security of certificates, it is crucial for the certificate authority to properly
authenticate requests for certificates. However, proper authentication re-
quires costly, time-consuming validation of the requesting entity and its
right to assert the requested identifier (typically, domain name). To re-
duce costs and provide quicker, easier service to customers (requesting
certificates), many certificate authorities use automated mechanisms for
authentication, often based on the weakly-secure domain validation; see
Note 8.2.

368
Equivocation: X.509 does not include any mechanisms to prevent or detect
certificate equivocation, i.e., existence of multiple different certificates for
the identifier. For theWeb PKI, a MitM attacker can impersonate a
TLS/SSL website [Link], if it is able to obtain an equivocating cer-
tificate for [Link]. For CAs that provide domain-validated certificates,
the defense against equivocating certificates is usually merely the weak
security of the domain-validation process (Note 8.2).
Scam and phishing: there are no standard requirements for validation of
domain names by CAs, to prevent certificates for scam and phishing web-
sites. In fact, obtaining such (domain-validated) certificate is usually
easy.
Ineffective Rogue-Cert Detection and Accountability. X.509 ensures ac-
countability - but only in the sense that once a rogue certificate is detected,
it is possible - even easy - to identify the ‘culprit’ CA. However, X.509
(and Web PKI) does not provide any mechanism to detect the rogue CA.
Of course, detection is possible; for example, the Google Chrome browser
has built-in mechanisms that detect receipt of a rogue certificate for the
Google domain, which were invoked in the DigiNotar incident (Table 8.1).
The last example is an exception that proves the rule; if we could rely
on browsers knowing the public keys, we wouldn’t need certificates at
all! This specific weakness is addressed by the Certificate Transparency
mechanism, which we discuss below (§ 8.6).

8.5.2 X.509/PKIX Defenses against Corrupt/Negligent CAs


The naı̈ve view of PKI security is that certificate authorities are fully trusted,
honest entities, that never fail to operate correctly, and in particular, to care-
fully vet any request for a certificate. However, in view of the known failures
(e.g., Table 8.1) and the weaknesses outlined above, a different approach seems
advisable: PKI systems should ensure some security guarantees, even assuming
that CAs may be corrupt or negligible.
For example, in the naı̈ve view, the basic role of a public key certificate is to
assure a mapping between the public key and the identity - assuming that the
CA is trustworthy. However, an alternative view of this is certificates make a
CA accountable for a fraudulent certificate. Namely, once we find a fraudulent
certificate, properly signed by a CA, i.e., that validates as ‘correct’ using the
public key of a CA, the CA becomes accountable for this fraudulent certificate.
Such accountability can be viewed as a necessary requirement6 for a secure PKI
system.
Indeed, considering the reality of possibly corrupt or negligent CAs, there
are additional proposed and deployed defenses against fraudulent certificates -
6 It is possible to define the accountability requirement, as well as other security re-

quirements from PKI schemes, similarly to the definitions presented in earlier chapters for
different cryptographic schemes such as encryption. Such definitions allow provably-secure
PKI schemes; for more details on this approach, see [117].

369
beyond the basic CA accountability, directly assured by the signed certificate.
The most significant in these proposals and efforts is probably certificate trans-
parency, which provides a public, auditable log of certificates; we discuss it in
the next section. Below, we discuss several other proposed defenses.

Use name and other constraints. One possible improvement to the secu-
rity of theWeb PKI system is simply to adopt and deploy the certificate path
constraints already defined in X.509 and PKIX, most notably, the name con-
straint, discussed in subsection 8.3.7. This would allow restriction of the name
space for certificates issued by a given CA, e.g., based on the DNS top-level
domain country codes (TLDcc), as defined in RFC 1591 [140]. For example, a
Canadian CA may be restricted to the TLDcc for Canada (.ca).
This idea may appear simple and easy to deploy, but we doubt it; deploy-
ment will probably face strong objections. Let us present two major concerns.
First, the use of name constraint requires a major change in the existingWeb
PKI system, which would be very hard to enforce. Second, there does not
appear to be a natural way to place name constraint on domain names which
are in one of the generic top level domains (gTLD) - and most domains belong
to a gTLD, mainly com, org, gov, edu and biz, not to mention that new top
level domains can be defined essentially by anyone. Restricing the certification
of gTLD certificates to only specific CAs may result in significant objections
by other parties (and countries).

Public key and/or certificate pinning. Fraudulent certificates can be


foiled and/or detected by the relying party, typically the browser, when the
browser is aware of the correct public key or certificate. For example, the
Chrome browser had the public key of Google ‘burned-in’, allowing the browser
to detect a fraudulent public key for Google (see the DigiNotar incident in
Table 8.1). Such ‘burned-in’ public keys are usually referred to as static pinning
(or static key pinning). Both Google and FireFox support static key pinning
for some domains. However, this is not a scalable solution.
Dynamic pinning of public keys or certificates extends this mechanism, by
allowing secure sites to direct the browser to ‘pin’ the given public key or cer-
tificate to the domain, for a specified period; see specifications for HTTP public
key pinning (HPKP) in RFC 7469 [70]. Similarly to static pinning, HPKP, and
dynamic pinning in general, would foil and detect fraudulent keys/certificates,
even if properly signed by a legitimate CA. With dynamic pinning, the server
essentially declares I will use this PK (or certificate) for at least the speci-
fied period. The client remembers this indication and refuses the use of other
keys/certificates, for the specified period.
In addition to or instead of refusing ‘other’ keys, the site may request clients
to report cases where they receive a key or certificate which differs from the
pinned one. Dynamic pinning relies on the belief that attacks are the exception
rather than the rule, and therefore, there is good likelihood that the first con-

370
nection would be with the legitimate website, and then used to protect following
connections; this approach is referred to as Trust-On-First-Use (TOFU).
However, there are drawbacks to the use of key/certificate pinning, includ-
ing:

1. If the private key is lost to the server, e.g., due to failure, then there is a
risk of losing the ability to serve the website, until the pinning times-out.
To address this (and the next) concern, a backup copy of the secret key
must be kept by the owner of the website in secure, reliable storage.
2. Suppose an attacker penetrates the server’s site, exposes the private key
and corrupts the copy of the key used by the server. This may allow the
attacker to serve the site to unsuspecting users - while preventing the
legitimate server from serving the site, or even changing its key. Note
that this problem may exist even for a site that did not use HPKP for
key/certificate pinning, since the attacker may perform the pinning.
3. Revoking a pinned key is difficult, since a replacement key is typically
not pinned. One option is to wait for the pinning to time-out, but that
implies period of unavailability of the site. An alternative is for the imple-
mentation to cancel key pinning once the certificate is revoked; however,
the current specification does not require this behavior.
4. Key-pinning may be applied only for long-lived public keys, and may
interfere with the use of rotating keys, and seems to completely prevent
the use of ephemeral master keys, e.g., for perfect-forward secrecy.
Due to these disadvantages, the use of key and certificate pinning may not
be widely adopted, and support for it may be dropped.
Note, however, that there may still be value in pinning of security policies,
e.g., pinning of the use of OCSP-stapling; see Exercise 8.17. Similarly, there
may be a value in CA pinning, i.e., pinning of the identity of specific certifi-
cate authorities which the website would use (say, in the coming year), foiling
attacks using certificates signed by rogue or negligent CAs.

8.6 Certificate Transparency (CT)


Certificate Transparency (CT) [114–116, 118] is a proposal, originating from
Google, for improving the transparency and security of the Web PKI - and
of PKI in general. There is an extensive standardization, implementation and
deployment effort for CT, and a variant of CT is already enforced by the
popular Chrome browser. As a result, many websites and CAs deploy CT,
making CT the most important development in PKI since X.509.
In this section we discuss CT; our discussion will be high-level, without all
details. One reason for this is that CT is still an evolving technology, including
two (significantly different) draft standards [115, 116], several implementations
(which often differ from both draft standards), and several proposed other

371
variants. Where we had to choose, we focus on a variant of CT presented in
[118], which is work-in-progress by Hemi Leibowitz, Ewa Syta and the author.

8.6.1 CT: concepts, entities and goals


In this subsection, we present the main concepts, relevant entities (mainly,
loggers and monitors), and, most significantly, goals of CT.

CT Goal 1: Transparency. One of the main weaknesses of the current


Web PKI, discussed in subsection 8.5.1, was that it is ineffective in detection
of rogue-certificates. This also implies that its accountability guarantee is not
effective: to identify the failure of the CA, we must find and detect the rogue
certificate - but rogue certificate would probably rarely be detected. We con-
clude that there can be no (effective) accountability without transparency.
The first goal of CT is to provide this missing transparency. Namely, CT
is designed to make it easy to:
Detect rogue certificates by monitoring specific domain name: CT al-
lows detection of certificates issued for a given domain name (or, in gen-
eral, identifier). This allows the ‘owner’ of the domain to detect unau-
thorized, equivocating certificates issued to its domain name. It also
allow identification of certificates issued to ‘suspect’ domains, e.g., using
an email log which provides the name of the suspect domain. Once the
rogue certificates are detected, the accountable CA can are also identified,
of course.
Detect suspect certificates by monitoring all domains: CT also allows
monitoring of all certificates issued, which facilitates detection of attacks
using an ‘unpredictable’ domain name, such as domain generated auto-
matically, by a Domain Generation Algorithm (DGA) - often abused,
mainly for malware such as botnets and ransomware. This may be
achieved by detecting patterns of certificate issuing or by detecting cer-
tificates with domain-names suspected to be part of scam, phishing or
homograph attack (??).
The basic approach allowing CT to achieve this goal is simple: every certifi-
cates issued, should be included in a publicly-available log. One way to imple-
ment this idea would have been for each CA to maintain such publicly-available
log of the certificates it issued. However, recall the risk of rogue and negligent
CAs, also listed in subsection 8.5.1 as a major Web PKI weakness. Due to this
concern about non-trustworthy CAs, the CT designers took a more radical ap-
proach: they separated the log functionality to a new type of entity, called a
logger. The CA must send each certificate it issues to the logger, who provides
back a signed response, which should be attached to the certificate, turning it
into a valid CT certificate. The response can be just a commitment by the
logger to include the certificate in the log by specific time, which is referred to
as an SCT (Signed Certificate Timestamp), or a ‘proof’ that the certificate was

372
(0.2) Certificate
CB = SignCA.s ([Link], Bob.e, . . .)
Logger CA
(0.3) PoI = (SignL.s (LogDigest, time), hint)

(0.1) (0.4)
Bob’s public key Certificate+PoI
Relying party Bob.e CB , PoI
(e.g, Alice’s browser)

(1.1) TLS Client Hello Subject


(e.g, [Link])
Nurse
(1.2) Certificate CB , PoI

Figure 8.18: Certificate Transparency (CT) issue process, using a (single) log-
ger who provides a Proof-of-Inclusion (PoI) and typical usage in TLS server
authentication. The Proof-of-Inclusion (PoI) includes a signature by the logger
over a recent digest of the log (LogDigest) and the time it was signed (time), as
well as a hint, which allows verification that the specific certificate CB appears
in the log. Note: some CT variants use SCT instead of PoI, see text.

added to the log at a given time, which we refer to as PoI (Proof of Inclusion).
Most of the CT literature focuses on the SCT case, but we will focus on the
case where the logger returns a PoI, since we believe that makes the design
clearer and stronger.
But who would be these loggers? Would they become a new trusted entities,
essentially replacing the CAs in this role? That seems undesirable, as it seems
likely to result is a similar set of incentives and legal constraints, that caused
the current situation of an excessively large and not sufficiently assessed and
trustworthy set of root CAs. So CT took a different high-level goal, which
could be summed up as reducing the trust requirements from loggers. In fact,
we present two possible more specific goals for CT, both addressing this high-
level goal of reducing trust in loggers; some variants and publications focus
on one goal, while others focus on the other. Note that the two goals are not
conflicting, and it is definitely possible to combine them.

CT Goal 2: avoid a single point of failure. One problem with the pre-CT
Web PKI design, is that a single rogue CA suffices to issue faulty certificates.
In principle, one solution could have been to require each website (or, more
generally, each subject), to present certificates from two different CAs; that
would avoided the risk due to an attack by a single CA; in fact, this even
protects against scenario containing multiple rogue CAs, as long as they do

373
not collude, i.e., they don’t both certify the same rogue certificate.
The ‘multiple CA’ solution has three drawbacks. First, it is simply infeasi-
ble; we can’t realistically suddenly require websites to receive two certificates
instead of one - no chance for successful adoption. Second, it still does not
ensure transparency, if both CAs end up using the same logger. Finally, it
completely relies on prevention rather than on accountability; and in particu-
lar, it completely fails if the attacker controls the multiple CAs. However, there
is an alternative solution, that avoids these drawbacks - or at least the first
two: continue using a single CA for each certificate, but require each certificate
to be logged by multiple loggers, ensuring transparency even when one of the
loggers is corrupt.
This is a very simple method to avoid a single point of failure - and, in
fact, this is how CT is currently deployed by CAs and by the popular Chrome
browser, which requires each certificate to be logged by two loggers. However,
while it does bring accountability to a CA who issues rogue certificates (since
they would be now easily detectable), it relies completely on prevention rather
than accountability with respect to the multiple loggers. In particular, the
transparency guarantee completely fails if multiple rogue loggers all ‘drop’ the
same certificate from their logs. The CT specifications and many of the pub-
lications describing it, include additional elements which are clearly designed
to achieve a different, and arguably more systematic and ambitious goal of
resiliency to rogue entities, which we describe next.

CT Goal 3: automated detection and accountability for any misbe-


having entities (including logs). As we just explained, the use of multiple
loggers ensures transparency even with a rogue logger, which is great - but it
fails, with no accountability, if all loggers used by an issuer are faulty, and
exclude a certificate from their logs. However, from its very beginning, CT
includes additional mechanisms, designed to ensure automated detection and
accountability for rogue loggers - without requiring the use of multiple loggers.
To achieve automated detection of misbehaving logs, CT includes an addi-
tional type of party, called a monitor, which interacts with both loggers and
relying parties. The monitors periodically read the logs from the loggers; and
the relying parties, randomly or otherwise, send certificates to the monitors, in
a process called audit. When a monitor receives a certificate for audit from a
customer, it checks if the certificate appears, as promised, in the corresponding
log. If a certificate does not appear in the log, the monitor detects the faulty
log; furthermore, the monitor obtains - and distribute - Proof of Misbehavior
(PoM). It only require use of the public key of the logger, to validate the PoM,
allowing rapid discrediting of the rogue logger.

8.6.2 High level CT design with inefficient cryptographic


functions
We now proceed to describe CT. In this subsection, and in Figure 8.19, we
present the high-level design of CT, together with a naive, inefficient design for

374
Website Issuer Logger L Monitor M
Date/time
Nurse

[Link] (CA) Log[d − 1] Log[d − 1


Alice
(browser) ([Link], B.e)
Certificate CB
Day d ends Log[d] ← Log[d − 1] ∪ {CB , . . .}

Day d + 1
Proof-of-Inclusion
(P oI), signed by logger
Monitor request
CB , P oI

New certs (incl. CB ),


TLS Hello
Proof-of-Log (P oL),
CB , P oI Timedstamped and signed

Audit P oI

Audit-Ok (signed by monitor)

Figure 8.19: High-level sequence diagram of CT entities and processes, no


faults scenario: issuing and logging of certificate CB (dotted), monitoring and
receiving set of new certificates including CB (dashed), TLS web-server au-
thentication (full) and successful audit of CB (blue). Assume daily logger and
monitor processes, and that the scenario begin during day d, with the logger
L and monitor M having the same log (set of certificates) for the previous day
Log[d − 1]. The vertical violet ‘axis’ to the left represents the timeline, and the
violet horizontal line denotes ‘midnight’, specifically, the move from day d to
day d + 1.

the cryptographic functions used by the design. We describe the ‘real’ - and
efficient - design of the cryptographic functions in ??.
Figure 8.19 is a high-level sequence diagram showing the different entities
and processes involved in the CT PKI, for a no-faults scenario. The figure
illustrate four CT processes:

CT process 1: Certificate issuing and logging (dotted lines). This pro-


cess involves the subject (in Figure 8.19, the website [Link]), the issuer
(CA) and the logger (L). The process begins with the subject sending its
public key B.e to the CA, who verifies the request comes from [Link].
If Ok, the CA signs the certificate CB and sends it to the logger. The
logger returns a Proof-of-Inclusion (P oI). The logger uses its signing key
L.s to produce the PoI, and it can be validated using its public key L.v.
The PoI is produced only on a periodical basis - e.g., daily (and therefore

375
includes the date, although this is not shown in the figure). The CA
sends the certificate CB and the P oI to the subject [Link].
CT process 2: Periodical monitoring (dashed lines). The monitor pe-
riodically, e.g. (here) daily, contacts the logger to inquire about the
current status of the log, i.e., were there any new certificates logged. The
logger should report all new certificates added to the log in the last day
(i.e., since the previous monitor query), including CB of course. Namely,
if the logger sent a PoI for a certificate, then this certificate should be
included in this periodical (e.g., daily) report returned to monitor re-
quests. The logger should also return a signed and time-stamped Proof-
of-Log (PoL), irrevocably defining the set of certificates in the current log
Log[d + 1] (including all certificates logged till day d + 1, inclusive); the
signature would also specify the date d itself (omitted from the figure).
The monitor validates that the PoL is correct, in particular, that it is
signed by the logger; this validation uses the logger’s validation key L.v.
CT process 3: Certificate usage. Figure 8.19 shows the typical certificate
use for server-authentication during TLS handshake, for secure connec-
tion between browser (relying party) and web-server (subject of certifi-
cate). Notice that the website sends its certificate as usual, but also
attaches the PoI (Proof-of-Inclusion), which the website received from
the logger via the CA. The relying party validates that the PoI is prop-
erly signed (using the logger’s public validation key L.v), and that it
‘proves’ that CB is in the log.
CT process 4: Audit (blue). Finally, the relying party, e.g., the browser,
may (randomly or otherwise) send an audit request, including the PoI.
The monitor validates that PoI is consistent with the signed PoL that
it received from the logger. In this figure, we show the case were it
is fine, and the monitor returns an ‘Audit-Ok’ message to the relying
party; to mitigate the risk of receiving a manipulated, incorrect ‘Audit-
Ok’ response, this response must be signed - this also allows dealing with
a faulty monitor, as we explain below. In the other case, where the PoI
and PoL are inconsistent, this indicates a rogue logger. In this case, the
pair of conflicting logger signatures - over the PoL and over the PoI -
serves as a Proof of Maliciousness (PoM) of the logger, and is sent to all
parties (broadcast), so nobody will trust the rogue logger any more.

There is a fifth CT process, inter-monitor gossip, which we describe in sub-


section 8.6.3, as it is ‘only’ required to (significantly) reduce the overhead of
CT; this process involves monitor-to-monitor communication, and referred to
as gossip.

Handling rogue loggers. Much of the CT design is targeted at achieving


the third goal, automated detection and accountability for misbehaving entities.
We now explain how this goal is achieved, first focusing on faulty loggers.

376
We separate logger faults to two categories: peer-detected faults and auditing-
detected faults. Let us first explain this categorization of faults.
Let us first present a metaphor. The two fault categories resembles the
ways in which a tax-evading vendor may be detected: peer-detection, such
as when a the vendor refuse to provide a (properly-signed) receipt, detected
almost immediately by a customer who may complain and warn others; and
audit-detection, such as when the vendor provides a receipt - but on audit is
found to not have reported the transaction correctly.
Each type of detection has its pros and cons. Peer-detection is immediate
and almost certain7 : the customer may wait a bit for the receipt, but quite soon,
most customers will realize this is a fraud and raise alarm. Audit-detection,
on the other hand, is delayed and probabilistic: not all vendors are audited,
and auditing may happen years after the fraud. On the other hand, an audit-
detection produces a clear proof of misbehavior - the conflict between the signed
receipt and the signed tax report; while a peer-detection is likely to result in
conflicting accounts by the two parties - a he-said, she-said situation. Let us
consider the two categories - in our ‘real’ context, i.e., faulty loggers.

Peer-detected logger faults. The logger is immediately detected when it


fails to provide the expected response to either monitor or CA - not sending any
response or sending an incorrectly-formatted or incorrectly-signed response.
This clearly falls under the peer-detected fault category; in particular, such
fault is detected (almost) immediately and with certainty - but only by the
peer (monitor or CA). The CA and monitor may stop working with this logger
and send accusation to alert others. Other parties that receive such accusations
will need to perform a trust-management decision, based on the number and
identity of the accusing parties. However, clearly we cannot hope for a better
outcome.

Audit-detected logger faults. A rogue logger may avoid peer-detection,


by not reporting some certificate in the log reported to a monitor (or multiple
monitors). Let us first consider the case where there is a single monitor, or
where the logger reports the same log to all monitors. In this case, the fault
would be detected when a relying-party (browser) that received the certificate,
say CB , together with the corresponding Proof-of-Inclusion (PoI), decides to
audit this PoI with one of the monitors, who would detect the conflict with
the signed Proof-of-Log (PoL). The two proofs (PoI and PoL) can be verified
by any party, it only requires the public validation key of the logger (L.v);
therefore, together, the pair (PoI, PoL) becomes a Proof-of-Misbehavior of this
logger, which we refer to as a logger-PoM.
7 The reader may rightfully argue that many customers may not notice a missing, incorrect
or unsigned receipt. However, that’s just a weakness of the metaphor, please ignore them
- assume prudent customer, if you like. The weakness is not relevant to our real subject:
detection of rogue loggers.

377
It remains to explain the cryptographic design of these two proofs (PoI and
PoL). Only for this subsection, we present a naive, inefficient design. This
design is very simple:
A PoI , in this naive design, is simply a signature by the logger on the cer-
tificate with a time-stamp, identifying the log in which it should appear
(e.g., day d in our example).
A PoL , in this naive design, is the period from the previous monitor query
till the current query, together with the list of certificates added to the
log during this period, all signed by the logger.
The combination of a signed PoI showing CB is supposed to be added to the
log at a given time (say day d), and a signed PoL with a list of certificate not
including CB and where the specified period includes the time of the PoI (day
d), can only exist if the logger is misbehaving, i.e., is a logger-PoM.

Handling monitor faults. Finally, let us point out that CT also allows
audit-detection of monitor faults. The role of monitors is only to detect a
faulty logger; therefore, detection of faulty monitors may appear unnecessary.
However, by slightly extending the relying-party auditing process, with negligi-
ble overhead, we can also protect against collusion of a logger with a monitor,
where a rogue monitor is ‘covering up’ for a rogue logger.
The way to deal with such collusion of a logger with a monitor, is by using
multiple monitors. One naive design is for the relying parties to always perform
audit with a set of monitors, including a non-faulty monitor (or several). In
this case, such cover-up would fail.
Notice, however, that this solution results in significant overhead - perform-
ing audit with a sufficiently-large set of monitors. In the next subsection, we
describe more efficient implementation of the CT mechanisms, which provides
very efficient audit-detection of such rogue monitors.

8.6.3 Efficient CT Functions Design and the Gossip Process


Finally, we sketch the ‘real’ design of the CT functions, which ensures the same
security properties - but much more efficiently. This design also involves an
additional - fifth - CT process: gossip.

Reducing monitor’s storage with efficient Proof-of-Log (PoL) using a


Merkle Digest Scheme. In the naive design, the PoL consisted of a signa-
ture over the entire set of certificate issued by the logger in a given time period.
In order to perform audit and produce, if necessary, a Proof-of-Misbehavior
(PoM), the monitor must maintain all these signed PoL received from the log-
ger; in particular, this requires the monitor to store all certificates. That can
be significant overhead!
CT avoids this overhead, by implementing the PoL and PoI using a Extended
Merkle Digest scheme (Definition 4.14, § 4.8). Specifically, the PoL requires

378
only a signed digest of the set of all logged certificates, and an extension from
the previous set (and digest); and PoI requires only a Proof-of-Inclusion of the
scheme. For details, see [118].

CT Process 5: Inter-Monitor Gossip - improving logger-detection


and reducing overhead of monitoring. The use of the extended Merkle
scheme allows monitors to keep in storage only certificates that the monitor
is ‘interested in’, e.g., of domain names that the monitor wants to supervise.
Monitors that do not need to monitor specific or all certificates, may now only
rely on the digest and extensions, an immense reduction in storage.
Furthermore, monitors can exchange information directly between them -
which is referred to as gossip. Monitors can use gossip to efficiently disseminate
to all monitors:

Proofs-of-Logs : Monitors can exchange Proof-of-Logs they receive from log-


gers. This allows detection of misbehaving loggers that send different
PoLs to different monitors, e.g., to mislead a relying party that uses only
a particular monitor; notice that the set of conflicting PoLs will provide
another type of Proof-of-Misbehavior (PoM). Distribution of PoLs via
gossip may also help to reduce the load on loggers, which can limit their
distribution of PoLs (and actual logs and certificates) to only a limited
number of monitors.
Proofs-of-Misbehavior and accusation alerts : Once a monitor becomes
aware of a rogue, misbehaving logger, it can use gossip to alert other
monitors. A monitor may become aware of misbehaving logger either
directly with a PoM, by discovering discrepancy during an audit of PoI
received from a relying party, directly without a PoM, e.g., when logger
sends a corrupt Proof-of-Log (PoL) or fails to send a PoL, or indirectly
by receiving such alert (with or without PoM) from another monitor (via
gossip) or from a relying party.
Logs and certificates : To improve efficiency, and in particular reduce the
load on loggers, monitors may obtain PoLs, logs and certificates from
other monitors, via the gossip mechanism, rather then having all monitors
download this from the loggers.

Efficient auditing against rogue monitors. A final efficiency improve-


ment is for the audit-detection of monitor faults. Instead of having relying
parties always use multiple monitors (for redundancy), it suffices that relying
parties only sometimes (randomly) use multiple monitors, allowing one moni-
tor to detect both a logger fault- and a failure of another monitor to report it.
The fact that the Audit-Ok responses are signed, turns such detection into a
Proof-of-Misbehavior of the monitor; some details omitted.

379
8.7 PKI: Additional Exercises
Exercise 8.11. Some X.509 extensions are quite elaborate, and may contain
many elements of different types; for example, the SubjectAltName (SAN)
extension. A relying party may receive a certificate with a known extension, but
containing an element of unknown type, or incorrectly formatted. X.509 default
behavior is to consider the entire extension as unrecognized; if the extension is
marked ‘critical’, this results in rejecting the entire certificate, i.e., considering
it invalid. However, X.509 allows a specific extension to define other behavior
for handling unrecognized elements.
Specify a format for an extension, which will allow defining critical and
non-critical element types. Critical elements must be recognized by the relying
party or the entire extension is deemed unrecognized, and non-critical elements
may be safely ignored by the relying party if not recognized, while still handling
the other elements in this extension.
Is this mechanism of potential use within an X.509v3 critical extension?
Within a non-critical extension?
Exercise 8.12. Locate the certificate information in your browser, and:
1. Count, or estimate, the number of root CAs.
2. Count, or estimate, the number of intermediate CAs.
3. Can you give two examples of root CAs that you believe belong to orga-
nizations or companies that you know and trust? Explain.
4. Can you give two examples of root CAs that you believe belong to orga-
nizations or companies that you do not know or do not trust? Explain.
5. Can you give two examples of intermediate CAs that you believe belong
to organizations or companies that you know and trust? Explain.
6. Can you give two examples of intermediate CAs that you believe belong
to organizations or companies that you do not know or do not trust?
Explain.
7. For one of the root CAs, view its details, and identify the extensions.
For extensions covered in this chapter, explain their fields. For other
extensions, identify what they are used for.
8. For one of the intermediate CAs, view its details, and identify the exten-
sions. For extensions covered in this chapter, explain their fields. For
other extensions, identify what they are used for.
Use screen-shots to document your results.
Exercise 8.13. A website [Link] receives a TLS certificate from a CA [Link].
What should prevent [Link] from using this certificate to impersonate as the
website for domain [Link]? Present at least one mistake that [Link] could
make in the certificate, which would allow this attack.

380
Exercise 8.14. It is sometimes desirable for certificates to provide evidence
of sensitive information, like gender, age or address. Such information may
be crucial to some services, e.g., to limit access to a certain chat room only to
individuals of specific gender. It is proposed to include such information in a
special extension, and to protect the privacy of the information by hashing it;
for example, the extension may contain h(DoB) where DoB stands for date
of birth. When the subject wishes to prove her age, she provides the certificate
together with the value DoB, allowing relying party to validate the age.

1. Should such extensions be marked ‘critical’ ? Why?


2. Show that the design above fails to preserve privacy, by describing the al-
gorithm an attacker would use to find out the sensitive information (e.g.,
DoB), without it being sent by the subject (e.g., Client). Use pseudo-code
or flow-chart. Explain why your algorithm is reasonably efficient.
3. Present an improved design which would protect privacy properly. Your
design should consist of the contents of the extension (instead of the in-
secure h(DoB)), and any additional information which should be sent
inside or outside the certificate to provide evidence of age.
4. Prove, or present clear argument, for security of your construction, under
the random oracle model.
5. Present a counterexample showing that it is not sufficient to assume that
h is a collision-resistant hash function, for the construction to be secure.
Exercise 8.15. As shown in [65], a large number of browser-trusted CAs is
operated by organizations such as universities, non-CA companies or other or-
ganizations, e.g., religious organizations. For example, assume that one of
these is [Link], a university in Niger, which obtained this signing certificate as
an intermediate CA of a trusted root CA, say [Link].
obtained this CA certificate so it may certify its different departments and
project websites. Assume that the private key of [Link] is compromised by a
MitM attacker.
1. Would this allow the attacker to intercept the traffic between the user and
(1) the university, (2) other Nigerian sites, (3) other universities, (4)
additional sites (which)?
2. What would be a typical process for detection of the exposure of the private
key of [Link]? Can you bound the time or number of fake-certificates
issued until exposure?
3. What should be the mitigation, once the exposure is detected?

Exercise 8.16 (Web-of-Trust). In this exercise we consider Web-of-Trust


PKI, first proposed and often associated with the PGP e-mail and file encryp-
tion/signing software. In a web of trust, each user can certify the (public key,

381
name) mappings for people she knows, acting as a CA. To establish secure
communication, two users exchange certificates they obtained for their keys,
and possibly also certificates of the signers. Each user u maintains a directed
graph (Vu , Eu ) whose nodes Vu are (publickey, name) pairs from the set of all
certificates known to u, and where there is an edge from one node, say (pkA ,
Alice) to another, say (pkB ,Bob), if u has a certificate over (pkB ,Bob) using
key pkA ; the graph has a special ‘trust anchor’ entry for u herself, denoted
(pku , u). Each user u decides on a maximal certificate path length L(u); we
say that u trusts (pkB ,Bob) if there is a path of length at most L(u) from
(pku , u) to (pkB ,Bob) in (Vu , Eu ).
Specify an efficient algorithm for determining for u to determine if she
should trust (pkB ,Bob). What is the complexity of this algorithm? Note: you
may use well-known algorithm(s), in which case, just specify their names and
reference, there is no need to copy them in your answer.
Exercise 8.17 (OCSP-stapling against powerful attacker). Consider the fol-
lowing powerful attacker model: the attacker has Man-in-the-Middle (MitM)
capabilities, and also is able to obtain a fraudulent certificate to a website, e.g.,
[Link]. Suppose, further, that the fraudulent certificate is promptly dis-
covered and revoked.
1. Show a sequence diagram showing that this attacker can continue to im-
personate as [Link], in spite of the revocation of the certificate
and of the use of OCSP-stapling by the website.
2. Explain how use of the must-staple certificate extension [88] would pre-
vent this attack.
3. Propose a simple extension to OCSP-stapling, OCSP-stapling-pinning,
that may also help against this threat, if the must-staple extension is
not supported. Explain how your extension would foil the attack you
presented. Hint: the required extension is related to the ‘key pinning’
mechanism discussed in subsection 8.5.2.
4. Discuss the applicability of the concerns of key pinning, discussed in sub-
section 8.5.2, to the extension.
5. Consider now an even more powerful attacker, who also has MitM capabil-
ities, but also controls a root certificate authority (trusted by the browser).
Show a sequence diagram showing that this attacker can continue to im-
personate as [Link], in spite of the use of the OCSP-stapling-
pinning extension as you presented above.
6. Propose an improvement to the OCSP-stapling-pinning extension, that
may help also against this threat.
Exercise 8.18 (Keyed hash and certificates). Many cryptographic protocols
and standard, e.g., X.509, PKIX and CT, rely on the use of the ‘hash-then-
sign’ paradigm, based on the assumption that the hash function in use, h(·), a
(keyless) collision-resistant hash function (CRHF).

382
1. Explain why this assumption is problematic, specifically, prove that there
cannot exist any collision-resistant hash function. (If you fail to prove,
at least present clear, convincing argument.)
2. One way to ‘fix’ such protocols, is by using, instead, a keyed CRHF.
Explain why your argument for non-existence of a keyless CRHF, does
not also imply non-existence of keyed-CRHF.
3. Present an extension to provide the keyed-hash key, for an X.509 cer-
tificate of a signature-verification public key. Explain, with example sce-
narios/applications, if the extension should be marked as ‘critical’, as
non-critical, or if for some applications it should be marked ‘critical’ and
for others marked ‘non-critical’.

Exercise 8.19 (Malicious CA). Let MAC be a malicious CA, trusted (e.g., as
a root CA) by browsers.

1. Assume [Link] generates a signing public-private key pair (B.s, B.v),


and has B.v certified by MAC. Present sequence diagram showing how
MAC can eavesdrop on communication between the site https:
[Link] and a client, Alice, in spite of their use of TLS 1.3. Your
solution should be a very simple attack. Assume the use of only the
defenses explicitly mentioned.
2. Assume that Alice uses CT and does not communicate with websites that
do not provide a properly signed SCT. Assume a single, trustworthy log-
ger, L, and a single, trustworthy monitor, M . Present a sequence dia-
gram showing how this setup may prevent or deter MAC from performing
the attack.
3. Assume that the private key of [Link] is exposed by MAC; however, the
exposure is detected, and Bob revokes its certificate. Furthermore, both
Alice and [Link] deploy OCSP-stapling, and Bob’s public key certificate
includes the must-staple extension. Show, with sequence diagram, how
MAC may still eavesdrop to communication between Alice and Bob.

383
Chapter 9

Usable Security and User


Authentication

This chapter is not yet in reasonable form.

9.1 Password-based Login


9.1.1 Hashed password file
9.1.2 One-time Passwords with Hash-Chain
9.2 Phishing

9.3 Usable Defenses against Phishing

9.4 Usable End-to-End Security

9.5 Usable Security and Authentication: Additional


Exercises
Exercise 9.1. Many websites invoke third-party payment services, such as
PayPal or ‘verified by Visa’. These services reduce the risk of exposure of
client’s credentials such as credit-card number, by having the seller’s site open
a new ‘pop-up’ window, at the payment provider’s site, say PayPal; and then
having the users enter their credentials at PayPal’s site.

1. Assume that a user is purchasing at the attacker’s site. Explain how that
site may be able to trick the user into providing their credentials to the
attacker. Assume typical user, and exploit typical human vulnerabilities.
Present the most effective attack you can.
2. Identify the human vulnerabilities exploited by your construction.

384
3. Propose up to three things that may help to reduce the chance of such
attack.

Exercise 9.2 (Homographic attacks). To support non-Latin languages,domain


names may include non-Latin characters, using unicode encoding. Some browsers
display these non-Latin characters as part of the URL, while others display
them only in the webpage itself, and, if they appear in the URL, display their
punycode encoding (encoding of unicode as few ascii characters).

1. Discuss security and/or usability of these two approaches; where there is


vulnerability, give an example scenario.
2. Some browsers display only using single font, i.e., never displaying do-
main names which mix Latin characters with unicode characters, or use
ponycode encoding in case of mixed fonts. What is the motivation? If
this is open to abuse, give example.
3. Another proposal is that whenever displaying non-Latin characters, to
add a special warning symbol at the end of the domain name, and if the
user clicks on it, provide detailed warning. Identify any secure-usability
principles and human vulnerabilities which are related to this proposal.

Solution to first part: There is a usability advantage in allowing non-latin


domain names to be displayed using the ‘correct’ font in the URL line: this
allows such domain names to appear ‘correct’ to users familiar with the relevant
(non-Latin) language. In fact, when such users use a browser that does NOT
display non-Latin characters in the URL, then they may be practically unable
to understand the URL; this may even open them to phishing attacks as they
may get used to ignore the URL and not notice when the domain name is
incorrect.
However, browsers displaying URLs with non-Latin characters may facil-
itate homographic attacks on websites in domain names consisting of Latin
characters - the vast majority of websites, definitely popular ones. Specifically,
an attacker may be able to buy a domain name which includes or consists of
non-Latin characters and is available for sale, although it is visually very sim-
ilar to a domain name used by some ‘victim’ website, due to using non-Latin
characters which are visually similar (some are visually almost identical!) to
some latin characters in the victim domain name. For example, the latin char-
acter P has a visually-similar Cyrillic character (also looking as P); hence the
domain name [Link] may be written using the Cyrillic P but appear
visually identical to the ‘real’ [Link] domain name.

Exercise 9.3 (Anti-phishing browser). A bank wants to design a special browser


for its employees, which will reduce the risk of them falling to phishing attacks.
It considers the following changes from regular browsers. For each, specify if it
would have significant, small, negligible or no positive impact on security, and
justify, based on secure usability principles.

385
1. Only allow surfing to SSL/TLS protected (https) sites.
2. Do not open websites as results of clicking on URLs received in emails.
3. If user clicks on URL in email, browser displays warning and asks user
to confirm that the URL is correct before requesting that page.
4. On the first time, per day, that the user surfs to a protected site, popup a
window with the certificate details and ask the user to confirm the details,
before requesting and displaying the site.

Exercise 9.4. In the Windows operating system, whenever the user installs
new software, a pop-up screen displays details and asks the user to approve in-
stallation - or to abort. Many programs are signed by a vendor, with a certificate
for that vendor from a trusted CA; in this case the pop-up screen displays the
(certified) name of the vendor and the (signed) name of the program. Other
programs are not signed, or the vendor is not certified; in these cases the pop-up
screen displays the names given by the program for itself and for the vendor,
but with clear statement that this was not validated.

1. Identify secure usability principles violated by this design, and explain


how attacker can exploit human vulnerabilities to get malicious programs
installed.
2. An organization wishes to prevent installation of malware, so it publishes
to its employees a list of permitted software vendors, so that employees
would verify their names before installing. Present criticism.
3. Propose an alternative method that an operating system could offer, that
would allow organization a more secure way to ensure that only programs
from permitted vendors would be run.

386
Chapter 10

Review exercises and solutions

10.1 Review exercises


In this section we present additional exercises, which are intended to be used
to review the entire area; namely, solving the exercises requires identification
of the relevant ‘tools’, e.g., should we use MAC, signatures, hashing, PRF,
encryption - or another tool?

Exercise 10.1 (Authorized-operations application). The US Customs allows


imported containers to be stored in special bonded warehouses, from which
they are released to importers after payment of custom fees, and possibly, in-
spections. Specifically, containers are to be accepted and released, only upon
receiving an appropriate approval slip from US Customs, containing approval
code and details (container number, date, accept/release instruction, and other
text - up to 1000 characters). Upon receiving an approval slip, the warehouse
sends a receipt back to US customs.
Mistakes and fraud may happen, either at the warehouse or at the US cus-
toms, involving accepting and releasing containers without valid approval slip.
The warehouse keeps a log of all approval slips received.

1. Design an efficient and secure processes for generating the approval codes
(by the customs) and validating them, by the warehouse as well as by an
arbitrary auditor. Your design should consist of two functions: generate
code and validate approval slip.
2. Add two more functions: generate receipt (run by the warehouse) and
validate receipt (run by US customs or arbitrary auditor).
3. Extend your design to allow the auditor to validate also the integrity of
the log, i.e., to validate that the log contains all approval slips for which
the customs received (valid) receipts. In this part, do not (yet) optimize
your solution for efficiently auditing (validating) a very large number of
approvals.

387
4. Optimize your solution for efficiently auditing a very large number of
approvals.

10.2 Solutions to selected exercises


In this section we present solutions to some exercises (beyond the solutions and
hints presented earlier).

Solution to Exercise 2.33


1. The question was to prove that G is (or isn’t) a secure PRG, provided
that one or both of G1 , G2 is a secure PRG. And the answer to this part
is no, it isn’t. The proof is quite simple: we show a counterexample.
To do this, we assume we are given a secure PRG, let’s denote it G0 .
We now set both G1 and G2 to be equal to G0 , i.e., G1 (s) = G0 (s) and
G2 (s) = G0 (s). So trivially they are both secure PRGs... but surely for
any seed/key s holds:

G(s) = G1 (s) ⊕ G2 (s) = G0 (s) ⊕ G0 (s) = 02n

Namely, the output is a string of zero bits, whose length is |G0 (s)|, i.e.,
2n.
2. This is again NOT a secure PRG. The argument is very similar to the
previous one. The only difference is that we set G2 (s) = G0 (s ⊕ 1|s| ).
3. The output of G2 will be fixed since its input is fixed. But the question
is if G is a secure PRG, so the fact that the output of G2 is fixed doesn’t
imply G isn’t a vsecure PRG - on the contrary! The adversary can
obviously compute the output of G2 (0|s| ) by herself, without knowing s
(just knowing its length); so if the adversary can distinguish btw G(s)
and a random string, then the adversary can also distinguish between
G1 (s) and a random string, i.e., G1 isn’t a PRG. So it follows that in this
case, G is a PRG if and only if G1 is a PRG.

Solution to Exercise 2.57


Consider a set P of n sensitive (plaintext) records P = {p1 , . . . , pn } belonging
to Alice, where n < 106 . Each record pi is l > 64 bits long ((∀i)(pi ∈ {0, 1}l )).
Alice has very limited memory, therefore, she wants to store an encrypted
version of her records in an insecure/untrusted cloud storage server S; denote
these ciphertext records by C = {c1 , . . . , cn }. Alice can later retrieve the ith
record, by sending i to S, who sends back ci , and then decrypting it back to
pi .

388
1. Alice uses some secure shared key encryption scheme (E, D), with l bit
keys, to encrypt the plaintext records into the ciphertext records. The
goal of this part is to allow Alice to encrypt and decrypt each record i
using a unique key ki , but maintain only a single ‘master’ key k, from
which it can easily compute ki for any desired record i. One motivation
for this is to allow Alice to give keys to specific record(s) ki to some other
users (Bob, Charlie,...), allowing decryption of only the corresponding
ciphertext ci , i.e., pi = Dki (ci ). Design how Alice can compute the key ki
for each record (i), using only the key k and a secure block cipher (PRP)
(F, F −1 ), with key and block sizes both l bits. Your design should be
as efficient and simple as possible. Note: do not design how Alice gives
ki to relevant users - e.g., she may do this manually; and do not design
(E, D).
Solution: ki = Fk (i)
2. Design now the encryption scheme to be used by Alice (and possibly by
other users to whom Alice gave keys ki ). You may use the block cipher
(F, F −1 ), but not other cryptographic functions. You may use different
encryption scheme (E i , Di ) for each record i. Ensure confidentiality of
the plaintext records from the cloud, from users (not given the key for
that record), and from eavesdroppers on the communication. Your design
should be as efficient as possible, in terms of the length of the ciphertext
(in bits), and in terms of number of applications of the secure block cipher
(PRP) (F, F −1 ) for each encryption and decryption operation. In this
part, assume that Alice stores P only once, i.e., never modifies records
pi . Your solution may include a new choice of ki , or simply use the same
as in the previous part.
Solution: ki = Fk (i),
Eki i (pi ) = ki ⊕ pi ,
Dki i (ci ) = ki ⊕ ci .
3. Repeat, when Alice may modify each record pi few times (say, up to 15
times); let ni denote number of modifications of pi . The solution should
allow Alice to give (only) her key k, and then Bob can decrypt all records,
using only the key k and the corresponding ciphertexts from the server.
Note: if your solution is the same as before, this may imply that your
solution to the previous part is not optimal.
Solution: ki = Fk (i + + ni ),
Eki i (pi ) = (ni , ki ⊕ pi ),
Dki i (ci ) = Dki i ((ni , c0i )) = ki ⊕ c0i . Note: ni encoded as four bits (for
efficiency).
4. Design an efficient way for Alice to validate the integrity of records re-
trieved from the cloud server S. This may include storing additional
information Ai to help validate record i, and/or changes to the encryp-
tion scheme or keys as designed in previous parts. As in previous parts,
your design should only use the block cipher (F, F −1 ).

389
Solution: compute ki as before and append to any ciphertext ci a MAC
value mi = Fki (ci ). Use the stored MAC value mi to validate ci upon
retrieval.
5. Extend the keying scheme from the first part, to allow Alice to also com-
pute keys ki,j , for integers i, j ≥ 0 s.t. 1 ≤ i · 2j + 1, (i + 1) · 2j ≤
n, where ki,j would allow (efficient) decryption of ciphertext records
ci·2j +1 , . . . , c(i+1)·2j . For example, k0,3 allows decryption of records c1 , . . . , c8 ,
and k3,2 allows decryption of records c13 , . . . , c16 . If necessary, you may
also change the encryption scheme (E i , Di ) for each record i.
Solution: Assume for simplicity that n = 2m , or, to avoid this assump-
tion, let m = dlog2 (n)e, i.e., n ≤ 2m . Let ki = ki,m , where we compute
ki,m be the following iterative process. First, let k1,0 = k. Then, for
j = 1, . . . m, we compute ki,j , for i = {1, 2, . . . , 2j }, as follows:

ki,j = Fk (imod2)
2 c
b i+1 ,j−1

Solution to Exercise 3.15


1. Let fk : D → {0, 1}n . Pick some (known) values x, y ∈ D, where x 6= y.
Then use k 0 = fk (x) and k 00 = fk (y). (This can easily be extended for
the case that the range of f is not {0, 1}n .) Since f is a PRF, then k 0
and k 00 are pseudorandom.
2. Let Êk be a reversible PRP (block cipher) over {0, 1}n . Pick some
(known) values x, y ∈ {0, 1}n , where x 6= y. Then use k 0 = Êk (x) and
k 00 = Êk (y). Since Ê is a PRP, then k 0 and k 00 are pseudorandom.
3. Use k 0 = P RGk (x)[1, . . . , n] and k 00 = P RGk (y)[n + 1, . . . , 2n]. Then
k0 +
+k 00 is pseudorandom, which implies that k 0 and k 00 are pseudorandom.

Solution to Exercise 3.3


Suppose A knows Fk (m0 + +m0 ) = Ek (0++m0 )+ +Ek (1+
+m0 ) and Fk (m1 + +m1 ) =
Ek (0 +
+ m1 ) + + m1 ), where m0 6= m1 . Then it is easy for A to compute
+ Ek (1 +
Fk (m0 ++ m1 ) = Ek (0 +
+ m0 ) +
+ Ek (1 +
+ m1 ). (Solution by SW)

Solution to Exercise 3.8


Let F 0 , F 00 be two MAC schemes, and define Fk0 ,k00 (m) = Fk0 0 (m) ⊕ Fk0000 (m).
WLOG, assume F 0 is secure and that F 00 may or may not be secure. Now we
show, by contradiction, that F must be secure. Assume, to the contrary, that
F is not a secure MAC. Namely, assume an attacker ADV Fk0 ,k00 (µ)|µ6=m that
can output a pair m, Fk0 ,k00 (m), given access to an oracle that computes Fk0 ,k00
on any value except m. We use ADV to construct an adversary ADV 0 against
F 0.
Adversary ADV 0 operates by running ADV , as well as selecting a key k 00
and running Fk0000 (·). Whenever ADV makes a query q, then ADV 0 makes the

390
same query to the Fk0 0 (·) oracle, to receive Fk0 0 (q). Then, ADV 0 computes by
itself Fk0000 (q), combines it with Fk0 0 (q) by computing the XOR, and returns the
response Fk0 0 (q) ⊕ Fk0000 (q) to ADV .
When ADV finally returns a pair m, Fk0 ,k00 (m), then ADV 0 computes Fk0000 (m)
and returns (m, Fk0000 (m) ⊕ Fk0 ,k00 (m)), which is the same as (m, Fk0 0 (m)) - as
required.

Solution to Exercise 4.1


The attacker can reorder the letters (e.g., ‘obb’ and ‘bbo’), and their ‘sum’ will
be the same. The attacker can also ‘increase’ one letter and ‘decrease’ another
(e.g., ‘cnb’). The hash values of theses strings are equal to h(‘bob’).
(solution by SW)

Solution to Exercise 4.3


a) If h is a CRHF, then no collisions can be efficiently found for it, so it must
also be SPR. b) For h0 , just pick some two inputs and specify that they have the
same output, while for all other inputs, the output is the same as for h. Then
there is a known collision for h0 , so h0 is not a CRHF. Yet, for all other inputs,
the outputs of h0 are the same as for h. This means that for a randomly chosen,
sufficiently long x, no adversary should be able to efficiently find a collision x0
with non-negligible probability, so h0 is also SPR.
(solution by SW)

Solution to Exercise 4.5


In the following two HTML files, the prefix p is 23 blocks (184 bytes), the suffix
s is 32 blocks (256 bytes), and x, x0 are each two blocks (16 bytes). Specifically,
x is ‘1111111122222222’ and x0 is ‘2222222211111111’ (where each character is
encoded in one byte). As a result, the difference between the two files is only
that the blocks x24 and x25 are switched. Consequently, the hash for the two
files would be the same, i.e., h(D1 ) = h(DM ).
File D1 :

<!DOCTYPE html>
<head>
<meta charset="UTF-8">
<title>Message</title>
</>
<body>
<p id="dem"></p>

<script>
var x=0;
var msg;
if (1111111122222222==1111111122222222) {

391
msg = "Pay 1$ to Amazon.";
} else {
msg = "Pay one million dollars to Mel.";
}
[Link]("dem").innerHTML = msg;
</script>

</body>
<!-- comment -->
</html>

File DM :

<!DOCTYPE html>
<head>
<meta charset="UTF-8">
<title>Message</title>
</>
<body>
<p id="dem"></p>

<script>
var x=0;
var msg;
if (2222222211111111==1111111122222222) {
msg = "Pay 1$ to Amazon.";
} else {
msg = "Pay one million dollars to Mel.";
}
[Link]("dem").innerHTML = msg;
</script>

</body>
<!-- comment -->
</html>

(solution by SW)

Solution to Exercise 4.6


Notice that if x mod 2n = 0, then it is easy to find a value x0 s.t. g(x0 ) = g(x)
(e.g., x0 = 0n ). However, the domain of g is {0, 1}∗, so if x is chosen randomly,
then the probability that x mod 2n = 0 is only 2−n (negligible). So usually,
since h is a OWF, it is hard to find x0 s.t. g(x0 ) = g(x), because (usually) this
would require finding x0 s.t. h(x0 ) = h(x). This means that g is also a OWF.
For f = (x) = g(g(x)), though, the output is always 02n (because g(x) mod 2n

392
always equals 0). Therefore, it is always trivial to find x0 s.t. f (x0 ) = f (x)
(again, x0 = 0n works, for example), so f is not a OWF, and by the same
reasoning, OTP-chain using g is insecure.
(solution by SW)

Solution to Exercise 4.8


Consider each pair of the sampled bits (in the original sequence). The two bits
are sampled independently, where 1 is generated with probability p, and 0 is
generated with probability 1 − p. Consequently, in each pair, the probability
that the first bit is 1 and the second bit is 0 is Pr [ (1, 0) ] = (p)(1 − p), while
the probability that the first bit is 0 and the second bit is 1 is Pr [ (0, 1) ] =
(1 − p)(p). Namely, for each pair, these two probabilities are equal. After
removing the pairs where both bits are the same, only the (1, 0) and (0, 1)
pairs remain in the modified sequence, so Pr [ (1, 0) ] = 12 , and Pr [ (0, 1) ] = 12 .
Formally, this can be shown using conditional probability.
First define events and probabilities in the original sequence:
A = pair is (1, 0) Pr [ A ] = (p)(1 − p)
B = pair is (0, 1) Pr [ B ] = (1 − p)(p)
C = pair is (1, 1) Pr [ C ] = p2
D = pair is (0, 0) Pr [ D ] = (1 − p)2
Removing pairs where both bits are the same can now be written as the
condition C ∪ D. So to compute the probability that a pair is (1, 0) in the
modified sequence, we can compute the probability of A given C ∪ D, as follows.

 
  Pr A ∩ C ∪ D Pr [ A ]
Pr A | C ∪ D =   = =
Pr C ∪ D 1 − Pr[ C ∪ D ]
(p)(1 − p) (p)(1 − p) (p)(1 − p) 1
= = = =
1 − (p2 + (1 − p)2 ) 2p − 2p2 2p(1 − p) 2

Similarly, it can be shown that Pr B | C ∪ D = 21 . Thus, in the modified


 

sequence, Pr [ (1, 0) ] = Pr [ (0, 1) ] = 12 .


In the final sequence, each (1, 0) pair is replaced by a 1 and each (0, 1) pair
is replaced by a 0. Therefore, the output is uniformly random, since each bit
is 1 with probability 12 and 0 with probability 21 .
(solution by SW)

Solution to Exercise 5.1


Sketch of Solution: An adversary can choose two same-length messages m0 , m1
such that the first one, m0 , is chosen randomly, and the second one, m1 ,
is highly redundant (e.g., all zeros). If Compress is any (reasonable) com-
pression function, then |Compress(m0 )| will be greater than |Compress(m1 )|.
This means that |Enc0k (m0 )| = |Enck (Compress(m0 ))| will be greater than
|Enc0k (m1 )| = |Enck (Compress(m1 ))|. (We can assume that the adversary

393
Alice Eve Bob

A, NA = 1234

A, NA = 1111

Ek (1234), NB = 5678

Ek (1234), NB = 1111

Ek (1111)

Ek (1111), NB = 2222

Ek (2222)

Figure 10.1: Example sequence diagram for Exercise 5.2.2

chooses the messages so that the difference in the compressed lengths is such
that the encrypted lengths are also different - this must be true for some lengths
if Enc takes inputs of any length.) Thus, the adversary will be able to distin-
guish between Enc0k (m0 ) and Enc0k (m1 ).
(solution by SW)

Solution to Exercise 5.19


1. The attack is simply a replay of an handshake - once the base reuses
the same triplet (and specifically sends the same r), the attacker simply
replays the s value sent the user (and later messages).
2. The improvement to the protocol is simply to compute sB = P RFs (rB )
where s is derived as in the original protocol.
3.

Solution to Exercise 5.2.2


(solution by SW)

Solution to Exercise 5.4


Consider, for example, the CFB mode encryption scheme. In this case, the
attacker can send NA and receive NB and EkCF B (NA ) = IV, Ek (IV ) ⊕ NA ,
where E CF B is the scheme and E is a block cipher. Then the attacker can

394
compute (Ek (IV )⊕NA )⊕NA ⊕NB = Ek (IV )⊕NB and can send IV, Ek (IV )⊕
NB = EkCF B (NB ), all without knowing the key k.
(solution by SW)

Solution to Exercise 6.25


Figure 6.20 shows a vulnerable variant of the Ratchet DH protocol, using a
(secure) pseudorandom function f to derive the session key. Assume that this
protocol is run daily, from day i = 1, and where k0 is a randomly-chosen secret
initial key, shared between Alice and Bob; messages on day i are encrypted
using key ki . An attacker can eavesdrops on the communication between the
parties on all days, and on days 3, 6, 9, . . . it can also spoof messages (send
messages impersonating as either Alice or Bob), and act as Monster-in-the-
Middle (MitM). On the fifth day (i = 5), the attacker is also given the initial
key k0 .

• On which day can attacker first decrypt messages?


Answer: fifth day.
• On the day you specified, what are the days that the attacker can decrypt
messages of?
Answer: all days till then (i.e., first to fifth) - would also be able to
decrypt messages on following days, once sent.
• Explain the attack , including a sequence diagram if relevant. Include
every calculation done by the attacker.
Answer: The protocol (of Figure 6.20) contains a mistake: instead of
the DH value g ai ·bi mod p, it uses g ai +bi mod p; but this can be easily
computed by an eavesdropper from the values sent by the parties, i.e.,
after each round i the attacker computes:

xi ≡ g ai +bi mod p = (g ai mod p) · (g bi mod p) mod p

Hence, this attack requires only a passive, eavesdropping capabilities for


the attacker, and there is no need in sequence diagram - the protocol is
simply run as designed, i.e., on the ith day, we have the exchange as in
Figure 6.20.
On fifth day, once k0 is known, the attacker can also compute ki =
fki−1 (xi ) for all days till then, i.e., i ∈ {1, . . . , 5}; and on each following
day j > 5, attacker can compute the day’s key immediately after the
parties perform the daily exchange.

Solution to Exercise 6.38


Many applications require both confidentiality, using recipient’s public encryp-
tion key, say B.e, and non-repudiation (signature), using sender’s verification
key, say A.v. Namely, to send a message to Bob, Alice uses both her private

395
signature key A.s and Bob’s public encryption key B.e; and to receive a mes-
sage from Alice, Bob uses his private decryption key B.d and Alice’s public
verification key A.v.

1. It is proposed that Alice will select a random key k and send to Bob the
triplet: (cK , cM , σ) = (EB.e (k), k ⊕ m, SignA.s (‘Bob0 +
+ k ⊕ m)). Show
this design is insecure, i.e., a MitM attacker may either learn the message
m or cause Bob to receive a message ‘from Alice’ - that Alice never sent.
Answer: MitM attacker Monster captures the specified triplet (x, c, t)
sent by Alice to Bob, where x = EB.e (k), c = k⊕m and t = SignA.s (‘Bob0 + +
k ⊕ m). Monster sends to Bob a modified triplet: (x0 , c, t), i.e., it mod-
ifies just the first element in the triplet; this allows Monster to ensure
that Bob will receive a message m’ chosen by Monster, instead of m (and
believe Bob will believe m’ was sent by Alice). To do so, Monster sets
x0 = EB.e (c ⊕ m0 ).
2. Propose a simple, efficient and secure fix. Define the sending and receiv-
ing process precisely.
Answer: Sign also the ciphertext, i.e., change the last component in the
triplet to t = SignA.s (‘Bob0 + + k ⊕ m).
+c+
3. Extend your solution to allow prevention of replay (receiving multiple
times a message sent only once).
Answer: Sign also a counter (or time-stamp - any monotonously-increasing
value), i.e., change the last component in the triplet to t = SignA.s (‘Bob0 +
+
i++c+ + k ⊕ m) where i is the number of messages sent by Alice so far;
each recipient has to keep the last message number received from any
recipient.

396
Index

(Optional) OCSP stapling, 360 Blockchain, 174


cryptographic hash function, 65 blockchain, 137, 154, 180
2lMT , 175–177 botnets, 372
MT , 178, 179 bounded key length stream cipher,
2PP, 201, 203 32
BREACH, 197
ACR, 143, 148, 177
adoption challenge, 363 CA, 112, 273, 321–323
AEAD, 197 CA certificates, 339
AES, 241 cascade, 91
anonymity, 180, 185, 265 CBC-MAC, 72, 112–114, 161
ANS.1, 332 CBC-MAC mode, 115
any collision resistance, 143, 148 CCA, 24, 57, 62
any-collision resistant, 148 CDH, 253, 255
as-needed, 349 certificate, 111
asymmetric cryptography, 226, 227 Certificate Authority, 321, 322
asymmetric cryptology, 235 certificate authority, 112, 273, 323
asymptotic security, 36, 37 Certificate policy, 340
Asynchronous-DH-Ratchet, 260 certificate policy, 340
attack model, 11, 29 Certificate Revocation List, 349
Authenticated DH, 257, 258, 280 Certificate Revocation Vectors, 351
authenticated key-exchange, 250 Certificate Revocations List, 349
authenticated-encryption with asso- Certificate Status Request, 360
ciated data, 197 Certificate Status Response, 359
Certificate Transparency, 320, 323,
backward compatibility, 302 371
basic constraints, 341, 342, 344, 347 certificates, 109
birthday attack, 148 certification path, 335, 342
birthday paradox, 145 Checksum, 195
Bitcoin, 182, 183 chosen ciphertext attack, 57
bitmask, 158 Chosen Plaintext Attack, 27
bitwise-randomness extracting, 158 chosen plaintext attack, 57
block cipher, 32, 51, 55, 65, 113 chosen-ciphertext attack, 24
block ciphers, 13, 120 chosen-plaintext attack, 24, 60
block-cipher, 39 chosen-prefix collision, 324

397
chosen-prefix collisions, 139 CT, 320, 323, 371
Cipher-agility, 218 CTO, 20–22, 24, 57, 58, 62, 63, 78
cipher-agility, 296 cyclic group, 244
cipher-text only, 57
ciphersuite, 218 Data Encryption Standard, 8, 26
ciphersuite negotiation, 198, 218 DDH, 254, 255
ciphertext, 24 Decisional DH, 254
Ciphertext-Only, 78 Denial-of-Service, 135, 358
ciphertext-only, 24, 58 denial-of-service, 200
ciphertext-only attack, 21 DES, 8, 26
client authentication, 323 DGA, 372
client certificate, 323 DH, 111, 225, 241, 244
client-server, 213 DH key exchange, 235
collision, 139 DH protocol, 257
collision-resistance, 164 DH-h public key cryptosystem, 263
collision-resistant, 140, 143 differential cryptanalysis, 66
collision-resistant hash, 107 Diffie-Hellman, 111, 241, 244, 252
Collision-Resistant Hash Function, Diffie-Hellman Key Exchange, 8
147, 164 Diffie-Hellman Ratchet, 258
collision-resistant hash function, 168 Diffie-Hellmen, 225
collisions, 69 digest, 164, 167
Compress-then-Encrypt Vulnerabil- digest function, 171
ity, 197 Digest scheme, 137, 164
compression function, 167 Digest-Chain, 164
Computational DH, 253 digest-chain, 154, 164
concrete security, 36, 37, 56 digital signature, 236
conservative design, 61, 106, 111 digital signatures, 235
counter-mode, 44 digitized handwritten signatures, 111
CP, 340, 354 discrete logarithm, 244, 245
CPA, 24, 27, 57, 60 discrete logarithm assumption, 245
CRHF, 107, 147, 148, 164, 167, 168 Distinguished Name, 326, 327, 337
CRIME, 197 distinguished name, 327
critical, 363 Distinguished Names, 326
critical extensions, 334 DLA, 245
criticality indicator, 334 DN, 326, 327
CRL, 349 DNS, 337, 340
CRLsets, 351 dNSName, 337
CRV, 351 Domain Generation Algorithm, 372
cryptocurrency, 182, 185 domain name, 321
cryptographic building blocks, 120, Domain Name System, 337
142 domain name system, 340
cryptographic building blocks prin- Domain Validation, 339, 340
ciple, 296 domain validation, 368
cryptographic hash, 27 DoS, 135
cryptographic hash functions, 120 dot notation, 105
CSR, 359, 360

398
Double Ratchet Key-Exchange, 261, FHE, 265
262 FIL, 120
Double-Ratchet Key-Exchange, 262, FIL-MAC, 104
263 Fixed Input Length, 104, 120
Double-ratchet key-exchange, 257 Flash Crowd, 358
Downgrade Attack, 218 Forward Secrecy, 234
downgrade attack, 218, 296 forward secrecy, 222–224, 228, 238
DV, 339, 340 freshness, 209
FSR, 41
e-voting, 265 Fully Homomorphic Encryption, 265
eavesdropping adversary, 237, 248
ECC, 63, 195 generic attack, 25
ECIES, 241 gossip, 376, 378
EDC, 195 GSM, 22, 23, 41, 63, 78, 86, 212
effective key length, 25, 28, 146 GSM handshake protocol, 231
efficient, 12, 36 GSM security, 197, 214, 216
El-Gamal, 241, 244
El-Gamal PKC, 264 Hardware Security Module, 210, 222
email address, 340 hash-chain, 138, 154
email validation, 340 Hash-then-Sign, 106, 147, 148
Encrypt-then-Authenticate, 196 hash-then-sign, 137, 147, 242, 324,
entity authentication, 197 332
equivocation detection, 323 Heartbleed Bug, 324
equivocation prevention, 323 Heartbleed bug, 350
Error Correction Code, 195 HMAC, 142, 163, 257
Error Detection Code, 195 homograph attack, 372
error localization, 77 homomorphic, 265
Error-Correcting Code, 63 HPKP, 370
Euler’s function, 268 HSM, 210, 222
Euler’s Theorem, 268 HSTS, 304
EV, 339, 340 HtS, 106, 147, 148, 242
evidence, 238 HtS-TCR, 149
exhaustive search, 24, 25, 28, 241 HTTP Strict Transport Security, 304
existential unforgeability, 106 hybrid encryption, 242, 243, 268
existentially unforgeable, 110
existentially unforgeable signature, identity certificates, 321
106, 107, 128, 147 IDS, 134
extend, 170 IETF, 336
extend function, 169 IMSI, 215
Extended Merkle Digest, 378 IND-CCA, 75, 76, 82
Extended Validation, 339, 340 IND-CCA1 secure, 335
Extract-then-Expand, 255 IND-CPA, 60–63, 68, 75, 81
extract-then-expand, 159 independently pseudorandom, 53
indistinguishability test, 32
factoring, 244 indistinguishablility test, 32
Feedback Shift Register, 41 initialization vector, 92

399
intermediate CA, 341 letter-frequency attack, 21, 24
International Mobile Subscriber Iden- linear cryptanalysis, 66
tity, 215 logger, 372
International Telecommunication Union,
325 MAC, 104, 106, 161, 221, 236, 239,
Intrusion-Detection Systems, 134 257
Intrusion-Prevention Systems, 135 malware, 372
IP address, 340 malware/virus scanners, 134
IPS, 135 master key, 257
IPsec, 194 MD-strengthening, 168
issuer, 321, 322 Merkle digest, 172
Issuer Alternative Name, 338 Merkle digest scheme, 173, 174
IssuerAltName, 337 Merkle Tree, 178, 179
ITU, 325, 332, 336 Merkle tree, 172
Merkle-Damgård, 165
KDC, 212, 238 Message Authentication Code, 106,
KDF, 156, 164, 255–257 161, 236, 257
Kerberos, 214 message recovery, 275
Kerckhoffs’ principle, 7 mining, 183, 186
Kerckhoffs’ Principle, 15 MitM, 111, 193, 204, 209, 218, 219,
Key derivation, 255 238, 248, 250, 252, 273,
key derivation, 256 297, 340, 369
Key Derivation Function, 156, 164, MitM soft-fail, 359, 361–363
255, 256 mode of operation, 46
Key Distribution, 197 Monster in the Middle, 193
Key distribution Center, 212 monster-in-middle, 273
Key Distribution Protocol, 212 Monster-in-the-Middle, 111, 204, 218,
Key exchange, 236 250, 297, 355
key exchange, 235, 236, 247 Multi-cert OCSP requests for Certificate-
key generation, 236 Path, 354
key usage, 333 Must-Staple, 361–364
key-exchange, 111, 225 must-staple, 382
key-separation principle, 52, 211, 227
key-setup, 197 name constraint, 333, 344, 347
key-setup 2PP extension, 211, 223 name constraints, 341
keyed collision resistant, 143 negligible, 37
keyed CRHF, 142, 143, 146 non-critical, 363
keyed hash, 148, 149 non-critical extensions, 334
Keyed HtS, 148 non-repudiation, 109, 110, 273, 274
keyed-CRHF, 148 nonce, 201, 204
Keyed-HtS, 148 nonces, 209
keyless CRHF, 146, 147
keyless Hash-then-Sign, 147 Object Identifier, 330
known-plaintext, 24 object identifier, 332, 333
known-plaintext attack, 57 OCSP, 320, 353
KPA, 24, 57, 62 OCSP client, 352, 354

400
OCSP Must-Staple, 359 permissionless, 182
OCSP request, 352 permutation, 54
OCSP responder, 352 PFS, 222, 225, 226, 228, 238, 257–
OCSP response, 352 259, 261, 299
OCSP server, 354 PGP, 381
OCSP Stapling, 359 PHE, 265
OCSP stapling, 358, 359 phishing, 369, 372
OFB, 46 PKC, 8, 63, 111, 226, 236, 239,
OFB-mode, 44 240, 247, 320
off-path, 111, 340 PKI, 110–112, 320, 322
OID, 330, 332, 333, 339, 340 PKIX, 320, 336, 337
One Time Pad, 30, 32 plaintext, 24
one-time pad, 31 PoC, 182
one-time password, 154 Pohlig-Hellman algorithm, 245
one-time signatures, 147 PoI, 177, 182, 373, 376, 377
one-time-pad, 225 PoL, 376, 377, 379
One-Way Function, 153 policy constraints, 341, 347
one-way function, 39, 142, 147, 153 policy mapping, 348
OneCRL, 351 PoM, 374, 376–379
oracle, 47, 49, 60 PoW, 138, 182, 183
oracle access, 12, 60 PPT, 12, 36, 50, 243, 245, 253,
Organization Validation, 340 256
Origin Validation, 339, 340 prefetch, 349
OTP, 30, 32, 154, 225 Preimage resistance, 153
OTP-chain, 138, 154 preimage resistant, 153
Output Feedback, 46 preloaded, 304
OV, 339, 340 PRF, 32, 39, 42, 44, 45, 47–49,
OWF, 138, 142, 147, 153, 154 53, 65, 88, 136, 215, 227,
256, 257
partially-homomorphic encryption, PRG, 25, 32, 33, 35, 36, 38, 88,
265 225
PBR, 71 PRG indistinguishability test, 34
Per-Block Random, 71 privacy, 265
Per-goal keys, 53 private key, 236
per-goal keys, 211 Proactive security, 238
Perfect Forward Secrecy, 225, 226, proactive security, 227
228, 238, 257, 258 Probabilistic Polynomial Time, 12,
Perfect forward secrecy, 299 36, 243
perfect forward secrecy, 222, 228 Proof of Inclusion, 373
Perfect Recover Secrecy, 225, 226, Proof of Maliciousness, 376
257, 258 Proof of Misbehavior, 374
perfect recover secrecy, 222, 226, 228, Proof-of-Consistency, 171, 174, 182
229 Proof-of-Inclusion, 171, 173, 177,
perfect-forward secrecy, 261 182, 375–377
perfect-recover secrecy, 261 Proof-of-Log, 376, 377, 379
permissioned, 182 Proof-of-Misbehavior, 377–379

401
Proof-of-Stake, 183 quadratic residue, 246
Proof-of-Work, 138, 182, 183
protocols, 191 rainbow tables, 27
proxy re-encryption, 267 random functions, 43, 53
PRP, 32, 53, 55, 67 random functions., 42
PRP/PRF switching lemma, 66, 69, Random Oracle Model, 160
113 Random oracle model, 136
PRS, 222, 225–229, 257–259, 261 random oracle model, 142
Pseudo Random Function, 215 Random Oracle Model (ROM), 160
Pseudo-Random Function, 39, 42, random permutation, 53
44, 45, 47–49, 65, 227, 256 randomness extracting hash function,
pseudo-random function, 32, 136, 255
228 randomness extraction, 156, 164
Pseudo-Random Functions, 43 ransomware, 372
pseudo-random functions, 32 RC4, 41
Pseudo-Random Generator, 25, 32, record protocol, 194
33, 35, 36, 38 recover secrecy, 222, 224, 225, 228,
pseudo-random generator, 32, 225 229
pseudo-random generators, 32 recover security, 224
Pseudo-Random Permutation, 55, Recover-Security Handshake, 224, 262
67 relying parties, 325
pseudo-random permutation, 32, 53 relying party, 320, 322, 333, 342
pseudo-randomness, 32 resiliency to exposure, 238
pseudorandomness, 32 Resiliency to Exposures, 257
public key, 236 resiliency to exposures, 198
public key certificate, 110, 112, 238, Resiliency to key exposure, 222
273, 321, 328 resiliency to key exposure, 225
public key certificates, 151 robust combiner, 51, 289
public key cryptography, 320 ROM, 138, 142, 160
Public Key Cryptology, 8 root CA, 323, 341
Public key cryptology, 239 root certificate authorities, 323
public key cryptology, 225, 228, 235, routing, 340
247 RS-Ratchet, 225, 258
Public key cryptosystem, 236 RSA, 29, 111, 241, 244, 270
public key cryptosystem, 111, 147, RSA assumption, 270
236 RSA signatures, 275
public key cryptosystems, 63
Public Key Infrastructure, 109, 320, S/MIME, 323
336 safe prime, 245, 247, 251, 254, 255
public key infrastructure, 110, 111 salt, 255, 256
public key(s), 325 SAN, 337, 380
public-key cryptography, 194, 226 scam, 369, 372
public-key cryptosystem, 241, 247 Schnorr’s group, 254
public-key encryption, 65, 235 SCSV, 303
public-key infrastructure, 112 SCT, 372
second preimage resistance, 136

402
second-preimage resistance, 142, 149 table look-up attack, 27
Second-Preimage Resistant, 142 Target Collision Resistant, 144, 146
second-preimage resistant, 150 target collision-resistant, 144
secure in the standard model, 161 target-collision resistant, 148
Secure Session Protocols, 194 TCR, 144, 146, 148, 177
secure session transmission proto- textbook RSA, 268
col, 194 TextSecure, 262
Secure Socket Layer, 194 The UX > Security Precedence Rule,
security parameter, 49, 60, 204 357
security requirements, 11 Threshold security, 238
session key, 197 threshold security, 227
session protocol, 194 TIME, 197
Session Ticket Encryption Key (STEK), time-memory tradeoff, 25, 27
311 time/memory/data tradeoff, 27
session-authentication, 197 Timestamp, 208
shared-key authentication-handshake timestamp, 209
protocols, 197 Timestamps, 209
Shared-key Entity-Authenticating Hand- timing side channel, 82
shake Protocols, 198 TLDcc, 370
side channel, 82 TLS, 194, 195, 197, 285, 320–323,
side-channel, 358 360, 369
Signaling Cipher Suite Value, 303 TLS extension, 359
signature scheme, 65, 104, 236, 272 TLS feature, 334
signature schemes, 101, 274 TLS FALLBACK SCSV, 303
signature with appendix, 274 to-be-signed, 331
signature with message recovery, 274, TOFU, 111, 273, 371
275 top-level domain country codes, 370
smooth, 247 transparency, 322
SNA, 200–203 Transport-Layer Security, 194, 285
SNA handshake protocol, 200, 203 truncation, 195
soft-fail, 361 trust anchor, 335–337, 341, 343
SPR, 136, 142, 149, 150 trust on first use, 111, 273
SSL, 194, 195, 320–323, 369 Trust-On-First-Use, 371
SSL-Stripping, 304 trusted third party, 212
stapled-OCSP, 352 TTP, 212
stream cipher, 31 two-layered hash tree, 172
stream ciphers, 13 Two-layered Mekle Tree, 176
SubjectAltName, 337, 338, 380 Two-layered Merkle Tree, 175, 176
subsequent certificates, 342, 347 Two-layered Merkle Tree P oC is cor-
sufficient effective key length, 240 rect and secure., 177
symmetric cryptography, 212
symmetric cryptosystem, 241 unary, 50
Synchronous DH Ratchet, 259 universal one-way hash functions,
Systems Network Architecture, 201 144
universal re-encryption, 267
table look-up, 25 user experience, 357

403
user-experience, 357
UX, 357
UX>Security, 361

Variable Input Length, 104, 120


variable input length, 71
Verify-Proof-of-Consistency, 174
VIL, 71, 116, 120
VIL-MAC, 104
voter anonymity, 265

weak collision resistance, 150


Web PKI, 320, 321, 323, 328, 369–
371
Web-of-Trust, 381
web-of-trust, 341
WEP, 82
Wi-Fi Protected Access, 83
wildcard certificates, 338
wildcard character, 338
Wired Equivalency Privacy, 82, 83
WPA, 83

X.500, 325, 327


X.509, 320, 327, 336
X.509v3, 320, 332

404
Bibliography

[1] I. 9797. Data cryptographic techniques: Data integrity mechanism using


a cryptographic check function employing a block cipher algorithm, 1989.
[2] W. Alexi, B. Chor, O. Goldreich, and C. P. Schnorr. Rsa and rabin
functions: Certain parts are as hard as the whole. SIAM Journal on
Computing, 17(2):194–209, 1988.
[3] J.-P. Aumasson. Serious Cryptography: A Practical Introduction to Mod-
ern Encryption. No Stratch Press, 2017.
[4] J.-P. Aumasson, S. Neves, Z. Wilcox-O’Hearn, and C. Winnerlein.
Blake2: Simpler, smaller, fast as md5. In M. J. J. Jr., M. E. Locasto,
P. Mohassel, and R. Safavi-Naini, editors, Applied Cryptography and Net-
work Security - 11th International Conference, ACNS 2013, Banff, AB,
Canada, June 25-28, 2013. Proceedings, volume 7954 of Lecture Notes in
Computer Science, pages 119–135. Springer, 2013.
[5] N. Aviram, S. Schinzel, J. Somorovsky, N. Heninger, M. Dankel,
J. Steube, L. Valenta, D. Adrian, J. A. Halderman, V. Dukhovni,
E. Käsper, S. Cohney, S. Engels, C. Paar, and Y. Shavitt. DROWN:
Breaking TLS with SSLv2. In 25th USENIX Security Symposium, Aug.
2016.
[6] G. V. Bard. A challenging but feasible blockwise-adaptive chosen-
plaintext attack on SSL. In M. Malek, E. Fernández-Medina, and J. Her-
nando, editors, Proceedings of SECRYPT, pages 99–109. INSTICC Press,
2006.
[7] E. Barkan, E. Biham, and N. Keller. Instant ciphertext-only cryptanal-
ysis of GSM encrypted communication. J. Cryptology, 21(3):392–429,
2008.
[8] E. Barker. Nist special publication 800-57 part 1 revision 4—recommen-
dation for key management (part 1: General), 2016.
[9] L. Bassham, W. Polk, and R. Housley. Algorithms and Identifiers for
the Internet X.509 Public Key Infrastructure Certificate and Certificate
Revocation List (CRL) Profile. RFC 3279 (Proposed Standard), Apr.
2002. Updated by RFCs 4055, 4491, 5480, 5758.

405
[10] T. Be’ery and A. Shulman. A Perfect CRIME? Only TIME Will Only
Tell. In Blackhat Europe, March 2013.
[11] M. Bellare, R. Canetti, and H. Krawczyk. Keying hash functions for
message authentication. In N. Koblitz, editor, Advances in Cryptology—
CRYPTO ’96, volume 1109 of Lecture Notes in Computer Science, pages
1–15. Springer-Verlag, 18–22 Aug. 1996.
[12] M. Bellare, R. Canetti, and H. Krawczyk. Pseudorandom functions revis-
ited: the cascade construction and its concrete security. In Proceedings
of 37th Conference on Foundations of Computer Science, pages 514–523,
1996.
[13] M. Bellare, R. Canetti, and H. Krawczyk. HMAC: Keyed-hashing for
message authentication. Internet Request for Comment RFC 2104, In-
ternet Engineering Task Force, Feb. 1997.
[14] M. Bellare, R. Canetti, and H. Krawczyk. A modular approach to the
design and analysis of authentication and key exchange protocols. In Pro-
ceedings of the thirtieth annual ACM symposium on Theory of computing,
pages 419–428. ACM, 1998.
[15] M. Bellare, A. Desai, E. Jokipii, and P. Rogaway. A concrete security
treatment of symmetric encryption. In Proceedings 38th Annual Sympo-
sium on Foundations of Computer Science, pages 394–403. IEEE, 1997.
[16] M. Bellare, R. Guérin, and P. Rogaway. Xor macs: New methods for
message authentication using finite pseudorandom functions. In Annual
International Cryptology Conference, pages 15–28. Springer, 1995.
[17] M. Bellare, J. Kilian, and P. Rogaway. The security of the cipher block
chaining message authentication code. J. Comput. Syst. Sci., 61(3):362–
399, 2000.
[18] M. Bellare, T. Krovetz, and P. Rogaway. Luby-rackoff backwards: In-
creasing security by making block ciphers non-invertible. In Advances in
Cryptology–Eurocrypt, 1998.
[19] M. Bellare and C. Namprempre. Authenticated encryption: Relations
among notions and analysis of the generic composition paradigm. Jour-
nal of Cryptology: the journal of the International Association for Cryp-
tologic Research, 21(4):469–491, Oct. 2008.
[20] M. Bellare and P. Rogaway. Entity authentication and key distribution.
In Crypto, volume 93, pages 232–249. Springer, 1993.
[21] M. Bellare and P. Rogaway. Random oracles are practical: A paradigm
for designing efficient protocols. In Proceedings of the 1st ACM conference
on Computer and communications security, pages 62–73, 1993.

406
[22] M. Bellare and P. Rogaway. Optimal asymmetric encryption. In Work-
shop on the Theory and Application of of Cryptographic Techniques,
pages 92–111. Springer, 1994.
[23] M. Bellare and P. Rogaway. Provably secure session key distribution:
the three party case. In Proceedings of the twenty-seventh annual ACM
symposium on Theory of computing, pages 57–66. ACM, 1995.
[24] M. Bellare and P. Rogaway. Collision-resistant hashing: Towards making
uowhfs practical. In Annual International Cryptology Conference, pages
470–484. Springer, 1997.
[25] S. M. Bellovin. Frank Miller: Inventor of the One-Time Pad. Cryptologia,
35(3):203–222, 2011.
[26] S. M. Bellovin. Vernam, Mauborgne, and Friedman: The One-Time Pad
and the index of coincidence. In P. Y. A. Ryan, D. Naccache, and J.-
J. Quisquater, editors, The New Codebreakers, volume 9100 of Lecture
Notes in Computer Science, pages 40–66. Springer, 2016.
[27] R. Bird, I. Gopal, A. Herzberg, P. Janson, S. Kutten, R. Molva, and
M. Yung. The kryptoknight family of light-weight protocols for authen-
tication and key distribution. IEEE/ACM Transactions on Networking
(TON), 3(1):31–41, 1995.
[28] R. Bird, I. Gopal, A. Herzberg, P. A. Janson, S. Kutten, R. Molva, and
M. Yung. Systematic design of a family of attack-resistant authenti-
cation protocols. IEEE Journal on Selected Areas in Communications,
11(5):679–693, 1993.
[29] A. Biryukov and A. Shamir. Cryptanalytic time/memory/data tradeoffs
for stream ciphers. In International Conference on the Theory and Ap-
plication of Cryptology and Information Security, pages 1–13. Springer,
2000.
[30] J. Black, P. Rogaway, and T. Shrimpton. Encryption-scheme security in
the presence of key-dependent messages. In K. Nyberg and H. M. Heys,
editors, Selected Areas in Cryptography, volume 2595 of Lecture Notes in
Computer Science, pages 62–75. Springer, 2002.
[31] S. Blake-Wilson, M. Nystrom, D. Hopwood, J. Mikkelsen, and T. Wright.
Transport Layer Security (TLS) Extensions. RFC 3546 (Proposed Stan-
dard), June 2003. Obsoleted by RFC 4366.
[32] S. Blake-Wilson, M. Nystrom, D. Hopwood, J. Mikkelsen, and T. Wright.
Transport Layer Security (TLS) Extensions. RFC 4366 (Proposed Stan-
dard), Apr. 2006. Obsoleted by RFCs 5246, 6066, updated by RFC 5746.

407
[33] M. Blaze, G. Bleumer, and M. S. 0001. Divertible protocols and atomic
proxy cryptography. In K. Nyberg, editor, Advances in Cryptology -
EUROCRYPT ’98, International Conference on the Theory and Appli-
cation of Cryptographic Techniques, Espoo, Finland, May 31 - June 4,
1998, Proceeding, volume 1403 of Lecture Notes in Computer Science,
pages 127–144. Springer, 1998.
[34] M. Blaze, J. Feigenbaum, and A. D. Keromytis. Keynote: Trust man-
agement for public-key infrastructures. In International Workshop on
Security Protocols, pages 59–63. Springer, 1998.
[35] M. Blaze, J. Feigenbaum, and J. Lacy. Decentralized trust management.
In Proceedings 1996 IEEE Symposium on Security and Privacy, pages
164–173. IEEE, 1996.
[36] D. Bleichenbacher. Chosen ciphertext attacks against protocols based on
the rsa encryption standard pkcs# 1. In Annual International Cryptology
Conference, pages 1–12. Springer, 1998.
[37] N. Borisov, I. Goldberg, and D. Wagner. Intercepting mobile com-
munications: The insecurity of 802.11. In Proceedings of the Seventh
Annual International Conference on Mobile Computing and Networking
(MOBICOM-01), pages 180–188, New York, July 16–21 2001. ACM
Press.
[38] BSI. Kryptographische verfahren: Empfehlungen und schlussell, Febru-
ary 2017.
[39] R. Canetti, R. Gennaro, A. Herzberg, and D. Naor. Proactive security:
Long-term protection against break-ins. RSA Laboratories’ CryptoBytes,
3(1):1–8, 1997.
[40] M. M. Carvalho, J. DeMott, R. Ford, and D. A. Wheeler. Heartbleed
101. IEEE Secur. Priv, 12(4):63–67, 2014.
[41] M. CCITT. Specification of abstract syntax notation one (asn.1). Open
Systems Interconnection-Basic Reference Model, 1988.
[42] D. Chadwick. Understanding X. 500: the directory. Chapman & Hall,
Ltd., 1994.
[43] S. Checkoway, M. Fredrikson, R. Niederhagen, A. Everspaugh, M. Green,
T. Lange, T. Ristenpart, D. J. Bernstein, J. Maskiewicz, and H. Shacham.
On the practical exploitability of dual ec in tls implementations. In Pro-
ceedings of the 23rd USENIX conference on Security Symposium, pages
319–335. USENIX Association, 2014.
[44] T. Chung, J. Lok, B. C. 0002, D. R. Choffnes, D. Levin, B. M. Maggs,
A. Mislove, J. P. Rula, N. Sullivan, and C. Wilson. Is the web ready for
ocsp must-staple? In Internet Measurement Conference, pages 105–118.
ACM, 2018.

408
[45] S. Cohney, M. D. Green, and N. Heninger. Practical state recovery at-
tacks against legacy RNG implementations, October 2017. online at
[Link]
[46] I. C. S. L. M. S. Committee. IEEE 802.11: Wireless LAN Medium Access
Control and Physical Layer Specifications, Aug. 1999.
[47] D. Cooper, S. Santesson, S. Farrell, S. Boeyen, R. Housley, and W. Polk.
Internet X.509 Public Key Infrastructure Certificate and Certificate Re-
vocation List (CRL) Profile. RFC 5280 (Proposed Standard), May 2008.
Updated by RFCs 6818, 8398, 8399.
[48] S. A. Crosby and D. S. Wallach. Efficient data structures for tamper-
evident logging. In USENIX Security Symposium, pages 317–334, 2009.
[49] J. Daemen and V. Rijmen. The design of Rijndael: AES-the advanced
encryption standard. Springer Science & Business Media, 2013.
[50] W. Dai. Crypto++ 6.0.0 benchmarks, 2018. version of 01-23-2018.
[51] I. B. Damgård. Collision free hash functions and public key signature
schemes. In Workshop on the Theory and Application of of Cryptographic
Techniques, pages 203–216. Springer, 1987.
[52] I. B. Damgård. A design principle for hash functions. In G. Brassard,
editor, Advances in Cryptology—CRYPTO ’89, volume 435 of Lecture
Notes in Computer Science, pages 416–427. Springer-Verlag, 1990, 1989.
[53] Y. Desmedt. Threshold cryptosystems. In International Workshop on
the Theory and Application of Cryptographic Techniques, pages 1–14.
Springer, 1992.
[54] T. Dierks and C. Allen. The TLS Protocol Version 1.0. RFC 2246
(Proposed Standard), Jan. 1999. Obsoleted by RFC 4346, updated by
RFCs 3546, 5746, 6176, 7465, 7507, 7919.
[55] T. Dierks and E. Rescorla. The Transport Layer Security (TLS) Protocol
Version 1.2. RFC 5246 (Proposed Standard), Aug. 2008. Obsoleted by
RFC 8446, updated by RFCs 5746, 5878, 6176, 7465, 7507, 7568, 7627,
7685, 7905, 7919, 8447.
[56] W. Diffie and M. Hellman. New directions in cryptography. IEEE Trans-
actions on Information Theory, 22(6):644–654, Nov. 1976.
[57] R. Dingledine, N. Mathewson, and P. F. Syverson. Tor: The second-
generation onion router. In M. Blaze, editor, Proceedings of the 13th
USENIX Security Symposium, August 9-13, 2004, San Diego, CA, USA,
pages 303–320. USENIX, 2004.

409
[58] Y. Dodis, R. Gennaro, J. Håstad, H. Krawczyk, and T. Rabin. Ran-
domness extraction and key derivation using the cbc, cascade and hmac
modes. In Annual International Cryptology Conference, pages 494–510.
Springer, 2004.
[59] D. Dolev, C. Dwork, and M. Naor. Nonmalleable cryptography. SIAM
review, 45(4):727–784, 2003.
[60] N. Doraswamy and D. Harkins. IPSec: the new security standard for the
Internet, intranets, and virtual private networks. Prentice Hall Profes-
sional, 2003.
[61] B. Dowling, F. Günther, U. Herath, and D. Stebila. Secure logging
schemes and certificate transparency. In European Symposium on Re-
search in Computer Security, pages 140–158. Springer, 2016.
[62] O. Dubuisson. ASN. 1 communication between heterogeneous systems.
Morgan Kaufmann, 2000.
[63] O. Dunkelman. Techniques for cryptanalysis of block ciphers. Thesis
(ph.d.), Faculty of Computer Science, Technion — Israel Institute of
Technology, Haifa, Israel, 2006.
[64] T. Duong and J. Rizzo. Here come the XOR ninjas. presented at
Ecoparty and available at [Link]
talks/bullrun/[Link], 2011.
[65] Z. Durumeric, J. Kasten, M. Bailey, and J. A. Halderman. Analysis of
the https certificate ecosystem. In K. Papagiannaki, P. K. Gummadi,
and C. Partridge, editors, Proceedings of the 2013 Internet Measurement
Conference, IMC 2013, Barcelona, Spain, October 23-25, 2013, pages
291–304. ACM, 2013.
[66] M. Dworkin. Recommendation for block cipher modes of operation. meth-
ods and techniques. Technical report, National Inst of Standards and
Technology Gaithersburg MD Computer security Div, 2001.
[67] M. J. Dworkin. Sha-3 standard: Permutation-based hash and extendable-
output functions. Technical report, NIST, 2015.
[68] M. J. Dworkin. Recommendation for block cipher modes of operation:
The cmac mode for authentication. Special Publication (NIST SP)-800-
38B, 2016.
[69] S. Dziembowski and K. Pietrzak. Leakage-resilient cryptography. In
Foundations of Computer Science, 2008. FOCS’08. IEEE 49th Annual
IEEE Symposium on, pages 293–302. IEEE, 2008.
[70] C. Evans, C. Palmer, and R. Sleevi. Public Key Pinning Extension for
HTTP. RFC 7469 (Proposed Standard), Apr. 2015.

410
[71] S. Even. Graph algorithms. Cambridge University Press, 2011.
[72] L. Ewing. Linux 2.0 penguins. Example of using The GIMP graphics
software, online; accessed 1-Sept-2017.
[73] P.-A. Fouque, D. Pointcheval, J. Stern, and S. Zimmer. Hardness of
distinguishing the msb or lsb of secret keys in diffie-hellman schemes. In
International Colloquium on Automata, Languages, and Programming,
pages 240–251. Springer, 2006.
[74] S. Frankel and S. Krishnan. IP Security (IPsec) and Internet Key Ex-
change (IKE) Document Roadmap. RFC 6071 (Informational), Feb.
2011.
[75] A. Freier, P. Karlton, and P. Kocher. The Secure Sockets Layer (SSL)
Protocol Version 3.0. RFC 6101 (Historic), Aug. 2011.
[76] S. Garfinkel and N. Makarevitch. PGP: Pretty Good Privacy. O’Reilly
International Thomson, Paris, France, 1995.
[77] Y. Gilad, A. Herzberg, M. Sudkovitch, and M. Goberman. Cdn-on-
demand: An affordable ddos defense via untrusted clouds. In Network
and Distributed System Security Symposium (NDSS). The Internet Soci-
ety, 2016.
[78] D. Giry. Cryptographic key length recommendation, 2018. version of
01-23-2018.
[79] O. Goldreich. Foundations of Cryptography, volume Basic Tools. Cam-
bridge University Press, 2001.
[80] O. Goldreich. Foundations of cryptography: volume 2, basic applications.
Cambridge university press, 2009.
[81] O. Goldreich. P, NP, and NP-Completeness: The Basics of Complexity
Theory. Cambridge University Press, 2010.
[82] O. Goldreich, S. Goldwasser, and S. Micali. How to construct random
functions. J. ACM, 33(4):792–807, 1986.
[83] S. Goldwasser and S. Micali. Probabilistic encryption. Journal of Com-
puter and System Sciences, 28(2):270–299, Apr. 1984.
[84] P. Golle, M. Jakobsson, A. Juels, and P. Syverson. Universal re-
encryption for mixnets. In Cryptographers’ Track at the RSA Conference,
pages 163–178. Springer, 2004.
[85] M. Goodwin. Revoking intermediate certifi-
cates: Introducing OneCRL. Mozilla Security Blog,
[Link]
revoking-intermediate-certificates-introducing-onecrl/,
Mar. 2015.

411
[86] C. G. Günther. An identity-based key-exchange protocol. In Workshop
on the Theory and Application of of Cryptographic Techniques, pages
29–37. Springer, 1989.
[87] C. Hall, D. Wagner, J. Kelsey, and B. Schneier. Building prfs from prps.
In Annual International Cryptology Conference, pages 370–389. Springer,
1998.
[88] P. Hallam-Baker. X.509v3 Transport Layer Security (TLS) Feature Ex-
tension. RFC 7633 (Proposed Standard), Oct. 2015.
[89] D. Hankerson, A. J. Menezes, and S. Vanstone. Guide to elliptic curve
cryptography. Springer Science & Business Media, 2006.
[90] J. Håstad and M. Nåslund. The security of all rsa and discrete log bits.
Journal of the ACM (JACM), 51(2):187–230, 2004.
[91] M. E. Hellman. A cryptanalytic time-memory trade-off. IEEE Trans.
Information Theory, 26(4):401–406, 1980.
[92] A. Herzberg. Folklore, practice and theory of robust combiners. Journal
of Computer Security, 17(2):159–189, 2009.
[93] A. Herzberg. Foundations of Cybersecurity, volume 2: An Internet-
focused Introduction to Network Security. Draft version available
online at [Link]
Foundations_of_Cybersecurity_early_draft_of_part_II_Network_
Security, 2020.
[94] A. Herzberg and Y. Mass. Relying party credentials framework. Elec-
tronic Commerce Research, 4(1-2):23–39, 2004.
[95] A. Herzberg, Y. Mass, J. Mihaeli, D. Naor, and Y. Ravid. Access control
meets public key infrastructure, or: Assigning roles to strangers. In
Proceeding 2000 IEEE Symposium on Security and Privacy. S&P 2000,
pages 2–14. IEEE, 2000.
[96] K. E. Hickman and T. Elgamal. The SSL protocol (version
2). Archived copy: =[Link]
vwfGws0GMow98DX3i7B/view?usp=sharing, June 1995. Published as
Internet Draft [Link].
[97] J. Hodges, C. Jackson, and A. Barth. HTTP Strict Transport Security
(HSTS). RFC 6797 (Proposed Standard), Nov. 2012.
[98] J. Hoffstein, J. Pipher, J. H. Silverman, and J. H. Silverman. An intro-
duction to mathematical cryptography. Springer, 2008.
[99] International Telecommunication Union. X.509 : Information technol-
ogy - Open Systems Interconnection - The Directory: Public-key and
attribute certificate frameworks, October 2019.

412
[100] J. Jean. TikZ for Cryptographers. [Link]
tikz/, 2016. CBC figures by Diana Maimut.
[101] A. Joux. Algorithmic cryptanalysis. CRC Press, 2009.
[102] D. Kahn. The Codebreakers: The comprehensive history of secret com-
munication from ancient times to the internet. Simon and Schuster, 1996.
[103] J. Kelsey, B. Schneier, D. Wagner, and C. Hall. Cryptanalytic attacks on
pseudorandom number generators. In S. Vaudenay, editor, Fast Software
Encryption: 5th International Workshop, volume 1372 of Lecture Notes
in Computer Science, pages 168–188, Paris, France, 23–25 Mar. 1998.
Springer-Verlag.
[104] A. Kerckhoffs. La cryptographie militaire. Journal des Sciences Mili-
taires, IX, 1883.
[105] S. Kille. A String Representation of Distinguished Names. RFC 1779
(Historic), Mar. 1995. Obsoleted by RFCs 2253, 3494.
[106] C. A. Kirtchev. A cyberpunk manifesto. Cyberpunk Review. Disponible
en [Link] 1997.
[107] A. Klein. Attacks on the RC4 stream cipher. Designs, Codes and Cryp-
tography, 48(3):269–286, 2008.
[108] L. R. Knudsen and M. Robshaw. The Block Cipher Companion. Infor-
mation Security and Cryptography. Springer, 2011.
[109] P. Koopman. 32-bit cyclic redundancy codes for internet applications. In
Proceedings 2002 International Conference on Dependable Systems and
Networks (DSN 2002), pages 459–472, (Bethesda, MD) Washington, DC,
USA, June 2002. IEEE Computer Society.
[110] H. Krawczyk. The order of encryption and authentication for protecting
communications (or: how secure is SSL?). In J. Kilian, editor, Advances
in Cryptology – CRYPTO ’ 2001, volume 2139 of Lecture Notes in Com-
puter Science, pages 310–331. International Association for Cryptologic
Research, Springer-Verlag, Berlin Germany, 2001.
[111] H. Krawczyk. Cryptographic extraction and key derivation: The hkdf
scheme. In T. Rabin, editor, Advances in Cryptology - CRYPTO 2010,
30th Annual Cryptology Conference, Santa Barbara, CA, USA, August
15-19, 2010. Proceedings, volume 6223 of Lecture Notes in Computer
Science, pages 631–648. Springer, 2010.
[112] J. F. Kurose and K. W. Ross. Computer networking: a top-down ap-
proach, volume 4. Addison Wesley Boston, 2009.

413
[113] J. Larisch, D. R. Choffnes, D. Levin, B. M. Maggs, A. Mislove, and
C. Wilson. Crlite: A scalable system for pushing all tls revocations to all
browsers. In IEEE Symposium on Security and Privacy, pages 539–556.
IEEE Computer Society, 2017.
[114] B. Laurie. Certificate transparency. Communications of the ACM,
57(10):40–46, 2014.
[115] B. Laurie, A. Langley, and E. Kasper. Certificate Transparency. RFC
6962 (Experimental), June 2013.
[116] B. Laurie, A. Langley, E. Kasper, E. Messeri, and R. Stradling. Cer-
tificate transparency version 2.0. IETF TRANS (Public Notary Trans-
parency) WG, Internet-Draft, November 2019.
[117] H. Leibowitz, A. Herzberg, and E. Syta. Provably model-secure pki
schemes. Cryptology ePrint Archive, Report 2019/807, 2019. https:
//[Link]/2019/807.
[118] H. Leibowitz, A. Herzberg, and E. Syta. PKIng: Practical, Provably-
Secure and Transparent PKI, or: In God we trust; Loggers we Monitor .
Work in progress, 2020.
[119] A. K. Lenstra and E. R. Verheul. Selecting cryptographic key sizes.
Journal of Cryptology, 14(4):255–293, 2001.
[120] G. Leurent and T. Peyrin. Sha-1 is a shambles: First chosen-prefix col-
lision on sha-1 and application to the {PGP} web of trust. In 29th
{USENIX} Security Symposium ({USENIX} Security 20), pages 1839–
1856, 2020.
[121] S. Levy. Crypto: how the code rebels beat the government, saving privacy
in the digital age. Viking, 2001.
[122] N. Li, J. C. Mitchell, and W. H. Winsborough. Design of a role-based
trust-management framework. In Proceedings 2002 IEEE Symposium on
Security and Privacy, pages 114–130. IEEE, 2002.
[123] M. Luby and C. Rackoff. How to construct pseudorandom permutations
from pseudorandom functions. SIAM Journal on Computing, 17(2):373–
386, Apr. 1988.
[124] I. Mantin and A. Shamir. A practical attack on broadcast RC4. In
International Workshop on Fast Software Encryption, pages 152–164.
Springer, 2001.
[125] M. Marlinspike. More tricks for defeating ssl in practice. Black Hat USA,
2009.

414
[126] J. Mason, K. Watkins, J. Eisner, and A. Stubblefield. A natural language
approach to automated cryptanalysis of two-time pads. In Proceedings
of the 13th ACM conference on Computer and communications security,
pages 235–244. ACM, 2006.
[127] A. J. Menezes, P. C. van Oorschot, and S. A. Vanston, editors. Handbook
of Applied Cryptography. CRC Press, 1996.
[128] R. C. Merkle. A certified digital signature. In G. Brassard, editor, Ad-
vances in Cryptology—CRYPTO ’89, volume 435 of Lecture Notes in
Computer Science, pages 218–238. Springer-Verlag, 1990, Aug. 1989.
[129] R. C. Merkle. One way hash functions and DES. pages 428–446, Berlin
- Heidelberg - New York, Aug. 1990. Springer.
[130] C. Meyer and J. Schwenk. Lessons learned from previous SSL/TLS at-
tacks - a brief chronology of attacks and weaknesses. IACR Cryptology
ePrint Archive, 2013:49, 2013.
[131] B. Moeller and A. Langley. TLS Fallback Signaling Cipher Suite Value
(SCSV) for Preventing Protocol Downgrade Attacks. RFC 7507 (Pro-
posed Standard), Apr. 2015.
[132] B. Möller, T. Duong, and K. Kotowicz. This POODLE bites: Exploiting
the SSL 3.0 fallback, September 2014. Online, accessed 01-Sept-2017.
[133] M. Naor and M. Yung. Universal one-way hash functions and their cryp-
tographic applications. In Proceedings of the twenty-first annual ACM
symposium on Theory of computing, pages 33–43, 1989.
[134] National Bureau of Standards. Data Encryption Standard. U. S. Depart-
ment of Commerce, Washington, DC, USA, Jan. 1977.
[135] B. C. Neuman and T. Ts’o. Kerberos: An authentication service for
computer networks. IEEE Communications magazine, 32(9):33–38, 1994.
[136] P. Oechslin. Making a faster cryptanalytic time-memory trade-off. In
Annual International Cryptology Conference, pages 617–630. Springer,
2003.
[137] R. Oppliger. SSL and TLS: Theory and Practice. Artech House, 2016.
[138] C. Paar and J. Pelzl. Understanding Cryptography - A Textbook for Stu-
dents and Practitioners. Springer, 2010.
[139] Y. Pettersen. The Transport Layer Security (TLS) Multiple Certificate
Status Request Extension. RFC 6961 (Proposed Standard), June 2013.
Obsoleted by RFC 8446.
[140] J. Postel. Domain Name System Structure and Delegation. RFC 1591
(Informational), Mar. 1994.

415
[141] S. Radack. Secure hash standard: Updated specifications approved and
issued as federal information processing standard (fips) 180-4. Technical
report, National Institute of Standards and Technology, 2012.
[142] B. Ramsdell and S. Turner. Secure/Multipurpose Internet Mail Exten-
sions (S/MIME) Version 3.2 Message Specification. RFC 5751 (Proposed
Standard), Jan. 2010.
[143] E. Rescorla. SSL and TLS: designing and building secure systems, vol-
ume 1. Addison-Wesley Reading, 2001.
[144] R. L. Rivest, A. Shamir, and L. Adleman. A method for obtaining digital
signatures and public-key cryptosystems. Communications of the ACM,
21(2):120–126, 1978.
[145] J. Rizzo and T. Duong. Crime: Compression ratio info-leak made easy.
In ekoparty Security Conference, 2012.
[146] P. Rogaway. Authenticated-encryption with associated-data. In
V. Atluri, editor, ACM Conference on Computer and Communications
Security, pages 98–107. ACM, November 2002.
[147] P. Rogaway. Evaluation of some blockcipher modes of operation. Cryptog-
raphy Research and Evaluation Committees (CRYPTREC) for the Gov-
ernment of Japan, 2011.
[148] P. Rogaway and T. Shrimpton. Cryptographic hash-function basics: Def-
initions, implications, and separations for preimage resistance, second-
preimage resistance, and collision resistance. In International workshop
on fast software encryption, pages 371–388. Springer, 2004.
[149] P. Saint-Andre and J. Hodges. Representation and Verification of
Domain-Based Application Service Identity within Internet Public Key
Infrastructure Using X.509 (PKIX) Certificates in the Context of Trans-
port Layer Security (TLS). RFC 6125 (Proposed Standard), Mar. 2011.
[150] J. Salowey, H. Zhou, P. Eronen, and H. Tschofenig. Transport Layer
Security (TLS) Session Resumption without Server-Side State. RFC 5077
(Proposed Standard), Jan. 2008. Obsoleted by RFC 8446, updated by
RFC 8447.
[151] S. Santesson, M. Myers, R. Ankney, A. Malpani, S. Galperin, and
C. Adams. X.509 Internet Public Key Infrastructure Online Certificate
Status Protocol - OCSP. RFC 6960 (Proposed Standard), June 2013.
[152] C. E. Shannon. Communication theory of secrecy systems. Bell Systen
Technicl Journal, 28:656–715, Oct. 1949.
[153] Y. Sheffer, R. Holz, and P. Saint-Andre. Summarizing Known Attacks
on Transport Layer Security (TLS) and Datagram TLS (DTLS). RFC
7457 (Informational), Feb. 2015.

416
[154] J. Simpson. An in-depth technical analysis of Curve-
Ball (CVE-2020-0601). Published in Trend Mi-
cro security intelligence blog, online at:[Link]
[Link]/trendlabs-security-intelligence/
an-in-depth-technical-analysis-of-curveball-cve-2020-0601/,
February 2020.
[155] S. Singh. The Science of Secrecy: The Secret History of Codes and Code-
breaking. Fourth Estate, London, UK, 2001.
[156] N. P. Smart. Cryptography Made Simple. Information Security and Cryp-
tography. Springer, 2016.
[157] T. Smith, L. Dickenson, and K. E. Seamons. Let’s revoke: Scalable global
certificate revocation. In NDSS. The Internet Society, 2020.
[158] M. Stevens, A. K. Lenstra, and B. de Weger. Chosen-prefix collisions for
md5 and colliding x.509 certificates for different identities. In M. Naor,
editor, Advances in Cryptology - EUROCRYPT 2007, 26th Annual In-
ternational Conference on the Theory and Applications of Cryptographic
Techniques, Barcelona, Spain, May 20-24, 2007, Proceedings, volume
4515 of Lecture Notes in Computer Science, pages 1–22. Springer, 2007.
[159] D. R. Stinson. Cryptography: theory and practice. CRC press, 2005.
[160] G. Tsudik. Message authentication with one-way hash functions. ACM
SIGCOMM Computer Communication Review, 22(5):29–38, 1992.
[161] A. M. Turing. On computable numbers, with an application to the
entscheidungsproblem. Procedings of the London Mathematical Society,
42(2):230–265, 1936.
[162] A. M. Turing. Computing machinery and intelligence. Mind,
59(236):433–460, Oct. 1950.
[163] L. Valenta, D. Adrian, A. Sanso, S. Cohney, J. Fried, M. Hastings, J. A.
Halderman, and N. Heninger. Measuring small subgroup attacks against
diffie-hellman. In Network and Distributed System Security Symposium
(NDSS), 2017.
[164] M. Vanhoef and F. Piessens. Key reinstallation attacks: Forcing nonce
reuse in wpa2. In B. M. Thuraisingham, D. Evans, T. Malkin, and D. Xu,
editors, Proceedings of the 2017 ACM SIGSAC Conference on Computer
and Communications Security, CCS 2017, Dallas, TX, USA, pages 1313–
1328. ACM, 2017.
[165] S. Venkata, S. Harwani, C. Pignataro, and D. McPherson. Dynamic Host-
name Exchange Mechanism for OSPF. RFC 5642 (Proposed Standard),
Aug. 2009.

417
[166] G. S. Vernam. Secret signaling system, July 22 1919. US Patent
1,310,719.
[167] J. Von Neumann. Various techniques used in connection with random
digits. Applied Math Series, 12(36-38):1, 1951.
[168] D. Wagner, B. Schneier, et al. Analysis of the ssl 3.0 protocol. In The
Second USENIX Workshop on Electronic Commerce Proceedings, pages
29–40, 1996.
[169] N. Wiener. Cybernetics, or control and communication in the animal and
the machine. John Wiley, New York, 1948.
[170] Wikipedia. Block cipher mode of operation, 2017. [Online; accessed
1-Sept-2017].
[171] Wikipedia contributors. Forward secrecy — Wikipedia, the free ency-
clopedia. [Link]
secrecy&oldid=973196673, 2020. [Online; accessed 16-August-2020].
[172] R. J. Wilson. Introduction to Graph Theory. Pearson, New York, NY, 5
edition, 2010.
[173] L. Zhang, D. R. Choffnes, T. Dumitras, D. Levin, A. Mislove, A. Schul-
man, and C. Wilson. Analysis of ssl certificate reissues and revocations
in the wake of heartbleed. Commun. ACM, 61(3):109–116, 2018.
[174] K. Zuse. Method for automatic execution of calculations with the aid of
computers (1936). In B. Randell, editor, The Origins of Digital Comput-
ers: Selected Papers, Texts and monographs in computer science, pages
163–170. Springer-Verlag, pub-SV:adr, third edition, 1982.

418

View publication stats

You might also like