[tor-commits] [torspec/master] Assign the RebelAlliance hybrid handshake proposal a number.
isis at torproject.org
isis at torproject.org
Fri Jul 22 12:05:35 UTC 2016
commit 875fdaecd544bd477edfa3ee98ec7458f53583ed
Author: Isis Lovecruft <isis at torproject.org>
Date: Fri Jul 22 12:03:10 2016 +0000
Assign the RebelAlliance hybrid handshake proposal a number.
---
proposals/000-index.txt | 2 +
proposals/270-newhope-hybrid-handshake.txt | 764 +++++++++++++++++++++++++++++
proposals/XXX-newhope-hybrid-handshake.txt | 764 -----------------------------
proposals/proposal-status.txt | 6 +
4 files changed, 772 insertions(+), 764 deletions(-)
diff --git a/proposals/000-index.txt b/proposals/000-index.txt
index c794db7..4af78b9 100644
--- a/proposals/000-index.txt
+++ b/proposals/000-index.txt
@@ -190,6 +190,7 @@ Proposals by number:
267 Tor Consensus Transparency [DRAFT]
268 New Guard Selection Behaviour [DRAFT]
269 Transitionally secure hybrid handshakes [DRAFT]
+270 RebelAlliance: A Post-Quantum Secure Hybrid Handshake Based on NewHope [DRAFT]
Proposals by status:
@@ -217,6 +218,7 @@ Proposals by status:
267 Tor Consensus Transparency
268 New Guard Selection Behaviour
269 Transitionally secure hybrid handshakes
+ 270 RebelAlliance: A Post-Quantum Secure Hybrid Handshake Based on NewHope
NEEDS-REVISION:
190 Bridge Client Authorization Based on a Shared Secret
NEEDS-RESEARCH:
diff --git a/proposals/270-newhope-hybrid-handshake.txt b/proposals/270-newhope-hybrid-handshake.txt
new file mode 100644
index 0000000..ccf3390
--- /dev/null
+++ b/proposals/270-newhope-hybrid-handshake.txt
@@ -0,0 +1,764 @@
+Filename: 270-newhope-hybrid-handshake.txt
+Title: RebelAlliance: A Post-Quantum Secure Hybrid Handshake Based on NewHope
+Author: Isis Lovecruft, Peter Schwabe
+Created: 16 Apr 2016
+Updated: 22 Jul 2016
+Status: Draft
+Depends: prop#220 prop#249 prop#264 prop#270
+
+§0. Introduction
+
+ RebelAlliance is a post-quantum secure hybrid handshake, comprised of an
+ alliance between the X25519 and NewHope key exchanges.
+
+ NewHope is a post-quantum-secure lattice-based key-exchange protocol based
+ on the ring-learning-with-errors (Ring-LWE) problem. We propose a hybrid
+ handshake for Tor, based on a combination of Tor's current NTor handshake
+ and a shared key derived through a NewHope ephemeral key exchange.
+
+ For further details on the NewHope key exchange, the reader is referred to
+ "Post-quantum key exchange - a new hope" by Alkim, Ducas, Pöppelmann, and
+ Schwabe [0][1].
+
+ For the purposes of brevity, we consider that NTor is currently the only
+ handshake protocol in Tor; the older TAP protocol is ignored completely, due
+ to the fact that it is currently deprecated and nearly entirely unused.
+
+
+§1. Motivation
+
+ An attacker currently monitoring and storing circuit-layer NTor handshakes
+ who later has the ability to run Shor's algorithm on a quantum computer will
+ be able to break Tor's current handshake protocol and decrypt previous
+ communications.
+
+ It is unclear if and when such attackers equipped with large quantum
+ computers will exist, but various estimates by researchers in quantum
+ physics and quantum engineering give estimates of only 1 to 2 decades.
+ Clearly, the security requirements of many Tor users include secrecy of
+ their messages beyond this time span, which means that Tor needs to update
+ the key exchange to protect against such attackers as soon as possible.
+
+
+§2. Design
+
+ An initiator and responder, in parallel, conduct two handshakes:
+
+ - An X25519 key exchange, as described in the description of the NTor
+ handshake in Tor proposal #216.
+ - A NewHope key exchange.
+
+ The shared keys derived from these two handshakes are then concatenated and
+ used as input to the SHAKE-256 extendable output function (XOF), as described
+ in FIPS-PUB-202 [2], in order to produce a shared key of the desired length.
+ The testvectors in §C assume that this key has a length of 32 bytes, but the
+ use of a XOF allows arbitrary lengths to easily support future updates of
+ the symmetric primitives using the key. See also §3.3.1.
+
+
+§3. Specification
+
+§3.1. Notation
+
+ Let `a || b` be the concatenation of a with b.
+
+ Let `a^b` denote the exponentiation of a to the bth power.
+
+ Let `a == b` denote the equality of a with b, and vice versa.
+
+ Let `a := b` be the assignment of the value of b to the variable a.
+
+ Let `H(x)` be 32-bytes of output of the SHAKE-256 XOF (as described in
+ FIPS-PUB-202) applied to message x.
+
+ Let X25519 refer to the curve25519-based key agreement protocol described
+ in RFC7748 §6.1. [3]
+
+ Let `EXP(a, b) == X25519(., b, a)` with `g == 9`. Let X25519_KEYGEN() do
+ the appropriate manipulations when generating the secret key (clearing the
+ low bits, twidding the high bits). Additionally, EXP() MUST include the
+ check for all-zero output due to the input point being of small
+ order (cf. RFC7748 §6).
+
+ Let `X25519_KEYID(B) == B` where B is a valid X25519 public key.
+
+ When representing an element of the Curve25519 subgroup as a byte string,
+ use the standard (32-byte, little-endian, x-coordinate-only) representation
+ for Curve25519 points.
+
+ Let `ID` be a router's identity key taken from the router microdescriptor.
+ In the case for relays possessing Ed25519 identity keys (cf. Tor proposal
+ #220), this is a 32-byte string representing the public Ed25519 identity key.
+ For backwards and forwards compatibility with routers which do not possess
+ Ed25519 identity keys, this is a 32-byte string created via the output of
+ H(ID).
+
+ We refer to the router as the handshake "responder", and the client (which
+ may be an OR or an OP) as the "initiator".
+
+
+ ID_LENGTH [32 bytes]
+ H_LENGTH [32 bytes]
+ G_LENGTH [32 bytes]
+
+ PROTOID := "pqtor-x25519-newhope-shake256-1"
+ T_MAC := PROTOID || ":mac"
+ T_KEY := PROTOID || ":key_extract"
+ T_VERIFY := PROTOID || ":verify"
+
+ (X25519_SK, X25519_PK) := X25519_KEYGEN()
+
+
+§3.2. Protocol
+
+ ========================================================================================
+ | |
+ | Fig. 1: The NewHope-X25519 Hybrid Handshake. |
+ | |
+ | Before the handshake the Initiator is assumed to know Z, a public X25519 key for |
+ | the Responder, as well as the Responder's ID. |
+ ----------------------------------------------------------------------------------------
+ | |
+ | Initiator Responder |
+ | |
+ | SEED := H(randombytes(32)) |
+ | x, X := X25519_KEYGEN() |
+ | a, A := NEWHOPE_KEYGEN(SEED) |
+ | CLIENT_HDATA := ID || Z || X || A |
+ | |
+ | --- CLIENT_HDATA ---> |
+ | |
+ | y, Y := X25519_KEYGEN() |
+ | NTOR_KEY, AUTH := NTOR_SHAREDB(X,y,Y,z,Z,ID,B) |
+ | M, NEWHOPE_KEY := NEWHOPE_SHAREDB(A) |
+ | SERVER_HDATA := Y || AUTH || M |
+ | sk := SHAKE-256(NTOR_KEY || NEWHOPE_KEY) |
+ | |
+ | <-- SERVER_HDATA ---- |
+ | |
+ | NTOR_KEY := NTOR_SHAREDA(x, X, Y, Z, ID, AUTH) |
+ | NEWHOPE_KEY := NEWHOPE_SHAREDA(M, a) |
+ | sk := SHAKE-256(NTOR_KEY || NEWHOPE_KEY) |
+ | |
+ ========================================================================================
+
+
+§3.2.1. The NTor Handshake
+
+§3.2.1.1. Prologue
+
+ Take a router with identity ID. As setup, the router generates a secret key z,
+ and a public onion key Z with:
+
+ z, Z := X25519_KEYGEN()
+
+ The router publishes Z in its server descriptor in the "ntor-onion-key" entry.
+ Henceforward, we refer to this router as the "responder".
+
+
+§3.2.1.2. Initiator
+
+ To send a create cell, the initiator generates a keypair:
+
+ x, X := X25519_KEYGEN()
+
+ and creates the NTor portion of a CREATE2V cell's HDATA section:
+
+ CLIENT_NTOR := ID || Z || X [96 bytes]
+
+ The initiator includes the responder's ID and Z in the CLIENT_NTOR so that, in
+ the event the responder OR has recently rotated keys, the responder can
+ determine which keypair to use.
+
+ The initiator then concatenates CLIENT_NTOR with CLIENT_NEWHOPE (see §3.2.2),
+ to create CLIENT_HDATA, and creates and sends a CREATE2V cell (see §A.1)
+ to the responder.
+
+ CLIENT_NEWHOPE [1824 bytes] (see §3.2.2)
+ CLIENT_HDATA := CLIENT_NTOR || CLIENT_NEWHOPE [1920 bytes]
+
+ If the responder does not respond with a CREATED2V cell, the initiator SHOULD
+ NOT attempt to extend the circuit through the responder by sending fragmented
+ EXTEND2 cells, since the responder's lack of support for CREATE2V cells is
+ assumed to imply the responder also lacks support for fragmented EXTEND2
+ cells. Alternatively, for initiators with a sufficiently late consensus
+ method, the initiator MUST check that "proto" line in the responder's
+ descriptor (cf. Tor proposal #264) advertises support for the "Relay"
+ subprotocol version 3 (see §5).
+
+
+§3.2.1.3. Responder
+
+ The responder generates a keypair of y, Y = X25519_KEYGEN(), and does
+ NTOR_SHAREDB() as follows:
+
+ (NTOR_KEY, AUTH) ← NTOR_SHAREDB(X, y, Y, z, Z, ID, B):
+ secret_input := EXP(X, y) || EXP(X, z) || ID || B || Z || Y || PROTOID
+ NTOR_KEY := H(secret_input, T_KEY)
+ verify := H(secret_input, T_VERIFY)
+ auth_input := verify || ID || Z || Y || X || PROTOID || "Server"
+ AUTH := H(auth_input, T_MAC)
+
+ The responder sends a CREATED2V cell containing:
+
+ SERVER_NTOR := Y || AUTH [64 bytes]
+ SERVER_NEWHOPE [2048 bytes] (see §3.2.2)
+ SERVER_HDATA := SERVER_NTOR || SERVER_NEWHOPE [2112 bytes]
+
+ and sends this to the initiator.
+
+
+§3.2.1.4. Finalisation
+
+ The initiator then checks Y is in G^* [see NOTE below], and does
+ NTOR_SHAREDA() as follows:
+
+ (NTOR_KEY) ← NTOR_SHAREDA(x, X, Y, Z, ID, AUTH)
+ secret_input := EXP(Y, x) || EXP(Z, x) || ID || Z || X || Y || PROTOID
+ NTOR_KEY := H(secret_input, T_KEY)
+ verify := H(secret_input, T_VERIFY)
+ auth_input := verify || ID || Z || Y || X || PROTOID || "Server"
+ if AUTH == H(auth_input, T_MAC)
+ return NTOR_KEY
+
+ Both parties now have a shared value for NTOR_KEY. They expand this into
+ the keys needed for the Tor relay protocol.
+
+ [XXX We think we want to omit the final hashing in the production of NTOR_KEY
+ here, and instead put all the inputs through SHAKE-256. --isis, peter]
+
+ [XXX We probably want to remove ID and B from the input to the shared key
+ material, since they serve for authentication but, as pre-established
+ "prologue" material to the handshake, they should not be used in attempts to
+ strengthen the cryptographic suitability of the shared key. Also, their
+ inclusion is implicit in the DH exponentiations. I should probably ask Ian
+ about the reasoning for the original design choice. --isis]
+
+
+§3.2.2. The NewHope Handshake
+
+§3.2.2.1. Parameters & Mathematical Structures
+
+ Let ℤ be the ring of rational integers. Let ℤq, for q ≥ 1, denote the quotient
+ ring ℤ/qℤ. We define R = ℤ[X]/((X^n)+1) as the ring of integer polynomials
+ modulo ((X^n)+1), and Rq = ℤq[X]/((X^n)+1) as the ring of integer polynomials
+ modulo ((X^n)+1) where each coefficient is reduced modulo q. When we refer to
+ a polynomial, we mean an element of Rq.
+
+ n := 1024
+ q := 12289
+
+ SEED [32 Bytes]
+ NEWHOPE_POLY [1792 Bytes]
+ NEWHOPE_REC [256 Bytes]
+ NEWHOPE_KEY [32 Bytes]
+
+ NEWHOPE_MSGA := (NEWHOPE_POLY || SEED)
+ NEWHOPE_MSGB := (NEWHOPE_POLY || NEWHOPE_REC)
+
+
+§3.2.2.2. High-level Description of Newhope API Functions
+
+ For a description of internal functions, see §B.
+
+ (NEWHOPE_POLY, NEWHOPE_MSGA) ← NEWHOPE_KEYGEN(SEED):
+ â := gen_a(seed)
+ s := poly_getnoise()
+ e := poly_getnoise()
+ ŝ := poly_ntt(s)
+ ê := poly_ntt(e)
+ b̂ := pointwise(â, ŝ) + ê
+ sp := poly_tobytes(ŝ)
+ bp := poly_tobytes(b̂)
+ return (sp, (bp || seed))
+
+ (NEWHOPE_MSGB, NEWHOPE_KEY) ← NEWHOPE_SHAREDB(NEWHOPE_MSGA):
+ s' := poly_getnoise()
+ e' := poly_getnoise()
+ e" := poly_getnoise()
+ b̂ := poly_frombytes(bp)
+ â := gen_a(seed)
+ ŝ' := poly_ntt(s')
+ ê' := poly_ntt(e')
+ û := poly_pointwise(â, ŝ') + ê'
+ v := poly_invntt(poly_pointwise(b̂,ŝ')) + e"
+ r := helprec(v)
+ up := poly_tobytes(û)
+ k := rec(v, r)
+ return ((up || r), k)
+
+ NEWHOPE_KEY ← NEWHOPE_SHAREDA(NEWHOPE_MSGB, NEWHOPE_POLY):
+ û := poly_frombytes(up)
+ ŝ := poly_frombytes(sp)
+ v' := poly_invntt(poly_pointwise(û, ŝ))
+ k := rec(v', r)
+ return k
+
+ When a client uses a SEED within a CREATE2V cell, the client SHOULD NOT use
+ that SEED in any other CREATE2V or EXTEND2 cells. See §4 for further
+ discussion.
+
+
+§3.3. Key Expansion
+
+ The client and server derive a shared key, SHARED, by:
+
+ HKDFID := "THESE ARENT THE DROIDS YOURE LOOKING FOR"
+ SHARED := SHAKE_256(HKDFID || NTorKey || NewHopeKey)
+
+
+§3.3.1. Note on the Design Choice
+
+ The reader may wonder why one would use SHAKE-256 to produce a 256-bit
+ output, since the security strength in bits for SHAKE-256 is min(d/2,256)
+ for collision resistance and min(d,256) for first- and second-order
+ preimages, where d is the output length.
+
+ The reasoning is that we should be aiming for 256-bit security for all of
+ our symmetric cryptography. One could then argue that we should just use
+ SHA3-256 for the KDF. We choose SHAKE-256 instead in order to provide an
+ easy way to derive longer shared secrets in the future without requiring a
+ new handshake. The construction is odd, but the future is bright.
+ As we are already using SHAKE-256 for the 32-byte output hash, we are also
+ using it for all other 32-byte hashes involved in the protocol. Note that
+ the only difference between SHA3-256 and SHAKE-256 with 32-byte output is
+ one domain-separation byte.
+
+ [XXX why would you want 256-bit security for the symmetric side? Are you
+ talking pre- or post-quantum security? --peter]
+
+
+§4. Security & Anonymity Implications
+
+ This handshake protocol is one-way authenticated. That is, the server is
+ authenticated, while the client remains anonymous.
+
+ The client MUST NOT cache and reuse SEED. Doing so gives non-trivial
+ adversarial advantages w.r.t. all-for-the-price-of-one attacks during the
+ caching period. More importantly, if the SEED used to generate NEWHOPE_MSGA
+ is reused for handshakes along the same circuit or multiple different
+ circuits, an adversary conducting a sybil attack somewhere along the path(s)
+ will be able to correlate the identity of the client across circuits or
+ hops.
+
+
+§5. Compatibility
+
+ Because our proposal requires both the client and server to send more than
+ the 505 bytes possible within a CREATE2 cell's HDATA section, it depends
+ upon the implementation of a mechanism for allowing larger CREATE cells
+ (cf. Tor proposal #249).
+
+ We reserve the following handshake type for use in CREATE2V/CREATED2V and
+ EXTEND2V/EXTENDED2V cells:
+
+ 0x0003 [NEWHOPE + X25519 HYBRID HANDSHAKE]
+
+ We introduce a new sub-protocol number, "Relay=3", (cf. Tor proposal #264
+ §5.3) to signify support this handshake, and hence for the CREATE2V and
+ fragmented EXTEND2 cells which it requires.
+
+ There are no additional entries or changes required within either router
+ descriptors or microdescriptors to support this handshake method, due to the
+ NewHope keys being ephemeral and derived on-the-fly, and due to the NTor X25519
+ public keys already being included within the "ntor-onion-key" entry.
+
+ Add a "UseNewHopeKEX" configuration option and a corresponding consensus
+ parameter to control whether clients prefer using this NewHope hybrid
+ handshake or some previous handshake protocol. If the configuration option
+ is "auto", clients SHOULD obey the consensus parameter. The default
+ configuration SHOULD be "auto" and the consensus value SHOULD initially be "0".
+
+
+§6. Implementation
+
+ The paper by Alkim, Ducas, Pöppelmann and Schwabe describes two software
+ implementations of NewHope, one C reference implementation and an optimized
+ implementation using AVX2 vector instructions. Those implementations are
+ available at [1].
+
+ Additionally, there are implementations in Go by Yawning Angel, available
+ from [4] and in Rust by Isis Lovecruft, available from [5].
+
+ The software used to generate the test vectors in §C is based on the C
+ reference implementation and available from:
+
+ https://code.ciph.re/isis/newhope-tor-testvectors
+ https://github.com/isislovecruft/newhope-tor-testvectors
+
+
+§7. Performance & Scalability
+
+ The computationally expensive part in the current NTor handshake is the
+ X25519 key-pair generation and the X25519 shared-key computation. The
+ current implementation in Tor is a wrapper to support various highly optimized
+ implementations on different architectures. On Intel Haswell processors, the
+ fastest implementation of X25519, as reported by the eBACS benchmarking
+ project [6], takes 169920 cycles for key-pair generation and 161648 cycles
+ for shared-key computation; these add up to a total of 331568 cycles on each
+ side (initiator and responder).
+
+ The C reference implementation of NewHope, also benchmarked on Intel
+ Haswell, takes 358234 cycles for the initiator and 402058 cycles for the
+ Responder. The core computation of the proposed combination of NewHope and
+ X25519 will thus mean a slowdown of about a factor of 2.1 for the Initiator
+ and a slowdown by a factor of 2.2 for the Responder compared to the current
+ NTor handshake. These numbers assume a fully optimized implementation of the
+ NTor handshake and a C reference implementation of NewHope. With optimized
+ implementations of NewHope, such as the one for Intel Haswell described in
+ [0], the computational slowdown will be considerably smaller than a factor
+ of 2.
+
+
+§8. References
+
+[0]: https://cryptojedi.org/papers/newhope-20160328.pdf
+[1]: https://cryptojedi.org/crypto/#newhope
+[2]: http://www.nist.gov/customcf/get_pdf.cfm?pub_id=919061
+[3]: https://tools.ietf.org/html/rfc7748#section-6.1
+[4]: https://github.com/Yawning/newhope
+[5]: https://code.ciph.re/isis/newhopers
+[6]: http://bench.cr.yp.to
+
+
+§A. Cell Formats
+
+§A.1. CREATE2V Cells
+
+ The client portion of the handshake should send CLIENT_HDATA, formatted
+ into a CREATE2V cell as follows:
+
+ CREATE2V { [2114 bytes]
+ HTYPE := 0x0003 [2 bytes]
+ HLEN := 0x0780 [2 bytes]
+ HDATA := CLIENT_HDATA [1920 bytes]
+ IGNORED := 0x00 [194 bytes]
+ }
+
+ [XXX do we really want to pad with IGNORED to make CLIENT_HDATA the
+ same number of bytes as SERVER_HDATA? --isis]
+
+§A.2. CREATED2V Cells
+
+ The server responds to the client's CREATE2V cell with SERVER_HDATA,
+ formatted into a CREATED2V cell as follows:
+
+ CREATED2V { [2114 bytes]
+ HLEN := 0x0800 [2 bytes]
+ HDATA := SERVER_HDATA [2112 bytes]
+ IGNORED := 0x00 [0 bytes]
+ }
+
+§A.3. Fragmented EXTEND2 Cells
+
+ When the client wishes to extend a circuit, the client should fragment
+ CLIENT_HDATA into four EXTEND2 cells:
+
+ EXTEND2 {
+ NSPEC := 0x02 { [1 byte]
+ LINK_ID_SERVER [22 bytes] XXX
+ LINK_ADDRESS_SERVER [8 bytes] XXX
+ }
+ HTYPE := 0x0003 [2 bytes]
+ HLEN := 0x0780 [2 bytes]
+ HDATA := CLIENT_HDATA[0,461] [462 bytes]
+ }
+ EXTEND2 {
+ NSPEC := 0x00 [1 byte]
+ HTYPE := 0xFFFF [2 bytes]
+ HLEN := 0x0000 [2 bytes]
+ HDATA := CLIENT_HDATA[462,954] [492 bytes]
+ }
+ EXTEND2 {
+ NSPEC := 0x00 [1 byte]
+ HTYPE := 0xFFFF [2 bytes]
+ HLEN := 0x0000 [2 bytes]
+ HDATA := CLIENT_HDATA[955,1447] [492 bytes]
+ }
+ EXTEND2 {
+ NSPEC := 0x00 [1 byte]
+ HTYPE := 0xFFFF [2 bytes]
+ HLEN := 0x0000 [2 bytes]
+ HDATA := CLIENT_HDATA[1448,1919] || 0x00[20] [492 bytes]
+ }
+ EXTEND2 {
+ NSPEC := 0x00 [1 byte]
+ HTYPE := 0xFFFF [2 bytes]
+ HLEN := 0x0000 [2 bytes]
+ HDATA := 0x00[172] [172 bytes]
+ }
+
+ The client sends this to the server to extend the circuit from, and that
+ server should format the fragmented EXTEND2 cells into a CREATE2V cell, as
+ described in §A.1.
+
+§A.4. Fragmented EXTENDED2 Cells
+
+ EXTENDED2 {
+ NSPEC := 0x02 { [1 byte]
+ LINK_ID_SERVER [22 bytes] XXX
+ LINK_ADDRESS_SERVER [8 bytes] XXX
+ }
+ HTYPE := 0x0003 [2 bytes]
+ HLEN := 0x0800 [2 bytes]
+ HDATA := SERVER_HDATA[0,461] [462 bytes]
+ }
+ EXTENDED2 {
+ NSPEC := 0x00 [1 byte]
+ HTYPE := 0xFFFF [2 bytes]
+ HLEN := 0x0000 [2 bytes]
+ HDATA := SERVER_HDATA[462,954] [492 bytes]
+ }
+ EXTEND2 {
+ NSPEC := 0x00 [1 byte]
+ HTYPE := 0xFFFF [2 bytes]
+ HLEN := 0x0000 [2 bytes]
+ HDATA := SERVER_HDATA[955,1447] [492 bytes]
+ }
+ EXTEND2 {
+ NSPEC := 0x00 [1 byte]
+ HTYPE := 0xFFFF [2 bytes]
+ HLEN := 0x0000 [2 bytes]
+ HDATA := SERVER_HDATA[1448,1939] [492 bytes]
+ }
+ EXTEND2 {
+ NSPEC := 0x00 [1 byte]
+ HTYPE := 0xFFFF [2 bytes]
+ HLEN := 0x0000 [2 bytes]
+ HDATA := SERVER_HDATA[1940,2112] [172 bytes]
+ }
+
+
+§B. NewHope Internal Functions
+
+ gen_a(SEED): returns a uniformly random poly
+ poly_getnoise(): returns a poly sampled from a centered binomial
+ poly_ntt(poly): number-theoretic transform; returns a poly
+ poly_invntt(poly): inverse number-theoretic transform; returns a poly
+ poly_pointwise(poly, poly): pointwise multiplication; returns a poly
+ poly_tobytes(poly): packs a poly to a NEWHOPE_POLY byte array
+ poly_frombytes(NEWHOPE_POLY): unpacks a NEWHOPE_POLY byte array to a poly
+
+ helprec(poly): returns a NEWHOPE_REC byte array
+ rec(poly, NEWHOPE_REC): returns a NEWHOPE_KEY
+
+
+ --- Description of the Newhope internal functions ---
+
+ gen_a(SEED seed) receives as input a 32-byte (public) seed. It expands
+ this seed through SHAKE-128 from the FIPS202 standard. The output of SHAKE-128
+ is considered a sequence of 16-bit little-endian integers. This sequence is
+ used to initialize the coefficients of the returned polynomial from the least
+ significant (coefficient of X^0) to the most significant (coefficient of
+ X^1023) coefficient. For each of the 16-bit integers first eliminate the
+ highest two bits (to make it a 14-bit integer) and then use it as the next
+ coefficient if it is smaller than q=12289.
+ Note that the amount of output required from SHAKE to initialize all 1024
+ coefficients of the polynomial varies depending on the input seed.
+ Note further that this function does not process any secret data and thus does
+ not need any timing-attack protection.
+
+
+ poly_getnoise() first generates 4096 bytes of uniformly random data. This can
+ be done by reading these bytes from the system's RNG; efficient
+ implementations will typically only read a 32-byte seed from the system's RNG
+ and expand it through some fast PRG (for example, ChaCha20 or AES-256 in CTR
+ mode). The output of the PRG is considered an array of 2048 16-bit integers
+ r[0],...,r[2047]. The coefficients of the output polynomial are computed as
+ HW(r[0])-HW(r[1]), HW(r[2])-HW(r[3]),...,HW(r[2046])-HW(r[2047]), where HW
+ stands for Hamming weight.
+ Note that the choice of RNG is a local decision; different implementations are
+ free to use different RNGs.
+ Note further that the output of this function is secret; the PRG (and the
+ computation of HW) need to be protected against timing attacks.
+
+
+ poly_ntt(poly f): For a mathematical description of poly_ntt see the [0]; a
+ pseudocode description of a very naive in-place transformation of an input
+ polynomial f = f[0] + f[1]*X + f[2]*X^2 + ... + f[1023]*X^1023 is the
+ following code (all arithmetic on coefficients performed modulo q):
+
+ psi = 7
+ omega = 49
+
+ for i in range(0,n):
+ t[i] = f[i] * psi^i
+
+ for i in range(0,n):
+ f[i] = 0
+ for j in range(0,n):
+ f[i] += t[j] * omega^((i*j)%n)
+
+ Note that this is not how poly_ntt should be implemented if performance is
+ an issue; in particular, efficient algorithms for the number-theoretic
+ transform take time O(n*log(n)) and not O(n^2)
+ Note further that all arithmetic in poly_ntt has to be protected against
+ timing attacks.
+
+
+ poly_invntt(poly f): For a mathematical description of poly_invntt see the
+ [0]; a pseudocode description of a very naive in-place transformation of an
+ input polynomial f = f[0] + f[1]*X + f[2]*X^2 + ... + f[1023]*X^1023 is the
+ following code (all arithmetic on coefficients performed modulo q):
+
+ invpsi = 8778;
+ invomega = 1254;
+ invn = 12277;
+
+ for i in range(0,n):
+ t[i] = f[i];
+
+ for i in range(0,n):
+ f[i]=0;
+ for j in range(0,n):
+ f[i] += t[j] * invomega^((i*j)%n)
+ f[i] *= invpsi^i
+ f[i] *= invn
+
+ Note that this is not how poly_invntt should be implemented if performance
+ is an issue; in particular, efficient algorithms for the inverse
+ number-theoretic transform take time O(n*log(n)) and not O(n^2)
+ Note further that all arithmetic in poly_invntt has to be protected against
+ timing attacks.
+
+
+ poly_pointwise(poly f, poly g) performs pointwise multiplication of the two
+ polynomials. This means that for f = (f0 + f1*X + f2*X^2 + ... +
+ f1023*X^1023) and g = (g0 + g1*X + g2*X^2 + ... + g1023*X^1023) it computes
+ and returns h = (h0 + h1*X + h2*X^2 + ... + h1023*X^1023) with h0 = f0*g0,
+ h1 = f1*g1,..., h1023 = f1023*g1023.
+
+
+ poly_tobytes(poly f) first reduces all coefficents of f modulo q, i.e.,
+ brings them to the interval [0,q-1]. Denote these reduced coefficients as
+ f0,..., f1023; note that they all fit into 14 bits. The function then packs
+ those coefficients into an array of 1792 bytes r[0],..., r[1792] in "packed
+ little-endian representation", i.e.,
+ r[0] = f[0] & 0xff;
+ r[1] = (f[0] >> 8) & ((f[1] & 0x03) << 6)
+ r[2] = (f[1] >> 2) & 0xff;
+ r[3] = (f[1] >> 10) & ((f[2] & 0x0f) << 4)
+ .
+ .
+ .
+ r[1790] = (f[1022]) >> 12) & ((f[1023] & 0x3f) << 2)
+ r[1791] = f[1023] >> 6
+ Note that this function needs to be protected against timing attacks. In
+ particular, avoid non-constant-time conditional subtractions (or other
+ non-constant-time expressions) in the reduction modulo q of the coefficients.
+
+
+ poly_frombytes(NEWHOPE_POLY b) is the inverse of poly_tobytes; it receives
+ as input an array of 1792 bytes and coverts it into the internal
+ representation of a poly. Note that poly_frombytes does not need to check
+ whether the coefficients are reduced modulo q or reduce coefficients modulo
+ q. Note further that the function must not leak any information about its
+ inputs through timing information, as it is also applied to the secret key
+ of the initiator.
+
+
+ helprec(poly f) computes 256 bytes of reconciliation information from the
+ input poly f. Internally, one byte of reconciliation information is computed
+ from four coefficients of f by a function helprec4. Let the input polynomial f
+ = (f0 + f1*X + f2*X^2 + ... + f1023*X^1023); let the output byte array be
+ r[0],...r[256]. This output byte array is computed as
+ r[0] = helprec4(f0,f256,f512,f768)
+ r[1] = helprec4(f1,f257,f513,f769)
+ r[2] = helprec4(f2,f258,f514,f770)
+ .
+ .
+ .
+ r[255] = helprec4(f255,f511,f767,f1023), where helprec4 does the following:
+
+ helprec4(x0,x1,x2,x3):
+ b = randombit()
+ r0,r1,r2,r3 = CVPD4(8*x0+4*b,8*x1+4*b,8*x2+4*b,8*x3+4*b)
+ r = (r0 & 0x03) | ((r1 & 0x03) << 2) | ((r2 & 0x03) << 4) | ((r3 & 0x03) << 6)
+ return r
+
+ The function CVPD4 does the following:
+
+ CVPD4(y0,y1,y2,y3):
+ v00 = round(y0/2q)
+ v01 = round(y1/2q)
+ v02 = round(y2/2q)
+ v03 = round(y3/2q)
+ v10 = round((y0-1)/2q)
+ v11 = round((y1-1)/2q)
+ v12 = round((y2-1)/2q)
+ v13 = round((y3-1)/2q)
+ t = abs(y0 - 2q*v00)
+ t += abs(y1 - 2q*v01)
+ t += abs(y2 - 2q*v02)
+ t += abs(y3 - 2q*v03)
+ if(t < 2q):
+ v0 = v00
+ v1 = v01
+ v2 = v02
+ v3 = v03
+ k = 0
+ else
+ v0 = v10
+ v1 = v11
+ v2 = v12
+ v3 = v13
+ r = 1
+ return (v0-v3,v1-v3,v2-v3,k+2*v3)
+
+ In this description, round(x) is defined as ⌊x + 0.5⌋, where ⌊x⌋ rounds to
+ the largest integer that does not exceed x; abs() returns the absolute
+ value.
+ Note that all computations involved in helprec operate on secret data and must
+ be protected against timing attacks.
+
+
+ rec(poly f, NEWHOPE_REC r) computes the pre-hash (see paper) Newhope key from
+ f and r. Specifically, it computes one bit of key from 4 coefficients of f and
+ one byte of r. Let f = f0 + f1*X + f2*X^2 + ... + f1023*X^1023 and let r =
+ r[0],r[1],...,r[255]. Let the bytes of the output by k[0],...,k[31] and let
+ the bits of the output by k0,...,k255, where
+ k0 = k[0] & 0x01
+ k1 = (k[0] >> 1) & 0x01
+ k2 = (k[0] >> 2) & 0x01
+ .
+ .
+ .
+ k8 = k[1] & 0x01
+ k9 = (k[1] >> 1) & 0x01
+ .
+ .
+ .
+ k255 = (k[32] >> 7)
+ The function rec computes k0,...,k255 as
+ k0 = rec4(f0,f256,f512,f768,r[0])
+ k1 = rec4(f1,f257,f513,f769,r[1])
+ .
+ .
+ .
+ k255 = rec4(f255,f511,f767,f1023,r[255])
+
+ The function rec4 does the following:
+
+ rec4(y0,y1,y2,y3,r):
+ r0 = r & 0x03
+ r1 = (r >> 2) & 0x03
+ r2 = (r >> 4) & 0x03
+ r3 = (r >> 6) & 0x03
+ Decode(8*y0-2q*r0, 8*y1-2q*r1, 8*y2-2q*r2, 8*y3-q*r3)
+
+ The function Decode does the following:
+
+ Decode(v0,v1,v2,v3):
+ t0 = round(v0/8q)
+ t1 = round(v1/8q)
+ t2 = round(v2/8q)
+ t3 = round(v3/8q)
+ t = abs(v0 - 8q*t0)
+ t += abs(v0 - 8q*t0)
+ t += abs(v0 - 8q*t0)
+ t += abs(v0 - 8q*t0)
+ if(t > 1) return 1
+ else return 0
+
+
+§C. Test Vectors
diff --git a/proposals/XXX-newhope-hybrid-handshake.txt b/proposals/XXX-newhope-hybrid-handshake.txt
deleted file mode 100644
index 9db9a96..0000000
--- a/proposals/XXX-newhope-hybrid-handshake.txt
+++ /dev/null
@@ -1,764 +0,0 @@
-Filename: XXX-newhope-hybrid-handshake.txt
-Title: RebelAlliance: A Post-Quantum Secure Hybrid Handshake Based on NewHope
-Author: Isis Lovecruft, Peter Schwabe
-Created: 16 Apr 2016
-Updated: 4 May 2016
-Status: Draft
-Depends: prop#220 prop#249 prop#264
-
-§0. Introduction
-
- RebelAlliance is a post-quantum secure hybrid handshake, comprised of an
- alliance between the X25519 and NewHope key exchanges.
-
- NewHope is a post-quantum-secure lattice-based key-exchange protocol based
- on the ring-learning-with-errors (Ring-LWE) problem. We propose a hybrid
- handshake for Tor, based on a combination of Tor's current NTor handshake
- and a shared key derived through a NewHope ephemeral key exchange.
-
- For further details on the NewHope key exchange, the reader is referred to
- "Post-quantum key exchange - a new hope" by Alkim, Ducas, Pöppelmann, and
- Schwabe [0][1].
-
- For the purposes of brevity, we consider that NTor is currently the only
- handshake protocol in Tor; the older TAP protocol is ignored completely, due
- to the fact that it is currently deprecated and nearly entirely unused.
-
-
-§1. Motivation
-
- An attacker currently monitoring and storing circuit-layer NTor handshakes
- who later has the ability to run Shor's algorithm on a quantum computer will
- be able to break Tor's current handshake protocol and decrypt previous
- communications.
-
- It is unclear if and when such attackers equipped with large quantum
- computers will exist, but various estimates by researchers in quantum
- physics and quantum engineering give estimates of only 1 to 2 decades.
- Clearly, the security requirements of many Tor users include secrecy of
- their messages beyond this time span, which means that Tor needs to update
- the key exchange to protect against such attackers as soon as possible.
-
-
-§2. Design
-
- An initiator and responder, in parallel, conduct two handshakes:
-
- - An X25519 key exchange, as described in the description of the NTor
- handshake in Tor proposal #216.
- - A NewHope key exchange.
-
- The shared keys derived from these two handshakes are then concatenated and
- used as input to the SHAKE-256 extendable output function (XOF), as described
- in FIPS-PUB-202 [2], in order to produce a shared key of the desired length.
- The testvectors in §C assume that this key has a length of 32 bytes, but the
- use of a XOF allows arbitrary lengths to easily support future updates of
- the symmetric primitives using the key. See also §3.3.1.
-
-
-§3. Specification
-
-§3.1. Notation
-
- Let `a || b` be the concatenation of a with b.
-
- Let `a^b` denote the exponentiation of a to the bth power.
-
- Let `a == b` denote the equality of a with b, and vice versa.
-
- Let `a := b` be the assignment of the value of b to the variable a.
-
- Let `H(x)` be 32-bytes of output of the SHAKE-256 XOF (as described in
- FIPS-PUB-202) applied to message x.
-
- Let X25519 refer to the curve25519-based key agreement protocol described
- in RFC7748 §6.1. [3]
-
- Let `EXP(a, b) == X25519(., b, a)` with `g == 9`. Let X25519_KEYGEN() do
- the appropriate manipulations when generating the secret key (clearing the
- low bits, twidding the high bits). Additionally, EXP() MUST include the
- check for all-zero output due to the input point being of small
- order (cf. RFC7748 §6).
-
- Let `X25519_KEYID(B) == B` where B is a valid X25519 public key.
-
- When representing an element of the Curve25519 subgroup as a byte string,
- use the standard (32-byte, little-endian, x-coordinate-only) representation
- for Curve25519 points.
-
- Let `ID` be a router's identity key taken from the router microdescriptor.
- In the case for relays possessing Ed25519 identity keys (cf. Tor proposal
- #220), this is a 32-byte string representing the public Ed25519 identity key.
- For backwards and forwards compatibility with routers which do not possess
- Ed25519 identity keys, this is a 32-byte string created via the output of
- H(ID).
-
- We refer to the router as the handshake "responder", and the client (which
- may be an OR or an OP) as the "initiator".
-
-
- ID_LENGTH [32 bytes]
- H_LENGTH [32 bytes]
- G_LENGTH [32 bytes]
-
- PROTOID := "pqtor-x25519-newhope-shake256-1"
- T_MAC := PROTOID || ":mac"
- T_KEY := PROTOID || ":key_extract"
- T_VERIFY := PROTOID || ":verify"
-
- (X25519_SK, X25519_PK) := X25519_KEYGEN()
-
-
-§3.2. Protocol
-
- ========================================================================================
- | |
- | Fig. 1: The NewHope-X25519 Hybrid Handshake. |
- | |
- | Before the handshake the Initiator is assumed to know Z, a public X25519 key for |
- | the Responder, as well as the Responder's ID. |
- ----------------------------------------------------------------------------------------
- | |
- | Initiator Responder |
- | |
- | SEED := H(randombytes(32)) |
- | x, X := X25519_KEYGEN() |
- | a, A := NEWHOPE_KEYGEN(SEED) |
- | CLIENT_HDATA := ID || Z || X || A |
- | |
- | --- CLIENT_HDATA ---> |
- | |
- | y, Y := X25519_KEYGEN() |
- | NTOR_KEY, AUTH := NTOR_SHAREDB(X,y,Y,z,Z,ID,B) |
- | M, NEWHOPE_KEY := NEWHOPE_SHAREDB(A) |
- | SERVER_HDATA := Y || AUTH || M |
- | sk := SHAKE-256(NTOR_KEY || NEWHOPE_KEY) |
- | |
- | <-- SERVER_HDATA ---- |
- | |
- | NTOR_KEY := NTOR_SHAREDA(x, X, Y, Z, ID, AUTH) |
- | NEWHOPE_KEY := NEWHOPE_SHAREDA(M, a) |
- | sk := SHAKE-256(NTOR_KEY || NEWHOPE_KEY) |
- | |
- ========================================================================================
-
-
-§3.2.1. The NTor Handshake
-
-§3.2.1.1. Prologue
-
- Take a router with identity ID. As setup, the router generates a secret key z,
- and a public onion key Z with:
-
- z, Z := X25519_KEYGEN()
-
- The router publishes Z in its server descriptor in the "ntor-onion-key" entry.
- Henceforward, we refer to this router as the "responder".
-
-
-§3.2.1.2. Initiator
-
- To send a create cell, the initiator generates a keypair:
-
- x, X := X25519_KEYGEN()
-
- and creates the NTor portion of a CREATE2V cell's HDATA section:
-
- CLIENT_NTOR := ID || Z || X [96 bytes]
-
- The initiator includes the responder's ID and Z in the CLIENT_NTOR so that, in
- the event the responder OR has recently rotated keys, the responder can
- determine which keypair to use.
-
- The initiator then concatenates CLIENT_NTOR with CLIENT_NEWHOPE (see §3.2.2),
- to create CLIENT_HDATA, and creates and sends a CREATE2V cell (see §A.1)
- to the responder.
-
- CLIENT_NEWHOPE [1824 bytes] (see §3.2.2)
- CLIENT_HDATA := CLIENT_NTOR || CLIENT_NEWHOPE [1920 bytes]
-
- If the responder does not respond with a CREATED2V cell, the initiator SHOULD
- NOT attempt to extend the circuit through the responder by sending fragmented
- EXTEND2 cells, since the responder's lack of support for CREATE2V cells is
- assumed to imply the responder also lacks support for fragmented EXTEND2
- cells. Alternatively, for initiators with a sufficiently late consensus
- method, the initiator MUST check that "proto" line in the responder's
- descriptor (cf. Tor proposal #264) advertises support for the "Relay"
- subprotocol version 3 (see §5).
-
-
-§3.2.1.3. Responder
-
- The responder generates a keypair of y, Y = X25519_KEYGEN(), and does
- NTOR_SHAREDB() as follows:
-
- (NTOR_KEY, AUTH) ← NTOR_SHAREDB(X, y, Y, z, Z, ID, B):
- secret_input := EXP(X, y) || EXP(X, z) || ID || B || Z || Y || PROTOID
- NTOR_KEY := H(secret_input, T_KEY)
- verify := H(secret_input, T_VERIFY)
- auth_input := verify || ID || Z || Y || X || PROTOID || "Server"
- AUTH := H(auth_input, T_MAC)
-
- The responder sends a CREATED2V cell containing:
-
- SERVER_NTOR := Y || AUTH [64 bytes]
- SERVER_NEWHOPE [2048 bytes] (see §3.2.2)
- SERVER_HDATA := SERVER_NTOR || SERVER_NEWHOPE [2112 bytes]
-
- and sends this to the initiator.
-
-
-§3.2.1.4. Finalisation
-
- The initiator then checks Y is in G^* [see NOTE below], and does
- NTOR_SHAREDA() as follows:
-
- (NTOR_KEY) ← NTOR_SHAREDA(x, X, Y, Z, ID, AUTH)
- secret_input := EXP(Y, x) || EXP(Z, x) || ID || Z || X || Y || PROTOID
- NTOR_KEY := H(secret_input, T_KEY)
- verify := H(secret_input, T_VERIFY)
- auth_input := verify || ID || Z || Y || X || PROTOID || "Server"
- if AUTH == H(auth_input, T_MAC)
- return NTOR_KEY
-
- Both parties now have a shared value for NTOR_KEY. They expand this into
- the keys needed for the Tor relay protocol.
-
- [XXX We think we want to omit the final hashing in the production of NTOR_KEY
- here, and instead put all the inputs through SHAKE-256. --isis, peter]
-
- [XXX We probably want to remove ID and B from the input to the shared key
- material, since they serve for authentication but, as pre-established
- "prologue" material to the handshake, they should not be used in attempts to
- strengthen the cryptographic suitability of the shared key. Also, their
- inclusion is implicit in the DH exponentiations. I should probably ask Ian
- about the reasoning for the original design choice. --isis]
-
-
-§3.2.2. The NewHope Handshake
-
-§3.2.2.1. Parameters & Mathematical Structures
-
- Let ℤ be the ring of rational integers. Let ℤq, for q ≥ 1, denote the quotient
- ring ℤ/qℤ. We define R = ℤ[X]/((X^n)+1) as the ring of integer polynomials
- modulo ((X^n)+1), and Rq = ℤq[X]/((X^n)+1) as the ring of integer polynomials
- modulo ((X^n)+1) where each coefficient is reduced modulo q. When we refer to
- a polynomial, we mean an element of Rq.
-
- n := 1024
- q := 12289
-
- SEED [32 Bytes]
- NEWHOPE_POLY [1792 Bytes]
- NEWHOPE_REC [256 Bytes]
- NEWHOPE_KEY [32 Bytes]
-
- NEWHOPE_MSGA := (NEWHOPE_POLY || SEED)
- NEWHOPE_MSGB := (NEWHOPE_POLY || NEWHOPE_REC)
-
-
-§3.2.2.2. High-level Description of Newhope API Functions
-
- For a description of internal functions, see §B.
-
- (NEWHOPE_POLY, NEWHOPE_MSGA) ← NEWHOPE_KEYGEN(SEED):
- â := gen_a(seed)
- s := poly_getnoise()
- e := poly_getnoise()
- ŝ := poly_ntt(s)
- ê := poly_ntt(e)
- b̂ := pointwise(â, ŝ) + ê
- sp := poly_tobytes(ŝ)
- bp := poly_tobytes(b̂)
- return (sp, (bp || seed))
-
- (NEWHOPE_MSGB, NEWHOPE_KEY) ← NEWHOPE_SHAREDB(NEWHOPE_MSGA):
- s' := poly_getnoise()
- e' := poly_getnoise()
- e" := poly_getnoise()
- b̂ := poly_frombytes(bp)
- â := gen_a(seed)
- ŝ' := poly_ntt(s')
- ê' := poly_ntt(e')
- û := poly_pointwise(â, ŝ') + ê'
- v := poly_invntt(poly_pointwise(b̂,ŝ')) + e"
- r := helprec(v)
- up := poly_tobytes(û)
- k := rec(v, r)
- return ((up || r), k)
-
- NEWHOPE_KEY ← NEWHOPE_SHAREDA(NEWHOPE_MSGB, NEWHOPE_POLY):
- û := poly_frombytes(up)
- ŝ := poly_frombytes(sp)
- v' := poly_invntt(poly_pointwise(û, ŝ))
- k := rec(v', r)
- return k
-
- When a client uses a SEED within a CREATE2V cell, the client SHOULD NOT use
- that SEED in any other CREATE2V or EXTEND2 cells. See §4 for further
- discussion.
-
-
-§3.3. Key Expansion
-
- The client and server derive a shared key, SHARED, by:
-
- HKDFID := "THESE ARENT THE DROIDS YOURE LOOKING FOR"
- SHARED := SHAKE_256(HKDFID || NTorKey || NewHopeKey)
-
-
-§3.3.1. Note on the Design Choice
-
- The reader may wonder why one would use SHAKE-256 to produce a 256-bit
- output, since the security strength in bits for SHAKE-256 is min(d/2,256)
- for collision resistance and min(d,256) for first- and second-order
- preimages, where d is the output length.
-
- The reasoning is that we should be aiming for 256-bit security for all of
- our symmetric cryptography. One could then argue that we should just use
- SHA3-256 for the KDF. We choose SHAKE-256 instead in order to provide an
- easy way to derive longer shared secrets in the future without requiring a
- new handshake. The construction is odd, but the future is bright.
- As we are already using SHAKE-256 for the 32-byte output hash, we are also
- using it for all other 32-byte hashes involved in the protocol. Note that
- the only difference between SHA3-256 and SHAKE-256 with 32-byte output is
- one domain-separation byte.
-
- [XXX why would you want 256-bit security for the symmetric side? Are you
- talking pre- or post-quantum security? --peter]
-
-
-§4. Security & Anonymity Implications
-
- This handshake protocol is one-way authenticated. That is, the server is
- authenticated, while the client remains anonymous.
-
- The client MUST NOT cache and reuse SEED. Doing so gives non-trivial
- adversarial advantages w.r.t. all-for-the-price-of-one attacks during the
- caching period. More importantly, if the SEED used to generate NEWHOPE_MSGA
- is reused for handshakes along the same circuit or multiple different
- circuits, an adversary conducting a sybil attack somewhere along the path(s)
- will be able to correlate the identity of the client across circuits or
- hops.
-
-
-§5. Compatibility
-
- Because our proposal requires both the client and server to send more than
- the 505 bytes possible within a CREATE2 cell's HDATA section, it depends
- upon the implementation of a mechanism for allowing larger CREATE cells
- (cf. Tor proposal #249).
-
- We reserve the following handshake type for use in CREATE2V/CREATED2V and
- EXTEND2V/EXTENDED2V cells:
-
- 0x0003 [NEWHOPE + X25519 HYBRID HANDSHAKE]
-
- We introduce a new sub-protocol number, "Relay=3", (cf. Tor proposal #264
- §5.3) to signify support this handshake, and hence for the CREATE2V and
- fragmented EXTEND2 cells which it requires.
-
- There are no additional entries or changes required within either router
- descriptors or microdescriptors to support this handshake method, due to the
- NewHope keys being ephemeral and derived on-the-fly, and due to the NTor X25519
- public keys already being included within the "ntor-onion-key" entry.
-
- Add a "UseNewHopeKEX" configuration option and a corresponding consensus
- parameter to control whether clients prefer using this NewHope hybrid
- handshake or some previous handshake protocol. If the configuration option
- is "auto", clients SHOULD obey the consensus parameter. The default
- configuration SHOULD be "auto" and the consensus value SHOULD initially be "0".
-
-
-§6. Implementation
-
- The paper by Alkim, Ducas, Pöppelmann and Schwabe describes two software
- implementations of NewHope, one C reference implementation and an optimized
- implementation using AVX2 vector instructions. Those implementations are
- available at [1].
-
- Additionally, there are implementations in Go by Yawning Angel, available
- from [4] and in Rust by Isis Lovecruft, available from [5].
-
- The software used to generate the test vectors in §C is based on the C
- reference implementation and available from:
-
- https://code.ciph.re/isis/newhope-tor-testvectors
- https://github.com/isislovecruft/newhope-tor-testvectors
-
-
-§7. Performance & Scalability
-
- The computationally expensive part in the current NTor handshake is the
- X25519 key-pair generation and the X25519 shared-key computation. The
- current implementation in Tor is a wrapper to support various highly optimized
- implementations on different architectures. On Intel Haswell processors, the
- fastest implementation of X25519, as reported by the eBACS benchmarking
- project [6], takes 169920 cycles for key-pair generation and 161648 cycles
- for shared-key computation; these add up to a total of 331568 cycles on each
- side (initiator and responder).
-
- The C reference implementation of NewHope, also benchmarked on Intel
- Haswell, takes 358234 cycles for the initiator and 402058 cycles for the
- Responder. The core computation of the proposed combination of NewHope and
- X25519 will thus mean a slowdown of about a factor of 2.1 for the Initiator
- and a slowdown by a factor of 2.2 for the Responder compared to the current
- NTor handshake. These numbers assume a fully optimized implementation of the
- NTor handshake and a C reference implementation of NewHope. With optimized
- implementations of NewHope, such as the one for Intel Haswell described in
- [0], the computational slowdown will be considerably smaller than a factor
- of 2.
-
-
-§8. References
-
-[0]: https://cryptojedi.org/papers/newhope-20160328.pdf
-[1]: https://cryptojedi.org/crypto/#newhope
-[2]: http://www.nist.gov/customcf/get_pdf.cfm?pub_id=919061
-[3]: https://tools.ietf.org/html/rfc7748#section-6.1
-[4]: https://github.com/Yawning/newhope
-[5]: https://code.ciph.re/isis/newhopers
-[6]: http://bench.cr.yp.to
-
-
-§A. Cell Formats
-
-§A.1. CREATE2V Cells
-
- The client portion of the handshake should send CLIENT_HDATA, formatted
- into a CREATE2V cell as follows:
-
- CREATE2V { [2114 bytes]
- HTYPE := 0x0003 [2 bytes]
- HLEN := 0x0780 [2 bytes]
- HDATA := CLIENT_HDATA [1920 bytes]
- IGNORED := 0x00 [194 bytes]
- }
-
- [XXX do we really want to pad with IGNORED to make CLIENT_HDATA the
- same number of bytes as SERVER_HDATA? --isis]
-
-§A.2. CREATED2V Cells
-
- The server responds to the client's CREATE2V cell with SERVER_HDATA,
- formatted into a CREATED2V cell as follows:
-
- CREATED2V { [2114 bytes]
- HLEN := 0x0800 [2 bytes]
- HDATA := SERVER_HDATA [2112 bytes]
- IGNORED := 0x00 [0 bytes]
- }
-
-§A.3. Fragmented EXTEND2 Cells
-
- When the client wishes to extend a circuit, the client should fragment
- CLIENT_HDATA into four EXTEND2 cells:
-
- EXTEND2 {
- NSPEC := 0x02 { [1 byte]
- LINK_ID_SERVER [22 bytes] XXX
- LINK_ADDRESS_SERVER [8 bytes] XXX
- }
- HTYPE := 0x0003 [2 bytes]
- HLEN := 0x0780 [2 bytes]
- HDATA := CLIENT_HDATA[0,461] [462 bytes]
- }
- EXTEND2 {
- NSPEC := 0x00 [1 byte]
- HTYPE := 0xFFFF [2 bytes]
- HLEN := 0x0000 [2 bytes]
- HDATA := CLIENT_HDATA[462,954] [492 bytes]
- }
- EXTEND2 {
- NSPEC := 0x00 [1 byte]
- HTYPE := 0xFFFF [2 bytes]
- HLEN := 0x0000 [2 bytes]
- HDATA := CLIENT_HDATA[955,1447] [492 bytes]
- }
- EXTEND2 {
- NSPEC := 0x00 [1 byte]
- HTYPE := 0xFFFF [2 bytes]
- HLEN := 0x0000 [2 bytes]
- HDATA := CLIENT_HDATA[1448,1919] || 0x00[20] [492 bytes]
- }
- EXTEND2 {
- NSPEC := 0x00 [1 byte]
- HTYPE := 0xFFFF [2 bytes]
- HLEN := 0x0000 [2 bytes]
- HDATA := 0x00[172] [172 bytes]
- }
-
- The client sends this to the server to extend the circuit from, and that
- server should format the fragmented EXTEND2 cells into a CREATE2V cell, as
- described in §A.1.
-
-§A.4. Fragmented EXTENDED2 Cells
-
- EXTENDED2 {
- NSPEC := 0x02 { [1 byte]
- LINK_ID_SERVER [22 bytes] XXX
- LINK_ADDRESS_SERVER [8 bytes] XXX
- }
- HTYPE := 0x0003 [2 bytes]
- HLEN := 0x0800 [2 bytes]
- HDATA := SERVER_HDATA[0,461] [462 bytes]
- }
- EXTENDED2 {
- NSPEC := 0x00 [1 byte]
- HTYPE := 0xFFFF [2 bytes]
- HLEN := 0x0000 [2 bytes]
- HDATA := SERVER_HDATA[462,954] [492 bytes]
- }
- EXTEND2 {
- NSPEC := 0x00 [1 byte]
- HTYPE := 0xFFFF [2 bytes]
- HLEN := 0x0000 [2 bytes]
- HDATA := SERVER_HDATA[955,1447] [492 bytes]
- }
- EXTEND2 {
- NSPEC := 0x00 [1 byte]
- HTYPE := 0xFFFF [2 bytes]
- HLEN := 0x0000 [2 bytes]
- HDATA := SERVER_HDATA[1448,1939] [492 bytes]
- }
- EXTEND2 {
- NSPEC := 0x00 [1 byte]
- HTYPE := 0xFFFF [2 bytes]
- HLEN := 0x0000 [2 bytes]
- HDATA := SERVER_HDATA[1940,2112] [172 bytes]
- }
-
-
-§B. NewHope Internal Functions
-
- gen_a(SEED): returns a uniformly random poly
- poly_getnoise(): returns a poly sampled from a centered binomial
- poly_ntt(poly): number-theoretic transform; returns a poly
- poly_invntt(poly): inverse number-theoretic transform; returns a poly
- poly_pointwise(poly, poly): pointwise multiplication; returns a poly
- poly_tobytes(poly): packs a poly to a NEWHOPE_POLY byte array
- poly_frombytes(NEWHOPE_POLY): unpacks a NEWHOPE_POLY byte array to a poly
-
- helprec(poly): returns a NEWHOPE_REC byte array
- rec(poly, NEWHOPE_REC): returns a NEWHOPE_KEY
-
-
- --- Description of the Newhope internal functions ---
-
- gen_a(SEED seed) receives as input a 32-byte (public) seed. It expands
- this seed through SHAKE-128 from the FIPS202 standard. The output of SHAKE-128
- is considered a sequence of 16-bit little-endian integers. This sequence is
- used to initialize the coefficients of the returned polynomial from the least
- significant (coefficient of X^0) to the most significant (coefficient of
- X^1023) coefficient. For each of the 16-bit integers first eliminate the
- highest two bits (to make it a 14-bit integer) and then use it as the next
- coefficient if it is smaller than q=12289.
- Note that the amount of output required from SHAKE to initialize all 1024
- coefficients of the polynomial varies depending on the input seed.
- Note further that this function does not process any secret data and thus does
- not need any timing-attack protection.
-
-
- poly_getnoise() first generates 4096 bytes of uniformly random data. This can
- be done by reading these bytes from the system's RNG; efficient
- implementations will typically only read a 32-byte seed from the system's RNG
- and expand it through some fast PRG (for example, ChaCha20 or AES-256 in CTR
- mode). The output of the PRG is considered an array of 2048 16-bit integers
- r[0],...,r[2047]. The coefficients of the output polynomial are computed as
- HW(r[0])-HW(r[1]), HW(r[2])-HW(r[3]),...,HW(r[2046])-HW(r[2047]), where HW
- stands for Hamming weight.
- Note that the choice of RNG is a local decision; different implementations are
- free to use different RNGs.
- Note further that the output of this function is secret; the PRG (and the
- computation of HW) need to be protected against timing attacks.
-
-
- poly_ntt(poly f): For a mathematical description of poly_ntt see the [0]; a
- pseudocode description of a very naive in-place transformation of an input
- polynomial f = f[0] + f[1]*X + f[2]*X^2 + ... + f[1023]*X^1023 is the
- following code (all arithmetic on coefficients performed modulo q):
-
- psi = 7
- omega = 49
-
- for i in range(0,n):
- t[i] = f[i] * psi^i
-
- for i in range(0,n):
- f[i] = 0
- for j in range(0,n):
- f[i] += t[j] * omega^((i*j)%n)
-
- Note that this is not how poly_ntt should be implemented if performance is
- an issue; in particular, efficient algorithms for the number-theoretic
- transform take time O(n*log(n)) and not O(n^2)
- Note further that all arithmetic in poly_ntt has to be protected against
- timing attacks.
-
-
- poly_invntt(poly f): For a mathematical description of poly_invntt see the
- [0]; a pseudocode description of a very naive in-place transformation of an
- input polynomial f = f[0] + f[1]*X + f[2]*X^2 + ... + f[1023]*X^1023 is the
- following code (all arithmetic on coefficients performed modulo q):
-
- invpsi = 8778;
- invomega = 1254;
- invn = 12277;
-
- for i in range(0,n):
- t[i] = f[i];
-
- for i in range(0,n):
- f[i]=0;
- for j in range(0,n):
- f[i] += t[j] * invomega^((i*j)%n)
- f[i] *= invpsi^i
- f[i] *= invn
-
- Note that this is not how poly_invntt should be implemented if performance
- is an issue; in particular, efficient algorithms for the inverse
- number-theoretic transform take time O(n*log(n)) and not O(n^2)
- Note further that all arithmetic in poly_invntt has to be protected against
- timing attacks.
-
-
- poly_pointwise(poly f, poly g) performs pointwise multiplication of the two
- polynomials. This means that for f = (f0 + f1*X + f2*X^2 + ... +
- f1023*X^1023) and g = (g0 + g1*X + g2*X^2 + ... + g1023*X^1023) it computes
- and returns h = (h0 + h1*X + h2*X^2 + ... + h1023*X^1023) with h0 = f0*g0,
- h1 = f1*g1,..., h1023 = f1023*g1023.
-
-
- poly_tobytes(poly f) first reduces all coefficents of f modulo q, i.e.,
- brings them to the interval [0,q-1]. Denote these reduced coefficients as
- f0,..., f1023; note that they all fit into 14 bits. The function then packs
- those coefficients into an array of 1792 bytes r[0],..., r[1792] in "packed
- little-endian representation", i.e.,
- r[0] = f[0] & 0xff;
- r[1] = (f[0] >> 8) & ((f[1] & 0x03) << 6)
- r[2] = (f[1] >> 2) & 0xff;
- r[3] = (f[1] >> 10) & ((f[2] & 0x0f) << 4)
- .
- .
- .
- r[1790] = (f[1022]) >> 12) & ((f[1023] & 0x3f) << 2)
- r[1791] = f[1023] >> 6
- Note that this function needs to be protected against timing attacks. In
- particular, avoid non-constant-time conditional subtractions (or other
- non-constant-time expressions) in the reduction modulo q of the coefficients.
-
-
- poly_frombytes(NEWHOPE_POLY b) is the inverse of poly_tobytes; it receives
- as input an array of 1792 bytes and coverts it into the internal
- representation of a poly. Note that poly_frombytes does not need to check
- whether the coefficients are reduced modulo q or reduce coefficients modulo
- q. Note further that the function must not leak any information about its
- inputs through timing information, as it is also applied to the secret key
- of the initiator.
-
-
- helprec(poly f) computes 256 bytes of reconciliation information from the
- input poly f. Internally, one byte of reconciliation information is computed
- from four coefficients of f by a function helprec4. Let the input polynomial f
- = (f0 + f1*X + f2*X^2 + ... + f1023*X^1023); let the output byte array be
- r[0],...r[256]. This output byte array is computed as
- r[0] = helprec4(f0,f256,f512,f768)
- r[1] = helprec4(f1,f257,f513,f769)
- r[2] = helprec4(f2,f258,f514,f770)
- .
- .
- .
- r[255] = helprec4(f255,f511,f767,f1023), where helprec4 does the following:
-
- helprec4(x0,x1,x2,x3):
- b = randombit()
- r0,r1,r2,r3 = CVPD4(8*x0+4*b,8*x1+4*b,8*x2+4*b,8*x3+4*b)
- r = (r0 & 0x03) | ((r1 & 0x03) << 2) | ((r2 & 0x03) << 4) | ((r3 & 0x03) << 6)
- return r
-
- The function CVPD4 does the following:
-
- CVPD4(y0,y1,y2,y3):
- v00 = round(y0/2q)
- v01 = round(y1/2q)
- v02 = round(y2/2q)
- v03 = round(y3/2q)
- v10 = round((y0-1)/2q)
- v11 = round((y1-1)/2q)
- v12 = round((y2-1)/2q)
- v13 = round((y3-1)/2q)
- t = abs(y0 - 2q*v00)
- t += abs(y1 - 2q*v01)
- t += abs(y2 - 2q*v02)
- t += abs(y3 - 2q*v03)
- if(t < 2q):
- v0 = v00
- v1 = v01
- v2 = v02
- v3 = v03
- k = 0
- else
- v0 = v10
- v1 = v11
- v2 = v12
- v3 = v13
- r = 1
- return (v0-v3,v1-v3,v2-v3,k+2*v3)
-
- In this description, round(x) is defined as ⌊x + 0.5⌋, where ⌊x⌋ rounds to
- the largest integer that does not exceed x; abs() returns the absolute
- value.
- Note that all computations involved in helprec operate on secret data and must
- be protected against timing attacks.
-
-
- rec(poly f, NEWHOPE_REC r) computes the pre-hash (see paper) Newhope key from
- f and r. Specifically, it computes one bit of key from 4 coefficients of f and
- one byte of r. Let f = f0 + f1*X + f2*X^2 + ... + f1023*X^1023 and let r =
- r[0],r[1],...,r[255]. Let the bytes of the output by k[0],...,k[31] and let
- the bits of the output by k0,...,k255, where
- k0 = k[0] & 0x01
- k1 = (k[0] >> 1) & 0x01
- k2 = (k[0] >> 2) & 0x01
- .
- .
- .
- k8 = k[1] & 0x01
- k9 = (k[1] >> 1) & 0x01
- .
- .
- .
- k255 = (k[32] >> 7)
- The function rec computes k0,...,k255 as
- k0 = rec4(f0,f256,f512,f768,r[0])
- k1 = rec4(f1,f257,f513,f769,r[1])
- .
- .
- .
- k255 = rec4(f255,f511,f767,f1023,r[255])
-
- The function rec4 does the following:
-
- rec4(y0,y1,y2,y3,r):
- r0 = r & 0x03
- r1 = (r >> 2) & 0x03
- r2 = (r >> 4) & 0x03
- r3 = (r >> 6) & 0x03
- Decode(8*y0-2q*r0, 8*y1-2q*r1, 8*y2-2q*r2, 8*y3-q*r3)
-
- The function Decode does the following:
-
- Decode(v0,v1,v2,v3):
- t0 = round(v0/8q)
- t1 = round(v1/8q)
- t2 = round(v2/8q)
- t3 = round(v3/8q)
- t = abs(v0 - 8q*t0)
- t += abs(v0 - 8q*t0)
- t += abs(v0 - 8q*t0)
- t += abs(v0 - 8q*t0)
- if(t > 1) return 1
- else return 0
-
-
-§C. Test Vectors
diff --git a/proposals/proposal-status.txt b/proposals/proposal-status.txt
index 088ce54..dc0b332 100644
--- a/proposals/proposal-status.txt
+++ b/proposals/proposal-status.txt
@@ -442,3 +442,9 @@ again to remind me!
Describes a generalised protocol for composing X25519 key exchanges with
post-quantum ones.
+
+270 RebelAlliance: A Post-Quantum Secure Hybrid Handshake Based on NewHope [DRAFT]
+
+ Describes a hybrid handshake based on the ntor handshake and the
+ NewHope post-quantum key exchange. Currently needs revision to
+ specify how this proposal depends upon prop#269.
More information about the tor-commits
mailing list