[tor-dev] GSoC: Tor Messenger CONIKS integration

Sat Mar 12 09:54:27 UTC 2016

Hello Marcela!

On 11 Mar 2016, at 2:55, Marcela Melara wrote:

> Hi Elias,
>
>> On Mar 9, 2016, at 5:31 AM, Elias Rohrer <ml at tnull.de> wrote:
>>
>> Hello Marcela!
>> Nice to meet you, too, and thank you for that interesting Paper!
>> It seems I made a lot of wrong assumptions in my last mail, ugh, 
>> thats kind of embarrassing..
>
> I really appreciate your interest in our paper and CONIKS, thanks :) 
> No worries, there are a lot of subtle aspects about CONIKS, so it can 
> take some time to fully see and understand them.
>
>>> Yes, we propose that the identity providers can act as auditors and 
>>> cross-verify each other in this way, and clients may query any 
>>> auditor when they are checking for equivocation. However, I should 
>>> note that we intended this to be the *ideal deployment* (and I 
>>> apologize if this doesn’t come through clearly enough in the 
>>> paper). CONIKS also allows third-party auditors whose sole job it is 
>>> to check that the chain of signed tree roots is linear. In practice, 
>>> we expect that auditors and identity providers will be separate in 
>>> most cases (at least until there is a standard for CONIKS protocols 
>>> and data formats, but we’re not there yet).
>>>
>>> It’s not accurate to conflate auditing with being a key server, or 
>>> say that auditing is a subset of the key server’s functionality. 
>>> Auditors only need to keep records of identity providers’ signed 
>>> tree roots, and they never download or record any key directory 
>>> contents, so it doesn’t really make sense to equip auditors with 
>>> key server functionality and “turn it off” when you’re not 
>>> acting as a key server.  Key servers have the more complex and rich 
>>> functionality needing to maintain significantly larger data 
>>> structures and perform more cryptographic computations. This means 
>>> that key servers have very different space and availability 
>>> requirements than auditors do.
>>>
>> Ah, yes, that is true. Just turning of features of the server to get 
>> an auditor might indeed be a bad idea. But wouldn't there be a lot of 
>> overlapping functionality in the server, auditor and even the 
>> clients? E.g. the crypto primitives for building and checking the 
>> tree? Maybe it would be viable to create a libCONIKS which could be 
>> used by all three components?
>
> So it might seem at first glance that there’s a lot of overlapping 
> functionality between the server, auditor and clients, but once you 
> start implementing each one, it’s not really the case. Think about 
> the different tasks of each component: the server is generating a 
> bunch of digital signatures (e.g. signed tree roots, lookup indices) 
> and hashes (Merkle tree, lookup indices, signed tree roots) to build 
> the Merkle trees and hash chain of signed tree roots. Clients, on the 
> other hand, do a bunch of signature and hash verifications (which, yes 
> to be fair requires hashing data as well, but for different reasons) 
> to verify individual authentication paths and signed tree roots. So 
> there’s really no overlap in functionality with the server. 
> Similarly, the auditors need to verify the signed tree roots so 
> there’s some signature and hash verification as well, but in a 
> different way than clients need to do this.
>
> While generation and verification of these various crypto primitives 
> go hand in hand, I would caution against putting them all in a single 
> library if different components of the system will need different 
> portions. If you look at the reference implementation, the only common 
> code is the definition of the data formats. When I first started 
> writing the reference implementation, I actually thought that it would 
> be useful (and possible) to build a single libCONIKS, but sharing 
> underlying crypto primitives isn’t reason enough to put all of the 
> functionality in one place since the application of those primitives 
> is what really matters. So I really think it makes more sense to make 
> separate modules for server, client and auditor.
>
>>>> On Page 4 the authors state:
>>>> "We can implement a VUF using any deterministic, existentially 
>>>> unforgeable signature scheme [42]. The signature scheme must be 
>>>> deterministic or else (in our application) the identity provider 
>>>> could insert multiple bindings for a user at different locations 
>>>> each with a valid authentication path. In practice, this could be a 
>>>> classic RSA signature [51] (using a deterministic padding scheme 
>>>> such as PKCS v. 1.5 [26]) or BLS “short signatures” [9], which 
>>>> provide the best space efficiency. Many common discrete-log based 
>>>> signature schemes such as DSA [30] are not immediately applicable 
>>>> as they require a random nonce.
>>>> In CONIKS, we generate the index for a user u as: i = h (sign_K 
>>>> (u))
>>>> where h() is a one-way hash function and sign() is a deterministic, 
>>>> existentially unforgeable signature scheme. The extra hash function 
>>>> is added because some bits of sign_K (u) might otherwise leak 
>>>> through binding validity proofs, and signature schemes are 
>>>> generally not unforgeable given some bits of the valid signature. 
>>>> The hash function prevents this by ensuring the final lookup index 
>>>> leaks no information about the value of sign_K (u).”
>>>>
>>>> I understand that using the signature of the user identity ensures 
>>>> that no information indicating its existence on the server is 
>>>> leaked. But why are they hashing the signature again? 
>>>> RSA-signatures normally already consist of a signed hash-value. Why 
>>>> would it bad to leak bits of the signature?
>>>
>>> The answer to this question is in the portion of the paper you 
>>> quote: "signature schemes are generally not unforgeable given some 
>>> bits of the valid signature.” There’s some crypto-jargon here, 
>>> which you may not be familiar with if you haven't taken a crypto 
>>> course. What this sentence is trying to say is that, given some bits 
>>> of a valid signature, an adversary could be able to find some 
>>> message or data and a corresponding digital signature which is valid 
>>> and which has not been generated by the legitimate signer (in this 
>>> case the identity provider). This is called an existential forgery, 
>>> and it means that an adversary can create forged lookup indices if 
>>> the signature isn’t hashed. The hash on top of the signature also 
>>> ensures that the lookup index can’t be easily computed from the 
>>> VUF.
>>>
>> I actually had a crypto course some time ago, so crypto jargon should 
>> be fine.. However it seems that part of my brain needs some more 
>> training to get reactivated ;)
>> I want to get this into my brain so here are some more (probably 
>> pretty naive) questions and thoughts on the topic:
>> - When bits of a valid signature leak, an external attacker could 
>> forge a colliding index for a different user? This would allow an 
>> attacker to register the same public key to a different username they 
>> control? So therefore a hash is used?
>
> So the problem with leaked bits isn’t about an attacker forging an 
> index. The problem is actually about privacy. Let’s go back and 
> think about why the server transforms usernames into these lookup 
> indices. The main purpose is so that the proofs of inclusion (i.e. the 
> Merkle authentication paths) for one user don’t leak information 
> about *any other* users in the tree. So if I receive the proof of 
> inclusion for Alice, I should not be able to figure out who else might 
> be in the tree. Remember that the proof of inclusion gives us the 
> prefix of Alice’s lookup index, so in other words, we want to make 
> sure that an attacker can’t find a name that results in an index 
> with a shared prefix with Alice since this would be a privacy 
> violation. If you use just a hash function, an attacker can reasonably 
> hash a large set of usernames until it finds one that shares its 
> prefix with Alice's index and determine whether this user exists by 
> querying the server for the corresponding key.
>
> Using only the digital signature doesn’t solve this problem since 
> there are attacks that allow you to recover the rest of the signature 
> bits given some of the first bits (I don’t know which specific 
> attacks do this, this is the explanation I received from my 
> collaborator). So given Alice’s index prefix, an attacker could try 
> to guess if the index for a specific user (e.g. Bob) shares the same 
> prefix, recover the remaining bits of the signature, and verify — 
> using the provider's index public verification key -- whether this is 
> a valid signature for Bob’s name. This is also a privacy violation. 
> So by hashing the signature, anyone who receives the authentication 
> path for Alice cannot determine based on this path, which other 
> usernames exist in the tree since this would require finding the 
> signatures whose hash shares a prefix with Alice’s index, a much 
> more difficult problem.
>
Ok, thanks for your elaborations! I think I understand the problem and 
how you solved it now. I'm still a bit curious though, which specific 
forgery attacks would be able to recover the signature. ;)

>
>> - What still confuses me: When I recall correctly, 'existential 
>> forgery' is the lowest level of forgery since the attacker may not 
>> have control over the contents of the signed message, but may forge a 
>> valid signature for that message.
>> You propose to use a existentially unforgeable signature scheme for 
>> your VUF. But:
>> 	- As you explicitly use an existentially unforgeable signature 
>> scheme, why can the index then be forged if bits of this 
>> (existentially unforgeable) signature are leaked to an attacker? Is 
>> the hash needed when an external attacker has no knowledge of the 
>> private key used to sign the index?
>> 	- You later propose RSA for the VUF, but isn't RSA *the* example for 
>> which existential forgery can be done?
>
> We realized the potential dangers of using RSA, so we designed a 
> different VUF scheme. I think it’s best if you read the new paper to 
> see the description.
>
>
>> 	- So am I right to assume that the attacker model shifts from an 
>> external one to a malicious identity provider? As he knows his 
>> private key, the signed data and the signature, what is the hash 
>> function hiding here?
>
> See my explanation of the VUF above. CONIKS tries to protect against 
> external *and* internal attackers. As I explained above, the hash 
> function in the private lookup index is used to protect users’ 
> privacy from external attackers. Like you said, the key server knows 
> can know who all has name-to-key mappings in its own key directory. 
> CONIKS deals with the internal attackers (i.e. malicious or 
> compromised key server) using the tamper-evident data structure and 
> signed tree roots to leave evidence of fake keys and equivocation.
>
>
>>>> Also, if I see this correctly, they don't do it in their reference 
>>>> implementation 
>>>> (https://github.com/coniks-sys/coniks-ref-implementation/blob/master/coniks_server/src/org/coniks/coniks_server/SignatureOps.java 
>>>> <https://github.com/coniks-sys/coniks-ref-implementation/blob/master/coniks_server/src/org/coniks/coniks_server/SignatureOps.java><https://github.com/coniks-sys/coniks-ref-implementation/blob/master/coniks_server/src/org/coniks/coniks_server/SignatureOps.java 
>>>> <https://github.com/coniks-sys/coniks-ref-implementation/blob/master/coniks_server/src/org/coniks/coniks_server/SignatureOps.java>>), 
>>>> instead they just use SHA256withRSA...
>>>> So, am I misreading the paper? Or do they actually mean 
>>>> sign_K(h(u)), which would be the normal 'hash and sign' operation 
>>>> of RSA? Or does it make sense to rehash the signature and their own 
>>>> implementation is not consistent with it?
>>>> This is probably no important point, however I want to make sure 
>>>> I'm not missing a piece concerning the security of the user 
>>>> identities.
>>>
>>> Unfortunately, the reference implementation is still a work in 
>>> progress (although I’ve had to take some time off from working on 
>>> it due to other research projects), and is missing several features 
>>> described in the paper. The SignatureOps class in the reference 
>>> implementation simply implements wrappers around Java signature 
>>> functions, so that’s not where the user lookup index calculation 
>>> happens. The lookup index calculation happens in the unameToIndex() 
>>> function in conks_server.ServerUtils. Now, I will note that the 
>>> reference implementation currently *isn’t* doing the proper index 
>>> calculation as it only currently hashes the username, so thank you 
>>> for making me go back and double-check my code! But it would be 
>>> relative straight-forward to change this and to add the digital 
>>> signature computation that needs to be done before the hashing.
>> Ok, good to know that, I'll keep it in mind when reading the 
>> reference code! However, this is embarrassing, I only had a glance an 
>> the reference implementation, sorry for linking to an unrelated spot 
>> in the code and making assumptions...
>>
>> I think my next steps will be to read the peer-reviewed version of 
>> your Paper and to refresh my crypto-knowledge a bit. I'd appreciate 
>> if you had some pointers to resources relevant to this project for me 
>> (other than classic textbooks). I may even try to have a look at the 
>> Micali/Rabin paper on VRFs.
>
> As for good pointers for this project, I’m not sure that the 
> Micali/Rabin paper on VRFs specifically is really going to be all that 
> helpful for implementing CONIKS because the paper describes VRFs at a 
> level of detail that isn’t needed to understand why it may be better 
> to use an EC-based VUF construction than RSA. I apologize about the 
> confusion arising from the older version of our paper
>
> If you’re interested in the very formal crypto details of VRFs, you 
> should read the paper, but for this project it’s more important to 
> understand the security and privacy problems that CONIKS tries to 
> solve and how it achieves those goals, and to understand how CONIKS 
> fits into the workflow of the existing Tor Messenger service. 
> Hopefully, our Usenix paper will explain most of the security and 
> privacy design decisions, but definitely feel free to ask more 
> questions! Another good resource might be to read about equivocation 
> and fork consistency (David Mazieres’ paper on SUNDR first 
> introduced this concept), and certificate transparency (there’s a 
> good ACM Queue article) for more background on the problems CONIKS 
> tries to solve and the design decisions we made. Goldberg et al’s 
> paper on preventing DNS zone enumeration is a fairly recent paper that 
> also shows how to use VUFs to prevent enumeration of sensitive 
> information, so that might be another good resource. For figuring out 
> how CONIKS can be integrated into Tor Messenger, I would look at the 
> portions of the Tor Messenger code that currently handle encryption 
> keys (e.g. key generation, registration, key lookups), and come up 
> with a plan of how CONIKS will affect the existing workflow.
>
> I hope this helps!
Yes, thank you, your elaborations helped me very much! I'll send in my 
proposal for this project on Monday and will have a look at the 
promising resources you gave me over the next few weeks.

Best Wishes!

Elias

> Best,
> Marcela_______________________________________________
> tor-dev mailing list
> tor-dev at lists.torproject.org
> https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev