[tor-dev] Shared random value calculation edge cases (proposal 250)

Tue Nov 17 17:40:30 UTC 2015

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Hello,

Saw the content of this section in master was corrected, yet the
subtitle is little confusing:

4.1.6. Including the ed25519 shared randomness key in votes [SRKEY]

- From the content of this section I understand that we are going to
include the ed25519 medium term signing key, certificate and master
identity key. The content is clear, but maybe we should change the
subtitle too, since there's no SR key:

4.1.6. Including the ed25519 medium term signing key and master
identity key in votes [ED25519ID]

Edge cases are the main reason I suggested in my previous emails to
require at least 2 or 3 reveal rounds in order to allow a dirauth to
participate in the shared randomness calculation for that day. However
this won't help in case a dirauth needs to vote at 01:00 UTC and
doesn't know anything.

The idea of adding flags in the votes so each dirauth can advertise if
it is participating (has an opinion for the <current> SR or not) is
great and helps us build more defenses, probably make it easier in the
future too if we decide to change anything.

What if the consensus for SR calculation would define majority based
on dirauths actually participating (and advertising so with a flag in
the vote). Also, the participating or not participating flag should be
used per vote/consensus and split into:

a) we know current SR value for today so we vote it
or
we know previous SR value and we know for sure if we should follow the
disaster protocol or not (in case we are about to vote at 01:00 UTC).
so
We participate in the vote for <current SR>.

b) we are able to participate in this protocol run which will
calculate the SR value for next day (after 00:00 UTC) so we send our
commits/reveals.

This is useful in case we are a dirauth that joined at 00:30 UTC and
we couldn't get the _latest_ consensus (to find out if the 00:00 UTC
consensus was created, and if not, previous SR value so we can follow
the disaster procedure) we will not have an opinion for the <current>
SR value at 01:00 UTC, but we can start participating in the protocol
run for the next day - send our commit values. Once we decided on a
<current> SR value for that day we save it and vote normally next time.

So, if we have 5 dirauths running/signing consensus in total, out of
which only 4 participate in the shared randomness protocol, the 4
participating ones should be able to create a valid consensus
themselves with the insurance that the 5th one won't break consensus.

One way to do this is: the dirauth which is not participating will
take the SR value voted by the majority of the participating dirauths
and include that in its consensus and sign. We need at least 3
dirauths agreeing on a SR value in order to accept it.

Is this crazy? It shouldn't open the door new attacks, since this
doesn't allow a single actor to game it, only the majority could game it.

Some more comments inline.

On 11/12/2015 4:25 PM, George Kadianakis wrote:
> Hello there believers of prop250,
> 
> you can find the latest version of the proposal in the upstream
> torpec repo: 
> https://gitweb.torproject.org/torspec.git/tree/proposals/250-commit-reveal-consensus.txt
>
> 
Implementation is also constantly moving forward as you can see in the
ticket:
> https://trac.torproject.org/projects/tor/ticket/16943
> 
> Now that we have ironed out the whole voting procedure, it's time
> to finish this up by figuring out the last details of the shared
> random value calculation.
> 
> The logic in the current proposal seems reasonable (see section
> 3.3), but I have some doubts that I wanted to communicate with
> you.
> 
> - I'm afraid of edge cases where different authorities will
> calculate different shared random values. This is bad because if it
> happens at the wrong moment, it might break the consensus.
> 
> For example, imagine that there are 5 authorities that are doing
> the prop250 protocol. Since there are 5 auths, they can create a
> valid consensus on their own with SR info in it.
> 
> Now imagine that one of them, Alice, has a different view of the
> previous_SRV than the others (maybe because she _just_ booted up
> before voting and she doesn't have a previous consensus). In this
> case, Alice will calculate a different SRV than the other 4, and
> hence the consensus will break because 5 signatures could not be
> collected.
> 
> Is this something likely to happen?
> 
> If yes, a way to protect against it is to add a flag on the votes
> denoting support for the SR protocol, and have dirauths toggle it
> off if they are voting for 00:00UTC and they don't know the
> previous_SRV or if they don't have a consensus. This can also be
> used as a torrc-enabled killswitch, if the SR protocol bugs out
> completely and we need to disable it for the sake of the network.
> What do you think?
> 
> - Another bothersome thing is the disaster SRV calculation.
> Specifically, the proposal says:
> 
> =============================== prop 250:
> ==================================== If the consensus at 00:00UTC
> fails to be created, then there will be no fresh shared random
> value for the day.
> 
> In this case, and assuming there is a previous shared random value,
> directory authorities should use the following construction as the
> shared random value of the day:
> 
> SRV = HMAC(previous_SRV, "shared-random-disaster")
> 
> where "previous_SRV" is the previous shared random value. 
> ===============================================================================
>
>  this logic is not implemented in the current code, and it's not 
> straightforward to implement. Again because the previous_SRV is
> blended in the formula, but also because it's not easy to know that
> "the consensus at 00:00UTC fails to be created".
> 
> For example, if you are a dirauth that just started up at 00:30UTC,
> and you asked for the previous consensus and you were given the
> 23:00UTC consensus, then you won't know that the 00:00 consensus
> was not created and that you need to do the disaster procedure.
> This will again break the consensus.
> 
> Not sure if this is a likely scenario as well, and if we should
> protect against it. What do you think?
> 
> It depends on the logic that authorities have for fetching
> consensuses. Do they ensure that they always have the latest
> consensus?  Do we need to add such logic as part of prop250? :/
> 

I don't think I understand this the way I should. If we join at 00:30
UTC, instead of asking for previous consensus, why don't we ask for
_latest_ consensus from every other dirauth? And if we are given the
23:00 UTC consensus at 00:30 UTC, we know the consensus at 00:00 UTC
was not created and we need to follow the disaster procedure.

If we have the 23:00 UTC consensus, we know the previous SR value so
we can participate. If we couldn't get it, we advertise that we are
not participating and sign whatever the participating majority agrees
so we don't break consensus.

> A way to protect against it is to use the "SR support" vote flag I
> talked about before, and toggle it off if you are a dirauth and you
> don't have the latest consensus. Terrible? Would that even allow
> dirauths to bootstrap SR?
> 
I don't see why would this be terrible. What's a plausible thing that
can happen so a dirauth can't get the latest consensus?

> - Another interesting part of prop250 is:
> 
> =============================== prop 250:
> ==================================== If the shared random value
> contains reveal contributions by less than 3 directory authorities,
> it MUST NOT be created. Instead, the old shared random value should
> be used as specified in section [SRDISASTER]. 
> ===============================================================================
>
>  do you think this is useful?
> 
> The fact that we use consensus methods, ensures us that at least 5
> dirauths understand the SR protocol, otherwise we don't do it.
> Should we care about the number of reveal values? And why should
> the constant be 3, and not 5 or 2?
> 

Yes, I think this is useful and 3 is a fair constant, especially
combined with the participating or not participating flags. I guess
the argument here is that it should be quite hard to have 3 dirauths
colluding for an attack.

> Those were my doubts.
> 
> Sorry for the extra confusion, but I'm currently reading the
> proposal and trying to minimize the amount of edge cases that can
> happen in the implementation (especially if they result in breaking
> the consensus for everyone). Maybe I'm trying too hard, and there
> are already multiple such edge cases in the consensus protocol that
> just never happen and are not worth fixing.
> 

Your concerns make sense to me, however this could also be true. I
don't know enough to confirm or infirm it, but looking forward for
more comments.

> In any case, the shared random value calculation seems to be the
> last piece of the puzzle here, so let's figure it out and finish
> up!
> 
> Thanks!
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (MingW32)

iQEcBAEBCAAGBQJWS2aOAAoJEIN/pSyBJlsRPCQIALUKqo1nvVTYV0WQqrlvnRpm
ilSulg+WZNuiyB/uxxTfk6DtmBz6oqwsO2hPwr5BPzJO8SYBHm7jSGxalTOUh0nR
MEgVbjRYMOJZGqECsioxjhdOqoB7p8oK+rhnSRmBy/HxTVqb6FkkGr5Psil+RrQL
JPOlkm6r0ptF10Fg+lVbYXyiM2GGB4Ggup76MOX4MZ0Lr12aWJmrLk17JUhXk2r5
k7akAREBhwmsHnkJ1XA27lVMcBYX9gz1IR85wDUgBFdf8WI3FDVck2MPUTsp2eai
xeLs6XAfvBfKcaQMolxsJ01rxUps0V8no8sjqOH4McdYJhXDfpdLnObFqoSj3no=
=wRgo
-----END PGP SIGNATURE-----