[tor-dev] SkypeMorph

Wed Mar 28 06:28:24 UTC 2012

On Mon, Mar 26, 2012 at 03:04:47PM -0400, Hooman wrote:
> >Can you give us some guesses about next steps for resolving these issues
> >(or explaining why they aren't actually as worrisome as they appear)?
> >
> >A) It looks like the transport has no notion of adapting to network
> >conditions, i.e. congestion control. So it will basically fall apart on
> >a low-bandwidth or congested network.
> True, but as mentioned in section 8.2 of the technical report, this
> can be fixed by considering Skype video calls on different networks,
> depending on the network status. (the way Skype bandwidth usage
> varies with available bandwidth is studied, for example: http://www.tlc-networks.polito.it/oldsite/mellia/papers/skype_info08.pdf
> )

Isn't that like saying TCP congestion control can be implemented by
sampling capacity and traffic load on a variety of networks, and then
hard-coding the TCP window and resend algorithms to suit the network
you think you're running on?

I'm not worried here so much about whether your flow adapts to network
conditions like a real Skype flow would (though I agree that's an
issue). I'm worried about whether your flow would fail to back off at
all in the face of congestion, leading to a) Skypemorph not getting its
packets through because so many of them get dropped, and b) Skypemorph
ruining the network it's running on.

> >B) It sends at a constant rate of 43KB/s in each direction all the
> >time. Even if users are willing to tolerate that, it doesn't scale on
> >the bridge/relay side if there are lots of users. I wonder how feasible
> >a "traffic shaping" approach would be (where the flow rate drops off
> >if there's no underlying traffic), and how much that would screw with
> >your statistics. Which leads to:
> 43KB/s is per connection, so each client gets this bandwidth, while
> the bridge can have multiple connections.

Right. But if a bridge wants to handle 10 Skypemorph users, the bridge
needs to be sending out 430KB/s all the time. That means volunteer users
can't operate these bridges at home (unless they live in Japan, Korea,
or Sweden I guess). It also greatly increases the overall traffic cost
of running a bridge.

For example, during the February weekend when Iran blocked SSL, my
obfsproxy bridge was easily handling ~500 users at once. With Skypemorph
that's 172mbit/s of duplex traffic?

> >C) The packet size and timing distributions only aim to match the
> >first-order properties of Skype. At the same time, DPI vendors have
> >already been in a battle with Skype traffic for a while now. How advanced
> >do you think DPI vendors are at detecting Skype-like traffic, and thus at
> >distinguishing your traffic from real Skype traffic? Similarly, how bad is
> >it that you don't follow through with the TCP side of the Skype handshake?
> The TCP connections are more of control connections and they send a
> small number of messages during the call and we actually have some
> ideas on how to deal with this, like handing the sockets for these
> connections to our software after we fake a call.

Ok.

What do you think about the "first-order properties" question about size
and timing (e.g. I bet real Skype traffic does not draw its packet size
and timing independently from the size and timing of the previous packet)?
Combined with the fact that DPI vendors have quite a bit of experience
targeting Skype traffic in particular, I worry that they've thought
about this specific question more than we have.

> >D) The morphing output is basically identical to the naive shaping. Are
> >you sure you did it right?
> 
> So as mentioned in the report, the original traffic morphing does
> not consider timing at all (which makes it less effective against
> DPIs) and it aims at minimizing the overhead, ie the number of
> padding bytes sent on the wire.

Right. Minimizing padding bytes on the wire is a big reason to like it.

> When we introduced the inter-packet
> timing feature, it was no longer possible to go with the same
> construction, since packets may not be send right away. As a result
> we tried a different approach for traffic morphing: we buffered
> packets received from Tor, then when it is time to send the next
> packet, we simply estimate the original packet size by a sample form
> the Tor's packet size distribution. I know there are other ways this
> can be done, but in our experiment we didn't observe any tangible
> difference in the outcome.

Hrm. So that means your traffic morphing algorithm doesn't try to reduce
padding bytes? That makes your graph 5 make more sense. But is it really
accurate to call it morphing still? It would be great to explore that
tradeoff more.

--Roger