[tor-bugs] #12505 [BridgeDB]: Refactor BridgeDB's hashrings (was: Refactor Bridges.py and Dist.py in BridgeDB)

Tor Bug Tracker & Wiki blackhole at torproject.org
Tue Mar 31 23:35:59 UTC 2015


#12505: Refactor BridgeDB's hashrings
-------------------------+-------------------------------------------------
     Reporter:  isis     |      Owner:  isis
         Type:  defect   |     Status:  assigned
     Priority:  major    |  Milestone:
    Component:           |    Version:
  BridgeDB               |   Keywords:  bridgedb-dist, bridgedb-1.0.x,
   Resolution:           |  isis2014Q3Q4, isisExB
Actual Points:           |  Parent ID:
       Points:           |
-------------------------+-------------------------------------------------
Description changed by isis:

Old description:

> I have slowly been refactoring all of BridgeDB. Code which has been
> already refactored is named with "proper" (according to PEP8) lower-cased
> module names in `lib/bridgedb` in the BridgeDB repository.
>
> Some of the largest, least-unitested, (and most difficult to refactor)
> sections of BridgeDB's code are the `Bridges.py` module and the `Dist.py`
> module. The biggest problems are:
>
>  1. The code for the various types of hashrings in `bridgedb.Bridges` is
> a complete mess. In some places, hashrings are referred to as
> `BridgeHolders`, in other places as `Buckets` (though not to be confused
> with the `Bucket.py` module, and in other places as "hashrings".
> Subclassing is haphazard and confusing to say the least. In addition, the
> hashrings are not algorithmically as efficient as they could be.
> Throughout the hashring code were old-style classes, unused methods,
> half-implemented methods, and unnecessary parameters. All of this code
> needs some serious help.
>
>  2. The Distributors in `bridgedb.Dist` inherit, for some unknown reason,
> from the improperly implemented "base class"
> `bridgedb.Bridges.BridgeHolder`. One, this isn't how one implements a
> proper Python base class (by deriving from a class with `__metaclass__ =
> abc.ABCMeta`). Two, why a `Distributor` should be a subclass of a the
> "base" hashring class is unclear and unnecessary, and we should move away
> from that. A `Distributor` is something which distributes bridges to
> users, not some weird half-thought-out hashring subclass.
>
>  3. The various Distributors in `bridgedb.Dist` should be different
> modules.
>
>  4. Almost none of the code in `Bridges.py` and `Dist.py` is unittested.
> These modules have the highest number of untested lines of code currently
> in BridgeDB.
>
> After this is finished, I am mostly willing to tag `BridgeDB-1.0.0`.
> There may be a few other refactoring that should get finished before
> then, but this is the main piece remaining to be completed.

New description:

 I have slowly been refactoring all of BridgeDB. Code which has been
 already refactored is named with "proper" (according to PEP8) lower-cased
 module names in `lib/bridgedb` in the BridgeDB repository.

 Some of the largest, least-unitested, (and most difficult to refactor)
 sections of BridgeDB's code are the `Bridges.py` module and the `Dist.py`
 module. This code primarily controls the hashrings and the distributors
 (which for some reason are subclasses of the very-confused hashrings
 structures).

 The biggest problems are:

  1. The code for the various types of hashrings in `bridgedb.Bridges` is a
 complete mess. In some places, hashrings are referred to as
 `BridgeHolders`, in other places as `Buckets` (though not to be confused
 with the `Bucket.py` module, and in other places as "hashrings".
 Subclassing is haphazard and confusing to say the least. In addition, the
 hashrings are not algorithmically as efficient as they could be.
 Throughout the hashring code were old-style classes, unused methods, half-
 implemented methods, and unnecessary parameters. All of this code needs
 some serious help.

  2. The Distributors in `bridgedb.Dist` inherit, for some unknown reason,
 from the improperly implemented "base class"
 `bridgedb.Bridges.BridgeHolder`. One, this isn't how one implements a
 proper Python base class (by deriving from a class with `__metaclass__ =
 abc.ABCMeta`). Two, why a `Distributor` should be a subclass of a the
 "base" hashring class is unclear and unnecessary, and we should move away
 from that. A `Distributor` is something which distributes bridges to
 users, not some weird half-thought-out hashring subclass.

  3. The various Distributors in `bridgedb.Dist` should be different
 modules.

  4. Almost none of the code in `Bridges.py` and `Dist.py` is unittested.
 These modules have the highest number of untested lines of code currently
 in BridgeDB.

 After this is finished, I am mostly willing to tag `BridgeDB-1.0.0`. There
 may be a few other refactoring that should get finished before then, but
 this is the main piece remaining to be completed.

--

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/12505#comment:5>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list