[tor-dev] [RFC] Proposal: A First Take at PoW Over Introduction Circuits
Jim Newsome
jnewsome at torproject.org
Thu Sep 24 16:54:29 UTC 2020
On 9/22/20 07:10, George Kadianakis wrote:
> George Kadianakis <desnacked at riseup.net> writes:
>
>> tevador <tevador at gmail.com> writes:
>>
>>> Hi all,
>>>
> Hello,
>
> I have pushed another update to the PoW proposal here:
> https://github.com/asn-d6/torspec/tree/pow-over-intro
> I also (finally) merged it upstream to torspec as proposal #327:
> https://github.com/torproject/torspec/blob/master/proposals/327-pow-over-intro.txt
>
> The most important improvements are:
> - Add tevador as an author.
> - Update PoW algorithms based on tevador's Equix feedback.
> - Update effort estimation algorithm based on tevador's simulation.
> - Include hybrid attack section.
> - Remove a bunch of blocker tags.
>
> Two things I'd like to work more on:
>
> - I'd like people to take tevador's Equix PoW function and run it on
> their boxes and post back benchmarks of how it performed.
I shared some results privately with George and he suggested including
the list. Results below.
> Particularly
> so if you have a GPU-enabled box, so that we can get some benchmarks
> from GPUs as well. That will help us tune the proposal even more.
For anyone else following along or also contributing benchmarks, George
clarified for me that the equix benchmark isn't capable of utilizing the
GPU.
My results:
First results are on my w530, i7, 4 core (hyperthreaded to 8) laptop
(with moderate activity in the background).
I stumbled across some weird artifacts when using more threads than
processors: the benchmark reports solutions/sec continuing to increase
linearly with #threads. The wall-clock time for the benchmark itself
(measured with `time`) show the expected trend though of linear scaling
only up to 4 (the number of physical cores), a little bump at 8 (using
the hyperthreaded virtual cores), and no improvement past that.
Further below are results on my pinephone.
$ time ./equix-bench --threads 1
Solving nonces 0-499 (interpret: 0, hugepages: 0, threads: 1) ...
1.910000 solutions/nonce
227.714446 solutions/sec. (1 thread)
20301.439170 verifications/sec. (1 thread)
real 0m4.242s
user 0m4.230s
sys 0m0.012s
$ time ./equix-bench --threads 2
Solving nonces 0-499 (interpret: 0, hugepages: 0, threads: 2) ...
1.910000 solutions/nonce
450.100153 solutions/sec. (2 threads)
17925.519934 verifications/sec. (1 thread)
real 0m2.184s
user 0m4.294s
sys 0m0.004s
$ time ./equix-bench --threads 4
Solving nonces 0-499 (interpret: 0, hugepages: 0, threads: 4) ...
1.910000 solutions/nonce
876.343564 solutions/sec. (4 threads)
18863.079719 verifications/sec. (1 thread)
real 0m1.154s
user 0m4.400s
sys 0m0.012s
$ time ./equix-bench --threads 8
Solving nonces 0-499 (interpret: 0, hugepages: 0, threads: 8) ...
1.910000 solutions/nonce
1089.198671 solutions/sec. (8 threads)
17808.857809 verifications/sec. (1 thread)
real 0m0.981s
user 0m7.019s
sys 0m0.052s
$ time ./equix-bench --threads 16
Solving nonces 0-499 (interpret: 0, hugepages: 0, threads: 16) ...
1.910000 solutions/nonce
2183.232035 solutions/sec. (16 threads)
18936.014118 verifications/sec. (1 thread)
real 0m1.025s
user 0m7.021s
sys 0m0.032s
$ time ./equix-bench --threads 32
Solving nonces 0-499 (interpret: 0, hugepages: 0, threads: 32) ...
1.910000 solutions/nonce
4397.259598 solutions/sec. (32 threads)
17754.229411 verifications/sec. (1 thread)
real 0m1.026s
user 0m6.961s
sys 0m0.049s
$ cat /proc/cpuinfo
<snip>
processor : 7
vendor_id : GenuineIntel
cpu family : 6
model : 58
model name : Intel(R) Core(TM) i7-3740QM CPU @ 2.70GHz
stepping : 9
microcode : 0x21
cpu MHz : 1856.366
cache size : 6144 KB
physical id : 0
siblings : 8
core id : 3
cpu cores : 4
apicid : 7
initial apicid : 7
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe
syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl
xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor
ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic
popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm cpuid_fault
epb pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid
fsgsbase smep erms xsaveopt dtherm ida arat pln pts md_clear flush_l1d
bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass
l1tf mds swapgs itlb_multihit srbds
bogomips : 5387.48
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:
Similar behavior on the (4-core aarch64) pinephone:
$ time ./equix-bench --threads 1
Solving nonces 0-499 (interpret: 0, hugepages: 0, threads: 1) ...
1.910000 solutions/nonce
23.920219 solutions/sec. (1 thread)
4477.199102 verifications/sec. (1 thread)
real 0m 40.35s
user 0m 40.12s
sys 0m 0.01s
$ time ./equix-bench --threads 2
Solving nonces 0-499 (interpret: 0, hugepages: 0, threads: 2) ...
1.910000 solutions/nonce
47.683428 solutions/sec. (2 threads)
4384.937853 verifications/sec. (1 thread)
real 0m 20.45s
user 0m 40.20s
sys 0m 0.06s
$ time ./equix-bench --threads 4
Solving nonces 0-499 (interpret: 0, hugepages: 0, threads: 4) ...
1.910000 solutions/nonce
94.149494 solutions/sec. (4 threads)
4359.695415 verifications/sec. (1 thread)
real 0m 10.47s
user 0m 40.71s
sys 0m 0.08s
$ time ./equix-bench --threads 8
Solving nonces 0-499 (interpret: 0, hugepages: 0, threads: 8) ...
1.910000 solutions/nonce
188.808873 solutions/sec. (8 threads)
4348.479398 verifications/sec. (1 thread)
real 0m 10.50s
user 0m 40.61s
sys 0m 0.07s
$ cat /proc/cpuinfo
<snip>
processor : 3
BogoMIPS : 48.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x0
CPU part : 0xd03
CPU revision : 4
More information about the tor-dev
mailing list