[tor-relays] [WARN] Your system clock just jumped 100 seconds forward; assuming established circuits no longer work.

Thu Feb 20 07:56:07 UTC 2014

On 2/20/14, grarpamp <grarpamp at gmail.com> wrote:
> Since you say it repeats you oppurtunity to check the
> system clock first:
> - configure tor to syslog

added

> - send an ntpdate -q pool to syslog every 5min,
>  remove when solved.

Do you mean disable ntpd daemon, and run this instead? Sounds easy
enough, I imagine:
service ntp stop; while true; do ntpd -gqn -l /var/log/syslog; sleep 5m; done
service ntp start

Something like -L seems to be needed but -L doesn't stop, the
following - I don't know what these sorts of lines are (reading docs
now, but it might take a while to figure it out) - ie I don't know why
ntpd listens on LAN:
20 Feb 18:31:26 ntpd[22655]: Listen normally on 3 eth1 192.168.5.2 UDP 123

BTW, looks like ntpd has the same options as ntpdate, intended for the
same functions (at least on Debian, says ntpdate is deprecated).

Now, here's the last short while of this ntpd script output:
ntpd: time slew -0.015558s
ntpd: time slew +0.062676s
ntpd: time set +0.191404s
ntpd: time set +0.187068s
ntpd: time slew -0.029898s
ntpd: time slew +0.027801s

So it seems that the slew is somehow not being set properly, or
rather, now that ntpd is being run every 5 minutes, it gets to add
about .2 of a second pretty regularly (I'll continue to watch of
course), so something definitely seems amiss. I'm not loading the
system default ntpd config file.

It looks like time could be running slow and being _not_ updated,
except a few times a day, resulting in the 2-3minute jump.

> - send *.* to /var/log/all

What's that mean? I need just a little more handholding sorry. Is this
intended to be a torrc config? It sounds like a good idea to send
everything to one log file for a while, till I debug this.

I'll get back to this, but just for the moment, I notice some
interesting repeating lines all over daemon.log re ntpd (but not
recently):

...
daemon.log:Feb 18 03:10:08 lt8 ntpd[2845]: peer 203.24.120.194 now valid
daemon.log:Feb 18 03:15:37 lt8 ntpd[2845]: peer 203.59.9.110 now invalid
daemon.log:Feb 18 03:28:34 lt8 ntpd[2843]: adjusting clock frequency
by 18.354661 to 3.931566ppm
daemon.log:Feb 18 04:08:41 lt8 ntpd[2845]: peer 203.59.9.110 now valid
daemon.log:Feb 18 04:15:29 lt8 ntpd[2845]: peer 118.88.20.194 now invalid
daemon.log:Feb 18 04:15:34 lt8 ntpd[2845]: peer 130.102.2.123 now invalid
daemon.log:Feb 18 04:15:47 lt8 ntpd[2845]: peer 203.171.85.237 now invalid
daemon.log:Feb 18 04:15:52 lt8 ntpd[2845]: peer 202.127.210.37 now invalid
daemon.log:Feb 18 04:16:01 lt8 ntpd[2845]: peer 202.147.104.52 now invalid
daemon.log:Feb 18 04:30:05 lt8 ntpd[2845]: peer 203.59.9.110 now invalid
daemon.log:Feb 18 04:59:11 lt8 ntpd[2845]: 10 out of 20 peers valid
daemon.log:Feb 18 04:59:11 lt8 ntpd[2845]: bad peer from pool
0.debian.pool.ntp.org (130.102.2.123)
daemon.log:Feb 18 04:59:11 lt8 ntpd[2845]: bad peer from pool
0.debian.pool.ntp.org (202.127.210.37)
daemon.log:Feb 18 04:59:11 lt8 ntpd[2845]: bad peer from pool
1.debian.pool.ntp.org (203.59.9.110)
...

nothing similar in today's daemon.log though.

> and see what you find around where the date lines
> start to slide or jump past each other. Graph it
> with rrd/gnuplot if you want. epoch format helps there.
> Your timekeeping is probably just broke somewhere,

sounds like it.

> ie: your system has bad clock drift and also can't sync
> because there's a firewall blocking ntp, so off goes
> your time. Check that first.

Here's what I'm getting every 5 minutes:

Feb 20 18:46:59 lt8 ntpd[22662]: ntpd 4.2.6p5 at 1.2349-o Sat May 12
09:07:18 UTC 2012 (1)
20 Feb 18:46:59 ntpd[22662]: proto: precision = 2.514 usec
20 Feb 18:46:59 ntpd[22662]: Listen and drop on 0 v4wildcard 0.0.0.0 UDP 123
Feb 20 18:46:59 lt8 ntpd[22662]: logging to file /var/log/syslog
20 Feb 18:46:59 ntpd[22662]: Listen and drop on 1 v6wildcard :: UDP 123
20 Feb 18:46:59 ntpd[22662]: Listen normally on 2 lo 127.0.0.1 UDP 123
20 Feb 18:46:59 ntpd[22662]: Listen normally on 3 eth0 192.168.5.2 UDP 123
20 Feb 18:46:59 ntpd[22662]: Listen normally on 4 eth0
fe80::202:8aff:fe21:77f1 UDP 123
20 Feb 18:46:59 ntpd[22662]: Listen normally on 5 lo ::1 UDP 123
20 Feb 18:46:59 ntpd[22662]: peers refreshed
20 Feb 18:46:59 ntpd[22662]: Listening on routing socket on fd #22 for
interface updates
20 Feb 18:47:12 ntpd[22662]: ntpd: time slew +0.027801 s

again, I don't (yet) understand why ntpd is Listening as seen above.

there appears to be no problems with firewall, that I can tell yet
anyway (also eg the logs above - I've done some grepping etc over the
last few days of logs and everything outgoing appears to work, and
incoming ORPort DIRPort are successfully forwarded and the relay is
chugging away nicely within its limits. Also Internet generally works
(downloading debian boot cd for example) - and no ntp server
connection failures I could spot in the logs just before.

> It's not your isp since you say you're using
> ntpd against external debian pool and odds are
> not someone stuffing you bogus timedata. Though
> your ntp cli query may not be the same as the ntp
> daemon query re: udp/tcp port they use, stateful
> firewall timeout, etc... ie: somehow ntp blocked.
> Or stale/unwriteable startup drift files on disk. If ntpdate
> is set, then under ntpd running for 15min+,
> if ntpq -np does not show one asterisk(*) in front

I am running the 5-minutely script at the moment. But perhaps I should
keep running the system ntpd at the same time, so I can do this
suggested test?

> of one of the nonlocal peers, you're not synced.
> And you'll be no luck until you are, fix that first.

Thank you so much for the pointers  - I've a bit of monitoring and
testing and reading to do it seems.

> Not likely to be Tor or kernel, but you can then next
> - move the drive to another known good mobo/cpu box.
> - do the kernel event logging thing
> - bump tor's loglevel

Thanks again,
Zenaan