Monthly Archives: 八月 2018

NTP 的一些问题

这篇文章写的

ntp 状态及含义

ntp 又叫 Network Protocol Protocol,是一种通过网络来同步时间的协议。

当在linux里维护自己的NTP server时,通过ntpq命令可以查看当前ntp server的状态。

而linux ntp客户端可以通过ntptime来查看当前ntp的同步情况,通过ntpdate来手动同步时间。

ntpdate

强制同步时间

  1. 如果有ntpd,需要先关闭,再同步
service ntp stop
ntpdate time.nist.gov
service ntp start
  1. 或者是加上 -u 参数
ntpdate -u time.nist.gov
  1. 或者是手动同步
date -u --set='2016-09-20 08:14:17.427319'

ntpq

ntpq -pn
root@owning:~# ntpq -pn
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
+193.228.143.14  192.36.143.151   2 u  142  256  317  247.201   20.400  15.036
-202.112.10.37   10.3.8.150       5 u  177  256  373   49.110    5.891   1.191
+61.216.153.106  118.163.81.63    3 u  151  256  377   53.737   -4.461   1.382
*120.25.108.11   10.137.53.7      2 u  215  256  377    7.486    7.879   1.613
-91.189.89.198   17.253.34.125    2 u  215  256  377  402.811  -35.993   9.119
root@owning:~#

输出含义

  • remote and refid: remote NTP server, and its NTP server
  • st: stratum of server
    # from https://serverfault.com/questions/277375/ntpdate-d-server-dropped-strata-too-high
    
    NTP increases the stratum for each level in the hierarchy - a NTP server pulling time from a "stratum 1" server would advertise itself as "stratum 2" to its clients.
    
    A stratum value of "16" is reserved for unsynchronized servers meaning that your internal NTP server at 192.168.92.82 thinks not to have a reliable timesource (i.e. not synchronizing to a higher-level stratum server).
    
  • t: type of server (local, unicast, multicast, or broadcast)

  • poll: how frequently to query server (in seconds)
  • when: how long since last poll (in seconds)
  • reach: octal bitmask of success or failure of last 8 queries (left-shifted); 377 = 11111111 = all recent queries were successful; 257 = 10101111 = 4 most recent were successful, 5 and 7 failed

    最后八次查询(每次查询间隔为poll)的结果。

    这个值非常的 tricky,用了八位的bit,先转换成八进制,再以字符串的方式显示出来。

    00 000 000 -> 0
    00 000 001 -> 1
    00 000 011 -> 3
    00 000 111 -> 7
    00 001 111 -> 17
    00 011 111 -> 37
    00 111 111 -> 77
    01 111 111 -> 177
    11 111 111 -> 377
    
    10 101 111 -> 257
    

    测试发现,当ntp server重启后,值变为 17(即最近四次的query,除去刚启动时的一次,说明启动后间隔poll的三次query都成功),这时remote会加上*。下面是示例。

    # 第三次,reach 是 7。
    Wed Aug 15 11:56:02 CST 2018
     remote           refid      st t when poll reach   delay   offset  jitter
    ==============================================================================
    202.108.6.95    10.6.63.22       2 u   65   64    7   24.777  -15.577   0.082
    
    # 第四次,reach 是 17。
    Wed Aug 15 11:56:03 CST 2018
     remote           refid      st t when poll reach   delay   offset  jitter
    ==============================================================================
    *202.108.6.95    10.6.63.22       2 u    -   64   17   24.327  -15.328   0.209
    
  • delay: network round trip time (in milliseconds)
  • offset: difference between local clock and remote clock (in milliseconds)
  • jitter: difference of successive time values from server (high jitter could be due to an unstable clock or, more likely, poor network performance)

参考

remote 前的符号是什么意思?

NTP源码 查看,是根据查询上游服务器的结果的返回值来决定展示什么符号。

char flash3[] = " x.-+#*o"; /* flash decode for peer status version 3 */

所以得查 RFC1305,才能知道每个符号的具体含义:


' ' 0, rejected 'x' 1, passed sanity checks (tests 1 through 8 in Section 3.4.3) '.' 2, passed correctness checks (intersection algorithm in Section 4.2.1) '-' 3, passed candidate checks (if limit check implemented) '+' 4, passed outlyer checks (clustering algorithm in Section 4.2.2) '#' 5, current synchronization source; max distance exceeded (if limit check implemented) '*' 6, current synchronization source; max distance okay 只有ntp server的remote(upstream)为完全同步时,其它ntp client才能从这个server同步时间。 'o' 7, reserved

参考

from ntp source code

    case MODE_CLIENT:
        if (ISREFCLOCKADR(&srcadr))
            type = 'l'; /* local refclock*/
        else if (SOCK_UNSPEC(&srcadr))
            type = 'p'; /* pool */
        else if (IS_MCAST(&srcadr))
            type = 'a'; /* manycastclient */
        else
            type = 'u'; /* unicast */
        break;

ntptime

[root@node01 ~]# ntptime
ntp_gettime() returns code 5 (ERROR)
  time de9e7743.14f3881c  Thu, May 10 2018 15:46:11.081, (.081841975),
  maximum error 16000000 us, estimated error 16000000 us, TAI offset 0
ntp_adjtime() returns code 5 (ERROR)
  modes 0x0 (),
  offset 0.000 us, frequency -8.823 ppm, interval 1 s,
  maximum error 16000000 us, estimated error 16000000 us,
  status 0x2041 (PLL,UNSYNC,NANO),
  time constant 9, precision 0.001 us, tolerance 500 ppm,
[root@node01 ~]#

如果系统起了 ntpd 服务,手动同步主机时间(date -u --set XXX)后,ntptime 可能会如上错误。

其它

timedatectl

root@owning:~# timedatectl status
      Local time: Thu 2018-05-17 11:59:14 CST
  Universal time: Thu 2018-05-17 03:59:14 UTC
        RTC time: Thu 2018-05-17 03:59:12
        Timezone: Asia/Shanghai (CST, +0800)
     NTP enabled: yes
NTP synchronized: yes
 RTC in local TZ: no
      DST active: n/a
root@owning:~#

问题及排查

问题:ntpdate 同步,报 no server suitable for synchronization found

ntp client 从 ntp server 同步,报错 no server suitable for synchronization found

[root@client ~]# ntpdate -dv -b 192.168.2.61
15 Aug 11:38:35 ntpdate[3123]: ntpdate 4.2.6p5@1.2349-o Fri Jan 26 02:18:05 UTC 2018 (1)
Looking for host 192.168.2.61 and service ntp
host found : bigtable-01
transmit(192.168.2.61)
receive(192.168.2.61)
transmit(192.168.2.61)
receive(192.168.2.61)
transmit(192.168.2.61)
receive(192.168.2.61)
transmit(192.168.2.61)
receive(192.168.2.61)
192.168.2.61: Server dropped: strata too high
server 192.168.2.61, port 123
stratum 16, precision -22, leap 11, trust 000
refid [192.168.2.61], delay 0.02580, dispersion 0.00003
transmitted 4, in filter 4
reference time:    00000000.00000000  Mon, Jan  1 1900  8:05:43.000
originate timestamp: df1e1ebb.39c5652e  Wed, Aug 15 2018 11:38:35.225
transmit timestamp:  df1e1ebb.424aafc4  Wed, Aug 15 2018 11:38:35.258
filter delay:  0.02583  0.02591  0.02580  0.02589
         0.00000  0.00000  0.00000  0.00000
filter offset: -0.03335 -0.03343 -0.03338 -0.03343
         0.000000 0.000000 0.000000 0.000000
delay 0.02580, dispersion 0.00003
offset -0.033386

15 Aug 11:38:35 ntpdate[3123]: no server suitable for synchronization found
[root@client ~]#

从这段输出,stratum 为 16,16 是系统预留值,表明 ntp server 还未完全同步。

192.168.2.61: Server dropped: strata too high
server 192.168.2.61, port 123
stratum 16, precision -22, leap 11, trust 000

在 ntp server 上执行 ntpq -pn,也能发现,ntp server 和 upstream 的时间比较接近,但还没有完全同步(remote 前面没有 * 的标记)。

# ntp server 上检查
[root@server ~]# ntpq -pn
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 202.108.6.95    10.6.63.22       2 u   56   64    7   24.749  -23.655   0.091
[root@server ~]#

ntp server 在启动后,需要等几个poll周期(1+3个周期,可以根据reach字段来判断。)之后,才会认为和 upstream 已同步。

[root@server ~]# ntpq -pn
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
*202.108.6.95    10.6.63.22       2 u   43   64   17   24.478  -23.552   0.092
[root@server ~]#

等 ntp server 同步后,ntp client 重试下,就能同步了。

[root@client ~]# ntpdate -dv -b 192.168.2.61
15 Aug 11:51:56 ntpdate[6619]: ntpdate 4.2.6p5@1.2349-o Fri Jan 26 02:18:05 UTC 2018 (1)
Looking for host 192.168.2.61 and service ntp
host found : bigtable-01
transmit(192.168.2.61)
receive(192.168.2.61)
transmit(192.168.2.61)
receive(192.168.2.61)
transmit(192.168.2.61)
receive(192.168.2.61)
transmit(192.168.2.61)
receive(192.168.2.61)
server 192.168.2.61, port 123
stratum 3, precision -22, leap 00, trust 000
refid [192.168.2.61], delay 0.02580, dispersion 0.00003
transmitted 4, in filter 4
reference time:    df1e21ae.f96502d4  Wed, Aug 15 2018 11:51:10.974
originate timestamp: df1e21dc.2d33be99  Wed, Aug 15 2018 11:51:56.176
transmit timestamp:  df1e21dc.28b5ea58  Wed, Aug 15 2018 11:51:56.159
filter delay:  0.02588  0.02592  0.02580  0.02589
         0.00000  0.00000  0.00000  0.00000
filter offset: 0.017501 0.017394 0.017443 0.017392
         0.000000 0.000000 0.000000 0.000000
delay 0.02580, dispersion 0.00003
offset 0.017443

15 Aug 11:51:56 ntpdate[6619]: step time server 192.168.2.61 offset 0.017443 sec
[root@client ~]#

参考