Table of Contents
Infoblox NTP
- There is a dedicated page for general NTP.
- There is a dedicated page for public NTP servers.
- From NOS 9.0.4 onwards, NIOS will not use static UTC offsets such as (UTC+2:00). Instead, it will only have time zone names with DST changes. To achieve this, NIOS fetches the time zone list from the Ubuntu tzdata package and updates the same in the database.
- When Grid members synchronize their times with the Grid Master, the Grid Master and its members send NTP messages through an encrypted VPN tunnel. When a Grid member synchronizes its time with another Grid member, the NTP messages are not sent through a VPN tunnel.
- Unofficial tests indicate that the small, virtual TE-825 NIOS appliance can easily handle over 10k NTP transactions per second (TPS).
- If you have configured DNS anycast on the appliance, it can answer NTP requests through the anycast IP address.
- NTP Working group member says stratum has almost no bearing on accuracy. GM should sync to 3+ sources. Limit to no more than 5 or 7 because you can run into some interesting reboot time issue on appliances if there are DNS issues which lead to NTP service not resolving NTP FQDN.
- If Members are serving NTP, then get members to sync out directly to Internet rather than to the GM. This ensures they are not dependent on GM for time.
- Considerations on whether to sync all members to GM and then sync GM to external or whether it is better to sync all members directly to external time sources.
- If you sync all members to external time sources, those time sources may not agree so the Grid members may never be fully in sync.
- If you sync all members to external time sources, if one members gets firewalled off from NTP by accident, it will drift off on its own.
- If you want to serve time from NIOS to the network, consider having each member sync to external time.
- If you are not serving time from NIOS to the network, consider having the members sync to the GM and the GM sync to the external time source.
- NTP best practice is actually 5+ servers. 3 NTP servers is too few. If one goes down, you are left with 2 (the magic 'bad' number). Aim for 5 external sources, preferably different time sources (gps, radio) that are geographically disperse. Having more clocks is always better and make them easier to triangulate accurate time.
- Stratum of time source as almost no bearing on accuracy. That myth was started by people who sell time sources for exorbitant money. (Source: member of NTP working group)
- NIOS checks if the NTP is in sync or not every 4 seconds, if 5 consecutive checks fail, it will send a SNMP trap. I don't know what the offset diff has to be to be considered a sync fail.
- NTP listens on all interfaces. No specific config necessary for LAN2, other than possibly routing of course.
- With the containerization of NIOS it is more important to have a reliable external timezone because now ntpd is running inside a container which does worse at keeping time than when it was run at the VM level.
- For a disconnected NIOS hardware appliance, NTP time is not held well but no worse than anyone else. If the NIOS appliance is on KVM, or ESXi, then it can be much much much worse. This is why there are $10k-$20K NTP clocks that synchronize, and once they are sync'ed, if they get disconnected, they'll last a year or more before any appreciable drift.
- For your time source list, also consider using one or two from time.nist.gov and something appropriate (Top Public Time Servers) from this GitHub list.
Leap Seconds
NIOS has supported Leap Seconds since NIOS 6.5.0.
Properly configured NTP servers will send a leap second warning announcement to NTP clients. If the client OS supports leap seconds, it will receive the warning announcement and insert one second exactly at the time of the leap.
The kernel in NIOS supports receiving the leap second warning announcement and doing the one second insertion at precisely the right time. In turn, the NIOS NTP servers send notifications to NTP clients which are configured to use the Infoblox appliances as NTP servers. Infoblox was involved in providing the patch that fixed the issue in 2012/2013. (KB article on leap seconds and NIOS)
Show NTP Data CLI
show ntp
Example. Normally, you want the numbers in the 'offset' column to be <1.0 either side of 0.
Infoblox > show ntp
remote refid st t when poll reach delay offset jitter
==============================================================================
+212.23.8.6 195.66.241.2 2 u 42 64 377 5.691 -709.98 100.710
*212.23.10.129 85.199.214.99 2 u 37 64 377 12.644 -706.26 100.443
127.127.1.1 .LOCL. 12 l 758 64 0 0.000 +0.000 0.000
or
Infoblox > show ntp
remote refid st t when poll reach delay offset jitter
==============================================================================
-212.23.8.6 195.66.241.3 2 u 169 256 377 6.073 +0.054 0.303
+212.23.10.129 85.199.214.99 2 u 216 256 377 13.108 +0.094 0.327
127.127.1.1 .LOCL. 12 l 47d 64 0 0.000 +0.000 0.000
+216.239.35.12 .GOOG. 1 u 168 256 377 10.339 -0.247 0.362
*17.253.28.251 .GPSs. 1 u 35 64 377 6.536 -0.372 0.387
or
set maintenancemode show ntpstats
You can exit maintenance mode with
set maintenancemode off
NIOS-XaaS
NIOS-XaaS instance use AWS for NTP source and show to end user as 127.127.1.1 which is
a special NTP (Network Time Protocol) reference address that indicates a device is using its own local internal system clock as the time source (often labeled .LOCL.).
AWS provides a highly accurate, time synchronization service (Amazon Time Sync Service - ATSS - 169.254.169.123) inside every EC2 instance. It uses a fleet of redundant satellite-connected and atomic clocks in each region to deliver a highly accurate reference clock.
The NTP service offered by NIOS-X as a Service uses the Global NTP Settings (the “Global (Default)” profile). By default, the Global NTP configuration is set to limit clients to no more than 1 request per 2 second interval and to also consider more than 1 request per 8 seconds on average over time as too much. Clients sending too many requests will be sent KOD (kiss of death) packets.
Because throttling is applied per source IP, multiple devices behind the same NAT IP can collectively exceed the rate limit, even if each device individually polls at a reasonable rate. This could be plausible contributing factor in Rothschild (Continuation Computers Limited) case.
Setting Time
Time can be set under Grid Properties. Rememeber that you cannot set the time manually if the Grid is set to sync time from external servers (Grid NTP Settings).
When the Grid Member joins the Grid Master and the time difference between Grid Member and the Grid Master is more than 60 seconds, then Grid Member restarts to adjust time with Grid Master and logs: “System restart: time reset…”.
Synctime
NIOS 8.6.1 introduces the synctime command to synchronize the system time with the time of an
external NTP server or a Grid Manager. When you run the synctime command, NIOS checks to verify whether
there are already configured NTP servers present. If they are present, it displays the list of NTP servers and you
have to choose from the list. If there are no NTP servers that are configured, you must specify the IP address of the
NTP server or Grid Manager with which you want to synchronize the system time.
Running this command will immediately sync the appliance to its configured NTP server. The command is available in maintenance mode.
Once the command is issued, there will be a confirmation, and then a product restart will be performed (causing a temporary service interruption). The logs call this a “system restart (time reset)” and the network card gets bounced.
To run the command, enter maintenance mode
Infoblox > set maintenancemode Maintenance Mode > synctime It is recommended to sync grid manager with external server first. Do you want to continue ? (y or n): y VPN IP Address of NTP Server in Grid: 169.254.0.1 If grid manager is not excluded as an NTP Server, please use above VPN IP address! Configured external NTP servers: 85.199.214.98 If not use above VPN IP Address, and one of the servers above is reachable and a part of the list of configured NTP servers, please use that NTP server. Last NTP server used to synchronize the time through the synctime command is: 169.254.0.1 Enter the NTP server's IP address [Default: 169.254.0.1]: 85.199.214.98 The offset between server 85.199.214.98 and this system is 1.328228 seconds. Because the offset is very small, we recommend that you let NTP server handle it. The adjustment made to the system time will cause the product to restart. Do you want to continue? (y or n)
How Long Does it Take to Sync Time
Infoblox will not 'jump' time. If NIOS finds itself out-of-sync, it will slowly bring itself back into sync.
1,988 seconds (33 minutes and 8 seconds)to fix an error of 1 second.
e.g. an offset of 47 seconds would take the following to fix
- 93,436 seconds
- OR 1,558 minutes
- OR 26 hours
Normally, you want the numbers in the 'offset' column to be <1.000 either side of 0.
One (nasty) solution to get NIOS back in sync is the following
- Disable syncing to external NTP servers
- Update Grid Settings and you can now set the time. Set the time exactly to something about 15 seconds from now and click okay. You will be told that proceeding will restart the Grid Master. Wait until actual time matches the time you set and press okay.
- The Grid Master will restart.
- Log back into the Grid and enable syncing to external NTP servers.
- This should get the time back to close enough to perfect so that NIOS doesn't complain. If not, it should be close enough that it won't take very long to sync up.
NTP out of sync errors
If you get NTP out of sync and it only affects vNIOS and not physical Infoblox devices, verify whether there is a pattern when the NTP goes out of sync. It is possible that vNIOS goes out of sync when a VM snapshot is taken. vNIOS may also go out of sync during vMotion (moving the vNIOS from one ESXI server to other), as vMotion may cause local clocks to differ in time.
NTP External Source Issue
Issue the CLI command show ntp four or more times in a 15 minute interval to show whether the offset is gradually growing. In a normal NTP time sync, the offset should decrease gradually as the NTP program tries to slew down the time difference. If we find that the external clock offset is gradually growing, consider configuring another external NTP source to confirm whether the issue is with the currently configured NTP source.
Frequency exceeded errors in the logs
Log messages saying Frequency exceeded are displayed when the time computed by NTPD and the time reported by the system's internal clock exceed 500 PPM.
The frequency stability of an electronic oscillator component can be measured in ppm, one parts-per-million is 0.0001% (IE-6). Even an error of only 0.001% causes a clock to be off by almost one second per day. If the difference exceeds 500 parts-per-million (0.0005%) over the synchronization interval, the log frequency exceeded message appears in the logs.
Show NTP
To display NTP data, use the show ntp command.
Infoblox > show ntp
remote refid st t when poll reach delay offset jitter
==============================================================================
127.127.1.1 .LOCL. 12 l 5h 64 0 0.000 +0.000 0.000
+212.23.8.6 195.66.241.3 2 u 42 128 377 10.700 -0.329 0.265
+212.23.10.129 85.199.214.99 2 u 32 128 377 16.991 -0.517 0.488
*85.199.214.98 .GPS. 1 u 13 64 377 11.410 -0.170 0.172
When you execute the show ntp command, the NIOS appliance displays the following information:
- remote: The IP address of the remote peer.
- refid: Identifies the reference clock. This will be the clock type if the NTP server is Stratum 1. If it is Stratam 2 or higher then the refid will be the IP address of the NTP server's time source. (See Note on Refid)
- st: The stratum of the remote peer. Indicates the stratum of the configured clock. Infoblox recommends that an external stratum 1 clock be configured as the NTP server if you are running an enterprise network. Other stratum clocks, such as stratum 2 or 3, can be configured as the NTP server when the Grid Member or Grid Master is blocked by the firewall and cannot reach the external stratum 1 clock.
- t: The type of the peer, such as local (l), unicast (u) or broadcast (b).
- when: When the last packet was received, in seconds.
- poll: The polling interval, in seconds.
- reach: The reach-ability register, in octal numerals. Reach value represents the status of the last eight NTP transactions between the NTP daemon and a given remote time server in octet. This reach value should read 377. If it is not 377, run a traffic capture on the Grid Master or on the Grid Members that show NTP out of sync and see if there are responses from external NTP servers on UDP port 123.
- delay: The current estimated delay, in seconds. Should be as low as possible. 1 - 10 is superb. 11 - 20 great. 20 - 30 is more common and not bad. >100 is not so great. Delay (Latency) is the delay between the Local clock (NIOS appliance) and the external NTP servers. Delay varies depending upon the distance and network latency of the external NTP server. Normally, the delay can be between 5-40 Milliseconds. Choose the server with the least possible delay before configuring NTP.
- offset: The offset of the peer clock relative to the local clock, in milliseconds. The value should be something like (0.517). This means there is less than a second difference. If you see something line 39256.2 then your clock is about 39 second out-of-sync and it will take some time to correct (see above). Offset is the time difference in milliseconds from the external NTP server and local clock. Offset value more than 300 seconds (300,000 milliseconds) need a step change which need the
ntpdatecommand. Thentpdatecommand is executed during product restart. If you notice offset more than 300 seconds, please consider doing a product restart as there is no other way to run thentpdatecommand. - jitter: The estimated time error of the system clock. Should be a value less than 1.0. Ideally as close to 0.000 as possible (e.g. 0.205).
NTP Settings
If the members are told to sync to Grid Master, they will do so over the VPN tunnel. If the tunnel is down (e.g. product reboot), then they will use whatever the Grid Master uses for NTP source. This can lead to long product reboots if there is a very long list of NTP servers to sync to and the member does not have access to any of them. Either permit access or reduce the list.
Troubleshooting
NTP Error
- Facility:
daemon - Level:
Notice - Server:
ntpd[1356860] - Message:
frequency error -500 PPM exceeds tolerance 500 PPM
