Table of Contents
When Things Go Wrong
Email Outage
Web Outage
When Things Go Wrong
Email Outage
External facing Linux server running BIND.
Zone updated after a long time but Linux permissions resulted in zone file being locked and non-functional.
Zone included MX records that were then unavailable.
Caused Email outage.
Linux admin team decided that they didn't want
DNS
service maintenance under their role so moved it to Infoblox.
Web Outage
NIOS Hidden Primary.
Third Party
DNS
provider zone transferred to publicly host zones.
Third Party acquired by another company.
Other company migrates systems.
Everything works.
Other company deletes migration systems.
Turns out, a few customers had their TSIG keys tied up in the migration system.
TSIG keys no longer existed.
Zone transfers failed.
Alerts for zone transfer failure… failed.
After a week, the secondary servers dropped the zones that they could not longer update.
Result - outage of critical website until customer updated public
DNS
to “unhide” their NIOS boxes.