Text Blame History Raw

CentOS DNS authoritative and resolvers setup

Public DNS setup

Bind authoritative servers

We use Bind as main DNS authoritative solution. For the public zones, we simply use the traditional primary/secondary setup, where primary zone is updated and then secondary servers are notified and so issue a IXFR/AXFR transfer to get latest zone content (and have same SOA)

The way we configure DNS zones is easy :

  • update static zone (if needed, see below) in the filestore/bind directory , managed as a git repository, based on the environment)
  • play the bind ansible role either on trigger or just wait for automatic role to be applied (see the Ansible section for the setup)

We have also some delegated zones that are either still served by bind, or PowerDNS (see below)

Static zones

As described above, to add/delete/modify a DNS record in the static zone, one has just to :

  • update SOA in zone file
  • update/add/delete record
  • commit/push to git (in the filestore git repository, depending on the inventory)
  • trigger ansible

List of zones served through static files (public zones):

  • centos.org
  • ocp.centos.org
  • ocp.ci.centos.org
  • ocp.stg.ci.centos.org
  • centosproject.org
Specific records

CAA records: used to publicly announce which valid CA can sign our certificates for our zones :

dig @ns1.centos.org -t CAA centos.org +short
0 issue "amazon.com"
0 issuewild "letsencrypt.org"
0 issue "letsencrypt.org"
0 issuewild "digicert.com"
0 issue "digicert.com"

TXT / SPF records: used for Sender Policy Framework and restrict from which IP block/host one can send mail originating from @centos.org domain

TXT for kerberos : we have a pointer to announce that one can use FEDORAPROJECT for kerberos ticket

dig @ns1.centos.org -t TXT _kerberos.centos.org +short
"FEDORAPROJECT.ORG"

CNAME : simple aliases for other A/AAAA records

CNAME for TLS/ACME dns challenge : we use some static CNAME pointing to equivalent record in dynamic zone (see below)

NS records : for the zones that we delegate to other authoritative servers, like for example (but not limited to) mirror.centos.org , served by PowerDNS/GeoIP (see also below)

Dynamic zones

We also have a specific acme.centos.org zone, that is only use for one specific purpose : creating on the fly TXT records that will be used by LetsEncrypt/ACME for DNS challenge. For this we use acme.sh tool that will do that automatically for us : it will use nsupdate with specific allowed key to create dynamically the needed record that ACME server will verify to validate and then sign the CSR.

See TLS section on how to use it.

Some pointers:

PowerDNS servers (GeoIP)

For some specific records, like mirror.centos.org, or vault.centos.org (and others) , we wanted to use something else than simple Round-Robin logic into Bind zone file. The idea was to optimize where to redirect based on GeoIP/country information, and so use nearest server for that role.

PowerDNS is a really good authoritative solution that also permits you to inject your own Pipe backend , meaning that we were able to have our own logic based on our requirements.

See our pdns-custom-geoip-backend git repository that contains the simple code used for that and corresponding pdns-pipe ansible role used to automatically deploy it.

Workflow for the dynamic backend :

  • we have a central nodes.db sqlite3 DB that is where we use a specific schema to enter fqdn, ipv4/ipv6 address[es], region, continent, country, and if node is active or not (see /var/lib/centos-infra/nodes.db)
  • if we have to add/modify/remove a node we just do it in that sqlite DB
  • we then regenerate a .json file parsing DB and sorting in correct format that can be consummed by powerdns pipe backend (call the /var/lib/centos-infra/gen_backend script), that will also encrypt .json with gpg
  • delegated powerdns nodes will detect changes, decrypt files, and reload it in memory automatically

Worth knowing that for existing setup, and when we want to put a machine out of the pool, or add it back, we have a simple ansible adhoc-task (using prompt vars) that can be used for this : adhoc-node-pdns-modify.yml :

ansible-playbook-prod playbooks/adhoc-node-pdns-modify.yml 
Host to modify in PowerDNS ? => : centosq7.centos.org
Action (enable|disable) ? => : enable

PLAY [centosq7.centos.org] ********************************************************************************************

TASK [Enable/Disable msync node in PowerDNS geoip backend] ************************************************************
Friday 25 June 2021  14:46:44 +0200 (0:00:07.333)       0:00:07.333 *********** 
changed: [centosq7.centos.org]

PLAY RECAP ************************************************************************************************************
centosq7.centos.org        : ok=1    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   

Friday 25 June 2021  14:46:51 +0200 (0:00:06.806)       0:00:14.140 *********** 
=============================================================================== 
Enable/Disable msync node in PowerDNS geoip backend ------------------------------------------------------------ 6.81s
Playbook run took 0 days, 0 hours, 0 minutes, 14 seconds

Internal DNS setup

Bind authoritative and resolvers

For DCs that we control (Red Hat ones) and for which we have internal zone/subnet and so different set of internal IPs, we also use Bind, but with other features added in our Ansible role, like allow recursion (specific ACL to let internal subnet uses bind both as authoritative and resolver)

The procedure to update a zone is identical to the one described for public zone, but surely coming from a different ansible inventory and so different filestore git repo (tied to that inventory/env)

Worth knowing that we (ab)use some specific feature like Response Policy Zones RPZ : that permits us to , while still internally, answers automatically and redirect known external records (like mirrorlist.centos.org) to an internal IP and so not query the public authoritative servers.

All that is supported by our bind Ansible role, so consider reading defaults/main.yml to see how that works, or have access in ansible inventory/filestore for real examples.

Unbound resolvers

In some specific subnets/environments we also use Unbound which is a also a really lightweight/fast resolver, that supports plenty of features.

While not technically called RPZ, Unbound let you define some records, and forward other queries to other resolvers (forwarders).

Our unbound ansible role supports such features, like :

  • control ACL (for recursion)
  • override some specific records through ansible list
  • also parse automatically an ansible group itself to generate dynamically a kind of internal zone.

That last feature is the one we use (through unbound_local_groups) for the internal rdu2.centos.org zone, as when we'll add a new machine, and ip defined for the host/VM, unbound will automatically add it into computed file used by unbound.