Blob Blame History Raw
# CentOS DNS authoritative and resolvers setup

## Public DNS setup

### Bind authoritative servers

We use [Bind](https://www.isc.org/bind/) as main DNS authoritative solution.
For the `public` zones, we simply use the traditional primary/secondary setup, where primary zone is updated and then secondary servers are notified and so issue a IXFR/AXFR transfer to get latest zone content (and have same SOA)

The way we configure DNS zones is easy : 

 * update static zone (if needed, see below) in the `filestore/bind` directory , managed as a git repository, based on the environment)
 * play the [bind](https://github.com/centos/ansible-role-bind) ansible role either on trigger or just wait for automatic role to be applied (see the Ansible section for the setup)

We have also some delegated zones that are either still served by bind, or PowerDNS (see below)

#### Static zones

As described above, to add/delete/modify a DNS record in the static zone, one has just to :

  * update SOA in `zone` file
  * update/add/delete record
  * commit/push to git (in the `filestore` git repository, depending on the inventory)
  * trigger ansible

List of zones served through static files (public zones): 

 * centos.org
   * ocp.centos.org
   * ocp.ci.centos.org
   * ocp.stg.ci.centos.org
 * centosproject.org

##### Specific records

`CAA` records: used to publicly announce which valid CA can sign our certificates for our zones : 

```bash
dig @ns1.centos.org -t CAA centos.org +short
0 issue "amazon.com"
0 issuewild "letsencrypt.org"
0 issue "letsencrypt.org"
0 issuewild "digicert.com"
0 issue "digicert.com"

``` 

`TXT / SPF` records: used for [Sender Policy Framework](https://en.wikipedia.org/wiki/Sender_Policy_Framework) and restrict from which IP block/host one can send mail originating from @centos.org domain

`TXT` for kerberos : we have a pointer to announce that one can use FEDORAPROJECT for kerberos ticket

```
dig @ns1.centos.org -t TXT _kerberos.centos.org +short
"FEDORAPROJECT.ORG"
```

`CNAME` : simple aliases for other A/AAAA records

`CNAME` for TLS/ACME dns challenge : we use some `static` CNAME pointing to equivalent record in `dynamic` zone (see below)

`NS` records : for the zones that we delegate to other authoritative servers, like for example (but not limited to) `mirror.centos.org` , served by PowerDNS/GeoIP (see also below)

#### Dynamic zones

We also have a specific `acme.centos.org` zone, that is only use for one specific purpose : creating on the fly TXT records that will be used by LetsEncrypt/ACME for DNS challenge.
For this we use [acme.sh](https://github.com/Neilpang/acme.sh) tool that will do that automatically for us : it will use nsupdate with specific allowed key to create dynamically the needed record that ACME server will verify to validate and then sign the CSR.

See [TLS section](/tls/#how-to-obtain-new-cert-dns-challenge-is-the-preferred-way) on how to use it.

Some pointers:

  * [https://github.com/acmesh-official/acme.sh/wiki/dnsapi#7-use-nsupdate-to-automatically-issue-cert](https://github.com/acmesh-official/acme.sh/wiki/dnsapi#7-use-nsupdate-to-automatically-issue-cert)
  * [https://github.com/Neilpang/acme.sh/wiki/DNS-alias-mode](https://github.com/Neilpang/acme.sh/wiki/DNS-alias-mode)


### PowerDNS servers (GeoIP)

For some specific records, like `mirror.centos.org`, or `vault.centos.org` (and others) , we wanted to use something else than simple Round-Robin logic into Bind zone file.
The idea was to optimize where to redirect based on GeoIP/country information, and so use nearest server for that role.

[PowerDNS](https://www.powerdns.com) is a really good authoritative solution that also permits you to inject your own [Pipe backend](https://doc.powerdns.com/md/authoritative/backend-pipe/) , meaning that we were able to have our own logic based on our requirements.

See our [pdns-custom-geoip-backend](https://github.com/CentOS/pdns-custom-geoip-backend) git repository that contains the simple code used for that and corresponding [pdns-pipe](https://github.com/CentOS/ansible-role-pdns-pipe) ansible role used to automatically deploy it.

Workflow for the `dynamic` backend : 

 * we have a central `nodes.db` sqlite3 DB that is where we use a specific schema to enter fqdn, ipv4/ipv6 address[es], region, continent, country, and if node is active or not (see `/var/lib/centos-infra/nodes.db`)
 * if we have to add/modify/remove a node we just do it in that sqlite DB
 * we then regenerate a .json file parsing DB and sorting in correct format that can be consummed by powerdns pipe backend (call the `/var/lib/centos-infra/gen_backend` script), that will also encrypt .json with gpg
 * delegated powerdns nodes will detect changes, decrypt files, and reload it in memory automatically

Worth knowing that for existing setup, and when we want to put a machine out of the pool, or add it back, we have a simple ansible adhoc-task (using prompt vars) that can be used for this : `adhoc-node-pdns-modify.yml` : 

```
ansible-playbook-prod playbooks/adhoc-node-pdns-modify.yml 
Host to modify in PowerDNS ? => : centosq7.centos.org
Action (enable|disable) ? => : enable

PLAY [centosq7.centos.org] ********************************************************************************************

TASK [Enable/Disable msync node in PowerDNS geoip backend] ************************************************************
Friday 25 June 2021  14:46:44 +0200 (0:00:07.333)       0:00:07.333 *********** 
changed: [centosq7.centos.org]

PLAY RECAP ************************************************************************************************************
centosq7.centos.org        : ok=1    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   

Friday 25 June 2021  14:46:51 +0200 (0:00:06.806)       0:00:14.140 *********** 
=============================================================================== 
Enable/Disable msync node in PowerDNS geoip backend ------------------------------------------------------------ 6.81s
Playbook run took 0 days, 0 hours, 0 minutes, 14 seconds

```


## Internal DNS setup

### Bind authoritative and resolvers

For DCs that we control (Red Hat ones) and for which we have internal zone/subnet and so different set of `internal` IPs, we also use Bind, but with other features added in our Ansible role, like allow recursion (specific ACL to let internal subnet uses bind both as authoritative *and* resolver)

The procedure to update a zone is identical to the one described for public zone, but surely coming from a different ansible inventory and so different `filestore` git repo (tied to that inventory/env)

Worth knowing that we (ab)use some specific feature like Response Policy Zones [RPZ](https://www.isc.org/rpz/) : that permits us to , while still internally, answers automatically and redirect known `external` records (like mirrorlist.centos.org) to an internal IP and so not query the `public` authoritative servers.

All that is supported by our [bind](https://github.com/centos/ansible-role-bind) Ansible role, so consider reading defaults/main.yml to see how that works, or have access in ansible inventory/filestore for real examples.


### Unbound resolvers

In some specific subnets/environments we also use [Unbound](https://www.nlnetlabs.nl/projects/unbound/about/) which is a also a really lightweight/fast resolver, that supports plenty of features.

While not technically called RPZ, Unbound let you define some records, and forward other queries to other resolvers (forwarders).

Our [unbound](https://github.com/CentOS/ansible-role-unbound) ansible role supports such features, like : 

 * control ACL (for recursion)
 * override some specific records through ansible list
 * also parse *automatically* an ansible group itself to generate dynamically a kind of internal zone.

That last feature is the one we use (through `unbound_local_groups`) for the `internal` rdu2.centos.org zone, as when we'll add a new machine, and ip defined for the host/VM, unbound will automatically add it into computed file used by unbound.