diff --git a/docs/infra/dns.md b/docs/infra/dns.md index 72c78aa..a22f701 100644 --- a/docs/infra/dns.md +++ b/docs/infra/dns.md @@ -4,14 +4,141 @@ ### Bind authoritative servers +We use [Bind](https://www.isc.org/bind/) as main DNS authoritative solution. +For the `public` zones, we simply use the traditional primary/secondary setup, where primary zone is updated and then secondary servers are notified and so issue a IXFR/AXFR transfer to get latest zone content (and have same SOA) + +The way we configure DNS zones is easy : + + * update static zone (if needed, see below) in the `filestore/bind` directory , managed as a git repository, based on the environment) + * play the [bind](https://github.com/centos/ansible-role-bind) ansible role either on trigger or just wait for automatic role to be applied (see the Ansible section for the setup) + +We have also some delegated zones that are either still served by bind, or PowerDNS (see below) + #### Static zones +As described above, to add/delete/modify a DNS record in the static zone, one has just to : + + * update SOA in `zone` file + * update/add/delete record + * commit/push to git (in the `filestore` git repository, depending on the inventory) + * trigger ansible + +List of zones served through static files (public zones): + + * centos.org + * ocp.centos.org + * ocp.ci.centos.org + * ocp.stg.ci.centos.org + * centosproject.org + +##### Specific records + +`CAA` records: used to publicly announce which valid CA can sign our certificates for our zones : + +```bash +dig @ns1.centos.org -t CAA centos.org +short +0 issue "amazon.com" +0 issuewild "letsencrypt.org" +0 issue "letsencrypt.org" +0 issuewild "digicert.com" +0 issue "digicert.com" + +``` + +`TXT / SPF` records: used for [Sender Policy Framework](https://en.wikipedia.org/wiki/Sender_Policy_Framework) and restrict from which IP block/host one can send mail originating from @centos.org domain + +`TXT` for kerberos : we have a pointer to announce that one can use FEDORAPROJECT for kerberos ticket + +``` +dig @ns1.centos.org -t TXT _kerberos.centos.org +short +"FEDORAPROJECT.ORG" +``` + +`CNAME` : simple aliases for other A/AAAA records + +`CNAME` for TLS/ACME dns challenge : we use some `static` CNAME pointing to equivalent record in `dynamic` zone (see below) + +`NS` records : for the zones that we delegate to other authoritative servers, like for example (but not limited to) `mirror.centos.org` , served by PowerDNS/GeoIP (see also below) + #### Dynamic zones +We also have a specific `acme.centos.org` zone, that is only use for one specific purpose : creating on the fly TXT records that will be used by LetsEncrypt/ACME for DNS challenge. +For this we use [acme.sh](https://github.com/Neilpang/acme.sh) tool that will do that automatically for us : it will use nsupdate with specific allowed key to create dynamically the needed record that ACME server will verify to validate and then sign the CSR. + +See [TLS section](/tls/#how-to-obtain-new-cert-dns-challenge-is-the-preferred-way) on how to use it. + +Some pointers: + + * [https://github.com/acmesh-official/acme.sh/wiki/dnsapi#7-use-nsupdate-to-automatically-issue-cert](https://github.com/acmesh-official/acme.sh/wiki/dnsapi#7-use-nsupdate-to-automatically-issue-cert) + * [https://github.com/Neilpang/acme.sh/wiki/DNS-alias-mode](https://github.com/Neilpang/acme.sh/wiki/DNS-alias-mode) + + ### PowerDNS servers (GeoIP) +For some specific records, like `mirror.centos.org`, or `vault.centos.org` (and others) , we wanted to use something else than simple Round-Robin logic into Bind zone file. +The idea was to optimize where to redirect based on GeoIP/country information, and so use nearest server for that role. + +[PowerDNS](https://www.powerdns.com) is a really good authoritative solution that also permits you to inject your own [Pipe backend](https://doc.powerdns.com/md/authoritative/backend-pipe/) , meaning that we were able to have our own logic based on our requirements. + +See our [pdns-custom-geoip-backend](https://github.com/CentOS/pdns-custom-geoip-backend) git repository that contains the simple code used for that and corresponding [pdns-pipe](https://github.com/CentOS/ansible-role-pdns-pipe) ansible role used to automatically deploy it. + +Workflow for the `dynamic` backend : + + * we have a central `nodes.db` sqlite3 DB that is where we use a specific schema to enter fqdn, ipv4/ipv6 address[es], region, continent, country, and if node is active or not (see `/var/lib/centos-infra/nodes.db`) + * if we have to add/modify/remove a node we just do it in that sqlite DB + * we then regenerate a .json file parsing DB and sorting in correct format that can be consummed by powerdns pipe backend (call the `/var/lib/centos-infra/gen_backend` script), that will also encrypt .json with gpg + * delegated powerdns nodes will detect changes, decrypt files, and reload it in memory automatically + +Worth knowing that for existing setup, and when we want to put a machine out of the pool, or add it back, we have a simple ansible adhoc-task (using prompt vars) that can be used for this : `adhoc-node-pdns-modify.yml` : + +``` +ansible-playbook-prod playbooks/adhoc-node-pdns-modify.yml +Host to modify in PowerDNS ? => : centosq7.centos.org +Action (enable|disable) ? => : enable + +PLAY [centosq7.centos.org] ******************************************************************************************** + +TASK [Enable/Disable msync node in PowerDNS geoip backend] ************************************************************ +Friday 25 June 2021 14:46:44 +0200 (0:00:07.333) 0:00:07.333 *********** +changed: [centosq7.centos.org] + +PLAY RECAP ************************************************************************************************************ +centosq7.centos.org : ok=1 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 + +Friday 25 June 2021 14:46:51 +0200 (0:00:06.806) 0:00:14.140 *********** +=============================================================================== +Enable/Disable msync node in PowerDNS geoip backend ------------------------------------------------------------ 6.81s +Playbook run took 0 days, 0 hours, 0 minutes, 14 seconds + +``` + + ## Internal DNS setup ### Bind authoritative and resolvers +For DCs that we control (Red Hat ones) and for which we have internal zone/subnet and so different set of `internal` IPs, we also use Bind, but with other features added in our Ansible role, like allow recursion (specific ACL to let internal subnet uses bind both as authoritative *and* resolver) + +The procedure to update a zone is identical to the one described for public zone, but surely coming from a different ansible inventory and so different `filestore` git repo (tied to that inventory/env) + +Worth knowing that we (ab)use some specific feature like Response Policy Zones [RPZ](https://www.isc.org/rpz/) : that permits us to , while still internally, answers automatically and redirect known `external` records (like mirrorlist.centos.org) to an internal IP and so not query the `public` authoritative servers. + +All that is supported by our [bind](https://github.com/centos/ansible-role-bind) Ansible role, so consider reading defaults/main.yml to see how that works, or have access in ansible inventory/filestore for real examples. + + ### Unbound resolvers + +In some specific subnets/environments we also use [Unbound](https://www.nlnetlabs.nl/projects/unbound/about/) which is a also a really lightweight/fast resolver, that supports plenty of features. + +While not technically called RPZ, Unbound let you define some records, and forward other queries to other resolvers (forwarders). + +Our [unbound](https://github.com/CentOS/ansible-role-unbound) ansible role supports such features, like : + + * control ACL (for recursion) + * override some specific records through ansible list + * also parse *automatically* an ansible group itself to generate dynamically a kind of internal zone. + +That last feature is the one we use (through `unbound_local_groups`) for the `internal` rdu2.centos.org zone, as when we'll add a new machine, and ip defined for the host/VM, unbound will automatically add it into computed file used by unbound. + + +