|
|
ee6248 |
# Bare-metal host deploy operation
|
|
|
ee6248 |
|
|
|
ee6248 |
This process can be used to add a new bare-metal node in the CentOS Infra/inventory.
|
|
|
ee6248 |
It can be hosted within the `Community Cage` (Red Hat) DC, or dedicated/hosted server hosted by a CentOS sponsor
|
|
|
ee6248 |
|
|
|
ee6248 |
## DataCenter we control (Red Hat DC)
|
|
|
ee6248 |
|
|
|
ee6248 |
Through internal ticket with PNT/DevOps we ensure that machine/chassis is racked, and documented.
|
|
|
ee6248 |
We also add it in the [Internal Inventory](https://docs.google.com/spreadsheets/d/1K-aewLJ17z3pRC6K5qyBRJYtNXy1WcxRSVwPkGf4NXQ), and start also "reserving" IP addresses needed for IPMI/iDrac/mgmt vlan interface and also for Operating System.
|
|
|
ee6248 |
|
|
|
ee6248 |
We also have to create probably another ticket on [internal](https://help.redhat.com) portal to ensure that ToR switches (that we don't have control on) would have ports configured correctly (enabled, set to correct VLAN PVID, etc)
|
|
|
ee6248 |
|
|
|
ee6248 |
### Hardware initialization
|
|
|
ee6248 |
|
|
|
ee6248 |
There is a *very* small ip range in the mgmt vlan available for new nodes that would be connected. So on the internal dhcpd node (see in inventory which server is current for the `boot-server` ansible role), you can always verify/see if new machine is leased an ip from the oob/management vlan.
|
|
|
ee6248 |
|
|
|
ee6248 |
Once we have `dial tone` on the hardware side (oob/mgmt vlan), we need to ensure that we :
|
|
|
ee6248 |
|
|
|
ee6248 |
* change default credentials with randomly generated one
|
|
|
ee6248 |
* configure alerting for hardware issues
|
|
|
ee6248 |
* setup correctly raid array if we have a hardware raid controller
|
|
|
ee6248 |
|
|
|
ee6248 |
### Preparing PXE/UEFI boot env
|
|
|
ee6248 |
|
|
|
ee6248 |
If we want ansible to automatically deploy it, we'll just have to add the node in the inventory and ensure that the <inventory>/host_vars/<node> will have at least :
|
|
|
ee6248 |
|
|
|
ee6248 |
* following variables set : `
|
|
|
ee6248 |
* ipmi_ip`, `ipmi_user`, `ipmi_pass` : used to remotely pxe boot the node
|
|
|
ee6248 |
* `ip` , `gateway`, `netmask` and `dns` (usually apart from `ip`, which is unique, the rest is coming through inheritance
|
|
|
ee6248 |
* based on group inheritance, ensure that variables documented in [adhoc-provision-node.yml](https://github.com/CentOS/ansible-infra-playbooks/blob/master/adhoc-provision-node.yml) are also defined
|
|
|
ee6248 |
|
|
|
ee6248 |
### Deploying the machine
|
|
|
ee6248 |
|
|
|
ee6248 |
If previous steps are done and also network switch port[s] working, we can just now proceed with ansible :
|
|
|
ee6248 |
|
|
|
ee6248 |
```
|
|
|
ee6248 |
ansible-playbook-prod playbooks/adhoc-provision-node.yml
|
|
|
ee6248 |
[WARNING] Nodes to be fully wiped/reinstalled with CentOS => : <my_new_node[s>
|
|
|
ee6248 |
```
|
|
|
ee6248 |
|
|
|
ee6248 |
In a summary that playbook will (through `delegate_to` ansible tasks) :
|
|
|
ee6248 |
|
|
|
ee6248 |
* prepare the kickstart needed for the host to be deployed (jinja2 template)
|
|
|
ee6248 |
* prepare the pxe/tftp/grub settings to boot from network (on the tftpd node)
|
|
|
ee6248 |
* use ipmi to reset the hardware node and force booting over pxe
|
|
|
ee6248 |
* wait for sshd to be available on the freshly deployed node
|
|
|
ee6248 |
|
|
|
ee6248 |
!!! warning
|
|
|
ee6248 |
Attention : this will *wipe* existing operating system, reason why that playbook is using ansible `vars_prompt` to ensure that it's waiting for input that *you* need to verify. As you can also specify a group of machines to also be deployed but a wrong input would destroy/reinstall existing nodes.
|
|
|
ee6248 |
|
|
|
ee6248 |
## Sponsored machine
|
|
|
ee6248 |
|
|
|
ee6248 |
When we receive a new dedicated server, hosted in another DC that we don't control (no pxe/dhcp), the process usually goes like this :
|
|
|
ee6248 |
|
|
|
ee6248 |
* through email exchanged with sponsor, we agree on a minimal setup
|
|
|
ee6248 |
* we receive initial credentials
|
|
|
ee6248 |
* we collect needed informations (like ipv4/ipv6 address[es], dns resolvers, etc)
|
|
|
ee6248 |
* we perform remotely (without remote console access) a reinstall on itself (faster then auditing the state in which we receive a machine) that is reinstalled following our standards
|
|
|
ee6248 |
* we add node in dns/ansible (see [Common section](/operations/deploy/common) )
|
|
|
ee6248 |
|