Tree - centos/centos-infra-docs

phsmoura / centos / centos-infra-docs

Forked from centos/centos-infra-docs 2 years ago

Source
Stats

Blame docs/operations/ci/adding_nodes.md

Blob History Raw

		47c289	`# Adding Compute/Worker nodes`
		47c289	`This SOP should be used in the following scenario:`
		47c289
		47c289	`- Red Hat OpenShift Container Platform 4.x cluster has been installed some time ago (1+ days ago) and additional worker nodes are required to increase the capacity for the cluster.`
		47c289
		47c289
		47c289	`## Steps`
		47c289
		47c289	`1. Add the new nodes being added to the cluster to the appropriate inventory file in the appropriate group.`
		47c289
		47c289	`eg:`
		47c289
		47c289	```
		47c289	`# ocp, compute/worker:`
		47c289	`[ocp-ci-compute]`
		47c289	`newnode1.example.centos.org`
		47c289	`newnode2.example.centos.org`
		47c289	`newnode3.example.centos.org`
		47c289	`newnode4.example.centos.org`
		47c289	`newnode5.example.centos.org`
		47c289	```
		47c289
		47c289	`eg:`
		47c289
		47c289	```
		47c289	`# ocp.stg, compute/worker:`
		47c289	`[ocp-stg-ci-compute]`
		47c289	`newnode6.example.centos.org`
		47c289	`newnode7.example.centos.org`
		47c289	`newnode8.example.centos.org`
		47c289	`newnode9.example.centos.org`
		47c289
		47c289	`# ocp.stg, master/control plane`
		47c289	`[ocp-stg-ci-master]`
		47c289	`newnode10.example.centos.org`
		47c289	```
		47c289
		47c289
		47c289	2. Examine the `inventory` file for `ocp` or `ocp.stg` and determine which management node corresponds with the group `ocp-ci-management`.
		47c289
		47c289	`eg:`
		47c289
		47c289	```
		47c289	`[ocp-ci-management]`
		47c289	`some-managementnode.example.centos.org`
		47c289	```
		47c289
		47c289	3. Find the OCP admin user which is contained in the hostvars for this management node at the key `ocp_service_account`.
		47c289
		47c289	`eg:`
		47c289
		47c289	```
		47c289	`host_vars/some-managementnode.example.centos.org:ocp_service_account: adminuser`
		47c289	```
		47c289
		47c289	4. SSH to the node identified in step `2`, and become the user identified in step `3`.
		47c289
		47c289	`eg:`
		47c289
		47c289	```
		47c289	`ssh some-managementnode.example.centos.org`
		47c289
		47c289	`sudo su - adminuser`
		47c289	```
		47c289
		47c289	5. Verify that you are authenticated correctly to the Openshift cluster as the `system:admin`.
		47c289
		47c289	```
		47c289	`oc whoami`
		47c289	`system:admin`
		47c289	```
		47c289
		47c289	`6. Retrieve the certificate from the internal API and convert the contents to base64 string like so.`
		47c289
		47c289	`eg:`
		47c289
		47c289	```
		47c289	`echo "q" \| openssl s_client -connect api-int.ocp.ci.centos.org:22623 -showcerts \| awk '/-----BEGIN CERTIFICATE-----/,/-----END CERTIFICATE-----/' \| base64 --wrap=0`
		47c289	`DONE`
		47c289	`XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXCERTSTOREDASABASE64ENCODEDSTRING=`
		47c289	```
		47c289
		47c289	7. Replace the cert in the compute/worker ignition file, at the `XXXXXXXXREPLACEMEXXXXXXXX=` point, be sure to save this change in SCM, and push.
		47c289
		47c289	```
		47c289	`cat filestore/rhcos/compute.ign`
		47c289	`{"ignition":{"config":{"append":[{"source":"https://api-int.ocp.ci.centos.org:22623/config/worker","verification":{}}]},"security":{"tls":{"certificateAuthorities":[{"source":"data:text/plain;charset=utf-8;base64,XXXXXXXXREPLACEMEXXXXXXXX=","verification":{}}]}},"timeouts":{},"version":"2.2.0"},"networkd":{},"passwd":{},"storage":{"disks":[{"device":"/dev/sdb","wipeTable":true}]},"systemd":{}}`
		47c289	```
		47c289
		47c289	8. Once the ignition file has been updated, run the `adhoc-provision-ocp4-node` playbook to copy the updated ignition files up to the http server, and install the new node(s). When prompted, specify the hostname of the new node. Best to do one at a time, it takes a minute or two per new node being added at this step.
		47c289
		47c289	`eg:`
		47c289
		47c289	```
		47c289	`ansible-playbook playbooks/adhoc-provision-ocp4-node.yml`
		47c289	`[WARNING] Nodes to be fully wiped/reinstalled with OCP => : newnode6.example.centos.org`
		47c289	```
		47c289
		47c289	`9. As the new nodes are provisioned, they will attempt to join the cluster. They must first be accepted.`
		47c289
		47c289	```
		47c289	`# List the certs. If you see status pending, this is the worker/compute nodes attempting to join the cluster. It must be approved.`
		47c289	`oc get csr`
		47c289
		47c289	`# Accept all node CSRs one liner`
		47c289	`oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' \| xargs oc adm certificate approve`
		47c289	```
		47c289
		47c289
		47c289	`10. Finally run the playbook to update haproxy config to monitor the new nodes.`
		47c289
		47c289	```
		47c289	`ansible-playbook playbooks/role-haproxy.yml --tags="config"`
		47c289	```
		47c289
		47c289
		47c289	`To see more information about adding new worker/compute nodes to a user provisioned infrastructure based OCP4 cluster see the detailed steps at [1],[2].`
		47c289
		47c289
		47c289	`### Resources`
		47c289
		47c289	`- [1] [How to add Openshift 4 RHCOS worker nodes in UPI <24 hours](https://access.redhat.com/solutions/4246261)`
		47c289	`- [2] [How to add Openshift 4 RHCOS worker nodes to UPI >24 hours](https://access.redhat.com/solutions/4799921)`

phsmoura / centos / centos-infra-docs

Source Code

Blame docs/operations/ci/adding_nodes.md