From 68716fc7a03d56eb0287eb359fa2fd852b2788d7 Mon Sep 17 00:00:00 2001 From: Fabian Arrotin Date: Jun 28 2021 14:21:54 +0000 Subject: Some updated doc for ocp powerdown/up for CI, based on ocp 4.7.x Signed-off-by: Fabian Arrotin --- diff --git a/docs/operations/ci/cordoning_nodes_and_draining_pods.md b/docs/operations/ci/cordoning_nodes_and_draining_pods.md index e0ab9a2..911402d 100644 --- a/docs/operations/ci/cordoning_nodes_and_draining_pods.md +++ b/docs/operations/ci/cordoning_nodes_and_draining_pods.md @@ -31,7 +31,7 @@ Note: It might not switch to `NotReady` immediately, there maybe many pods still This will drain node ``, delete any local data, and ignore daemonsets, and give a period of 60 seconds for pods to drain gracefully. ``` -oc adm drain --delete-local-data=true --ignore-daemonsets=true --grace-period=60 +oc adm drain --delete-emptydir-data=true --ignore-daemonsets=true --grace-period=15 ``` 4. Perform the scheduled maintenance on the node diff --git a/docs/operations/ci/graceful_shutdown_ocp_cluster.md b/docs/operations/ci/graceful_shutdown_ocp_cluster.md index dbd4fde..c411bbe 100644 --- a/docs/operations/ci/graceful_shutdown_ocp_cluster.md +++ b/docs/operations/ci/graceful_shutdown_ocp_cluster.md @@ -20,7 +20,7 @@ nodes=$(oc get nodes -o name | sed -E "s/node\///") 2. Shutdown the nodes from the administration box associated with the cluster eg prod/staging. ``` -for node in ${nodes[@]}; do ssh -i core@$node sudo shutdown -h now; done +for node in ${nodes[@]}; do ssh -i ~/ocp_backups/.backup.key core@$node sudo shutdown -h now; done ``` diff --git a/docs/operations/ci/graceful_startup_ocp_cluster.md b/docs/operations/ci/graceful_startup_ocp_cluster.md index 1e5711f..12ddb24 100644 --- a/docs/operations/ci/graceful_startup_ocp_cluster.md +++ b/docs/operations/ci/graceful_startup_ocp_cluster.md @@ -9,12 +9,20 @@ This SOP should be followed in the following scenarios: Prequisite steps: -1. Start the physical nodes +### Start the physical nodes : -- Production uses `adhoc-openshift-nfs-stats.yaml` playbook to stop/start/restart nodes -- Staging uses seamicro accessible from admin machine, user manual contained in centosci/ocp4-docs/sops/seamicro + - Production uses `adhoc-ipmi-poweron.yml` playbook to stop/start/restart nodes : -2. Once the nodes have been started they must be uncordoned if appropriate +``` +ansible-playbook playbooks/adhoc-ipmi-poweron.yml -l ocp-ci +``` + - Staging uses `adhoc-seamicro-poweron.yml` playbook to start the powered off seamicro compute nodes + +``` +ansible-playbook playbooks/adhoc-seamicro-poweron.yml # This will prompt for nodes, just use `ocp-ci` as group +``` + +### Once the nodes have been started they must be uncordoned if appropriate ``` oc get nodes @@ -65,7 +73,7 @@ kempty-n9.ci.centos.org Ready worker 106d v1.18.3+6c42de8 ``` -### Resources +## Resources - [1] [Graceful Cluster Startup](https://docs.openshift.com/container-platform/4.5/backup_and_restore/graceful-cluster-restart.html) - [2] [Cluster disaster recovery](https://docs.openshift.com/container-platform/4.5/backup_and_restore/disaster_recovery/scenario-2-restoring-cluster-state.html#dr-restoring-cluster-state)