Tree - centos/centos-infra-docs

centos / centos-infra-docs

Blame docs/operations/ci/installation/install.md

Blob History Raw

		47c289	`# Steps for installing OCP 4.3 on bare metal:`
		47c289
		47c289	`Documentation: [docs](https://access.redhat.com/documentation/en-us/openshift_container_platform/4.3/html/installing_on_bare_metal/installing-on-bare-metal)`
		47c289
		47c289	`## Install:`
		47c289	`* mkdir ocp-ci-centos-org`
		47c289	`* cd ocp-ci-centos-org`
		47c289	`* For installations of OpenShift Container Platform that use user-provisioned infrastructure, you must manually generate your installation configuration file.`
		47c289	`* 1.1.7.1. for sample config see: [here](https://projects.engineering.redhat.com/secure/attachment/104626/install-config.yaml.bak)`
		47c289
		47c289	```
		47c289	`apiVersion: v1`
		47c289	`baseDomain: centos.org`
		47c289	`compute:`
		47c289	`- hyperthreading: Enabled`
		47c289	`name: worker`
		47c289	`replicas: 0`
		47c289	`controlPlane:`
		47c289	`hyperthreading: Enabled`
		47c289	`name: master`
		47c289	`replicas: 3`
		47c289	`metadata:`
		47c289	`name: ocp.ci`
		47c289	`networking:`
		47c289	`clusterNetwork:`
		47c289	`- cidr: 10.128.0.0/14`
		47c289	`hostPrefix: 23`
		47c289	`networkType: OpenShiftSDN`
		47c289	`serviceNetwork:`
		47c289	`- 172.30.0.0/16`
		47c289	`platform:`
		47c289	`none: {}`
		47c289	`fips: false`
		47c289	`pullSecret: '<installation pull secret from cloud.redhat.com>'`
		47c289	`sshKey: '<ssh key for the RHCOS nodes>'`
		47c289	```
		47c289
		47c289
		47c289	`* get the pullsecret from [https://cloud.redhat.com/openshift/install/metal/user-provisioned](https://cloud.redhat.com/openshift/install/metal/user-provisioned) requires your access.redhat.com login.`
		47c289	`* “You must set the value of the replicas parameter to 0. This parameter controls the number of workers that the cluster creates and manages for you, which are functions that the cluster does not perform when you use user-provisioned infrastructure. You must manually deploy worker machines for the cluster to use before you finish installing OpenShift Container Platform.”`
		47c289	`* 1.1.8. Once the install-config.yaml configuration has been added correctly, take a backup of this file for future installs or reference as the next step will consume it. Then run the following:`
		47c289	* `openshift-install create manifests --dir=/home/dkirwan/ocp-ci-centos-org`
		47c289
		47c289	`INFO Consuming Install Config from target directory`
		47c289	`WARNING Certificate 35183CE837878BAC77A802A8A00B6434857 from additionalTrustBundle is x509 v3 but not a certificate authority`
		47c289	`WARNING Making control-plane schedulable by setting MastersSchedulable to true for Scheduler cluster settings.`
		47c289	`* Running this command converts the install-config.yaml to a number of files eg:`
		47c289	```
		47c289	`~/ocp-ci-centos-org $ tree .`
		47c289	`.`
		47c289	`├── manifests`
		47c289	`│ ├── 04-openshift-machine-config-operator.yaml`
		47c289	`│ ├── cluster-config.yaml`
		47c289	`│ ├── cluster-dns-02-config.yml`
		47c289	`│ ├── cluster-infrastructure-02-config.yml`
		47c289	`│ ├── cluster-ingress-02-config.yml`
		47c289	`│ ├── cluster-network-01-crd.yml`
		47c289	`│ ├── cluster-network-02-config.yml`
		47c289	`│ ├── cluster-proxy-01-config.yaml`
		47c289	`│ ├── cluster-scheduler-02-config.yml`
		47c289	`│ ├── cvo-overrides.yaml`
		47c289	`│ ├── etcd-ca-bundle-configmap.yaml`
		47c289	`│ ├── etcd-client-secret.yaml`
		47c289	`│ ├── etcd-host-service-endpoints.yaml`
		47c289	`│ ├── etcd-host-service.yaml`
		47c289	`│ ├── etcd-metric-client-secret.yaml`
		47c289	`│ ├── etcd-metric-serving-ca-configmap.yaml`
		47c289	`│ ├── etcd-metric-signer-secret.yaml`
		47c289	`│ ├── etcd-namespace.yaml`
		47c289	`│ ├── etcd-service.yaml`
		47c289	`│ ├── etcd-serving-ca-configmap.yaml`
		47c289	`│ ├── etcd-signer-secret.yaml`
		47c289	`│ ├── kube-cloud-config.yaml`
		47c289	`│ ├── kube-system-configmap-root-ca.yaml`
		47c289	`│ ├── machine-config-server-tls-secret.yaml`
		47c289	`│ ├── openshift-config-secret-pull-secret.yaml`
		47c289	`│ └── user-ca-bundle-config.yaml`
		47c289	`└── openshift`
		47c289	`├── 99_kubeadmin-password-secret.yaml`
		47c289	`├── 99_openshift-cluster-api_master-user-data-secret.yaml`
		47c289	`├── 99_openshift-cluster-api_worker-user-data-secret.yaml`
		47c289	`├── 99_openshift-machineconfig_99-master-ssh.yaml`
		47c289	`├── 99_openshift-machineconfig_99-worker-ssh.yaml`
		47c289	`└── openshift-install-manifests.yaml`
		47c289	`2 directories, 32 files`
		47c289	```
		47c289
		47c289	`* Edit manifests/cluster-scheduler-02-config.yml and set mastersSchedulable to false. This will prevent Pods from being scheduled on the master instances.`
		47c289	* `sed -i 's/mastersSchedulable: true/mastersSchedulable: false/g' manifests/cluster-scheduler-02-config.yml`
		47c289	`* Create the machineconfigs to disable dhcp on the master/worker nodes:`
		47c289
		47c289	```
		47c289	`for variant in master worker; do`
		47c289	`cat << EOF > ./99_openshift-machineconfig_99-${variant}-nm-nodhcp.yaml`
		47c289	`apiVersion: machineconfiguration.openshift.io/v1`
		47c289	`kind: MachineConfig`
		47c289	`metadata:`
		47c289	`labels:`
		47c289	`machineconfiguration.openshift.io/role: ${variant}`
		47c289	`name: nm-${variant}-nodhcp`
		47c289	`spec:`
		47c289	`config:`
		47c289	`ignition:`
		47c289	`config: {}`
		47c289	`security:`
		47c289	`tls: {}`
		47c289	`timeouts: {}`
		47c289	`version: 2.2.0`
		47c289	`networkd: {}`
		47c289	`passwd: {}`
		47c289	`storage:`
		47c289	`files:`
		47c289	`- contents:`
		47c289	`source: data:text/plain;charset=utf-8;base64,W21haW5dCm5vLWF1dG8tZGVmYXVsdD0qCg==`
		47c289	`verification: {}`
		47c289	`filesystem: root`
		47c289	`mode: 0644`
		47c289	`path: /etc/NetworkManager/conf.d/disabledhcp.conf`
		47c289	`osImageURL: ""`
		47c289	`EOF`
		47c289	`done`
		47c289	```
		47c289
		47c289	`* NOTE There is a gotcha here, fs mode is octal and should start with 0 eg 0644 (-rwxr--r--), however it will be decimal value 420 when queried later via kubernetes api.`
		47c289	`* Create the ignition configurations:`
		47c289	* Rename `worker.ign` to `compute.ign`, as later steps in the process are configured to point at compute.ign.
		47c289
		47c289	```
		47c289	`openshift-install create ignition-configs --dir=/home/dkirwan/ocp-ci-centos-org`
		47c289	`INFO Consuming OpenShift Install (Manifests) from target directory`
		47c289	`INFO Consuming Common Manifests from target directory`
		47c289	`INFO Consuming Master Machines from target directory`
		47c289	`INFO Consuming Worker Machines from target directory`
		47c289	`INFO Consuming Openshift Manifests from target directory`
		47c289
		47c289	`# Should have the following layout`
		47c289	`.`
		47c289	`├── auth`
		47c289	`│ ├── kubeadmin-password`
		47c289	`│ └── kubeconfig`
		47c289	`├── bootstrap.ign`
		47c289	`├── master.ign`
		47c289	`├── metadata.json`
		47c289	`└── compute.ign`
		47c289	```
		47c289
		47c289
		47c289	* NOTE for production ie `ocp.ci` we must perform an extra step at this point, as the machines have 2 hard disks attached. We want to ensure that `/dev/sdb` gets its partition table wiped at bootstrapping time, so at a later time we can configure the Local Storage Operator to manage this disk drive.
		47c289	* Modify the `master.ign` and `compute.ign` ignition files with the following:
		47c289
		47c289	```
		47c289	`+ "storage":{"disks":[{"device":"/dev/sdb","wipeTable":true}]},`
		47c289	`- "storage":{},`
		47c289	```
		47c289
		47c289
		47c289	`* 1.1.9. Creating Red Hat Enterprise Linux CoreOS (RHCOS) machines`
		47c289	`* Prerequisites:`
		47c289	`* Obtain the Ignition config files for your cluster.`
		47c289	`* Configure suitable PXE or iPXE infrastructure.`
		47c289	`* Have access to an HTTP server that you can access from your computer.`
		47c289	`* Have a load balancer eg Haproxy available`
		47c289	`* You must download the kernel, initramfs, ISO file and the RAW disk files eg:`
		47c289	`* [https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/4.3/latest/](https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/4.3/latest/)`
		47c289	`* [rhcos-4.3.8-x86_64-installer-kernel-x86_64](https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/4.3/latest/rhcos-4.3.8-x86_64-installer-kernel-x86_64)`
		47c289	`* [rhcos-4.3.8-x86_64-installer-initramfs.x86_64.img](https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/4.3/latest/rhcos-4.3.8-x86_64-installer-initramfs.x86_64.img)`
		47c289	`* [rhcos-4.3.8-x86_64-installer.x86_64.iso](https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/4.3/latest/rhcos-4.3.8-x86_64-installer.x86_64.iso)`
		47c289	`* [rhcos-4.3.8-x86_64-metal.x86_64.raw.gz](https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/4.3/latest/rhcos-4.3.8-x86_64-metal.x86_64.raw.gz)`
		47c289	`* These files should be copied over to a webserver which is accessible from the bootstrap/master/compute instances.`
		47c289	`* 1.1.9.2. “Configure the network boot infrastructure so that the machines boot from their local disks after RHCOS is installed on them. “`
		47c289	`* Existing CentOS PXE boot configuration Ansible [example](https://github.com/CentOS/ansible-infra-playbooks/blob/master/templates/pxeboot.j2)`
		47c289	`* Example RHCOS PXE boot configuration [here](https://projects.engineering.redhat.com/secure/attachment/104734/centos-ci-pxe_sampleconfig.txt)`
		47c289	* **1.1.10. Once the systems are booting and installing, you can monitor the installation with: `./openshift-install --dir=/home/dkirwan/ocp-ci-centos-org wait-for bootstrap-complete --log-level=info`
		47c289	`* Once the master nodes come up successfully, this command will exit. We can now remove the bootstrap instance, and repurpose it as a worker/compute node.`
		47c289	* Run the haproxy role, once the bootstrap node has been removed from the `ocp-ci-master-and-bootstrap-stg` ansible inventory group.
		47c289	`* Begin installing the compute/worker nodes.`
		47c289	* Once the workers are up accept them into the cluster by accepting their `csr` certs:
		47c289	```
		47c289	`# List the certs. If you see status pending, this is the worker/compute nodes attempting to join the cluster. It must be approved.`
		47c289	`oc get csr`
		47c289
		47c289	`# Accept all node CSRs one liner`
		47c289	`oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' \| xargs oc adm certificate approve`
		47c289	```
		47c289	`* 1.1.11. Logging in to the cluster. At this point the cluster is up, and we’re in configuration territory.`
		47c289
		47c289
		47c289	`## Manually test the bootstrap process RHCOS`
		47c289
		47c289	`Resources:`
		47c289
		47c289	`* [1] JIRA corresponding with this section: [CPE-661](https://projects.engineering.redhat.com/browse/CPE-661)`
		47c289	`* [2] [https://github.com/CentOS/ansible-infra-playbooks/pull/4](https://github.com/CentOS/ansible-infra-playbooks/pull/4)`
		47c289	`* [3] [https://scm.infra.centos.org/CentOS/ansible-inventory-ci/pulls/1](https://scm.infra.centos.org/CentOS/ansible-inventory-ci/pulls/1)`
		47c289	`* [4] [https://scm.infra.centos.org/CentOS/ansible-pkistore-ci/pulls/1](https://scm.infra.centos.org/CentOS/ansible-pkistore-ci/pulls/1)`
		47c289	`* [5] [CentOS/ansible-infra-playbooks/staging/templates/ocp_pxeboot.j2](https://raw.githubusercontent.com/CentOS/ansible-infra-playbooks/staging/templates/ocp_pxeboot.j2)`
		47c289	`* [https://www.openshift.com/blog/openshift-4-bare-metal-install-quickstart](https://www.openshift.com/blog/openshift-4-bare-metal-install-quickstart)`
		47c289	`* [6] [Create a raid enabled data volume via ignition file](https://coreos.com/ignition/docs/latest/examples.html#create-a-raid-enabled-data-volume)`
		47c289	`* [7] HAProxy config for OCP4 [https://github.com/openshift-tigerteam/guides/blob/master/ocp4/ocp4-haproxy.cfg](https://github.com/openshift-tigerteam/guides/blob/master/ocp4/ocp4-haproxy.cfg)`
		47c289
		47c289
		47c289	`Steps:`
		47c289
		47c289	* Create ssh key pair using `ssh-keygen` and uploaded it to the ansible-pkistore-ci repository at [4]
		47c289	`* Through trial and error, we’ve produced a PXE boot configuration for one of the machines and managed to get it to boot and begin the bootstrap process via an ignition file see [5].`
		47c289	`* Next steps is to make a decision on networking configuration then configure DNS and create 2 haproxy proxies before creating the bootstrap and master OCP nodes. Jiras created: [CPE-678](https://projects.engineering.redhat.com/browse/CPE-678), [CPE-677](https://projects.engineering.redhat.com/browse/CPE-677) and [CPE-676](https://projects.engineering.redhat.com/browse/CPE-676)`
		47c289	`* PR configuration for the HAProxy loadbalancers: [here](https://github.com/CentOS/ansible-role-haproxy/pull/2)`
		47c289	`* Configuration for DNS/bind (encrypted): [here](https://scm.infra.centos.org/CentOS/ansible-filestore-ci/src/branch/master/bind/ci.centos.org)`

centos / centos-infra-docs

Source Code

Blame docs/operations/ci/installation/install.md