zlopez / centos / sig-guide

Forked from centos/sig-guide 2 years ago
Clone
Blob Blame History Raw
# Testing

The CentOS Project has some resources available for each SIG to run some CI jobs/tests for their projects.
We'll document soon how to get onboarded (on request) on the CentOS CI infra platform.

![CI Infra overview](img/duffy-aws.drawio.png)

!!! warning
    This is the future plan as we're currently moving infra from local DC to AWS. 

We offer the following resources :

  * Openshift hosted jenkins (one per project/SIG), using usual authentication from FAS/ACO
  * bare-metal and/or Virtual Machines ephemeral nodes on which you can run some tests (including destructive ones), that will be automatically be reinstalled (for bare-metal) or discarded (for VMs) - aka `Duffy` 


## Openshift

We provide access to an [Openshift](https://console-openshift-console.apps.ocp.ci.centos.org/) cluster, hosted (actually) next to the Duffy ephemeral nodes infra.
Once [Infra ticket/request](https://pagure.io/centos-infra/new_issue?template=ci-migration) is validated you'll be granted access (through ACO/FAS account) to a namespace on that cluster.

### Authentication

The openshift cluster is linked/tied to accounts.centos.org so it will be using SSO to let you login (no need for additional credentials)

### Interacting with Openshift

One can use the [web console](https://console-openshift-console.apps.ocp.ci.centos.org/) to interact with deployed pods, see openshift logs, and also eventually interact with terminal on running pods.

But it's also possible to [download](https://console-openshift-console.apps.ocp.ci.centos.org/command-line-tools) the `oc` cli tool and so interact with openshift from cli. Don't forget that for that to work, you need first to login through web console and then you'll have a `copy login command` under your user name (top left corner) , that will basically take you to [oauth token](https://oauth-openshift.apps.ocp.ci.centos.org/oauth/token/display) display.
You can just copy that command in your own terminal (be sure to have up2date oc binary !) and start interacting with openshift


### Jenkins
We provide a Jenkins template that can be provisioned for you automatically. It's using the jenkins image from openshift catalog but we modified it with some extra parameters : Deployed jenkins will be able to launch ephemeral pods as jenkins executor (nothing should run on the jenkins pod itself)

For that you need to configure your job to use the `cico-workspace` label, as that will be automatically trigger a pod deployment in openshift from a template (cico-workspace)
That environment pod is using a centos 8-stream based container, that will automatically connect to jenkins as "agent" and also contains the following tools :

  * git
  * ansible(-core)
  * python-cicoclient (for legacy Duffy, see below)
  * duffy[client] pip package (to interact with newer Duffy, see below) 

That pod will mount the ssh private key used for your project under /duffy-ssh-key/ssh-privatekey (see [ssh_config](https://github.com/centosci/images/blob/master/cico-workspace/ssh_config#L2)) and also have the `CICO_API_KEY` env variable set up to request duffy v2 nodes 

From that point, it's up to you to :

 * write a function/script that will request a Duffy node (see below)
 * ssh into the machine[s] to run some tests (usually pulled from git repository somewhere)
 * return ephemeral nodes to duffy

!!! note
    As a test job, to "play" with the concept, you can just configure a simple "Freestyle project" that would just run "/bin/bash" (so that jenkins job continue to run in background) , and from Openshift console, you can take the pod terminal to explore it and try things.


## Duffy (ephemeral bare-metal/Virtual Machines provider)

Duffy is the middle layer running ci.centos.org that manages the provisioning, maintenance and teardown / rebuild of the Nodes (physical hardware and VMs) that are used to run the tests in the CI Cluster.

We provide both bare-metal and VMs and support the following architectures :

  * x86_64 (both physical and VMs)
  * aarch64 (both physical and VMs)
  * ppc64le (only VMs on Power 8 or Power 9 but supporting nested virtualization)

The EC2 instances are also provisioned with a second EBS volume (unconfigured) that you can then init the way you want (requested initially by Ceph for their own testing)

To be able to request ephemeral nodes to run your tests, you'll need both your `project` name and `api key` that will be generated for you once your project will have be allowed on that Infra.

!!! note
    It's worth knowing that there are quotas in place to ensure that you can't request infinite number of nodes in parallel and each `session` has a maximum lifetime of 6h. After that your nodes are automatically reclaimed by Duffy and are freshly reinstalled and put back in the Ready pool.


### Installing and configuring duffy client
Use `pip` (or `pip3.8` on el8) to install duffy client. (already installed in `cico-workspace` pod template in openshift so not needed there)

```shell
pip install --user duffy[client]
```

Duffy client needs the tenant's name and the tenant's API key to be able to request sessions from duffy server. If the tenant doesn't exist yet, it should be created in duffy server. Having the tenant's name and the tenant's API key, create the file `.config/duffy` with the following content. 

```
client:
  url: https://duffy.ci.centos.org/api/v1
  auth:
    name: <tenant name>
    key: <API key>
```

One can also call `duffy client` without any existing config file (if you want to):

```
duffy client --url https://duffy.ci.centos.org/api/v1 --auth-name <tenant_name> --auth-key $CICO_API_KEY <command>
```

!!! danger
    never leak your API key so if you use this command from within jenkins, by default jenkins is using `set -x` so outputing/echoing commands. So don't forget to use `set +x` before the duffy call (or else).

### Requesting a session

Before creating a session, the name of the pool is required. Check the pool available executing the command.

```shell
duffy client list-pools
```

!!! note
    The name of the pool is structured like this: 

    `<AAA>-<BBB>-<CCC>-<DDD>-<EEE>-<FFF>`

    - AAA: Identify if it is a bare metal or virtual machine
    - BBB: The kind of the instance, like seamicro bare-metal, AWS EC2, etc
    - CCC: The machine flavor type
    - DDD: Operating System (CentOS|Fedora)
    - EEE: OS version
    - FFF: architecture (x86_64|aarch64|ppc64le)


Having the name of the pool, request how many sessions needed. Duffy has a limit of sessions per tenant, this information is available in the duffy server.

Worth knowing that one can also see current pool usage and machines in `ready` state, etc, by querying specific pool. Example:

```
duffy client show-pool virt-ec2-t2-centos-9s-x86_64
{
  "action": "get",
  "pool": {
    "name": "virt-ec2-t2-centos-9s-x86_64",
    "fill_level": 5,
    "levels": {
      "provisioning": 0,
      "ready": 5,
      "contextualizing": 0,
      "deployed": 0,
      "deprovisioning": 0
    }
  }
}

```

To then request some nodes from a pool one can use the following duffy call : 

```shell
duffy client request-session pool=<name of the pool>,quantity=<number of machines to get>
``` 

By default this command outputs a _json_, but it's possible to change the format to _yaml_ or _flat_ using `--format`. Under "node" key it's possible to find the node's hostname provisioned. Log in to it as `root` user, using `ssh`.

```json
{
<...output ommited...>

"nodes": [
    {
        "hostname": "<hostname>",
        "ipaddr": "<ip address>",

<...output ommited...>
}
```

### Retiring a session

At the end of the test, you should "return" your ephemeral nodes to Duffy API service. This will trigger either a reinstall of the physical node (through kickstart) or just discarding/terminating it (if that's a cloud instance)

To retire a session, the session id is required. Check the id executing.

```shell
duffy client list-sessions
```

When needed to retire the session execute the command.

```shell
duffy client retire-session <session id>
```

## Artifacts storage

There is a artifacts storage box that you can use to store ephemeral artifacts (logs, build, etc). It's publicly available as [https://artifacts.ci.centos.org](https://artifacts.ci.centos.org).

How can you push to that storage box ? 
Each tenant will have a dedicated directory (owned by them) under /srv/artifacts.
You can use your project name as user and your project ssh keypair to push through ssh (rsync, scp) to ssh://`tenant_name`@artifacts.ci.centos.org:/srv/artifacts/`tenant_name`

Worth knowing that while you can push through ssh, there is no allowed shell for you on that storage box, so use scp or rsync directly from the jenkins pod that has your private key to push to that storage box

!!! warning
    We'll implement some rotation to clean-up used space on that machine on regular basis, so don't expect pushed files to remain available forever !

## Migration to new CI instance

In case you want to migrate your old jenkins configuration to new CI instance follow this guide.

1. Login to [old openshift instance](https://oauth-openshift.apps.ocp.ci.centos.org/)
2. Click at your username in upper right corner and select `Copy login command`
3. Use [oc](https://console-openshift-console.apps.ocp.ci.centos.org/command-line-tools) tool to login
4. Switch to correct project `oc project <project_name>` (you should already be on correct project, but it's better to check)
5. Copy the old configuration to your machine `oc rsync <pod_name>:/var/lib/jenkins/jobs <target_directory>`

   !!! note
       Usually you want to only migrate jobs, but if you need any other configuration file, just
       do `oc rsh <pod_name>` and look inside `/var/lib/jenkins` for all configuration files.
       Just be aware that some of the files could contain IP or hostname that will no longer work
       in new CI instance.

6. Login to [new openshift instance](https://console-openshift-console.apps.ocp.cloud.ci.centos.org/)
7. Click at your username in upper right corner and select `Copy login command`
8. Use [oc](https://console-openshift-console.apps.ocp.ci.centos.org/command-line-tools) tool to login
9. Switch to correct project `oc project <project_name>` (you should already be on correct project, but it's better to check)
10. Copy the configuration files from your machine to openshift pod `oc rsync jobs <pod_name>:/var/lib/jenkins`
11. Login to new Jenkins instance (the URL should be `https://jenkins-<project_name>.apps.ocp.cloud.ci.centos.org`)
12. In the `Manage Jenkins` page click on `Reload Configuration from Disk`

Now you should have your old configuration available in new Jenkins instance.

!!! warning
    This migration doesn't migrate any credentials from the old Jenkins instance. Those needs to be
    migrated manually, because those are not stored in `/var/lib/jenkins`.