Git source control

We use various git hosting solutions for CentOS, depending on the need[s] :

Pagure : self-hosted on https://git.centos.org
Github : we have a presence/Organization there for some automation scripts for historical reasons, like all ansible roles
Gitlab : Recently Red Hat decided to start using Gitlab for Stream 9 sources and beyond

Let's only focus on the first one, that infra team needs to manage/maintain and let's explain also what it's used for, and which specific permissions/delegations we have for Special Interest Groups.

Git.centos.org

The first thing to know is that it's all managed/deployed by Ansible pagure role

Due to experience within the team, we decided to use MySQL DB instead of postgresql, and also to reuse existing roles for these other parts.

Initial purposes

It's mainly used for :

centos specific projects (like website, etc), all in the /centos/ namespace
RPM packages sources from RHEL, pushed by Red Hat, and then built by the CentOS team, all landing in the /rpms/ namespace

Authentication

Our pagure instance is tied with our existing Authentication service so one needs to first have a account there to interact with the pagure instance (except of course for Read-Only operations like cloning a repository, etc)

When a user is added in a SIG group , and logs in again, its membership will be reflected at the pagure/git.centos.org side.

Their ssh public key is imported into their account (normal for a git forge solution).

Protected branches and ACLs

By default, nobody (except specific Red Hat privileged account) can push to master branch on any project under /rpms/ namespace, nor any other protected branches, like c7, c8, c8s and so one (based on regex). All these protected branched represent what Red Hat is pushing, and that should represent upstream RHEL Sources.

Apart from protected branches, member of SIGs can push automatically (the logic is checked automatically by pagure-dist-git to some 'sub' branches.

Example : a member of the sig-cloud can automatically push to the c8-sig-cloud-<whatever_if_I_want_to> branch of any rpm in the /rpms/ namespace, but never to the main c8 branch (and repeat the logic by swapping distro release and sig group/name)

Lookaside cache upload

People can also push to the lookaside cache the needed tarballs/archives that can be used to rebuild/compose a src.rpm package before being submitted to the build system (to build and release rpm packages)

Same logic as above : specific priviledged Red Hat account can push all needed tarballs/archives to the lookaside cache in all directories.

A SIG member can push to specific branch that correspond to the logic described above for git : from our previous example, that means pushing to c8-sig-cloud-<whatever_if_I_want_to>

Creating projects in RPMS namespace

Normally Red Hat will create automatically (through pagure API calls) some projects in the /rpms/ namespace and so (see above) SIGs can then push to these projects.

But sometimes, SIGs want new project[s] under the /rpms/ namespace, if they don't exist (yet) on git.centos.org.

To create a new project, you'll need to call pagure API through a simple and interactive or batch script and you'll also need to create config file with the pagure API token for the centosrcm user. (you can always retrieve that api key through pagure-admin admin-token list on the pagure host)

Assuming that we are given a list, we can create a simple file and create projects like this :

cat kmod-list |while read project ; do pagure-batch-create-repo $project "the $project repo" rpms prod;done

Modifying ACLs on projects

While automatic ACLs coming from pagure-dist-git are in place to allow/deny SIG groups to push to some branches by default, pagure needs itself to be aware that some groups can review Pull Requests and merge these. For that we have to grant the 'commit' rights on specific projects (SIGs will still be disallowed to merge to protected branches). To do that one can use simple and interactive or batch script, still using same config file as described above.

If we have to do that for a list, we can for example use the pagure-batch-acl-mod script :

fas_group="sig-hyperscale"
namespace="rpms"
cat pagure-acl.list |while read project ; do echo pagure-batch-acl-mod $namespace $project $fas_group prod ; done

Infra notes

We host one public instance (git.centos.org) but we have a secondary one (temporary removed from ansible for the pagure role) that is kept in sync for the git repositories and lookaside content (not for mysql database though) Both instances are declared in pagure-src-nodes ansible group , and so inheriting from variables declared at that level

Worth knowing that it's a "poor man" replication as it's not using any shared storage or block device replication but simply lsyncd (packaged as rpm in fedora/epel/centos), so that one there is a push/write/mod on git.centos.org, it automatically triggers a rsync replication in one way to the second instance. See host_vars in ansible inventory about how lsyncd is configured and obviously how rsyncd is configured on the second instance to restrict which machine can push to which rsyncd module/target

These two instances are Virtual Machines running RHEL8, on top of different hypervisors (see the kvm_host variable in inventory) to ensure that one physical node issue wouldn't put both instances down.

Apart from replication, there is the obvious backup plan that saves mysqldump and whole lookaside and git repositories on central backup pool node.

In case of failover, one has to :

restore last mysqldump
replay the pagure role (pagure-src-nodes group membership)
eventually just fix the permissions on directories
update A records for git.centos.org

phsmoura / centos / centos-infra-docs

Source Code

Files