# Git source control
We use various git hosting solutions for CentOS, depending on the need[s] :
* [Pagure](https://pagure.io/pagure) : self-hosted on [https://git.centos.org](https://git.centos.org)
* Github : we have a [presence/Organization](https://github.com/centos) there for some automation scripts for historical reasons, like all [ansible roles](https://github.com/centos?q=ansible-&type=&language=&sort=)
* Gitlab : Recently Red Hat decided to start using Gitlab for [Stream 9 sources](https://gitlab.com/redhat/centos-stream) and beyond
Let's only focus on the first one, that infra team needs to manage/maintain and let's explain also what it's used for, and which specific permissions/delegations we have for Special Interest Groups.
# Git.centos.org
The first thing to know is that it's all managed/deployed by Ansible [pagure role](https://github.com/CentOS/ansible-role-pagure)
Due to experience within the team, we decided to use MySQL DB instead of postgresql, and also to reuse existing roles for these other parts.
## Initial purposes
It's mainly used for :
* centos specific projects (like website, etc), all in the `/centos/` namespace
* RPM packages sources from RHEL, pushed by Red Hat, and then built by the CentOS team, all landing in the `/rpms/` namespace
## Authentication
Our pagure instance is tied with our existing [Authentication service](../infra/authentication.md) so one needs to first have a account there to interact with the pagure instance (except of course for Read-Only operations like cloning a repository, etc)
When a user is added in a SIG group , and logs in again, its membership will be reflected at the pagure/git.centos.org side.
Their ssh public key is imported into their account (normal for a git forge solution).
## Protected branches and ACLs
By default, *nobody* (except specific Red Hat privileged account) can push to `master` branch on *any* project under /rpms/ namespace, nor any other protected branches, like `c7`, `c8`, `c8s` and so one (based on regex).
All these protected branched represent what Red Hat is pushing, and that should represent upstream RHEL Sources.
Apart from protected branches, member of SIGs can push *automatically* (the logic is checked automatically by [pagure-dist-git](https://pagure.io/pagure-dist-git) to some 'sub' branches.
Example : a member of the `sig-cloud` can automatically push to the `c8-sig-cloud-<whatever_if_I_want_to>` branch of any rpm in the `/rpms/` namespace, but *never* to the main `c8` branch (and repeat the logic by swapping distro release and sig group/name)
### Lookaside cache upload
People can also push to the [`lookaside cache`](https://git.centos.org/sources) the needed tarballs/archives that can be used to [rebuild/compose a src.rpm](https://wiki.centos.org/Sources) package before being submitted to the build system (to build and release rpm packages)
Same logic as above : specific priviledged Red Hat account can push all needed tarballs/archives to the lookaside cache in all directories.
A SIG member can push to specific branch that correspond to the logic described above for git : from our previous example, that means pushing to `c8-sig-cloud-<whatever_if_I_want_to>`
## Creating projects in RPMS namespace
Normally Red Hat will create automatically (through pagure API calls) some projects in the /rpms/ namespace and so (see above) SIGs can then push to these projects.
But sometimes, SIGs want new project[s] under the /rpms/ namespace, if they don't exist (yet) on git.centos.org.
To create a new project, you'll need to call pagure API through a [simple and interactive](https://github.com/CentOS/infra-scripts/blob/master/pagure/pagure-create-repo) or [batch](https://github.com/CentOS/infra-scripts/blob/master/pagure/pagure-batch-create-repo) script and you'll also need to create config file with the pagure API token for the `centosrcm` user. (you can always retrieve that api key through `pagure-admin admin-token list` on the pagure host)
Assuming that we are given a [list](https://pagure.io/centos-infra/issue/393), we can create a simple file and create projects like this :
```
cat kmod-list |while read project ; do pagure-batch-create-repo $project "the $project repo" rpms prod;done
```
## Modifying ACLs on projects
While automatic ACLs coming from pagure-dist-git are in place to allow/deny SIG groups to push to some branches by default, pagure needs itself to be aware that some groups can review Pull Requests and merge these. For that we have to grant the 'commit' rights on specific projects (SIGs will still be disallowed to merge to protected branches).
To do that one can use [simple and interactive](https://github.com/CentOS/infra-scripts/blob/master/pagure/pagure-acl-mod) or [batch](https://github.com/CentOS/infra-scripts/blob/master/pagure/pagure-batch-acl-mod) script, still using same config file as described above.
If we have to do that for a list, we can for example use the `pagure-batch-acl-mod` script :
```
fas_group="sig-hyperscale"
namespace="rpms"
cat pagure-acl.list |while read project ; do echo pagure-batch-acl-mod $namespace $project $fas_group prod ; done
```
## Infra notes
We host one public instance (git.centos.org) but we have a secondary one (temporary removed from ansible for the pagure role) that is kept in sync for the git repositories and lookaside content (not for mysql database though)
Both instances are declared in `pagure-src-nodes` ansible group , and so inheriting from variables declared at that level
Worth knowing that it's a "poor man" replication as it's not using any shared storage or block device replication but simply [lsyncd](https://github.com/lsyncd/lsyncd) (packaged as rpm in fedora/epel/centos), so that one there is a push/write/mod on git.centos.org, it automatically triggers a rsync replication in *one way* to the second instance. See `host_vars` in ansible inventory about how lsyncd is configured and obviously how rsyncd is configured on the second instance to restrict which machine can push to which rsyncd module/target
These two instances are Virtual Machines running RHEL8, on top of different hypervisors (see the `kvm_host` variable in inventory) to ensure that one physical node issue wouldn't put both instances down.
Apart from replication, there is the obvious backup plan that saves mysqldump and whole lookaside and git repositories on central backup pool node.
In case of failover, one has to :
* restore last mysqldump
* replay the pagure role (pagure-src-nodes group membership)
* eventually just fix the permissions on directories
* update A records for git.centos.org