Blame docs/buildsys/git.md

b4b5b5
# Git source control 
dfd049
dfd049
We use various git hosting solutions for CentOS, depending on the need[s] :
dfd049
dfd049
 * [Pagure](https://pagure.io/pagure) : self-hosted on [https://git.centos.org](https://git.centos.org)
dfd049
 * Github : we have a [presence/Organization](https://github.com/centos) there for some automation scripts for historical reasons, like all [ansible roles](https://github.com/centos?q=ansible-&type=&language=&sort=)
dfd049
 * Gitlab : Recently Red Hat decided to start using Gitlab for [Stream 9 sources](https://gitlab.com/redhat/centos-stream) and beyond
dfd049
dfd049
Let's only focus on the first one, that infra team needs to manage/maintain and let's explain also what it's used for, and which specific permissions/delegations we have for Special Interest Groups.
dfd049
dfd049
# Git.centos.org
dfd049
The first thing to know is that it's all managed/deployed by Ansible [pagure role](https://github.com/CentOS/ansible-role-pagure)
dfd049
dfd049
Due to experience within the team, we decided to use MySQL DB instead of postgresql, and also to reuse existing roles for these other parts.
dfd049
dfd049
## Initial purposes
dfd049
dfd049
It's mainly used for :
dfd049
dfd049
 * centos specific projects (like website, etc), all in the `/centos/` namespace
dfd049
 * RPM packages sources from RHEL, pushed by Red Hat, and then built by the CentOS team, all landing in the `/rpms/` namespace
dfd049
dfd049
## Authentication
dfd049
1c0ded
Our pagure instance is tied with our existing [Authentication service](../infra/authentication.md) so one needs to first have a account there to interact with the pagure instance (except of course for Read-Only operations like cloning a repository, etc)
dfd049
dfd049
When a user is added in a SIG group , and logs in again, its membership will be reflected at the pagure/git.centos.org side.
dfd049
dfd049
Their ssh public key is imported into their account (normal for a git forge solution).
dfd049
dfd049
## Protected branches and ACLs
dfd049
dfd049
By default, *nobody* (except specific Red Hat privileged account) can push to `master` branch on *any* project under /rpms/ namespace, nor any other protected branches, like `c7`, `c8`, `c8s` and so one (based on regex).
dfd049
All these protected branched represent what Red Hat is pushing, and that should represent upstream RHEL Sources.
dfd049
dfd049
Apart from protected branches, member of SIGs can push *automatically* (the logic is checked automatically by [pagure-dist-git](https://pagure.io/pagure-dist-git) to some 'sub' branches.
dfd049
dfd049
Example : a member of the `sig-cloud` can automatically push to the `c8-sig-cloud-<whatever_if_I_want_to>` branch of any rpm in the `/rpms/` namespace, but *never* to the main `c8` branch (and repeat the logic by swapping distro release and sig group/name)
dfd049
dfd049
### Lookaside cache upload
dfd049
dfd049
People can also push to the [`lookaside cache`](https://git.centos.org/sources) the needed tarballs/archives that can be used to [rebuild/compose a src.rpm](https://wiki.centos.org/Sources) package before being submitted to the build system (to build and release rpm packages)
dfd049
dfd049
Same logic as above : specific priviledged Red Hat account can push all needed tarballs/archives to the lookaside cache in all directories.
dfd049
dfd049
A SIG member can push to specific branch that correspond to the logic described above for git : from our previous example, that means pushing to `c8-sig-cloud-<whatever_if_I_want_to>`
961456
961456
## Creating projects in RPMS namespace
961456
961456
Normally Red Hat will create automatically (through pagure API calls) some projects in the /rpms/ namespace and so (see above) SIGs can then push to these projects.
961456
961456
But sometimes, SIGs want new project[s] under the /rpms/ namespace, if they don't exist (yet) on git.centos.org.
961456
961456
To create a new project, you'll need to call pagure API through a [simple and interactive](https://github.com/CentOS/infra-scripts/blob/master/pagure/pagure-create-repo) or [batch](https://github.com/CentOS/infra-scripts/blob/master/pagure/pagure-batch-create-repo) script and you'll also need to create config file with the pagure API token for the `centosrcm` user. (you can always retrieve that api key through `pagure-admin admin-token list` on the pagure host)
961456
961456
Assuming that we are given a [list](https://pagure.io/centos-infra/issue/393), we can create a simple file and create projects like this : 
961456
961456
```
961456
cat kmod-list |while read project ; do pagure-batch-create-repo $project "the $project repo" rpms prod;done
961456
961456
```
2071cc
## Modifying ACLs on projects 
2071cc
2071cc
While automatic ACLs coming from pagure-dist-git are in place to allow/deny SIG groups to push to some branches by default, pagure needs itself to be aware that some groups can review Pull Requests and merge these. For that we have to grant the 'commit' rights on specific projects (SIGs will still be disallowed to merge to protected branches).
2071cc
To do that one can use [simple and interactive](https://github.com/CentOS/infra-scripts/blob/master/pagure/pagure-acl-mod) or [batch](https://github.com/CentOS/infra-scripts/blob/master/pagure/pagure-batch-acl-mod) script, still using same config file as described above.
2071cc
2071cc
If we have to do that for a list, we can for example use the `pagure-batch-acl-mod` script : 
2071cc
2071cc
```
2071cc
fas_group="sig-hyperscale"
2071cc
namespace="rpms"
2071cc
cat pagure-acl.list |while read project ; do echo pagure-batch-acl-mod $namespace $project $fas_group prod ; done
2071cc
```
2071cc
bb7efc
bb7efc
## Infra notes
bb7efc
bb7efc
We host one public instance (git.centos.org) but we have a secondary one (temporary removed from ansible for the pagure role) that is kept in sync for the git repositories and lookaside content (not for mysql database though)
bb7efc
Both instances are declared in `pagure-src-nodes` ansible group , and so inheriting from variables declared at that level
bb7efc
bb7efc
Worth knowing that it's a "poor man" replication as it's not using any shared storage or block device replication but simply [lsyncd](https://github.com/lsyncd/lsyncd) (packaged as rpm in fedora/epel/centos), so that one there is a push/write/mod on git.centos.org, it automatically triggers a rsync replication in *one way* to the second instance. See `host_vars` in ansible inventory about how lsyncd is configured and obviously how rsyncd is configured on the second instance to restrict which machine can push to which rsyncd module/target
bb7efc
bb7efc
These two instances are Virtual Machines running RHEL8, on top of different hypervisors (see the `kvm_host` variable in inventory) to ensure that one physical node issue wouldn't put both instances down.
bb7efc
bb7efc
Apart from replication, there is the obvious backup plan that saves mysqldump and whole lookaside and git repositories on central backup pool node.
bb7efc
bb7efc
In case of failover, one has to :
bb7efc
bb7efc
 * restore last mysqldump
bb7efc
 * replay the pagure role (pagure-src-nodes group membership)
bb7efc
 * eventually just fix the permissions on directories
bb7efc
 * update A records for git.centos.org
bb7efc
bb7efc