bbecb6
From 51b1c22d025bf40e9ef488bb0faf0c8dff303ccd Mon Sep 17 00:00:00 2001
bbecb6
From: Rob Crittenden <rcritten@redhat.com>
bbecb6
Date: Thu, 8 Dec 2022 16:18:07 -0500
bbecb6
Subject: [PATCH] doc: Design for certificate pruning
bbecb6
bbecb6
This describes how the certificate pruning capability of PKI
bbecb6
introduced in v11.3.0 will be integrated into IPA, primarily for
bbecb6
ACME.
bbecb6
bbecb6
Related: https://pagure.io/freeipa/issue/9294
bbecb6
bbecb6
Signed-off-by: Rob Crittenden <rcritten@redhat.com>
bbecb6
Reviewed-By: Florence Blanc-Renaud <frenaud@redhat.com>
bbecb6
---
bbecb6
 doc/designs/expired_certificate_pruning.md | 297 +++++++++++++++++++++
bbecb6
 doc/designs/index.rst                      |   1 +
bbecb6
 2 files changed, 298 insertions(+)
bbecb6
 create mode 100644 doc/designs/expired_certificate_pruning.md
bbecb6
bbecb6
diff --git a/doc/designs/expired_certificate_pruning.md b/doc/designs/expired_certificate_pruning.md
bbecb6
new file mode 100644
bbecb6
index 0000000000000000000000000000000000000000..2c10d914020d3c12b6abb028323cd6796ec33e00
bbecb6
--- /dev/null
bbecb6
+++ b/doc/designs/expired_certificate_pruning.md
bbecb6
@@ -0,0 +1,297 @@
bbecb6
+# Expired Certificate Pruning
bbecb6
+
bbecb6
+## Overview
bbecb6
+
bbecb6
+https://pagure.io/dogtagpki/issue/1750
bbecb6
+
bbecb6
+When using short-lived certs and regular issuance, the expired certs can build up in the PKI database and cause issues with replication, performance and overall database size.
bbecb6
+
bbecb6
+PKI has provided a new feature in 11.3.0, pruning, which is a job that can be executed on a schedule or manually to remove expired certificates and requests.
bbecb6
+
bbecb6
+Random Serial Numbers v3 (RSNv3) is mandatory to enable pruning.
bbecb6
+
bbecb6
+Both pruning and RSNv3 require PKI 11.3.0 or higher.
bbecb6
+
bbecb6
+## Use Cases
bbecb6
+
bbecb6
+ACME certificates in particular are generally short-lived and expired certificates can build up quickly in a dynamic environment. An example is a CI system that requests one or more certificates per run. These will build up infinitely without a way to remove the expired certificates.
bbecb6
+
bbecb6
+Another case is simply a very long-lived installation. Over time as hosts come and go certificates build up.
bbecb6
+
bbecb6
+## How to Use
bbecb6
+
bbecb6
+https://github.com/dogtagpki/pki/wiki/Configuring-CA-Database-Pruning provides a thorough description of the capabilities of the pruning job.
bbecb6
+
bbecb6
+The default configuration is to remove expired certificates and incomplete requests after 30 days.
bbecb6
+
bbecb6
+Pruning is disabled by default.
bbecb6
+
bbecb6
+Configuration is a four-step process:
bbecb6
+
bbecb6
+1. Configure the expiration thresholds
bbecb6
+2. Enable the job
bbecb6
+3. Schedule the job
bbecb6
+4. Restart the CA
bbecb6
+
bbecb6
+The job will be scheduled to use the PKI built-in cron-like timer. It is configured nearly identically to `crontab(5)`. On execution it will remove certificates and requests that fall outside the configured thresholds. LDAP search/time limits can be used to control how many are removed at once.
bbecb6
+
bbecb6
+In addition to the automated schedule it is possible to manually run the pruning job.
bbecb6
+
bbecb6
+The tool will not restart the CA. It will be left as an exercise for the user, who will be notified as needed.
bbecb6
+
bbecb6
+### Where to use
bbecb6
+
bbecb6
+The pruning configuration is not replicated. It should not be necessary to enable this task on all IPA servers, or more than one.
bbecb6
+
bbecb6
+Running the task simultaneously on multiple servers has a few downsides:
bbecb6
+
bbecb6
+* Additional stress on the LDAP server searching for expired certificates and requests
bbecb6
+* Unnecessary replication load deleting the same entries on multiple servers
bbecb6
+
bbecb6
+While enabling this on a single server represents a single-point-of-failure there should be no catastrophic consequences other than expired certificates and requests potentially building up. This can be cleared by enabling pruning on a different server. Depending on the size of the backlog this could take a couple of executions to catch up.
bbecb6
+
bbecb6
+## Design
bbecb6
+
bbecb6
+There are several operations, most of which act locally and one of which uses the PKI REST API.
bbecb6
+
bbecb6
+1. Updating the job configuration (enable, thresholds, etc). This will be done by running the `pki-server ca-config-set` command which modifies CS.cfg directly per the PKI wiki. A restart is required.
bbecb6
+
bbecb6
+2. Retrieving the current configuration for display. The `pki-server ca-config-find` command returns the entire configuration so the results will need to be filtered.
bbecb6
+
bbecb6
+3. Managing the job. This can be done using the REST API, https://github.com/dogtagpki/pki/wiki/PKI-REST-API . Operations include enabling the job and triggering it to run now.
bbecb6
+
bbecb6
+Theoretically for operations 1 and 2 we could use existing code to manually update `CS.cfg` and retrieve values. For future-proofing purposes calling `pki-server` is probably the better long-term option given the limited number of times this will be used. Configuration is likely to be one and done.
bbecb6
+
bbecb6
+There are four values each that can be managed for pruning certificates and requests:
bbecb6
+
bbecb6
+* expired cert/incomplete request time
bbecb6
+* time unit
bbecb6
+* LDAP search size limit
bbecb6
+* LDAP search time limit
bbecb6
+
bbecb6
+The first two configure when an expired certificate or incomplete request will be deleted. The unit can be one of: minute, hour, day, year. By default it is 30 days.
bbecb6
+
bbecb6
+The LDAP limits control how many entries are returned and how long the search can take. By default it is 1000 entries and unlimited time.
bbecb6
+
bbecb6
+### Configuration settings
bbecb6
+
bbecb6
+The configuration values will be set by running `pki-server ca-config-set` This will ensure best forward compatibility. The options are case-sensitive and not validated by the CA until restart. The values are not applied until the CA is restarted.
bbecb6
+
bbecb6
+### Configuring job execution time
bbecb6
+
bbecb6
+The CA provides a cron-like interface for scheduling jobs. To configure the job to run at midnight on the first of every month the PKI equivalent command-line is:
bbecb6
+
bbecb6
+```
bbecb6
+pki-server ca-config-set jobsScheduler.job.pruning.cron `"0 0 1 * *"`
bbecb6
+```
bbecb6
+
bbecb6
+This will be the default when pruning is enabled. A separate configuration option will be available for fine-tuning execution time.
bbecb6
+
bbecb6
+The format is defined https://access.redhat.com/documentation/en-us/red_hat_certificate_system/9/html/administration_guide/setting_up_specific_jobs#Frequency_Settings_for_Automated_Jobs
bbecb6
+
bbecb6
+### REST Authentication and Authorization
bbecb6
+
bbecb6
+The REST API for pruning is documented at https://github.com/dogtagpki/pki/wiki/PKI-Start-Job-REST-API
bbecb6
+
bbecb6
+A PKI job can define an owner that can manage the job over the REST API. We will automatically define the owner as `ipara` when pruning is enabled.
bbecb6
+
bbecb6
+Manually running the job will be done using the PKI REST API. Authentication to this API for our purposes is done at the `/ca/rest/account/login` endpoint. A cookie is returned which will be used in any subsequent calls. The IPA RA agent certificate will be used for authentication and authorization.
bbecb6
+
bbecb6
+### Commands
bbecb6
+
bbecb6
+This will be implemented in the ipa-acme-manage command. While strictly not completely ACME-related this is the primary driver for pruning.
bbecb6
+
bbecb6
+A new verb will be added, pruning, to be used for enabling and configuring pruning.
bbecb6
+
bbecb6
+### Enabling pruning
bbecb6
+
bbecb6
+`# ipa-acme-manage pruning --enable=TRUE`
bbecb6
+
bbecb6
+Enabling the job will call
bbecb6
+
bbecb6
+`# pki-server ca-config-set jobsScheduler.job.pruning.enabled true`
bbecb6
+
bbecb6
+This will also set jobsScheduler.job.pruning.cron to `"0 0 1 * *"` if it has not already been set.
bbecb6
+
bbecb6
+Additionally it will set the job owner to `ipara` with:
bbecb6
+
bbecb6
+`# pki-server ca-config-set jobsScheduler.job.pruning.owner ipara`
bbecb6
+
bbecb6
+Disabling the job will call
bbecb6
+
bbecb6
+`# pki-server ca-config-unset jobsScheduler.job.pruning.enabled`
bbecb6
+
bbecb6
+### Cron settings
bbecb6
+
bbecb6
+To modify the cron settings:
bbecb6
+
bbecb6
+`# ipa-acme-manage pruning --cron="Minute Hour Day_of_month Month_of_year Day_of_week"`
bbecb6
+
bbecb6
+Validation of the value will be:
bbecb6
+* each of the options is an integer
bbecb6
+* minute is within 0-59
bbecb6
+* hour is within 0-23
bbecb6
+* day of month is within 0-31
bbecb6
+* month of year is within 1-12
bbecb6
+* day of week is within 0-6
bbecb6
+
bbecb6
+No validation of setting February 31st will be done. That will be left to PKI. Buyer beware.
bbecb6
+
bbecb6
+### Disabling pruning
bbecb6
+
bbecb6
+`$ ipa-acme-manage pruning --enable=FALSE`
bbecb6
+
bbecb6
+This will remove the configuration option for `jobsScheduler.job.pruning.cron` just to be sure it no longer runs.
bbecb6
+
bbecb6
+### Configuration
bbecb6
+
bbecb6
+#### Pruning certificates
bbecb6
+
bbecb6
+`$ ipa-acme-manage pruning --certretention=VALUE --certretentionunit=UNIT`
bbecb6
+
bbecb6
+will be the equivalent of:
bbecb6
+
bbecb6
+`$ pki-server ca-config-set jobsScheduler.job.pruning.certRetentionTime 30`
bbecb6
+
bbecb6
+`$ pki-server ca-config-set jobsScheduler.job.pruning.certRetentionUnit day`
bbecb6
+
bbecb6
+The unit will always be required when modifying the time.
bbecb6
+
bbecb6
+`$ ipa-acme-manage pruning --certsearchsizelimit=VALUE --certsearchtimelimit=VALUE`
bbecb6
+
bbecb6
+will be the equivalent of:
bbecb6
+
bbecb6
+`$ pki-server ca-config-set jobsScheduler.job.pruning.certSearchSizeLimit 1000`
bbecb6
+
bbecb6
+`$ pki-server ca-config-set jobsScheduler.job.pruning.certSearchTimeLimit 0`
bbecb6
+
bbecb6
+A value of 0 for searchtimelimit is unlimited.
bbecb6
+
bbecb6
+#### Pruning requests
bbecb6
+
bbecb6
+`$ ipa-acme-manage pruning --requestretention=VALUE --requestretentionunit=UNIT`
bbecb6
+
bbecb6
+will be the equivalent of:
bbecb6
+
bbecb6
+`$ pki-server ca-config-set jobsScheduler.job.pruning.requestRetentionTime 30`
bbecb6
+
bbecb6
+`$ pki-server ca-config-set jobsScheduler.job.pruning.requestRetentionUnit day`
bbecb6
+
bbecb6
+The unit will always be required when modifying the time.
bbecb6
+
bbecb6
+`$ ipa-acme-manage pruning --requestsearchsizelimit=VALUE --requestsearchtimelimit=VALUE`
bbecb6
+
bbecb6
+
bbecb6
+will be the equivalent of:
bbecb6
+
bbecb6
+`$ pki-server ca-config-set jobsScheduler.job.pruning.requestSearchSizeLimit 1000`
bbecb6
+
bbecb6
+`$ pki-server ca-config-set jobsScheduler.job.pruning.requestSearchTimeLimit 0`
bbecb6
+
bbecb6
+A value of 0 for searchtimelimit is unlimited.
bbecb6
+
bbecb6
+These options set the client-side limits. The server imposes its own search size and look through limits. This can be tuned for the uid=pkidbuser,ou=people,o=ipaca user via https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/html/administration_guide/ldapsearch-ex-complex-range
bbecb6
+
bbecb6
+### Showing the Configuration
bbecb6
+
bbecb6
+To display the current configuration run `pki-server ca-config-find` and filter the results to only those that contain `jobsScheduler.job.pruning`.
bbecb6
+
bbecb6
+Default values are not included so will need to be set by `ipa-acme-manage` before displaying.
bbecb6
+
bbecb6
+Output may look something like:
bbecb6
+
bbecb6
+```console
bbecb6
+# ipa-acme-manage pruning --config-show
bbecb6
+Enabled: TRUE
bbecb6
+Certificate retention time: 30 days
bbecb6
+Certificate search size limit: 1000
bbecb6
+Certificate search time limit: 0
bbecb6
+Request retention time: 30 days
bbecb6
+Request search size limit: 1000
bbecb6
+Request search time limit: 0
bbecb6
+Cron: 0 0 1 * *
bbecb6
+```
bbecb6
+
bbecb6
+## Implementation
bbecb6
+
bbecb6
+For online REST operations (login, run job) we will use the `ipaserver/plugins/dogtag.py::RestClient` class to manage the requests. This will take care of the authentication cookie, etc.
bbecb6
+
bbecb6
+The class uses dogtag.https_request() will can take PEM cert and key files as arguments. These will be used for authentication.
bbecb6
+
bbecb6
+For the non-REST operations (configuration, cron settings) the tool will fork out to pki-server ca-config-set.
bbecb6
+
bbecb6
+### UI
bbecb6
+
bbecb6
+This will only be configurable on the command-line.
bbecb6
+
bbecb6
+### CLI
bbecb6
+
bbecb6
+Overview of the CLI commands. Example:
bbecb6
+
bbecb6
+
bbecb6
+| Command |	Options |
bbecb6
+| --- | ----- |
bbecb6
+| ipa-acme-manage pruning | --enable=TRUE |
bbecb6
+| ipa-acme-manage pruning | --enable=FALSE |
bbecb6
+| ipa-acme-manage pruning | --cron=`"0 0 1 * *"` |
bbecb6
+| ipa-acme-manage pruning | --certretention=30 --certretentionunit=day |
bbecb6
+| ipa-acme-manage pruning | --certsearchsizelimit=1000 --certsearchtimelimit=0 |
bbecb6
+| ipa-acme-manage pruning | --requestretention=30 --requestretentionunit=day |
bbecb6
+| ipa-acme-manage pruning | --requestsearchsizelimit=1000 --requestsearchtimelimit=0 |
bbecb6
+| ipa-acme-manage pruning | --config-show |
bbecb6
+
bbecb6
+ipa-acme-manage can only be run as root.
bbecb6
+
bbecb6
+### Configuration
bbecb6
+
bbecb6
+Configuration changes will be made to /etc/pki/pki-tomcat/ca/CS.cfg 
bbecb6
+
bbecb6
+## Upgrade
bbecb6
+
bbecb6
+No expected impact on upgrades.
bbecb6
+
bbecb6
+## Test plan
bbecb6
+
bbecb6
+Testing will consist of:
bbecb6
+
bbecb6
+* Use the default configuration
bbecb6
+* enabling the pruning job
bbecb6
+* issue one or more certificates
bbecb6
+* move time forward +1 days after expiration
bbecb6
+* manually running the job
bbecb6
+* validating that the certificates are removed
bbecb6
+
bbecb6
+For size/time limit testing, create a large number of certificates/requests and set the search limit to a low value, then ensure that the number of deleted certs is equal to the search limit. Testing timelimit in this way may be less predictable as it may require a massive number of entries to find to timeout on a non-busy server.
bbecb6
+
bbecb6
+## Troubleshooting and debugging
bbecb6
+
bbecb6
+The PKI debug log will contain job information.
bbecb6
+
bbecb6
+```
bbecb6
+2022-12-08 21:14:25 [https-jsse-nio-8443-exec-8] INFO: JobService: Starting job pruning
bbecb6
+2022-12-08 21:14:25 [https-jsse-nio-8443-exec-8] INFO: JobService: - principal: null
bbecb6
+2022-12-08 21:14:51 [https-jsse-nio-8443-exec-10] INFO: JobService: Starting job pruning 2022-12-08 21:14:51 [https-jsse-nio-8443-exec-10] INFO: JobService: - principal: null
bbecb6
+2022-12-08 21:15:11 [https-jsse-nio-8443-exec-11] INFO: PKIRealm: Authenticating certificate chain:
bbecb6
+2022-12-08 21:15:11 [https-jsse-nio-8443-exec-11] INFO: PKIRealm: - CN=IPA RA,O=EXAMPLE.TEST
bbecb6
+2022-12-08 21:15:11 [https-jsse-nio-8443-exec-11] INFO: PKIRealm: - CN=Certificate Authority,O=EXAMPLE.TEST
bbecb6
+2022-12-08 21:15:11 [https-jsse-nio-8443-exec-11] INFO: LDAPSession: Retrieving cn=19072098145751813471503860299601579276,ou=certificateRepository, ou=ca,o=ipaca
bbecb6
+2022-12-08 21:15:11 [https-jsse-nio-8443-exec-11] INFO: CertUserDBAuthentication: UID ipara authenticated.
bbecb6
+2022-12-08 21:15:11 [https-jsse-nio-8443-exec-11] INFO: PKIRealm: User ipara authenticated
bbecb6
+2022-12-08 21:15:11 [https-jsse-nio-8443-exec-11] INFO: UGSubsystem: Retrieving user uid=ipara,ou=People,o=ipaca
bbecb6
+2022-12-08 21:15:11 [https-jsse-nio-8443-exec-11] INFO: PKIRealm: User DN: uid=ipara,ou=people,o=ipaca
bbecb6
+2022-12-08 21:15:11 [https-jsse-nio-8443-exec-11] INFO: PKIRealm: Roles:
bbecb6
+2022-12-08 21:15:11 [https-jsse-nio-8443-exec-11] INFO: PKIRealm: - Certificate Manager Agents
bbecb6
+2022-12-08 21:15:11 [https-jsse-nio-8443-exec-11] INFO: PKIRealm: - Registration Manager Agents
bbecb6
+2022-12-08 21:15:11 [https-jsse-nio-8443-exec-11] INFO: PKIRealm: - Security Domain Administrators
bbecb6
+2022-12-08 21:15:11 [https-jsse-nio-8443-exec-11] INFO: PKIRealm: - Enterprise ACME Administrators
bbecb6
+2022-12-08 21:15:24 [https-jsse-nio-8443-exec-12] INFO: JobService: Starting job pruning
bbecb6
+2022-12-08 21:15:24 [https-jsse-nio-8443-exec-12] INFO: JobService: - principal: GenericPrincipal[ipara(Certificate Manager Agents,Enterprise ACME Administrators,Registration Manager Agents,Security Domain Administrators,)]
bbecb6
+2022-12-08 21:15:24 [https-jsse-nio-8443-exec-12] INFO: JobsScheduler: Starting job pruning
bbecb6
+2022-12-08 21:15:24 [pruning] INFO: PruningJob: Running pruning job at Thu Dec 08 21:15:24 UTC 2022
bbecb6
+2022-12-08 21:15:24 [pruning] INFO: PruningJob: Pruning certs expired before Tue Nov 08 21:15:24 UTC 2022
bbecb6
+2022-12-08 21:15:24 [pruning] INFO: PruningJob: - filter: (&(x509Cert.notAfter<=1667942124527)(!(x509Cert.notAfter=1667942124527)))
bbecb6
+2022-12-08 21:15:24 [pruning] INFO: LDAPSession: Searching ou=certificateRepository, ou=ca,o=ipaca for (&(notAfter<=20221108211524Z)(!(notAfter=20221108211524Z)))
bbecb6
+2022-12-08 21:15:24 [pruning] INFO: PruningJob: Pruning incomplete requests last modified before Tue Nov 08 21:15:24 UTC 2022
bbecb6
+2022-12-08 21:15:24 [pruning] INFO: PruningJob: - filter: (&(!(requestState=complete))(requestModifyTime<=1667942124527)(!(requestModifyTime=1667942124527)))
bbecb6
+2022-12-08 21:15:24 [pruning] INFO: LDAPSession: Searching ou=ca, ou=requests,o=ipaca for (&(!(requestState=complete))(dateOfModify<=20221108211524Z)(!(dateOfModify=20221108211524Z)))
bbecb6
+```
bbecb6
diff --git a/doc/designs/index.rst b/doc/designs/index.rst
bbecb6
index 570e526fe35d510feeac62a44dd59224289e0506..1d41c0f84f0d7d3d5f184a47e31b4e71a890805d 100644
bbecb6
--- a/doc/designs/index.rst
bbecb6
+++ b/doc/designs/index.rst
bbecb6
@@ -14,6 +14,7 @@ FreeIPA design documentation
bbecb6
    hsm.md
bbecb6
    krb-ticket-policy.md
bbecb6
    extdom-plugin-protocol.md
bbecb6
+   expired_certificate_pruning.md
bbecb6
    expiring-password-notification.md
bbecb6
    ldap_grace_period.md
bbecb6
    ldap_pam_passthrough.md
bbecb6
-- 
bbecb6
2.39.1
bbecb6