Data Retention
Retention periods, purge schedules, and batch sizes for all data categories.
Retention schedule
The following table lists all data categories, their retention periods, and whether automated purging is currently configured.
| Data category | Table / model | Retention period | Auto-purge |
|---|---|---|---|
| Events (analytics) | Event | 90 days | Yes |
| Marketing events | MarketingEvent | 730 days (2 years) | Yes |
| Marketing consents | MarketingConsent | 730 days (2 years) | Yes |
| Data subject requests | DataSubjectRequest | 1,095 days (3 years) | Yes |
| Sessions | Session | 14 days | Yes |
| CSRF tokens | CsrfToken | 2 days | Yes |
| Rate limit buckets | RateLimitBucket | 7 days | Yes |
| Login locks | LoginLock | 7 days | Yes |
| Webhook deliveries | WebhookDelivery | No purge configured | No |
| Reconciliation runs | ReconciliationRun | No purge configured | No |
| Recurring charges | RecurringCharge | No purge configured | No |
Purge cron schedule
The data retention purge job runs as a Vercel cron function. The schedule and configuration are defined in vercel.json.
| Property | Value |
|---|---|
| Schedule | Daily (once per day) |
| Endpoint | /api/cron/purge |
| Runtime | Vercel serverless function |
| Timeout | Constrained by Vercel function execution limits |
Batch sizes
The purge job deletes records in batches to avoid long-running database transactions that could lock tables or exhaust connection pools. Each table is purged in its own batch operation.
| Data category | Batch size | Notes |
|---|---|---|
| Events | 1,000 rows per batch | High volume table. May need multiple runs if backlog is large. |
| Marketing events | 500 rows per batch | Lower volume. Single batch usually sufficient. |
| Marketing consents | 500 rows per batch | Very low volume. Rarely exceeds a single batch. |
| Data subject requests | 100 rows per batch | Low volume. 3-year retention means deletions are rare. |
| Sessions | 500 rows per batch | Moderate volume. 14-day retention keeps table small. |
| CSRF tokens | 1,000 rows per batch | High volume (one per form render). 2-day retention means large daily deletes. |
| Rate limit buckets | 1,000 rows per batch | Can spike during abuse. Monitor row count weekly. |
| Login locks | 200 rows per batch | Low volume unless there is a credential-stuffing attack. |
How purging works
For each data category, the purge job executes the following sequence:
- Query the table for records where the relevant timestamp column (e.g.,
created_at,expires_at) is older than the retention period. - Select up to
batch_sizerows matching the retention criteria. - Delete the selected rows in a single
DELETEstatement. - If the number of deleted rows equals the batch size, there may be more rows to purge. The cron will catch them on the next daily run.
Monitoring
Routinely check the following to ensure data retention is working correctly:
| Check | Frequency | What to look for |
|---|---|---|
| Purge cron execution | Daily | Verify the cron ran successfully in the Vercel function logs. Look for errors or timeouts. |
| Table row counts | Weekly | Check row counts for Event, CsrfToken, and RateLimitBucket tables. These are the highest-volume tables and most likely to grow unexpectedly. |
| Unpurged table growth | Monthly | Check row counts for WebhookDelivery, ReconciliationRun, and RecurringCharge. These have no automated purge. Plan manual cleanup or implement purge logic if growth is concerning. |
| Database storage usage | Monthly | Monitor total database storage in the provider dashboard. Data retention issues will manifest as unexpected storage growth. |
Manual purge
If the automated purge falls behind (e.g., after an outage or if a table has accumulated a large backlog), you can run a manual purge:
- Trigger the purge endpoint manually:
POST /api/cron/purgewith the appropriate authorization header. - Check the response for the count of deleted rows per table.
- If the response indicates the batch limit was hit for any table, run the endpoint again. Repeat until all tables report zero remaining expired rows.
- For very large backlogs (100,000+ rows), consider running a direct database query with a larger batch size to avoid hitting the function timeout:
DELETE FROM "Event" WHERE "created_at" < NOW() - INTERVAL '90 days' LIMIT 10000;
Future work
The following items are planned but not yet implemented:
- WebhookDelivery purge: Retain successful deliveries for 90 days and failed deliveries for 180 days.
- ReconciliationRun purge: Retain runs for 365 days. Flagged discrepancies should be retained indefinitely.
- RecurringCharge purge: Retain charges for 365 days after the plan is terminated.
- Purge metrics: Emit metrics for rows deleted per table per run, enabling dashboard monitoring and alerting on purge failures.