Skip to content

Disaster Recovery Plan

Document ID: BCP-DRP-001 | Version: 1.0 | Effective: January 2026


1. Purpose

Establishes procedures for recovering Keshless systems and data following a disaster or major service disruption.

Scope: PostgreSQL database, GCP Cloud Storage, API secrets, Cloud Run services


2. Recovery Objectives

Recovery Time Objective (RTO)

CategoryRTOSystems
Critical2 hoursTransaction processing, authentication
High4 hoursDashboard, KYC processing
Medium8 hoursReporting, analytics
Low24 hoursNon-essential features

Recovery Point Objective (RPO)

Data TypeRPOBackup Schedule
PostgreSQL24 hoursDaily at 2:00 AM UTC
GCS Documents24 hoursDaily at 3:00 AM UTC
Secrets7 daysWeekly on Sundays at 4:00 AM UTC

3. Backup Infrastructure

Backup Schedule

JobScheduleDescription
PostgreSQL DailyDaily 2:00 AMFull database export
PostgreSQL Monthly1st of month 2:00 AMLong-term retention
GCS SyncDaily 3:00 AMDocument storage sync
Secrets BackupSundays 4:00 AMEncrypted config backup
CleanupMondays 5:00 AMRemove expired backups

Retention Policy

Backup TypeRetention
PostgreSQL Daily2 years
PostgreSQL Monthly5 years
Secrets Weekly2 years
GCS Sync Logs90 days

Storage Location

GCP Cloud Storage: keshless-backups bucket

keshless-backups/
├── postgresql/daily/YYYY-MM-DD/
├── postgresql/monthly/YYYY-MM/
├── documents/
└── secrets/weekly/

4. Disaster Scenarios

ScenarioCategoryRecovery Approach
Single collection corruptionMinorRestore specific collection
Database corruptionMajorFull PostgreSQL restore
KYC documents deletedMajorCloud Storage restore
API secrets compromisedCriticalSecrets restore + rotation
Complete data center failureCatastrophicFull system recovery
Ransomware attackCatastrophicClean restore + forensics

5. Recovery Procedures

PostgreSQL Recovery

CommandDescription
--listList available backups
--date YYYY-MM-DDRestore from specific date
--collection NAMERestore single collection
--dry-runPreview without changes
--dropDrop existing before restore (DANGER)
--monthlyUse monthly backup

Always run --dry-run first.

GCS Document Recovery

CommandDescription
--listList available backups
--folder NAMERestore specific folder
--allRestore all folders
--key PATHRestore specific file
--overwriteOverwrite existing files

Secrets Recovery

  1. Download encrypted backup from Cloud Storage
  2. Decrypt using SECRETS_ENCRYPTION_KEY (stored in password manager)
  3. Restore to Cloud Run environment
  4. Rotate any compromised secrets

CRITICAL

SECRETS_ENCRYPTION_KEY must be stored in a secure password manager. Without it, secrets cannot be decrypted.


6. Recovery Time Estimates

Recovery TypeEstimated Time
Single collection5-15 minutes
Full PostgreSQL30-60 minutes
Single GCS folder15-30 minutes
Full GCS restore1-2 hours
Secrets restore15 minutes
Full system2-4 hours

7. Full Disaster Recovery Sequence

  1. ASSESS - Determine damage extent, identify latest backup
  2. PREPARE - Access backup bucket, verify integrity
  3. RESTORE DATABASE - List, dry-run, execute
  4. RESTORE DOCUMENTS - Priority: kyc, selfies, vendor-kyc
  5. RESTORE SECRETS - Decrypt and update environment
  6. VERIFY - Health checks, test transactions
  7. RESUME - Deactivate emergency controls, monitor

Post-Recovery Actions

  • [ ] Document incident timeline and actions
  • [ ] Verify data integrity against manifest
  • [ ] Notify regulators if required (within 72 hours)
  • [ ] Conduct post-mortem within 5 days
  • [ ] Update procedures based on lessons learned

8. DR Testing

Test TypeFrequency
Backup verificationWeekly (automated)
Single collection restoreMonthly
Full restore drillQuarterly
Full DR simulationAnnually

9. Roles and Responsibilities

RoleResponsibilities
DR CoordinatorOverall coordination (CTO)
Database AdminPostgreSQL recovery
Infrastructure LeadCloud resources, secrets
Compliance OfficerRegulatory notification
CommunicationsStakeholder updates

10. Infrastructure Reference

GCP Resources

ResourceValue
Projectcontracts-470406
Regioneurope-west1
Cloud SQLeneza-40ab5:europe-west1:eneza-postgres
Cloud Runkeshless-api

Storage Buckets

BucketPurpose
keshless-documentsPrimary documents
keshless-backupsBackup storage

Quick Reference

┌─────────────────────────────────────────────────┐
│          DISASTER RECOVERY QUICK STEPS          │
├─────────────────────────────────────────────────┤
│ 1. Activate emergency control (if needed)       │
│ 2. List backups: restore-postgresql.ts --list   │
│ 3. Dry run: --date YYYY-MM-DD --dry-run        │
│ 4. Execute: --date YYYY-MM-DD [--drop]         │
│ 5. Verify document counts                       │
│ 6. Test authentication and transactions         │
│ 7. Deactivate emergency control                 │
└─────────────────────────────────────────────────┘

Document Control: Version 1.0 | January 2026 | IT Operations Team

Internal use only - Keshless Payment Platform