What Is Disaster Recovery?
Disaster recovery (DR) is the structured approach to restoring IT systems, applications, and data after a disruptive event — whether a cyberattack, hardware failure, natural disaster, or human error. It transforms the question "what happens when systems go down?" into a documented, tested, and auditable process.
A mature DR programme goes beyond backup tapes in a safe. It defines recovery objectives for every critical system, implements appropriate technical strategies, tests regularly, and provides the compliance evidence that frameworks demand.
Key DR Concepts
| Concept | Definition | Why It Matters |
|---|
| RPO (Recovery Point Objective) | Maximum acceptable data loss measured in time | Determines backup frequency and replication strategy |
| RTO (Recovery Time Objective) | Maximum acceptable downtime before systems must be restored | Determines recovery strategy and infrastructure investment |
| MTPD (Maximum Tolerable Period of Disruption) | Longest time the business can survive without a system | Sets the upper bound for RTO |
| BIA (Business Impact Analysis) | Assessment of impact from system unavailability | Drives RPO/RTO decisions and system criticality tiers |
| DR plan | Documented procedures for system recovery | Operational guide and compliance evidence |
| Failover | Switching operations to the standby environment | Actual recovery mechanism |
| Failback | Returning operations to the primary environment | Restoration after the disaster is resolved |
DR Strategy Tiers
| Strategy | RPO | RTO | Cost | How It Works |
|---|
| Backup and restore | Hours to days | Hours to days | Lowest | Regular backups restored to new infrastructure |
| Pilot light | Minutes to hours | Hours | Low-medium | Minimal core services running, scale up on demand |
| Warm standby | Minutes | Minutes to hours | Medium | Scaled-down replica with recent data, scale up on failover |
| Hot standby | Seconds to minutes | Minutes | High | Full replica with real-time replication |
| Active-active | Near zero | Near zero | Highest | Multiple active sites sharing load, automatic failover |
System Criticality Classification
| Tier | Description | Typical RPO | Typical RTO | Examples |
|---|
| Tier 1 — Mission critical | Systems where downtime causes immediate revenue loss or regulatory breach | < 1 hour | < 1 hour | Payment processing, core database, authentication |
| Tier 2 — Business critical | Systems that significantly impact operations | < 4 hours | < 4 hours | ERP, CRM, email, primary applications |
| Tier 3 — Business important | Systems needed for daily operations but with workarounds | < 24 hours | < 24 hours | File sharing, internal tools, reporting |
| Tier 4 — Non-critical | Systems with minimal immediate business impact | < 72 hours | < 72 hours | Development environments, archives |
DR Testing Approaches
| Test Type | Scope | Effort | Frequency | What It Validates |
|---|
| Tabletop exercise | Walk through DR plan as a group discussion | Low | Quarterly | Plan completeness, team awareness, decision-making |
| Walkthrough test | Step through procedures without executing | Low-medium | Semi-annually | Procedure accuracy, role assignments |
| Functional test | Test individual recovery components | Medium | Semi-annually | Backup restoration, failover mechanisms |
| Full failover test | Complete switch to DR environment | High | Annually | End-to-end recovery capability |
| Surprise test | Unannounced recovery exercise | High | Annually (optional) | True readiness, undocumented dependencies |
Cloud DR Patterns
| Pattern | Description | RPO/RTO | Cost Optimisation |
|---|
| Cross-region backup | Backups replicated to another cloud region | Hours/Hours | Pay for storage only, compute on demand |
| Pilot light | Database replication, minimal compute in DR region | Minutes/Hours | Minimal running compute, scale on failover |
| Warm standby | Scaled-down replica in DR region | Minutes/Minutes | Reduced-size instances, auto-scale on failover |
| Multi-region active | Full deployment in multiple regions | Near zero | Full infrastructure cost in multiple regions |
| Multi-cloud | DR in a different cloud provider | Varies | Avoids single-provider dependency, higher complexity |
Compliance Requirements
Framework Mapping
| Requirement | ISO 27001 | SOC 2 | NIS2 | DORA |
|---|
| DR/continuity policy | A.5.29 | A1.2 | Art. 21(2)(c) | Art. 11(1) |
| Business impact analysis | A.5.29 | A1.2 | Art. 21(2)(c) | Art. 11(2) |
| Recovery objectives (RPO/RTO) | A.5.29 | A1.2 | Art. 21(2)(c) | Art. 11(3) |
| DR plan documentation | A.5.30 | A1.2 | Art. 21(2)(c) | Art. 11(4) |
| Regular DR testing | A.5.30 | A1.3 | Art. 21(2)(c) | Art. 11(6) |
| DR plan review and update | A.5.30 | A1.3 | Art. 21(2)(c) | Art. 11(7) |
| Communication plan | A.5.30 | A1.2 | Art. 23 | Art. 14 |
| Third-party DR requirements | A.5.21 | CC9.2 | Art. 21(2)(d) | Art. 28 |
Audit Evidence
| Evidence Type | Description | Framework |
|---|
| Business impact analysis | Documented BIA with system criticality tiers | All frameworks |
| DR plan | Complete, current plan with procedures and contacts | All frameworks |
| RPO/RTO definitions | Documented objectives per system with business approval | All frameworks |
| DR test results | Test reports with outcomes, issues found, remediation | All frameworks |
| Backup verification | Regular backup restoration testing records | ISO 27001, SOC 2 |
| Communication plan | Documented escalation and notification procedures | NIS2, DORA |
| Third-party DR assessment | Vendor DR capability evaluation | DORA |
| DR plan review records | Evidence of annual review and update | All frameworks |
Common Mistakes
| Mistake | Risk | Fix |
|---|
| No tested DR plan | Untested plan fails during actual disaster | Test DR annually, fix issues found |
| RPO/RTO not business-aligned | Over- or under-investing in recovery capability | Conduct BIA with business stakeholders |
| Backup without restoration testing | Backups may be corrupt or incomplete | Test backup restoration monthly |
| Ignoring dependencies | Recovery fails due to unrecovered dependencies | Map and document all system dependencies |
| No communication plan | Chaos during disaster, regulatory notification failures | Document communication procedures and practice |
| DR plan not updated | Outdated plan references decommissioned systems | Review and update DR plan after every major change |
| Single-region cloud | Cloud region outage takes down everything | Implement cross-region DR strategy |
How Orbiq Supports Disaster Recovery Compliance
Orbiq helps you demonstrate disaster recovery controls:
- Evidence collection — Centralise BIA documents, DR plans, test results, and backup verification records
- Continuous monitoring — Track DR control effectiveness and test schedules
- Trust Center — Share your disaster recovery posture via your Trust Center
- Compliance mapping — Map DR controls to ISO 27001, SOC 2, NIS2, and DORA
- Audit readiness — Pre-built evidence packages for auditor review
Further Reading