# Implement Redundancy Strategies ## High Availability - **Maximum Tolerable downtime (MTD)** - Scheduled service intervals vs unplanned outages - Scalability - Increase capacity within similar cost ratio - Scale out vs scale up - Elasticity - Cope with changes to demand in real time - **Fault tolerance and redundancy** | **Availability** | **Annual Downtime** | | ---------------- | ------------------- | | 99.9999% | 00:00:32 | | 99.999% | 00:05:15 | | 99.99% | 00:52:34 | | 99.9% | 08:45:36 | | 99.0% | 87:36:00 | ## Power Redundancy - Power problems - Spikes and surges - Blackouts and brownouts - Dual power supplies - Component redundancy for server chassis - Managed power distribution units (PDUs) - Protection against spikes, surges, and brownouts - Remote monitoring - Battery backups and uninterruptible power supply (UPS) - Batter backup at component level - UPS battery backups for servers and appliances - Generators ## Network Redundancy - Network interface card (NIC)/adapter teaming - Adapters with multiple ports - Multiple adapters - multiple NICs together is NIC teaming - Can be for redundancy or for load balancing - More bandwidth (except during failover) - Switching and routing - Design network with multiple paths - Load balancers - Load balancing switch to distribute workloads - Clusters provision multiple redundant servers to share data and session information ![[Pasted image 20230713092328.png]] ## Geographical Redundancy and Replication - Replication context - **Local storage (RAID)** - **Storage area network (SAN)** - Database - Virtual machine (VM) - Geographic dispersal - Asynchronous and synchronous replication - **Synchronous (must be written at both sites—expensive)** - active-Active - **Asynchronous (one site is primary and the others secondary)** - Optimum distances between sites - On-premises versus cloud # Implement Backup Strategies ## Backups and Retention Policy - Short term retention - Version control and recovery from corruption/malware - Long term retention - Regulatory/business requirements - Recovery window - Recovery point objective (RPO) ![[Pasted image 20230713095340.png]] ![[Pasted image 20230713095409.png]] - Differential - All data added/changed since the previous FULL backup - Incremential - Only gets data from the previous backup be it the full or the last incremental ## Snapshots - snapshots - Feature of file system allowing open file copy - Volume Shadow Copy Service (VSS) - VM snapshots and checkpoints - Image-based backup - System images ## Backup Storage Issues - Backup security - Access control and encryption - Offsite storage - Distance consideration - Physical transfer - Network/cloud backups - Online versus offline backups - Speed of restore operations - Risk to online backup data •3-2-1 rule ![[Pasted image 20230713095539.png]] ## Backup Media Types - Disk - SOHO backups - Lack enterprise-level capacity and manageability - Network attached storage (NAS) - File-level/protocol-based access - No offsite option - Tape - Enterprise-level capacity and manageability - Storage area network (SAN) and cloud - Block-level access to storage devices - Highly configurable - Mix storage technologies to implement performance tiers ## Restoration Order 1) Power delivery systems 2) Switch infrastructure then routing appliances and systems 3) Network security appliances 4) Critical network servers 5) Backend and middleware and verify data integrity 6) Front-end applications 7) Client workstations and devices and client browser access ## Non-Persistence - Separate compute instance from data - Snapshot/revert to known state - rollback to known configuration - Live boot media - Provisioning - Master image - Automated build from template - Configuration validation # Implement Cybersecurity Resilience Strategies ## Configuration Management - Service assets - Configuration items (CIs) - Assets that require configuration management - Baseline configuration - Configuration management system (CMS) - Creating and updating diagrams - Workflows - Physical and logical network topologies - Network rack layouts - etc... ## Asset Management - Inventory/asset management database - Asset identification and standard naming conventions - Barcodes and RFID tags - Standard naming conventions for asset IDs - Attribute fields and tags - Internet protocol (IP) schema - Static allocation vs DHCP ranges - **IP address management (IPAM)** software suites ## Change Control and Change Management - Change control - Assess whether a change should be made - **Classifying change (reactive, proactive, risk)** - **Request for Change (RFC)** - **Change Advisory Board (CAB)** - Change management - Ensure changes are applied with minimum disruption - Rollback plan ## Site Resiliency - Alternate processing sites/recovery sites - Provide redundancy for damage to resources stored on the primary site - Failover to alternate processing site (or system) - **Hot site** - Instant failover - **Warm Site** - Some delay or manual configuration before failover occurs - **Cold Site** - Significant delay and configuration before failover can occur ## Diversity and Defense in Depth - Layered security and defense in depth - Technology and control diversity - Provision different classes and types of controls - Mix technical, administrative, and physical controls - Deploy controls to prevent, deter, detect, and correct - Vendor diversity - Use more than one supplier - Crypto diversity ## Deception and Disruption Strategies - Asymmetry of attack and defense - Active defense - Fake/decoy assets - Honeypots, honeynets, and honeyfiles - Breadcrumbs - Disruption strategies - Bogus DNS records - Decoy directories and resources - Port spoofing to return fake telemetry/monitoring data - DNS sinkholes