Datacenter High Availability
Overview
In SD-WAN solutions, high availability (HA) for Data Centers (DCs) is critical. This setup uses an Active-Passive HA configuration, where each tenant has two DCs (Active and Passive) and a gateway node connecting the DCs to the LAN.
Each DC has a public IP endpoint for branch IPSEC tunnel connections. If the Active DC fails, the Passive DC takes over, and all branches reconnect to it. The dc-monitor micro-service tracks the Active DC’s status every 20 seconds, initiating a switchover to the Passive DC if necessary. Open Shortest Path First (OSPF) protocol is configured between the Active DC, Passive DC, and Gateway node to enable seamless route injection during failover.
Topology
This setup involves:
- DC-A (Active)
- DC-P (Passive)
- AMZ-Gateway (Gateway connecting the DCs to LAN)
- BR-1 and BR-2 (Branches)
Setup Configuration
Ensure your setup resembles the above topology. Follow these steps to configure HA:
- Onboard Devices: Add AMZ-Gateway, DC-A, DC-P, BR-1, and BR-2 CPEs via a secure shell. Verify that dc-monitor is running, as it is essential for HA.
- Add metadata to identify DC and Branch CPEs:
- Verify Onboarding: Confirm that all devices are onboarded in the Alpsee UI.
- OSPF Configuration: Configure OSPF in AMZ-Gateway, DC-A, and DC-P for reverse route injection.
OSPF Configuration
AMZ-Gateway Configuration:
- Enable OSPF.
- Add Router ID.
- Create a new area (recommended ID: 0).
- Add active interface for route learning (network type: point_to_multipoint).
- Add passive interface to reach DC LAN servers (network type: broadcast).
DC-A (Active):
- Enable OSPF.
- Add Router ID.
- Configure route redistribution for kernel routes.
- Add area (recommended ID: 0).
- Add active interface (network type: point_to_multipoint).
DC-P (Passive):
- Enable OSPF.
- Add Router ID.
- Configure route redistribution for kernel routes.
- Add area.
- Add active interface (network type: point_to_multipoint).
Verification
- OSPF Routes:
- AMZ-Gateway: Verify external routes are learned from DC-A.
- DC-A: Confirm no external routes (Active status).
- DC-P: Confirm external routes learned from DC-A.
- Route Checks:
- Verify routes across all CPEs (AMZ-Gateway, DC-A, DC-P, BR-1, BR-2).
- IP Verification:
- Confirm IP assignment from CPEs for BR-1, BR-2, and DC LAN servers.
- Traffic Testing:
- Run input/output operations on BR-1 and BR-2 LAN PCs, monitoring traffic on DC LAN servers using
tcpdump
.
- Run input/output operations on BR-1 and BR-2 LAN PCs, monitoring traffic on DC LAN servers using