Also known as: HA
High availability (HA) is the practice of designing systems to keep running with minimal downtime, by removing single points of failure through redundancy, failover, and health monitoring.1
Overview
Availability is often expressed in “nines”: 99.9% uptime allows about nine hours of downtime a year, while 99.999% (“five nines”) allows about five minutes. Reaching high numbers means that no single component — server, disk, network link, power feed — can take the whole system down. The toolkit includes redundant hardware, load balancers that route around dead servers, RAID for disks, replicated databases, and automatic failover to a standby when the primary fails. The data center itself contributes with redundant power and cooling.
Where it fits
High availability is about staying up, while scalability is about handling more load; the two are related but distinct, and many techniques (replication, multiple servers) serve both. Orchestrators like Kubernetes automate failover for containerized apps. A hobby GopherTrunk node rarely justifies HA, but a monitoring site that must never miss a call would replicate capture nodes and back-end servers so any one can fail without an outage.
Sources
-
High availability — Wikipedia, on availability, redundancy, and failover. ↩