High availability

Also known as: HA

High availability (HA) is the practice of designing systems to keep running with minimal downtime, by removing single points of failure through redundancy, failover, and health monitoring.¹

Overview

Availability is often expressed in “nines”: 99.9% uptime allows about nine hours of downtime a year, while 99.999% (“five nines”) allows about five minutes. Reaching high numbers means that no single component — server, disk, network link, power feed — can take the whole system down. The toolkit includes redundant hardware, load balancers that route around dead servers, RAID for disks, replicated databases, and automatic failover to a standby when the primary fails. The data center itself contributes with redundant power and cooling.

Where it fits

High availability is about staying up, while scalability is about handling more load; the two are related but distinct, and many techniques (replication, multiple servers) serve both. Orchestrators like Kubernetes automate failover for containerized apps. A hobby GopherTrunk node rarely justifies HA, but a monitoring site that must never miss a call would replicate capture nodes and back-end servers so any one can fail without an outage.

Sources

High availability — Wikipedia, on availability, redundancy, and failover. ↩

Overview

Where it fits

Sources

See also

Join the GopherTrunk community