#22: Graceful Degradation
Hello! Today, we will discuss a fundamental concept in reliability: graceful degradation.
Complex systems regularly encounter unexpected events—whether it’s a sudden peak of requests or external dependency problems. When these events occur, a service can deliberately reduce its quality of service to avoid a complete failure. This is known as graceful degradation.
Let’s delve into a concrete example from my experience at a previous company. We had a classic service that exposed a REST API over HTTP, connected to a database through a connection pool. To prevent overwhelming the database, each service instance was configured to have a pool of 100 connections.
Here’s what happened during an unexpected traffic surge:
Unexpected load spike: Suddenly, the service started to face a heavy load—due to a spike in user requests.
Connection pool limit reached: Each request required a database connection, and since the number of requests exceeded the pool’s capacity, new requests began to queue up.
Autoscaling triggered: Kubernetes autoscaling kicked in and spun up new instances of the service to handle the increased load.
Keep reading with a 7-day free trial
Subscribe to The Coder Cafe to keep reading this post and get 7 days of free access to the full post archives.