, independent consultant and author of the book ” Building Microservices
“, talked at the Velocity conference in London
on some of the challenges with hybrid systems relying on both serverless architectures
and traditional infrastructure. In particular, Newman discussed how serverless changes our notion of resiliency and how the two paradigms clash at times of high load in the system.
Resiliency in traditional server systems relies on state (for example a database connection pool to throttle and control the amount of requests hitting the database at any particular point in time). Stability in this type of systems is kept by controlling incoming load and balancing it across multiple instances. But with ephemeral functions (lambdas) there is no place to store the controlling state, therefore there needs to be a parity between the way functions auto-scale with load and the way the backend databases scale as well.
Auto-scaling cloud databases such as Amazon’s DynamoDB
or Google’s Bigtable
fit well in the serverless paradigm, but Newman pointed out that the majority of systems rely on traditional databases, thus simply “bolting-on” serverless functions on a legacy system can have drastic consequences. Newman highlighted the fact that even serverless poster child
faced unexpected challenges. Although they explicitly set a hard constraint of 1000 lambda connections to any one of their Redis
node (know to be able to handle 10 times that number of connections), they still saw failing nodes because lambda functions seem to keep the connection alive up to 3 minutes after they have been stopped (based on anedoctal evidence). Bustle engineering had to delve into Redis inner workings to fix this issue (forcing those zombie connections to time out faster), which highlights the mismatch between how serverless and non serverless handle load and resiliency, Newman argued.
Another challenge Newman mentioned is the fact thatcircuit breakers, typically used in microservices to gracefully handle failure downstream – effectively shedding load
thus making the overall system more resilient – rely on maintaining state across multiple requests. For instance, to be able to close the circuit (self-heal) once the downstream service has shown to be stable again.
Newman said service meshes such asIstio orLinkerd might help with some of these issues, acting as persistent stateful proxies that can co-ordinate load between microservice functions.
Finally, from a security point of view, functions are running containers and thus are vulnerable to exploits where a container breaks out into another one running on the same host. But this becomes quite hard since the container where a function is run lives for a short period of time, it cannot be exploited after the function is terminated. Security experts such asGuy Podjarny warn, however, that serverless moves security concerns to the application level, and a chain of function calls can be exploited
if not secured correctly.
Newman also mentioned the concern many people have around lock-in when selecting a particular cloud vendor’s implementation of Function-as-a-Service
(FaaS), an issue that was covered in arecent InfoQ eMag. Moving the discussion from lock-in to understanding (and accepting) the trade-off between going faster (with less cognitive load) and the cost of migration (which is decreasing as feature sets become similar across different FaaS providers) is the key to handle this concern.