Mark Syznaka
FinOps Architect at CloudeBroker
Unchecked AWS Lambda functions can generate lots of unplanned costs. Learn how this practitioner prevents these anomalies.
It’s exciting to utilize serverless functions, like AWS Lambda, to process workflows on the cloud at scale. However, without proper monitoring, these functions can potentially loop and create spikes in cloud costs. These costs accumulate and scale if not caught, as described by the account below.
A development team was creating a Lambda function as a gatekeeper for actioning analysis of large amounts of data. The Lambda code had a mechanism to check for an event and if detected to import, analyze and generate reports. This Lambda in a large development environment encountered a persistent event which caused it to repeat its function endlessly in a tight loop. The code was put in place on a Friday before a three day weekend.
After this long weekend, the cost spike was detected by the FinOps team by happenstance when looking at general spend levels using a third party program. By this point the Lambda charges along with the analytics service charges were over $10,000 combined. Previously the monthly Lambda charges never exceeded $500.
The developer was asked to insert and test code to detect looping and stop the program if found. After the anomaly was detected, the FinOps team used the Cloudwatch anomaly detection to alert the stakeholders if these two cost category services deviated from baseline spend by more than 20% in an hour. The developer was able to update the Lambda program in under two weeks to check for and then take action if looping was detected.
AWS has since added looping detection in Lambda as of July 2023. Lambda now detects functions that appear to be running in a recursive loop and drops requests after exceeding 16 invocations. However, this still requires vigilance from FinOps practitioners and engineers to create policy, practice, and awareness of these types of potential cloud cost spikes.