Running NodeJS in an AWS Lambda will make you lose your mind! Or at least your memory ;)

My lambda keeps running out of memory??

We have a Lambda in AWS running NodeJS code that would occasionally timeout. The default lambda logs print out how much memory is available, and how much is currently used. This would show that our used memory was equal to the available memory, which lead to us investigating whether there was a memory leak in our application.

While a timeout is manageable (at least it’s not a crash!) I want to fully understand what the root cause of the issue is. It’s easy to assume that there must be some memory leak in the code that is causing this, however after investigating this issue, I discovered that this is an unfortunate casualty of poor default configurations.

TLDR: Increase your lambda’s memory to > 512MB.*edit* 9/27/23 — looks like you can now add NODE_OPTIONS: ‘ --max-old-space-size=<some value less than 128>' as an environment variable to your lambda. Thanks to Mike Bianco for finding this!

Homer_Simpson

First off, it helps to be able to see the historical memory usage in a graph. We can get these graphs in CloudWatch by turning on the extra monitoring tools in the lambda’s configuration.

Monitoring_Operations_Tools

The “Enhanced monitoring” option allows us to start tracking memory usage and gives us a graph that looks like this:

Enhanced_Monitoring

Notice that the memory keeps on climbing up until it plateaus at 256MB. This is because there is 256MB of memory allocated to this lambda (the default is 128MB).

Initially it seems that there must be some memory leak in the code causing the memory to be consumed before finally tipping over and causing an exception.

However, our logs and monitoring show no crashes, only an occasional timeout. If this was a memory leak causing a crash, we probably wouldn’t expect to see the plateaus, but rather an immediate jump down as the lambda resets.

Here’s what the usage looks like after I increase the lambda’s memory to 1024MB:

Lambda_Memory

No more plateaus, but still a steady increase in memory usage, before the sharp drops.

How do we investigate to find the root cause?

OK. It looks like I’ve mitigated the issue, but the memory usage is still climbing! The problem has only been held off, not fixed! How do I figure out what’s going on here ?

With the help of the built in utility process.memoryUsage()

<code>/* process.memoryUsage() gives us: { rss: Resident Set Size, is the amount of space occupied in the main memory device (that is a subset of the total allocated memory) for the process, including all C++ and JavaScript objects and code. heapTotal: V8's memory usage heapUsed: V8's memory usage external: The memory usage of C++ objects bound to JavaScript objects managed by V8. arrayBuffers: The memory allocated for ArrayBuffers and SharedArrayBuffers, including all Node.js Buffers. This is also included in the external value. When Node.js is used as an embedded library, this value may be 0 because allocations for ArrayBuffers may not be tracked in that case. } */var startMem = process.memoryUsage(); console.log("used: " + (startMem.heapUsed / 1048576) + "MB"); console.log("external: " + (startMem.external / 1048576) + "MB"); console.log("arrayBuffers: " + (startMem.arrayBuffers / 1048576) + "MB");</code>

Additionally, we can make use of node-memwatch (for windows machines, use @floffah/node-memwatch) to hook on to the garbage collection event and get insights to what the garbage collection is doing.

<code>var memwatch = require('@floffah/node-memwatch');// will trigger whenever garbage collection events happen memwatch.on('stats', function (stat:any) { console.log("gc event"); console.log(stat); })// will trigger when memwatch detects a leak memwatch.on('leak', function (stat:any) { console.log("leakleak"); console.log(stat); })</code>

I slapped this bit of code in to a hello world Express app, and by hitting refresh a bunch of times I was able to see the following:

Hello_World_Express

I am only interested in the heapUsedexternal, and arrayBuffers since the other values are too high level for this investigation. What we can see is that even in a hello world application the memory steadily increases until all of a sudden a garbage collection event is fired. The memwatch hook for leak detection is not firing, and along with everything else we’ve discussed, this is a decent indication that there are indeed no memory leaks in my barebones express app.

So unless there’s a memory leak in the Express or NodeJS source code, all signs point to this issue is not being caused by a memory leak.

So what exactly is going on then?

To understand this better, we can take a closer look at the V8 engine that node runs on top of. This blog post does a really good job explaining how V8 handles memory. Essentially, the V8 engine has 2 methods of running Garbage Collection.

  1. Mark-Sweep — relatively slow but frees all non-referenced memory
  2. Scavenge Compact — fast but incomplete

Looking at our outputs, we can see that “gcScavengeCount” was run 32 times, and that the “gcMarkSweetCompact” was run 4 times. The memwatch ‘stat’ hook only fired when the “gcMarkSweep” was performed, but not when the Scavenge was.

When garbage collection runs, it interrupts the execution of the application. So by waiting to run Mark-Sweep until there is a larger chunk of memory to clean, and using the Scavenge Compact method more frequently in between cleans, the V8 engine prioritizes the applications’ performance while still managing the memory usage at a reasonable level.

It turns out you can manually configure how large the chunk will be by using the flag max-old-space-size . This flag gets piped down from the node invocation down to V8 and is the tipping point after which “V8 will spend more time on garbage collection in an effort to free unused memory”.

Now here’s the rub: The default value for this setting is 512MB

This means that if the memory you have allocated for your lambda is less than 512MB, then very likely, your app will run out of memory before the garbage collection kicks in

AFAIK in AWS Lambda, there is no way to introduce this flag or even use the — expose-gc flag (which allows you to manually run garbage collection).

Conclusion

To summarize, If you are running a NodeJS Lambda in AWS, and you’re seeing the memory continue to rise to the point where there is no memory left, all you need to do is increase the memory to greater than 512MB.You can do this by navigating to Configuration > General configuration > Edit

General_Configuration

Other Considerations

Increasing your Lambda’s memory could increase the cost incurred from running the Lambda. Check here to see the AWS Lambda pricing structure.Ironically though, bumping up the memory for our Lambda may save on costs as our average latency and run duration was reduced by a third.