Lambda, Python and Memory Usage

As a favour to a good friend who needed a really simple thing done I decided to write this Lambda function. Basically: load a file, search through it for a particular string and return a flag indicating if it was found or not. Simple, right?

Ok. That wasn't really what I had been asked to do but for the purposes of illustration and this particular treatise it is the core of the matter and what tripped me up. Anyway, to the point at hand: I figured: store the file in S3; read it into Lambda and do a search. Front end it with API Gateway and you've got a super-duper cheap (and quick) way of searching this text. Cheap, easy, fast. Just what the doctor ordered.

Only one small obstacle - it's a big file. 800 megabytes or so. But that's not a big deal, right? Lambda has 1.5 GB of memory available (at the time of doing this - September 2017) so no problems! Just retrieve the file from S3 and we're done. As long as that doesn't take too long then we're good.

Let's write a bit of code - ensuring that Lambda has 1.5 GB of memory, we set the function timeout to (say) 20 seconds and we assign a suitable role.

import boto3

def LoadObject():
    S3 = boto3.client("s3")

    try:
        Response = S3.get_object(Bucket="randombucket", Key="randomkey")
        print("Object size: "+Response["ResponseMetadata"]["HTTPHeaders"]["content-length"])

        FileContents = Response["Body"].read()
    
    except Exception as e:
        print("Error getting object: "+e)

def lambda_handler(event, context):
    LoadObject()

The error message from Lambda is interesting:

Execution result: failed

START RequestId: 61fd3a48-a8b2-11e7-af8a-f741ba48d44e Version: $LATEST
Object size: 839775828
END RequestId: 61fd3a48-a8b2-11e7-af8a-f741ba48d44e
REPORT RequestId: 61fd3a48-a8b2-11e7-af8a-f741ba48d44e  Duration: 17005.67 ms   Billed Duration: 17100 ms   Memory Size: 1536 MB    Max Memory Used: 1536 MB    
RequestId: 61fd3a48-a8b2-11e7-af8a-f741ba48d44e Process exited before completing request

Say what? We ran out of memory? But we're only loading just over 800 MB! How did we use 1.5 GB of memory and still not load the file?

With a little bit of investigating, it seems that Python (more likely Boto3 or perhaps the underlying HTTP/HTTPS retrieval library) is allocating memory for some type of buffer before handing the object over to be assigned into a string. What happens if we try and read just a part of the file? Let's work our way down a bit - starting at 800 MB.

import boto3

def LoadObject():
    S3 = boto3.client("s3")

    try:
        Response = S3.get_object(Bucket="randombucket", Key="randomkey")
        print("Object size: "+Response["ResponseMetadata"]["HTTPHeaders"]["content-length"])

        FileContents = Response["Body"].read(amt=800000000)
    
    except Exception as e:
        print("Error getting object: "+e)

def lambda_handler(event, context):
    LoadObject()

Execution result: failed

START RequestId: c66fefcc-a8b9-11e7-8a4a-33fdf56d58d4 Version: $LATEST
Object size: 839775828
END RequestId: c66fefcc-a8b9-11e7-8a4a-33fdf56d58d4
REPORT RequestId: c66fefcc-a8b9-11e7-8a4a-33fdf56d58d4  Duration: 14292.27 ms   Billed Duration: 14300 ms   Memory Size: 1536 MB    Max Memory Used: 1536 MB    
RequestId: c66fefcc-a8b9-11e7-8a4a-33fdf56d58d4 Process exited before completing request

Nope. Turns out the sweet spot is somewhere a bit over 700 MB (which makes sense because 700 * 2 = 1400 which is less than 1500 or roughly 1.5 GB) but let's play it safe and go with 500 MB as a number. Now, can we do two reads of 500 MB and load the file?

import boto3

def LoadObject():
    FileContents = ""
    S3 = boto3.client("s3")

    try:
        Response = S3.get_object(Bucket="randombucket", Key="randomkey")
        print("Object size: "+Response["ResponseMetadata"]["HTTPHeaders"]["content-length"])

        for i in range(0, 2):
            FileContents += Response["Body"].read(amt=500000000)
 
    except Exception as e:
        print("Error getting object: "+e)

def lambda_handler(event, context):
    LoadObject()

Execution result: succeeded

START RequestId: c68b22f7-a8ba-11e7-91b9-d1069e754502 Version: $LATEST
Object size: 839775828
END RequestId: c68b22f7-a8ba-11e7-91b9-d1069e754502
REPORT RequestId: c68b22f7-a8ba-11e7-91b9-d1069e754502  Duration: 12269.91 ms   Billed Duration: 12300 ms   Memory Size: 1536 MB    Max Memory Used: 1163 MB

Yes! Ok. Loaded, now we can go to work. Let's search for the string we're interested in. Note that if we reduce the read buffer down to (say) 250 MB and do four reads we reduce our memory consumption from 1.1 GB to 900 (ish) MB so that might be a factor later (because it is crazy difficult to see momentary memory usage in Python so I can't really tell how much memory is being used at any particular second) - so if the file size increases it may be a problem but for now let's go with it. And around twelve seconds to pull the file from S3 isn't a terrible hit for a first search for this use case so we're cooking with gas now!

First though we need to put that string into a global variable and we should make sure that we only load it on cold function starts.

import boto3

GlobalFileContents = ""

def LoadObject():
    global GlobalFileContents
    
    if len(GlobalFileContents) == 0:
        S3 = boto3.client("s3")

        try:
            Response = S3.get_object(Bucket="randombucket", Key="randomkey")
            
            for i in range(0, 2):
                GlobalFileContents += Response["Body"].read(amt=500000000)
 
        except Exception as e:
            print("Error getting object: "+e)

def lambda_handler(event, context):
    LoadObject()

Execution result: failed

START RequestId: 685a7a14-a8bb-11e7-9a18-0b874190d810 Version: $LATEST
END RequestId: 685a7a14-a8bb-11e7-9a18-0b874190d810
REPORT RequestId: 685a7a14-a8bb-11e7-9a18-0b874190d810  Duration: 13491.09 ms   Billed Duration: 13500 ms   Memory Size: 1536 MB    Max Memory Used: 1537 MB    
RequestId: 685a7a14-a8bb-11e7-9a18-0b874190d810 Process exited before completing request

Huh? Didn't we just solve this problem? What if we reduce the size of the amount we're reading from S3 and increase the number of loops? Turns out - no. This doesn't work either. It appears that Python might be allocating a local variable within the function to store the file (or some other mechanism) before it allocates the global variable space - and we can't fit 800 MB * 2 into the Lambda function memory. Fine, Python. Have it your way.

import boto3

GlobalFileContents = ""

def LoadObject():
    global GlobalFileContents
    
    if len(GlobalFileContents) == 0:
        LocalContents = ""
        S3 = boto3.client("s3")

        try:
            Response = S3.get_object(Bucket="randombucket", Key="randomkey")
            
            for i in range(0, 2):
                LocalContents += Response["Body"].read(amt=500000000)
            GlobalFileContents = LocalContents
            
        except Exception as e:
            print("Error getting object: "+e)

def lambda_handler(event, context):
    LoadObject()

Execution result: succeeded

START RequestId: e4fb623c-a8bb-11e7-bce8-e7c17002a174 Version: $LATEST
END RequestId: e4fb623c-a8bb-11e7-bce8-e7c17002a174
REPORT RequestId: e4fb623c-a8bb-11e7-bce8-e7c17002a174  Duration: 12320.69 ms   Billed Duration: 12400 ms   Memory Size: 1536 MB    Max Memory Used: 1163 MB

Excellent. Luckily Python passes variables by reference so we only get one copy of the buffer now. It allocates the space inside the function; references it globally and then (lucky for me) doesn't destroy it when the function exits. Ok. NOW we can search.

import boto3

GlobalFileContents = ""

def LoadObject():
    global GlobalFileContents
    
    if len(GlobalFileContents) == 0:
        LocalContents = ""
        S3 = boto3.client("s3")

        try:
            Response = S3.get_object(Bucket="randombucket", Key="randomkey")
            
            for i in range(0, 2):
                LocalContents += Response["Body"].read(amt=500000000)
            GlobalFileContents = LocalContents
            
        except Exception as e:
            print("Error getting object: "+e)

def lambda_handler(event, context):
    LoadObject()

    SearchText = "FindMe"
    Found = GlobalFileContents.find(SearchText)
    
    return(Found)

Execution result: succeeded

START RequestId: 1eeebdde-a8bc-11e7-b523-530dce425102 Version: $LATEST
END RequestId: 1eeebdde-a8bc-11e7-b523-530dce425102
REPORT RequestId: 1eeebdde-a8bc-11e7-b523-530dce425102  Duration: 13072.65 ms   Billed Duration: 13100 ms   Memory Size: 1536 MB    Max Memory Used: 1163 MB

Sweet! And if we run it again we can see that we're not loading the file a second time and getting the benefit of warm function starts.

Execution result: succeeded

START RequestId: 56e8ba66-a8bc-11e7-a60e-a19bf1c32d99 Version: $LATEST
END RequestId: 56e8ba66-a8bc-11e7-a60e-a19bf1c32d99
REPORT RequestId: 56e8ba66-a8bc-11e7-a60e-a19bf1c32d99  Duration: 592.48 ms Billed Duration: 600 ms     Memory Size: 1536 MB    Max Memory Used: 1163 MB

600 milliseconds to search 800-odd MB of text isn't bad, right? Cool. Now, let's see what happens if we call the function from API GW. Luckily we can simulate this right from the console by crafting a test event.

{
  "queryStringParameters": {
    "SearchString": "FindMe"
  }
}

import boto3

GlobalFileContents = ""

def LoadObject():
    global GlobalFileContents
    
    if len(GlobalFileContents) == 0:
        LocalContents = ""
        S3 = boto3.client("s3")

        try:
            Response = S3.get_object(Bucket="randombucket", Key="randomkey")
            
            for i in range(0, 2):
                LocalContents += Response["Body"].read(amt=500000000)
            GlobalFileContents = LocalContents
            
        except Exception as e:
            print("Error getting object: "+e)

def lambda_handler(event, context):
    LoadObject()

    SearchText = event["queryStringParameters"]["SearchString"]
    Found = GlobalFileContents.find(SearchText)
    
    return(Found)

Execution result: failed

START RequestId: 1b909e6a-a8e1-11e7-91b0-852b465f0ad0 Version: $LATEST
: MemoryError
Traceback (most recent call last):
  File "/var/task/lambda_function.py", line 26, in lambda_handler
    Found = GlobalFileContents.find(SearchText)
MemoryError

END RequestId: 1b909e6a-a8e1-11e7-91b0-852b465f0ad0
REPORT RequestId: 1b909e6a-a8e1-11e7-91b0-852b465f0ad0  Duration: 13520.09 ms   Billed Duration: 13600 ms   Memory Size: 1536 MB    Max Memory Used: 1163 MB

Here we go again. Why are we getting a memory error? According to the Python wisdom online this is because we have run out of memory. Well duh. But Lambda says we're only using our regular 1.1 GB or so. Hmmm. What if Python is trying to allocate more memory than is available but is failing on that request? That'd produce that error message. But why would it be trying to allocate more memory? Searching by a string works - why would searching by a string that comes from the event that is passed to Lambda be any different?

import boto3

GlobalFileContents = ""

def LoadObject():
    global GlobalFileContents
    
    if len(GlobalFileContents) == 0:
        LocalContents = ""
        S3 = boto3.client("s3")

        try:
            Response = S3.get_object(Bucket="randombucket", Key="randomkey")
            
            for i in range(0, 2):
                LocalContents += Response["Body"].read(amt=500000000)
            GlobalFileContents = LocalContents
            
        except Exception as e:
            print("Error getting object: "+e)

def lambda_handler(event, context):
    LoadObject()

    OriginalSearchText = "FindMe"
    SearchText = event["queryStringParameters"]["SearchString"]
    print(OriginalSearchText)
    print(SearchText)

    Found = GlobalFileContents.find(SearchText)
    
    return(Found)

Execution result: failed

START RequestId: a684d388-a8e1-11e7-b2cd-4543dc3a3fa5 Version: $LATEST
FindMe
FindMe
: MemoryError
Traceback (most recent call last):
  File "/var/task/lambda_function.py", line 30, in lambda_handler
    Found = GlobalFileContents.find(SearchText)
MemoryError

END RequestId: a684d388-a8e1-11e7-b2cd-4543dc3a3fa5
REPORT RequestId: a684d388-a8e1-11e7-b2cd-4543dc3a3fa5  Duration: 10855.90 ms   Billed Duration: 10900 ms   Memory Size: 1536 MB    Max Memory Used: 1163 MB

No clues there - looks identical to me (and to string comparison operators too). So what is "find" doing that is so unusual? Much digging happens at this point with many hours passing while following curious trails down rabbit holes much to no avail until eventually, by accident a random posting somewhere has something of interest so I do this:

import boto3

GlobalFileContents = ""

def LoadObject():
    global GlobalFileContents
    
    if len(GlobalFileContents) == 0:
        LocalContents = ""
        S3 = boto3.client("s3")

        try:
            Response = S3.get_object(Bucket="randombucket", Key="randomkey")
            
            for i in range(0, 2):
                LocalContents += Response["Body"].read(amt=500000000)
            GlobalFileContents = LocalContents
            
        except Exception as e:
            print("Error getting object: "+e)

def lambda_handler(event, context):
    LoadObject()

    SearchText = event["queryStringParameters"]["SearchString"]
    print(type(SearchText))

    Found = GlobalFileContents.find(SearchText)
    
    return(Found)

Execution result: failed

START RequestId: edc9e9c4-a8e1-11e7-a949-8d49f899d7ad Version: $LATEST
<type 'unicode'>
: MemoryError
Traceback (most recent call last):
  File "/var/task/lambda_function.py", line 28, in lambda_handler
    Found = GlobalFileContents.find(SearchText)
MemoryError

END RequestId: edc9e9c4-a8e1-11e7-a949-8d49f899d7ad
REPORT RequestId: edc9e9c4-a8e1-11e7-a949-8d49f899d7ad  Duration: 12401.82 ms   Billed Duration: 12500 ms   Memory Size: 1536 MB    Max Memory Used: 1163 MB

So the event JSON payload is unicode? What does "find" do when comparing a unicode string to a utf-8 string? It converts the utf-8 string to unicode first - and in the process attempts to consume way more memory than we have available. Aha! I don't really care if I lose any special unicode characters (because there aren't any in the target string), so:

import boto3

GlobalFileContents = ""

def LoadObject():
    global GlobalFileContents
    
    if len(GlobalFileContents) == 0:
        LocalContents = ""
        S3 = boto3.client("s3")

        try:
            Response = S3.get_object(Bucket="randombucket", Key="randomkey")
            
            for i in range(0, 2):
                LocalContents += Response["Body"].read(amt=500000000)
            GlobalFileContents = LocalContents
            
        except Exception as e:
            print("Error getting object: "+e)

def lambda_handler(event, context):
    LoadObject()

    SearchText = event["queryStringParameters"]["SearchString"].encode("utf-8")
    Found = GlobalFileContents.find(SearchText)
    
    return(Found)

Execution result: succeeded

START RequestId: 6b3c2b34-a8e2-11e7-9528-9721462c0542 Version: $LATEST
END RequestId: 6b3c2b34-a8e2-11e7-9528-9721462c0542
REPORT RequestId: 6b3c2b34-a8e2-11e7-9528-9721462c0542  Duration: 12167.18 ms   Billed Duration: 12200 ms   Memory Size: 1536 MB    Max Memory Used: 1163 MB

Phew. Finally. There endeth the lesson.