Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Download file from Azure Blob Storage hangs when retrieving final 1-2 blocks #729

Closed
mfontes opened this issue Jan 19, 2024 · 7 comments
Closed

Comments

@mfontes
Copy link

mfontes commented Jan 19, 2024

Investigative information

Please provide the following:

  • Timestamp: 2024-01-18 22:59:00.641
  • Invocation ID: f3822fe-54f2-470a-9d82-64e057ae075e
  • Region: West US 2

Note: I am providing a particular instance where this failed, but this is 100% reproducible for me and my team members when running the FunctionApp locally.

Repro steps

  1. Create a BlockBlobClient using package @azure/storage-blob
  2. Call downloadToBuffer() on the blockBlobClient and store it to a variable

Expected behavior

  1. Expect variable to have file contents

Actual behavior

  1. File download hangs for an unacceptable length of time, over 3 minutes for a 1.3mb file
  2. After ~3 minute hang, file download completes and data is stored in a variable
  3. The time it hangs seems to scale up with the length of the file. 2mb takes over 5 minutes

Known workarounds

Does not hang on smaller file sizes. Tried with 16kb and 32kb without issue. No luck on file sizes greater than about 1.3mb

Related information

I realize this is smelling like a bug in blob storage, however the reason I am filing it here is that this behavior is only reproducible when downloading the blob from the function host. I have tried downloading the blob in a few ways:

  1. Using the NodeJS SDK for Azure blob storage, calling BlockBlobClient.download()
  2. Using the SDK, calling BlockBlobClient.downloadToBuffer() using various blockSizes
  3. Using a simple http.get request to download the file using a SAS URL
  4. Using http head request to get the file content length, followed by a series of http get byte range requests to retrieve the blob in chunks (again using a SAS URL)

In each case, the blob download completes in around 1 second from a simple NodeJS script. However they all seem to hang when downloading from a function host. In cases 1, 2, and 4 I was able to write progress to my console (using context.log) and could see consistently that I would actually retrieve the whole file except the last 1 or 2 chunks of data, at which point it would hang.

For a 1.3MB file, it would hang for 3.5 minutes before actually completing successfully. For a 2.1MB file, the hang was longer than my 5 minute function timeout, so I did not see if it would complete successfully since my function would timeout before it could.

  • Programming language used - NodeJS 18.15.0
  • Bindings used - Service Bus Queue trigger
  • Using @azure/functions: ^4.1.0 programming model

Sample output given the download attempted using the code from Source below:

progress! 262144 of 1311223
progress! 262647 of 1311223
progress! 524791 of 1311223
progress! 786935 of 1311223
progress! 1049079 of 1311223
Source Download file:
  const blobStorage = new BlobServiceClient(
    `https://${account}.blob.core.windows.net`,
    new StorageSharedKeyCredential(account, key)
  );

  const blobClient = blobStorage
    .getContainerClient(containerName)
    .getBlockBlobClient(fileName);

  const { contentLength } = await blobClient.getProperties();
  const fileBuffer = await blobClient.downloadToBuffer(0, undefined, {
    blockSize: Math.pow(2, 10) * 256, // I've tried several block sizes (1k, 16k, 128k, 256k, 1mb, 2mb, and 4mb), all produce the same issue
    onProgress: function ({ loadedBytes }) {
      context.log(`progress! ${loadedBytes} of ${contentLength}`);
    },
  });

// This point is reached after 3.5minutes for a 1.3mb file
  const asString = fileBuffer.toString();
  return JSON.parse(asString);

host.json:

{
  "aggregator": {
    "batchSize": 1000,
    "flushTimeout": "00:00:30"
  },
  "extensions": {
    "serviceBus": {
      "maxConcurrentCalls": 1
    }
  },
  "extensionBundle": {
    "id": "Microsoft.Azure.Functions.ExtensionBundle",
    "version": "[4.0.0, 5.0.0)"
  },
  "functionTimeout": "00:05:00",
  "logging": {
    "applicationInsights": {
      "samplingSettings": {
        "isEnabled": false
      }
    }
  },
  "version": "2.0"
}
@mfontes
Copy link
Author

mfontes commented Jan 19, 2024

I wanted to add this as a separate comment rather than detract from the original, but I did notice this very odd bit of behavior. If I throw an error immediately after the file download, it looks like the file download completes in a reasonable amount of time:

  const fileBuffer = await blobClient.downloadToBuffer(0, undefined, {
    blockSize: Math.pow(2, 10) * 256,
    onProgress: function ({ loadedBytes }) {
      context.log(`progress! ${loadedBytes} of ${contentLength}`);
    },
  });
  throw new Error('blah');

This logs to the console:

[2024-01-19T22:39:25.925Z] progress! 262144 of 1311223
[2024-01-19T22:39:25.936Z] progress! 524288 of 1311223
[2024-01-19T22:39:25.938Z] progress! 524791 of 1311223
[2024-01-19T22:39:25.944Z] progress! 786935 of 1311223
[2024-01-19T22:39:25.952Z] progress! 1049079 of 1311223
[2024-01-19T22:39:25.961Z] progress! 1311223 of 1311223
[2024-01-19T22:39:25.961Z] Error: blah

Note the timestamps --- there is very little delay between retrieving all of the blocks. The whole download completes in ~36ms. Of course, throwing an error does me no good as it fails my function execution. Swallowing the error in a try/catch just reproduces the original behavior

@ejizba
Copy link
Contributor

ejizba commented Jan 22, 2024

What version of @azure/storage-blob are you using? That being said, it's weird to me that it fails when you send an http.get request without even using the @azure/storage-blob package. That indicates the package theoretically isn't the problem

Could you share the files you're using to test? I'm curious if there's anything weird/unique about them. I used the command mkfile -n 2m 2m to make a 2mb file (not json though) and your code worked just fine for me locally.

@mfontes
Copy link
Author

mfontes commented Jan 22, 2024

Hi @ejizba,

We're using @azure/storage-blob ^12.17.0 which I believe is the latest version currently. But like you said, it doesn't appear to be related to that package considering a standard http get request hangs in the same way.

Here is a sample ~2mb file that I am having trouble downloading
Here is a sample 1.3mb file that I can download, but only after a 3.5 minute hang

@mfontes
Copy link
Author

mfontes commented Jan 22, 2024

Also confirming I tried mkfile -n 2m 2m to generate a file, and I am able to download the file successfully without any hanging.

@ejizba
Copy link
Contributor

ejizba commented Jan 22, 2024

The sample file worked fine for me 🤷‍♂️. Maybe it's something unique about your storage account? You could try a fresh new storage account if you think that could be contributing

@mfontes
Copy link
Author

mfontes commented Jan 22, 2024

@ejizba sorry for bothering you with this. I determined the root cause was in our function app code. After downloading from blob storage, we had a bit of poorly performing code that was taking a long time to operate on the downloaded blob.

It looked like the blob was never finished downloading because I couldn't log the progress to 100%, but it appears that has something to do with using context.log before a long and resource-intensive process is occurring? If I switched to console.log I could see that my code was still executing and stuck in a long-running loop. Fixed the long-running loop and the download appears to complete as it should! It also explains why throwing an error after download appeared to finish the download -- the function execution would complete before my long-running loop could start.

@mfontes mfontes closed this as completed Jan 22, 2024
@ejizba
Copy link
Contributor

ejizba commented Jan 23, 2024

No problem, glad you got it figured out!

There's some known issues around the timing/ordering of context.log. The host team made some changes recently to address this, but it'll be slow to roll out. The fix may or may not have helped in your case
Azure/azure-functions-nodejs-library#191
Azure/azure-functions-host#9238

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants