download of compressed files on arm-plattforms #110

simonhbw · 2021-04-27T13:48:14Z

hi, ive a problem using tools like @azure/storage-blob or node-fetch on armv7 devices with any node-version, which uses llhttp instead of the legacy parser. when downloading a compressed file, the response is always decompressed what cause an data corruption failure. i can reproduce the issue using whatever @azure/storage-blob or node-fetch directly.

the interessting thing is, that the issue only ocurres on arm(v7)-plattforms. running the same code on an x64/86 plattform, doesnt cause any issues.
changin the node-version to 11.x or 12.x, with "http-parser=legacy" also doesnt cause any issues

code sample:

const { BlockBlobClient } = require('@azure/storage-blob');
const path = require('path')
const fs = require('fs')
const cliProgress = require('cli-progress');

const repositoryPath = '../repository'
const uri = 'placeUrlWithSasHere'

async function startDownload(sasUri) {
try {
    console.log('starting download from url: %s', sasUri);
    const blobClient = new BlockBlobClient(sasUri);
    const props = await blobClient.getProperties();
    // console.log({ props });
    
    const fileName = blobClient._name;
    const filePath = path.join(repositoryPath, fileName);
    const bar = new cliProgress.SingleBar({}, cliProgress.Presets.shades_classic);

    const downloadBlockBlobResponse = await blobClient.download(null, null,
        {
            onProgress: ((ev) => {
                bar.update(ev.loadedBytes);
            }),
        });
    
    const readStream = downloadBlockBlobResponse.readableStreamBody;
    const writeStream = fs.createWriteStream(filePath);
    
    bar.start(props.contentLength, 0);
    readStream
        // .pipe(throttler)
        .pipe(writeStream);
    
    readStream.on('error', (e) => { 
        bar.stop()
        console.error(e.message)
    });
    writeStream.on('error', (e) => { 
        bar.stop()
        console.error(e.message)
    });

    readStream.on('end', () => {
        bar.stop()
        console.log('readstream finished')
    })
} catch (error) {
   console.error(error)
}
}

startDownload(uri)

result on armv7:

Data corruption failure: received less data than required and reached maxRetires limitation. Received data offset: 1473222655, data needed offset: 10063157247, retries: 5, max retries: 5

result on x64

readstream finished

The text was updated successfully, but these errors were encountered:

Chaphasilor · 2021-04-27T16:55:36Z

I'm also having this issue, check out the original issue (nodejs/node#37053) I created, which includes a basic setup for reproducing the issue :)

indutny · 2021-05-12T06:03:29Z

Thanks to undici issue: nodejs/undici#803 . I finally know what's going on there. There is a truncating conversion of uint64_t to size_t in consume in llparse. Will work on the fix soon.

On 32bit platforms `size_t` is essentially `uint32_t` (or at times even meager `uint16_t`). Loading `uint64_t` field value into `size_t` on these platforms would truncate the high bits and leave only the low 32 (16) bits in place. This leads to various interesting errors in downstream modules. See: - nodejs/llhttp#110 - nodejs/undici#803

indutny · 2021-05-12T06:29:45Z

Fix: nodejs/llparse#44

On 32bit platforms `size_t` is essentially `uint32_t` (or at times even meager `uint16_t`). Loading `uint64_t` field value into `size_t` on these platforms would truncate the high bits and leave only the low 32 (16) bits in place. This leads to various interesting errors in downstream modules. See: - nodejs/llhttp#110 - nodejs/undici#803 This patch makes all field loads go into their respective types. Truncation doesn't happen in this case because C coercion rules will cast both types to the largest necessary datatype to hold either of them.

On 32bit platforms `size_t` is essentially `uint32_t` (or at times even meager `uint16_t`). Loading `uint64_t` field value into `size_t` on these platforms would truncate the high bits and leave only the low 32 (16) bits in place. This leads to various interesting errors in downstream modules. See: - nodejs/llhttp#110 - nodejs/undici#803 This patch makes all field loads go into their respective types. Truncation doesn't happen in this case because C coercion rules will cast both types to the largest necessary datatype to hold either of them. PR-URL: #44 Reviewed-By: Matteo Collina <matteo.collina@gmail.com> Reviewed-By: Daniele Belardi <dwon.dnl@gmail.com> Reviewed-By: Robert Nagy <ronagy@icloud.com> Reviewed-By: Rich Trott <rtrott@gmail.com> Reviewed-By: Anna Henningsen <anna@addaleax.net>

indutny · 2021-05-13T02:53:15Z

PR for node.js: nodejs/node#38665

simonhbw · 2021-06-07T11:57:58Z

for my specific problem, the issue is not fixed.
im getting the following results when running the code from the first post, using a large file (~10GB):

using node@12.22.1

readstream finished

using node@14.x

Data corruption failure: received less data than required and reached maxRetires limitation. Received data offset: 1473222655, data needed offset: 10063157247, retries: 5, max retries: 5

using node@16.3.0

the download progress stops after at a random part between 5GB or 8GB without any log message
note that the "Data curruption failure" does not ocurr anymore.

Chaphasilor · 2021-06-07T13:36:58Z

using node@16.3.0

the download progress stops after at a random part between 5GB or 8GB without any log message
note that the "Data curruption failure" does not ocurr anymore.

Can you provide some demo code or the link it happens with? Would love to try it out, given that it worked for me with a 12+ GB file on Node 16.3.0...
Also, Node 16 didn't give an error message ever, it just closed the connection :)

simonhbw · 2021-06-07T14:00:05Z

using node@16.3.0

the download progress stops after at a random part between 5GB or 8GB without any log message
note that the "Data curruption failure" does not ocurr anymore.

Can you provide some demo code or the link it happens with? Would love to try it out, given that it worked for me with a 12+ GB file on Node 16.3.0...
Also, Node 16 didn't give an error message ever, it just closed the connection :)

you can use the code from my first post. because the file is non-public, i could send an email with the url. is this ok for you?

Chaphasilor · 2021-06-07T23:27:04Z

you can use the code from my first post. because the file is non-public, i could send an email with the url. is this ok for you?

Maybe you could try reducing your code further. Is there a way to use that azure library to get a (temorary) download URL for that file?

If yes, you could try download the file from that URL, using the built-in http(s) module.
This would make sure that the issue isn't in the azure package :)

That is, unless you already tried that ^^

indutny mentioned this issue May 12, 2021

c: use proper type coercion in consume nodejs/llparse#44

Closed

indutny closed this as completed May 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

download of compressed files on arm-plattforms #110

download of compressed files on arm-plattforms #110

simonhbw commented Apr 27, 2021

Chaphasilor commented Apr 27, 2021

indutny commented May 12, 2021

indutny commented May 12, 2021

indutny commented May 13, 2021

simonhbw commented Jun 7, 2021

Chaphasilor commented Jun 7, 2021

simonhbw commented Jun 7, 2021

Chaphasilor commented Jun 7, 2021

download of compressed files on arm-plattforms #110

download of compressed files on arm-plattforms #110

Comments

simonhbw commented Apr 27, 2021

Chaphasilor commented Apr 27, 2021

indutny commented May 12, 2021

indutny commented May 12, 2021

indutny commented May 13, 2021

simonhbw commented Jun 7, 2021

Chaphasilor commented Jun 7, 2021

simonhbw commented Jun 7, 2021

Chaphasilor commented Jun 7, 2021