Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NodeJS] RangeError [ERR_OUT_OF_RANGE] when reading a parquet file #141

Open
ntapsrigowri opened this issue Dec 23, 2022 · 4 comments
Open

Comments

@ntapsrigowri
Copy link

ntapsrigowri commented Dec 23, 2022

Unable to read a parquet file if it contains multiple lines using parquet reader results in RangeError [ERR_OUT_OF_RANGE]: The value of "offset" is out of range. It must be >= 0 and <= 79. Received 604307758(Error stack attached below)
If the parquet file contains only 1 record, then it works fine.

"parquetjs": "^0.11.2",
Node Version: v19.0.1
NPM version : 8.19.2
Attached parquet files with this thread

Archive.zip
Test Script:


import parquetjs from 'parquetjs';
const { ParquetReader } = parquetjs;
async function readParquetFile() {
    const reader = await ParquetReader.openFile('doesntwork.parquet');
    const cursor = reader.getCursor();

    let record = '';
    while (record !== undefined) {
                record = await cursor.next();
                console.log(">>RECORD",record);

        if (!record) {
            break;
        }
    }
}
readParquetFile()
node src/operations/test.js
/test/node_modules/brotli/build/encode.js:3
1<process.argv.length?process.argv[1].replace(/\\/g,"/"):"unknown-program");b.arguments=process.argv.slice(2);"undefined"!==typeof module&&(module.exports=b);process.on("uncaughtException",function(a){if(!(a instanceof y))throw a;});b.inspect=function(){return"[Emscripten Module object]"}}else if(x)b.print||(b.print=print),"undefined"!=typeof printErr&&(b.printErr=printErr),b.read="undefined"!=typeof read?read:function(){throw"no read() available (jsc?)";},b.readBinary=function(a){if("function"===
                                                                                                                                                                                                                              ^
RangeError [ERR_OUT_OF_RANGE]: The value of "offset" is out of range. It must be >= 0 and <= 79. Received 604307758
    at new NodeError (node:internal/errors:393:5)
    at boundsError (node:internal/buffer:86:9)
    at Buffer.readUInt32LE (node:internal/buffer:220:5)
    at decodeValues_BYTE_ARRAY (/test/node_modules/parquetjs/lib/codec/plain.js:168:29)
    at exports.decodeValues (/test/node_modules/parquetjs/lib/codec/plain.js:266:14)
    at decodeValues (/test/node_modules/parquetjs/lib/reader.js:294:34)
    at decodeDataPage (/test/node_modules/parquetjs/lib/reader.js:389:16)
    at decodeDataPages (/test/node_modules/parquetjs/lib/reader.js:322:20)
    at ParquetEnvelopeReader.readColumnChunk (/test/node_modules/parquetjs/lib/reader.js:255:12)
    at async ParquetEnvelopeReader.readRowGroup (/test/node_modules/parquetjs/lib/reader.js:231:35) {
  code: 'ERR_OUT_OF_RANGE'
}
@tanishqsaini1306
Copy link

Is there any solution for the above issue ? I encountered the same

@chris-aeviator
Copy link

happens to me whenever I try to read a file that has been saved with pandas

@tanishqsaini1306
Copy link

tanishqsaini1306 commented Jan 3, 2024

any workaround you did to overcome this ?

@chris-aeviator
Copy link

chris-aeviator commented Jan 3, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants