Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Readline: emit data events along with current line events #23032

Closed
caub opened this issue Sep 22, 2018 · 5 comments
Closed

Readline: emit data events along with current line events #23032

caub opened this issue Sep 22, 2018 · 5 comments
Labels
experimental Issues and PRs related to experimental features. feature request Issues that request new features to be added to Node.js. promises Issues and PRs related to ECMAScript promises. readline Issues and PRs related to the built-in readline module.

Comments

@caub
Copy link

caub commented Sep 22, 2018

It'd be great if readline could emit 'data' events for each line, so that for await could be used:

const readline = require('readline');
const stream = require('stream');

const input = new stream.Readable();
input.push(`{"some": "json","another":"json"}\n`);
input.push(`{"some": "json2","another":"json2"}\n`);
input.push(null);

// What I wish I would do:
(async () => {
  const rl = readline.createInterface({input});
  const rows = [];
  for await (const row of rl) rows.push(row);
  console.log(rows)
})();

// workaround:
const betterReadLine = ({input}) => {
  const output = new stream.PassThrough({objectMode: true});
  const rl = readline.createInterface({input});
  rl.on('line', line => {
    output.write(JSON.parse(line));
  });
  rl.on('close', () => {
    output.push(null);
  });
  return output;
};

(async () => {
  const rl = betterReadLine({input});
  const rows = [];
  for await (const row of rl) rows.push(row);
  console.log(rows)
})();
@devsnek
Copy link
Member

devsnek commented Sep 22, 2018

related: #18904

@devsnek devsnek added readline Issues and PRs related to the built-in readline module. feature request Issues that request new features to be added to Node.js. experimental Issues and PRs related to experimental features. labels Sep 22, 2018
@devsnek
Copy link
Member

devsnek commented Sep 22, 2018

Duplicate of #18603

@devsnek devsnek marked this as a duplicate of #18603 Sep 22, 2018
@devsnek devsnek closed this as completed Sep 22, 2018
@vsemozhetbyt vsemozhetbyt added the promises Issues and PRs related to ECMAScript promises. label Sep 22, 2018
@vsemozhetbyt
Copy link
Contributor

vsemozhetbyt commented Sep 22, 2018

@caub I was trying to make some workaround with an intermediate async generator that splits chunks and yields a Promise for each line, but such implementation was very slow, maybe due to many Promises involved. So I've made a shift with an intermediate async generator that splits chunks and yields a Promise for an Array with lines from the chunk, for now.

This implementation preserves line ending characters. If I need any granular line processing or transformation, I can use any Array functions (filter, map etc) on the consuming end.

read-lines-module.js:

'use strict';

const { createReadStream } = require('fs');

module.exports = async function* readLines(path, encoding = 'utf8') {
  const readable = createReadStream(path, encoding);
  let remainder = '';

  for await (const chunk of readable) {
    const lines = (remainder === '' ? chunk : `${remainder}${chunk}`)
                  .split(/(?<=\r?\n|\r(?!\n))/u);
    remainder = lines[lines.length - 1].endsWith('\n') ? '' : lines.pop();
    yield lines;
  }

  if (remainder !== '') yield [remainder];
};

test.js:

'use strict';

const { openSync, writeSync } = require('fs');
const readLines = require('./read-lines-module.js');

const output = openSync('big-file-copy.txt', 'w');

(async function main() {
  try {
    for await (const lines of readLines('big-file.txt')) {
      writeSync(output, lines.join(''));
    }
  } catch (err) {
    console.error(err);
  }
})();

@caub
Copy link
Author

caub commented Sep 22, 2018

interesting, it makes sense so (keeping the line returns also, since 'data' event should just chunk the content)

edit: I thought your snippet was what nodejs core would adopt, sry got confused, and hoping for https://github.com/nodejs/node/pull/18904/files to get merged soon

@vsemozhetbyt
Copy link
Contributor

@caub A new attempt to implement: #23916

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
experimental Issues and PRs related to experimental features. feature request Issues that request new features to be added to Node.js. promises Issues and PRs related to ECMAScript promises. readline Issues and PRs related to the built-in readline module.
Projects
None yet
Development

No branches or pull requests

3 participants