Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stream: add a non-destroying iterator to Readable #38526

Closed

Conversation

Linkgoron
Copy link
Member

@Linkgoron Linkgoron commented May 4, 2021

Add a non-destroying async iterator to Readable.

fixes: #38491

@nodejs/streams

A few things that I think might need attention:

  1. The API itself. Is a new method needed and is the naming OK, or should options be added to Symbol.asyncIterator?
  2. Should the new method receive different defaults than Symbol.asyncIterator?
  3. Maybe unrelated to this PR, should the createAsyncIterator method remove the listeners that it adds after the iteration ends? Currently it does not, but this was done when it essentially always destroyed the stream. Now it does not (although, it would be a bit problematic with Error, as it emits an error next-tick if an error was thrown). I'm not sure if it's that bad, as the listeners are mostly noops anyway.

add a non-destroying iterator to Readable

fixes: nodejs#38491
@github-actions github-actions bot added the needs-ci PRs that need a full CI run. label May 4, 2021
@Linkgoron Linkgoron added the stream Issues and PRs related to the stream subsystem. label May 4, 2021
@nodejs-github-bot
Copy link
Collaborator

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work! I've left a few notes.

lib/internal/streams/readable.js Outdated Show resolved Hide resolved
lib/internal/streams/readable.js Outdated Show resolved Hide resolved
fully. The stream will be read in chunks of size equal to the `highWaterMark`
option. In the code example above, data will be in a single chunk if the file
has less then 64KB of data because no `highWaterMark` option is provided to
[`fs.createReadStream()`][].

##### `readable.iterator([options])`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a thought: does this have to be in a new method rather than adding a parameter to the existing Symbol.asyncIterator one?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm -0 on naming. I prefer a new method as the Symbol.asyncIterator one has a predefined signature by the standard.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't feel right to me if the user has to call [Symbol.asyncIterator]() themselves.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the Symbol.asyncIterator one has a predefined signature by the standard.

Not really, all the standard says it this method is called with no argument (https://tc39.es/ecma262/#sec-getiterator). The standard gives a clear rule for the returned object (https://tc39.es/ecma262/#sec-asynciterable-interface), but not for the function signature. I personally don't feel strongly either way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer the separate method. The key issue with extending standard-defined APIs is that it makes reasoning about the portability of code far more complicated. A separate method makes it clear. That said, the behavior of the two can be identical such that [Symbol.asyncIterator]() could just defer to readable.iterator() with default arguments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, the behavior of the two can be identical such that [Symbol.asyncIterator]() could just defer to readable.iterator() with default arguments.

Yeah, the reason that one doesn't call the other is because of legacy streams and this. I'd need to use ReflectApply to bind this, and I preferred to have a regular method and send this as the first parameter instead of primordials.

@nodejs-github-bot
Copy link
Collaborator

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@nodejs-github-bot
Copy link
Collaborator

doc/api/stream.md Outdated Show resolved Hide resolved
@benjamingr
Copy link
Member

I'm a bit torn about this because I think there is a good chance we picked the wrong default for return on streams. I think pretty often people will need non-destructive iterators and for those cases having to write stream.iterator({ destroyOnReturn: false, destoryOnError:false }) isn't super ergonomic

@mcollina
Copy link
Member

mcollina commented May 6, 2021

I think the defaults are currently sound as the developer would need to do something to destroy the stream manually. The current defaults are safe.

@benjamingr
Copy link
Member

So here are the two use cases I have:

for await(const chunk of fs.createReadStream('./foo')) { // this should not leak

}
for await(const chunk of someMethodReturningAStream()) { // this should not leak

}

On the other hand, I want the following to also work:

const stream = fs.createReadSream('./someFile');
for await(const chunk of stream) {
  if (isSpecial(chunk)) break;
  processFirst(chunk); // e.g. read http headers
}
for await(const chunk of stream) { // continues from where the last for await ended
  if (isOtherSpecial(chunk)) break;
  processSecond(chunk); // e.g. read http body
}

@Linkgoron
Copy link
Member Author

Linkgoron commented May 6, 2021

So here are the two use cases I have:
On the other hand, I want the following to also work:

const stream = fs.createReadStream('./someFile');
for await(const chunk of stream) {
  if (isSpecial(chunk)) break;
  processFirst(chunk); // e.g. read http headers
}
for await(const chunk of stream) { // continues from where the last for await ended
  if (isOtherSpecial(chunk)) break;
  processSecond(chunk); // e.g. read http body
}

One option would be to add a setIterationMode on Readable, where you could set the default type of iterator that Symbol.asyncIterator would return. Other options include stuff like having another alias for nonDestructive iterators (not a fan of having tons of aliases for the same method) or having iterator return with different defaults (which I don't think has much support here).

setIterationMode would look something like this:

const stream = fs.createReadStream('./someFile');
stream.setIterationMode({ destroyOnReturn: false, destoryOnError:false });
for await(const chunk of stream) {
  if (isSpecial(chunk)) break;
  processFirst(chunk); // e.g. read http headers
}
for await(const chunk of stream) { // continues from where the last for await ended
  if (isOtherSpecial(chunk)) break;
  processSecond(chunk); // e.g. read http body
}

@mcollina
Copy link
Member

mcollina commented May 6, 2021

@benjamingr #38526 (comment) requirements are mutually exclusive.

@benjamingr
Copy link
Member

@benjamingr #38526 (comment) requirements are mutually exclusive.

That's sort of my point - it's the most concise way I could describe the problem. I'm not sure .iterator is the best way to deal with it. I'm wondering if this should be on the stream rather than the iterator.

It's possible that this is the best we can do - I just think it's a difficult problem and I want to make sure we're not exploring too few options here.

@nodejs-github-bot
Copy link
Collaborator

@mcollina
Copy link
Member

mcollina commented May 7, 2021

How could it be on the stream?

@nodejs-github-bot
Copy link
Collaborator

@Linkgoron
Copy link
Member Author

@benjamingr Do you have any outstanding objections to this getting merged?

@benjamingr
Copy link
Member

Nope, just uncertainty 😅

@benjamingr
Copy link
Member

I'd be more comfortable if this was experimental but I'm fine with this landing as stable.

@mcollina
Copy link
Member

I'd be happy to land it doc-experimental if you prefer @benjamingr? No warnings and we do not backport.

@mcollina mcollina added the baking-for-lts PRs that need to wait before landing in a LTS release. label May 19, 2021
@benjamingr
Copy link
Member

I'd be happy to land it doc-experimental if you prefer @benjamingr?

doc-experimental is good to me. If others feel strongly that this is the right API I'd also happily concede.

@mcollina
Copy link
Member

@Linkgoron can you add the experimental badge in there?

@Linkgoron
Copy link
Member Author

Added the experimental tag in the method docs

@nodejs-github-bot

This comment has been minimized.

@nodejs-github-bot

This comment has been minimized.

@nodejs-github-bot
Copy link
Collaborator

nodejs-github-bot commented May 24, 2021

CI: https://ci.nodejs.org/job/node-test-pull-request/38316/ 💚

@Linkgoron Linkgoron added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label May 25, 2021
@jasnell
Copy link
Member

jasnell commented May 25, 2021

Landed in df85d37

@jasnell jasnell closed this May 25, 2021
jasnell pushed a commit that referenced this pull request May 25, 2021
add a non-destroying iterator to Readable

fixes: #38491

PR-URL: #38526
Fixes: #38491
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Robert Nagy <ronagy@icloud.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
danielleadams pushed a commit that referenced this pull request May 31, 2021
add a non-destroying iterator to Readable

fixes: #38491

PR-URL: #38526
Fixes: #38491
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Robert Nagy <ronagy@icloud.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
@danielleadams danielleadams mentioned this pull request May 31, 2021
@targos targos removed the baking-for-lts PRs that need to wait before landing in a LTS release. label Sep 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
author ready PRs that have at least one approval, no pending requests for changes, and a CI started. needs-ci PRs that need a full CI run. stream Issues and PRs related to the stream subsystem.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add non-destroying AsyncIterator to Readable streams
9 participants