Skip to content

3.x: unbounded requests from first, take and others #6569

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
davidmoten opened this issue Jul 12, 2019 · 10 comments
Closed

3.x: unbounded requests from first, take and others #6569

davidmoten opened this issue Jul 12, 2019 · 10 comments
Milestone

Comments

@davidmoten
Copy link
Collaborator

In 2.x we talked about some surprising request patterns from operators like first, take and others where despite only one or a limited number of items being needed, Long.MAX_VALUE was requested of upstream and then cancelled after the desired number arrived (#5077). I believe this was a micro-optimization performance boost that improved the Scrabble benchmarks. Any change to the pattern was rejected based on it being a breaking API change.

In essence I'd like us not to be opinionated about the effect on the upstream of over-requesting (particularly over a network boundary). By over-requesting we are fundamentally losing information that can be useful to optimizing upstream processing. An example that springs to mind is that requesting a large number may be translated to an api call to a remote upstream that does a full sort (O(nlogn)) whereas requesting only one can be implemented upstream with a max scan (O(n)). This of course assumes one and only request to create the stream so is not a run-of-the-mill streaming case. I'd also suggest we are not opinionated about the ability of upstream to respond to cancellation (upstream may be performing cpu intensive actions in third-party libraries that aren't cancellable).

I think the effect on the benchmarks of reverting to naturally bounded requests where obvious (first, take, etc) will be very small.

Can we revisit this one for 3.x?

@akarnokd
Copy link
Member

Okay, let's make limit the default and change first/single.

What about the lambda subscribe?

@akarnokd akarnokd added this to the 3.0 milestone Jul 12, 2019
@davidmoten
Copy link
Collaborator Author

Thanks @akarnokd, so I'm suggesting we constrain upstream requests for:

  • elementAt
  • elementAtOrError
  • first (calls elementAt)
  • firstOrError (calls elementAt)
  • take
  • single(default)
  • singleOrError
  • blockingSingle(default) calls single(default)
  • blockingSingle (calls singleOrError)

limit can then be an alias for take.

What about the lambda subscribe?

You've lost me, what are you referring to?

@akarnokd
Copy link
Member

There is no reason to have limit separately then.

lambda subscribe requests max value.

@davidmoten
Copy link
Collaborator Author

lambda subscribe requests max value.

Yes, that seems reasonable to me. Why would it be otherwise?

@davidmoten
Copy link
Collaborator Author

There is no reason to have limit separately then.

ok by me

@akarnokd
Copy link
Member

You were worrying about overrequesting over the network boundary. A straight subscribe would do that too for your case. I'm fine with it as is.

@davidmoten
Copy link
Collaborator Author

You were worrying about overrequesting over the network boundary. A straight subscribe would do that too for your case. I'm fine with it as is.

I'm fine with it also.

@davidmoten
Copy link
Collaborator Author

Would you like a PR for these changes? You are welcome to do it yourself if you want to, but I can help if useful.

@akarnokd
Copy link
Member

I'm still waiting for #6589 but you can go ahead and create a PR.

@davidmoten
Copy link
Collaborator Author

I'm happy to wait for #6589 before starting the PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants