Skip to content
This repository was archived by the owner on Feb 12, 2024. It is now read-only.

perf: debounce pin flush #2634

Closed
wants to merge 2 commits into from
Closed

perf: debounce pin flush #2634

wants to merge 2 commits into from

Conversation

achingbrain
Copy link
Member

@achingbrain achingbrain commented Nov 25, 2019

When importing a directory of files, after every block is added we pin it and flush pins to the repo. This is really slow and if we are adding lots of files, the pinset is out of date almost immediately so this PR debounces flushing pins until after we've stopped adding things.

A call to pinManager.flushPins will create a promise that first waits 100ms to see if any subsequent flushes have been requested in that time - if so that promise will end early and the later flush request will flush the pins.

When importing a directory of files, after every block is added we
pin it and flush pins to the repo.  This is really slow and if we
are adding lots of files, the pinset is out of date almost
immediately so this PR debounces flushing pins until after we've
stopped adding things.

A call to `pinManager.flushPins` will return a promise, and that
promise will resolve once the pins have been flushed, but it will
first wait 100ms and see if any subsequent flushes have been
requested in that time - if so that promise will end early and
return the promise created by the later flush request.
@achingbrain achingbrain marked this pull request as ready for review November 27, 2019 12:38
Copy link
Member

@alanshaw alanshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way we can trigger this on the leading edge as well as the trailing edge? Something like if there's not flush running or debouncing then do flush otherwise be debounced.

I'm just noticing that the minimum time this call will take will now be 100ms + time(flush()), so if we call it just once in a 100ms period then we take the debounce hit even though we didn't debounce any calls.

this._flushingPins = {
controller,
promise: delay(100)
.then(async () => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come this call has a .then().catch()? We should stick to await as much as possible.

promise: (async () => {
  await delay(100)
  if (controller.signal.aborted) return
  // ...
})()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personal preference I guess? I've always found IIFEs a bit, well iffy to look at.


this.log(`Flushed pins with root: ${rootCid}`)
})
.catch(err => this.log(`Could not flush pins: ${err}`))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please let us not smother the error!

Secondly, if the flush does fail, any debounced calls succeed...and they succeed before the flush has been performed so there's no way to know if an error happens.

Could we use something like debounce-promise? It would mean the debounced calls would have to wait for the debounce run so perhaps this wouldn't exhibit as much perf improvement but would still be good.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. If I am importing files A, B and C, and we block on flushing pins after importing A, then B will not start to import until A has finished and the pins have flushed, debounced or otherwise so there's no benefit to be gained.

This PR works by doing the flushing in another async context but as you point out, there's no way to know if an error occurred, because the only way to know is to block until the operation has completed, which means the next file in the list will not get imported until the pins have flushed.

@achingbrain
Copy link
Member Author

Thinking about this more, I'm not sure it's a great idea - the errors aren't returned to the calling context and I don't know what will happen if the repo is closed while the debounce timeout is running.

I'm going to have a go at storing pins in the datastore (see #2197), that way we won't have to flush them or do any of these complicated DAG manipulations.

@achingbrain
Copy link
Member Author

Going to close this as refactoring how pins are stored is a better way forwards.

@achingbrain achingbrain closed this Feb 4, 2020
@achingbrain achingbrain deleted the debounce-pin-flush branch February 4, 2020 12:00
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants