-
Notifications
You must be signed in to change notification settings - Fork 20
Builder refactoring, trickle builder and balanced builder #118
Conversation
In the second argument of the importer constructor you can specify the following options: | ||
|
||
* `chunker` (string, defaults to `"fixed"`): the chunking strategy. Now only supports `"fixed"` | ||
* `chunkSize` (positive integer, defaults to `262144`): the maximum chunk size for the `fixed` chunker. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to have a specific chunkerOptions
object, which has in the case for the fixed chunker an option of size
, as for other chunkers this might not make sense as an option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dignifiedquire Implemented your suggestion.
|
||
const defaultOptions = { | ||
chunkSize: 262144 | ||
// chunkSize: 26214 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's up with this lonely commented out option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was used in an experiment and had forgotten to remove it. Removed now.
@pgte good stuff, one thing I'm a bit worried about, is the usage of pushable in a quite a lot of places. I haven't gone through all of them but I think some of them could be removed in favor of more direct transforms. The same goes for the use of pullWritable, as that forces you to use something like pushable if you want to use it in a transform stream. I haven't found simple solutions for where you used these, so this might be the best way for these use cases, but maybe not. @dominictarr maybe you have some good ideas about how the use of pushable could be avoided in these situations. |
instead of using pull-write to write to pull-pushable, |
@dominictarr thanks, good point about the loss of back-pressure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Solid! I've little or no comments codewise, big thumbs up for the refactor 👍
Couple of requests:
- We need a couple of exporter tests for the new added importer strategies
- We need a test to make sure that if several roots are passed to one importer instance, it errors out (because it only knows how to return one root). This is a test that was missing.
* `trickle`: builds [a trickle tree](https://github.com/ipfs/specs/pull/57#issuecomment-265205384) | ||
* `maxChildrenPerNode` (positive integer, defaults to `172`): the maximum children per node for the `balanced` and `trickle` DAG builder strategies | ||
* `layerRepeat` (positive integer, defaults to 4): (only applicable to the `trickle` DAG builder strategy). The maximum repetition of parent nodes for each layer of the tree. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dignifiedquire wanna push a commit with jsdoc to this PR so that unixfs-engine gets fresh docs with this refactor? Or perhaps just to a PR to the PR, that would be also good :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can merge this as is and we can add the documentation in a later PR, other wise if @pgte wants to take a stab he can start adding this info as jsdoc as well. It doesn't have to be me to do that.
|
||
const result = pushable() | ||
|
||
reduceToParents(source, function (err, roots) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use arrow functions for unnamed functions, to keep consistent with the rest of our code base
} | ||
} | ||
|
||
module.exports = function (Chunker, ipldResolver, Reducer, _options) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use uppercase variables only for constructors, and if they are constructors call them with new
.
source, | ||
batch(Infinity), | ||
pull.asyncMap(reduce), | ||
pull.collect(function (err, roots) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
arrow function
} | ||
|
||
describe('balanced builder', () => { | ||
it('reduces one value into itself', callback => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use parens around the args for arrow functions
) | ||
}) | ||
|
||
it('reduces 3 values into parent', callback => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parens
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add this to the aegir-lint
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I always thought standard was linting for this, which is why I adopted it in the first place
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the other hand, given this standard/eslint-config-standard#54 (comment) gigantic history I don't feel like touching it tbh
@diasdavid both requested changes are done via: |
@diasdavid there are also some opportunities to create more independent packages from this, namely the pull streams that generate a balanced tree and the trickle tree.. |
@@ -163,7 +163,7 @@ In the second argument of the importer constructor you can specify the following | |||
* `trickle`: builds [a trickle tree](https://github.com/ipfs/specs/pull/57#issuecomment-265205384) | |||
* `maxChildrenPerNode` (positive integer, defaults to `172`): the maximum children per node for the `balanced` and `trickle` DAG builder strategies | |||
* `layerRepeat` (positive integer, defaults to 4): (only applicable to the `trickle` DAG builder strategy). The maximum repetition of parent nodes for each layer of the tree. | |||
|
|||
* `reduceSingleLeafToSelf` (boolean, defaults to `false`): optimization for, when reducing a set of nodes with one node, reduce it to that node. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👌🏽
@@ -3,7 +3,7 @@ | |||
const balancedReducer = require('./balanced-reducer') | |||
|
|||
const defaultOptions = { | |||
maxChildrenPerNode: 172 | |||
maxChildrenPerNode: 174 | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a comment with a link to https://github.com/ipfs/go-ipfs/blob/master/importer/helpers/helpers.go#L16-L35 so that we remember where this value comes from
* `flat`: flat list of chunks | ||
* `balanced`: builds a balanced tree | ||
* `trickle`: builds [a trickle tree](https://github.com/ipfs/specs/pull/57#issuecomment-265205384) | ||
* `maxChildrenPerNode` (positive integer, defaults to `172`): the maximum children per node for the `balanced` and `trickle` DAG builder strategies |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now defaults to 174
🎉🎉🎉🎉🎉🎉🎉🎉 THIS IS HAPPENING :D Thank you @pgte ❤️❤️❤️❤️❤️ |
Refactored the builder constructors and streams.
A builder now is a transform stream that reduces into one root. (some opportunities for externalising these reducers into their own packages).
Each builder can be parametrised by options.
Documented those options.
Client can also choose which builder to use (now defaults to balanced builder).
Adds individual tests for each reducer stream to make sure it's yielding the correct tree structure.
By the way,
Can now easily add a different chunker.
Client can choose chunker strategy using option. Defaults and only supports
fixed
chunker for now.Please review.