Skip to content

Distinguish fanout from non-fanout links #10

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wking opened this issue Jun 3, 2015 · 1 comment
Closed

Distinguish fanout from non-fanout links #10

wking opened this issue Jun 3, 2015 · 1 comment

Comments

@wking
Copy link
Contributor

wking commented Jun 3, 2015

In ipfs/kubo#1320, I proposed a hypothetical fanout graph with:

Object    Type            Link
                       Name Hash
========  ===========  ==== ========
<hash-5>  fanout       d*   <hash-6>
<hash-6>  fanout-leaf  d    <hash-7>

and suggested non-recursive resolution of /ipfs/<hash-5>/d should
return /ipfs/<hash-7>.

On Wed, Jun 03, 2015 at 02:32:47PM -0700, Juan Batiz-Benet wrote:

btw, the fanout links thing should work like a union or a unixfs
file-- each node should work as a valid root (i.e. no fanout and
fanout-leaf distinction).

Agreed on valid-root-ness. I haven't looked into our layout
implementation. Fanout-leaf-ness is really just a way to identify
whether a link points to another fanout node, or if it points to an
independent child. If that information isn't contained in the fanout
node (e.g. if it requires you to lookup the linked hash and check it's
type), then non-recursive resolution for /ipfs/<hash-5>/d would look
like:

  1. Fetch <hash-5> from the Merkle DAG.
  2. Check it's type. It's a fanned-out directory (i.e. not a
    multi-chunk file, or a single-chunk file, or a single-chunk
    directory).
  3. Find the fanout strategy used by this object. In this case it
    turns out to be first-name-character fanout.
  4. Lookup the fanout child for our name (‘d’). The name starts with a
    ‘d’, so follow the ‘d*’ (or whatever) link to drill down the fanout
    tree to <hash-6>.
  5. Repeat (1), but for <hash-6>. Since <hash-5> was a fanout object,
    we recurse here even though --recursive wasn't set.
  6. Repeat (2), but for <hash-6>. It's a fanned-out directory (if
    we're not distinguishing leafs from non-leafs).
  7. Repeat (3), but for <hash-6>. This time it's “just use the full
    name”.
  8. Repeat (4), but for <hash-6>. The full name is ‘d’, so follow the
    ‘d’ link to <hash-7>.
  9. Repeat (1), but for <hash-7>.
  10. Repeat (2), but for <hash-7>. It's a unixfs-dir, so we've left the
    fanout and should use <hash-7> in the response.

I'd like to be bailing out after (8), before fetching <hash-7>.
That's going to require some way to distinguish “follow this link to
get another fanout object” (what we had in step 4) with “follow this
link to leave the fanout tree (what we had in step 8). I don't really
care if that information is encoded at the object level (e.g. “all
links from this node are to fanout objects”) or in special per-link
metadata (e.g. “this link points to another fanout object”), but I
want it somewhere in the linking object so I don't have to fetch the
linked object to find out. In this case “first-name-character fanout”
isn't compatible with links to non-fanout objects and “just use the
full name” isn't compatible with links to fanout objects, so that's
enough to figure this out without a <hash-7> fetch. Are all fanout
strategies going to be so obvious? Or do we need a separate way
to distinguish links that point to fanout nodes from links that point
to non-fanout nodes?

@hsanjuan
Copy link
Contributor

This issue is a bit old. Unixfs directories support a HAMT sharding function when they grow too big (which I suspect is what the fanout type was trying to solve?). Any case, I am closing this as it was never followed up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants