Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Steb's Questions & Feedback #13

Closed
Stebalien opened this issue Jul 13, 2022 · 6 comments
Closed

Steb's Questions & Feedback #13

Stebalien opened this issue Jul 13, 2022 · 6 comments

Comments

@Stebalien
Copy link

Some quick questions and "context-free" suggestions (dumped in one place so I don't spam a bunch of issues).

  • merge: what if I want an N-way merge?
    • Consider changing previous to an array of parents (like git).
    • Unless you only support 2-way merges, but I assume you'll need bigger merges.
  • metadata
    • Take a look at https://github.com/ipfs/specs/blob/main/UNIXFS.md#metadata
    • Why is type in metadata? That seems like something that should live in the file itself.
    • What if I have a lot of metadata? It would be nice if I could have (some) metadata in a separate object.
    • What about extensible metadata?
    • Why is "previous" and "merge" not metadata?
    • Can I add my own custom file type?
      • I'd like to be able to add my own that can be understood by a subset of clients. E.g., a special libp2p "socket" file.
  • Symlink
    • Looking at the format, does this mean I can effectively have "alternatives"? I.e.: {ipns: "...", http: "..."}
  • Private data.
    • Can I link to public data from private data?
    • Can I link to private data from public data?
    • Can I link to "more restricted" data from private data?
@matheus23
Copy link
Member

Thanks for a second set of eyes on this! 🙏


Consider changing previous to an array of parents (like git).

Yes.

Why is type in metadata? That seems like something that should live in the file itself.

Yeah. In WNFS we've previously piggy-backed on existing metadata types out there providing us with info of whether something is a file or not.
Even if only for convenience (& support for common ways of de/serializing stuff) we've been considering moving it a level up for the next version.

What if I have a lot of metadata? It would be nice if I could have (some) metadata in a separate object.

Yes. Nothing should prevent you from linking to a CID in the metadata (& that being understood by IPFS natively, it being dag-cbor).

What about extensible metadata?

Metadata is meant to be extensible. Making that more explicit in the spec is TBD!

Why is "previous" and "merge" not metadata?

I think for these two fields it doesn't really matter where they go (unless we decide to split out metadata into another block entirely at some point or something like that).

I'd like to be able to add my own that can be understood by a subset of clients. E.g., a special libp2p "socket" file.

That sounds like an interesting use case. I don't quite understand what you're trying to achieve with this.
I'm assuming underneath it's still just a bunch of bytes? If so, you could just use a regular file & signal that it's something special in the metadata.


Yeah, symlinks are very much underspecified right now. I need to check back with Brooke on what the plans around this exactly are (& with other folks who've worked on existing implementations of them in previous versions of WNFS so far).

@expede
Copy link
Member

expede commented Jul 15, 2022

@Stebalien thank you very much for the questions and feedback! 🙏

Why is "previous" and "merge" not metadata?

Because it's encoded as UnixFS today. When we write a custom codec, it should absolutely be in the metadata. We needed all of this stuff to work on an arbitrary gateway, but with some changing coming (or better WNFS support) a custom codec would be extremely high on the priority list. We've never been happy with having to use UnixFS for this.

What if I have a lot of metadata? It would be nice if I could have (some) metadata in a separate object.

Interesting. I guess you mean breaking it up into multiple metadata files. Is there a a use case for that, or could it be pushed off? If you have that much metadata, it may make sense to store it as a file and link to the CID from the metadata file. I'm not against such a change, but need to think on it more.

Consider changing previous to an array of parents (like git).

We've gone back and forth on this. You can build N-way merges out of binary merges. Assuming that you haven't written an update to the merge, then you treat all of the leaves of the merge tree as abelian monoid (partially ordered, associative, commutative).

IIRC the tradeoff on binary links is that you get more consistent merge nodes across replicas, but more internal nodes. It's not an "over my dead body", though. We want back and forth on this one.

Can I add my own custom file type?

Sure. Do you mean arbitrary IPLD, or a file extension?

As an aside that may or may not be relevant, something that Brendan asked for previously (but now seems to think is a bad idea) is "LDFiles" (Linked Data Files), which are arbitrary IPLD that can have internal versioned updates.

Can I link to public data from private data?
Can I link to private data from public data?

Yes to both, via symlink

Can I link to "more restricted" data from private data?

What is "more restricted" in this context? If you mean data that the current part of the cryptree can't read, then yes via symlink.

@expede
Copy link
Member

expede commented Jul 15, 2022

What about extensible metadata?

Metadata is meant to be extensible. Making that more explicit in the spec is TBD!

Exactly this! It's essentially xattrs

@expede
Copy link
Member

expede commented Jul 16, 2022

Another few things that you may find interesting, but we haven't actually done anything with yet, in a tweak to an earlier answer:

You can put public files (or directories) in the private directory, and treat them as "unlisted".

You technically could put encrypted files or DAGs in the public tree, but now you're revealing some info about them (the path that they're embedded inside). It's probably best to keep them in one big structure, and use (e.g.) the human-readable file path (which doesn't correlate to the external labels at all).

The private tree can contain an arbitrary number of file systems (or other structures) if kicked off by a super user. This is likely a feature, not a bug. For example, our WIP distributed database's private data MAY end up living here since it has an extremely different layout from the file system and different concept of versioning.

@matheus23
Copy link
Member

matheus23 commented Aug 23, 2022

I think most of the points raised were addressed. The rest I'd consider to be part of #24

@Stebalien
Copy link
Author

Yeah, sorry, I wrote up a response then GitHub ate it.

But yes, I have a much better understanding now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants