Skip to content
This repository has been archived by the owner on Apr 29, 2020. It is now read-only.

[ARCHIVED] JavaScript implementation of the UnixFs importer used by IPFS

License

Notifications You must be signed in to change notification settings

ipfs-inactive/js-ipfs-unixfs-importer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿ”’ Archived

The contents of this repo have been merged into ipfs/js-ipfs-unixfs.

Please open issues or submit PRs there.

ipfs-unixfs-importer

standard-readme compliant Build Status Codecov Dependency Status js-standard-style

JavaScript implementation of the layout and chunking mechanisms used by IPFS to handle Files

Lead Maintainer

Alex Potsides

Table of Contents

Install

> npm install ipfs-unixfs-importer

Usage

Example

Let's create a little directory to import:

> cd /tmp
> mkdir foo
> echo 'hello' > foo/bar
> echo 'world' > foo/quux

And write the importing logic:

const importer = require('ipfs-unixfs-importer')

// Import path /tmp/foo/bar
const source = [{
  path: '/tmp/foo/bar',
  content: fs.createReadStream(file)
}, {
  path: '/tmp/foo/quxx',
  content: fs.createReadStream(file2)
}]

// You need to create and pass an ipld-resolve instance
// https://github.com/ipld/js-ipld-resolver
for await (const entry of importer(source, ipld, options)) {
  console.info(entry)
}

When run, metadata about DAGNodes in the created tree is printed until the root:

{
  cid: CID, // see https://github.com/multiformats/js-cid
  path: 'tmp/foo/bar',
  unixfs: UnixFS // see https://github.com/ipfs/js-ipfs-unixfs
}
{
  cid: CID, // see https://github.com/multiformats/js-cid
  path: 'tmp/foo/quxx',
  unixfs: UnixFS // see https://github.com/ipfs/js-ipfs-unixfs
}
{
  cid: CID, // see https://github.com/multiformats/js-cid
  path: 'tmp/foo',
  unixfs: UnixFS // see https://github.com/ipfs/js-ipfs-unixfs
}
{
  cid: CID, // see https://github.com/multiformats/js-cid
  path: 'tmp',
  unixfs: UnixFS // see https://github.com/ipfs/js-ipfs-unixfs
}

API

const importer = require('ipfs-unixfs-importer')

const import = importer(source, ipld [, options])

The import function returns an async iterator takes a source async iterator that yields objects of the form:

{
  path: 'a name',
  content: (Buffer or iterator emitting Buffers),
  mtime: (Number representing seconds since (positive) or before (negative) the Unix Epoch),
  mode: (Number representing ugo-rwx, setuid, setguid and sticky bit)
}

import will output file info objects as files get stored in IPFS. When stats on a node are emitted they are guaranteed to have been written.

ipld is an instance of the IPLD Resolver or the js-ipfs dag api

The input's file paths and directory structure will be preserved in the dag-pb created nodes.

options is an JavaScript option that might include the following keys:

  • wrap (boolean, defaults to false): if true, a wrapping node will be created
  • shardSplitThreshold (positive integer, defaults to 1000): the number of directory entries above which we decide to use a sharding directory builder (instead of the default flat one)
  • chunker (string, defaults to "fixed"): the chunking strategy. Supports:
    • fixed
    • rabin
  • avgChunkSize (positive integer, defaults to 262144): the average chunk size (rabin chunker only)
  • minChunkSize (positive integer): the minimum chunk size (rabin chunker only)
  • maxChunkSize (positive integer, defaults to 262144): the maximum chunk size
  • strategy (string, defaults to "balanced"): the DAG builder strategy name. Supports:
    • flat: flat list of chunks
    • balanced: builds a balanced tree
    • trickle: builds a trickle tree
  • maxChildrenPerNode (positive integer, defaults to 174): the maximum children per node for the balanced and trickle DAG builder strategies
  • layerRepeat (positive integer, defaults to 4): (only applicable to the trickle DAG builder strategy). The maximum repetition of parent nodes for each layer of the tree.
  • reduceSingleLeafToSelf (boolean, defaults to true): optimization for, when reducing a set of nodes with one node, reduce it to that node.
  • hamtHashFn (async function(string) Buffer): a function that hashes file names to create HAMT shards
  • hamtBucketBits (positive integer, defaults to 8): the number of bits at each bucket of the HAMT
  • progress (function): a function that will be called with the byte length of chunks as a file is added to ipfs.
  • onlyHash (boolean, defaults to false): Only chunk and hash - do not write to disk
  • hashAlg (string): multihash hashing algorithm to use
  • cidVersion (integer, default 0): the CID version to use when storing the data (storage keys are based on the CID, including it's version)
  • rawLeaves (boolean, defaults to false): When a file would span multiple DAGNodes, if this is true the leaf nodes will not be wrapped in UnixFS protobufs and will instead contain the raw file bytes
  • leafType (string, defaults to 'file') what type of UnixFS node leaves should be - can be 'file' or 'raw' (ignored when rawLeaves is true)
  • blockWriteConcurrency (positive integer, defaults to 10) How many blocks to hash and write to the block store concurrently. For small numbers of large files this should be high (e.g. 50).
  • fileImportConcurrency (number, defaults to 50) How many files to import concurrently. For large numbers of small files this should be high (e.g. 50).

Overriding internals

Several aspects of the importer are overridable by specifying functions as part of the options object with these keys:

  • chunkValidator (function): Optional function that supports the signature async function * (source, options)
    • This function takes input from the content field of imported entries. It should transform them into Buffers, throwing an error if it cannot.
    • It should yield Buffer objects constructed from the source or throw an Error
  • chunker (function): Optional function that supports the signature async function * (source, options) where source is an async generator and options is an options object
    • It should yield Buffer objects.
  • bufferImporter (function): Optional function that supports the signature async function * (entry, source, ipld, options)
    • This function should read Buffers from source and persist them using ipld.put or similar
    • entry is the { path, content } entry, source is an async generator that yields Buffers
    • It should yield functions that return a Promise that resolves to an object with the properties { cid, unixfs, size } where cid is a CID, unixfs is a UnixFS entry and size is a Number that represents the serialized size of the IPLD node that holds the buffer data.
    • Values will be pulled from this generator in parallel - the amount of parallelisation is controlled by the blockWriteConcurrency option (default: 10)
  • dagBuilder (function): Optional function that supports the signature async function * (source, ipld, options)
    • This function should read { path, content } entries from source and turn them into DAGs
    • It should yield a function that returns a Promise that resolves to { cid, path, unixfs, node } where cid is a CID, path is a string, unixfs is a UnixFS entry and node is a DAGNode.
    • Values will be pulled from this generator in parallel - the amount of parallelisation is controlled by the fileImportConcurrency option (default: 50)
  • treeBuilder (function): Optional function that supports the signature async function * (source, ipld, options)
    • This function should read { cid, path, unixfs, node } entries from source and place them in a directory structure
    • It should yield an object with the properties { cid, path, unixfs, size } where cid is a CID, path is a string, unixfs is a UnixFS entry and size is a Number.

Contribute

Feel free to join in. All welcome. Open an issue!

This repository falls under the IPFS Code of Conduct.

License

MIT