Skip to content
This repository has been archived by the owner on Feb 12, 2024. It is now read-only.

feat: pin API #1045

Merged
merged 21 commits into from
Jun 19, 2018
Merged

feat: pin API #1045

merged 21 commits into from
Jun 19, 2018

Conversation

daviddias
Copy link
Member

No description provided.

@ghost ghost assigned daviddias Oct 22, 2017
@ghost ghost added the status/in-progress In progress label Oct 22, 2017

const multihashes = require('multihashes')
const CID = require('cids')
const protobuf = require('protocol-buffers')
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should use the protons package instead

@daviddias
Copy link
Member Author

daviddias commented Oct 22, 2017

@AdamStone stunning work with owning the Pin API endeavour on specs, js-ipfs-api and js-ipfs 👏🏽👏🏽👏🏽👏🏽👏🏽👏🏽

I went ahead and merged and released interface-ipfs-core + js-ipfs-api 🌟

As for merging this PR, since it is such a large code addition, I want to review it thoroughly one more time and I also want to invite other js-ipfs contributors to give their feedback.

Let's shoot to merge this week :) @AdamStone will you be around to finish any last details?

PS: I've also invited you to the JavaScript team on the IPFS org on Github, you now can commit directly into this branch or create other branches :)

@daviddias daviddias removed their assignment Oct 22, 2017
@AdamStone
Copy link

@diasdavid Thanks very much, I didn't expect it was that close to ready for merge though. Were you able to resolve the issue with the pin datastore incompatibility with the go repo? As I left it, the JS implementation would save its pins under a different key than the go implementation.

Having finished some high priority things for work, I think I should have some free time again for another week or two, but after that I'll be full time taking care of a newborn.

@@ -0,0 +1,31 @@
'use strict'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any specific reason we're implementing our own set here and not using the built in one?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did it this way to work around the problem mentioned in the comment

  // Buffers with identical data are still different objects, so
  // they need to be cast to strings to prevent duplicates in Sets

in order to have set-like storage for multihash buffers but indexed by hash string. But since it's straightforward to convert the hash string back to the buffer object, maybe it would be better to just use a Set to store the strings and use multihashes.fromB58String() where the object is needed.

const maxItems = 8192

// Protobuf interface
const pbSchema = (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can probably use backticks here (`) - but that's a nitpick :)

test/cli/pin.js Outdated
})
})

// it('ls (quiet)', () => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this still needed?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't sure about this one. It's part of go-ipfs but wasn't mentioned in the js-ipfs specs, and I had some trouble getting it to work.

// try to accept a variety of hash options including
// multihash Buffers, base58 strings, and ipfs path
// strings, either individually or as an array
if (Buffer.isBuffer(hashes) || typeof hashes.forEach !== 'function') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe use Array.isArray() here?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hard to remember now, but I think I had used that originally and changed it after someone asked to avoid using it... Or maybe it had been instanceof Array. Another option is hashes.constructor === Array. I'm fine with any of these. Currently in the project I see three examples of instanceof Array and three of Array.isArray().

each(hashes, (hash, cb) => {
if (typeof hash === 'string') {
// example: '/ipfs/QmRootHash/links/by/name'
const matched = hash.match(/^(?:\/ipfs\/)?([^/]+(?:\/[^/]+)*)\/?$/)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be nice to have utils around this :)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually wonder if I should be doing this at all here, it seems like there should already be some built-in way to follow paths like this using the existing interface? All this is just for the sake of getting the cid of the thing at the end of the path. I could use files.get with the full path but I'm not sure how to get the cid from the result. @diasdavid any suggestion?


function getRecursive (multihash, callback) {
// gets flat array of all DAGNodes in tree given by multihash
// (should this be part of dag.js API? it was in ipfs-merkle-dag)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen several of this questions, did we get this addressed - @diasdavid?

@AdamStone
Copy link

Thanks for the code review, I'm just checking in for now to mention that my son was born about a week early, so it will be some time before I can follow up with this, probably at least a few weeks.

@dryajov
Copy link
Member

dryajov commented Nov 6, 2017

@AdamStone congrats on your newborn! Hope you're enjoying the moment - it flies by fast 😜

@daviddias
Copy link
Member Author

@AdamStone just read your last message. Congratulations!!! 🎉🌟👶🌟🎉 I hope everything is great with the 3 of you :)

@daviddias daviddias added exp/expert Having worked on the specific codebase is important help wanted Seeking public contribution on this issue P1 High: Likely tackled by core team if no one steps up labels Nov 25, 2017
@daviddias daviddias mentioned this pull request Nov 25, 2017
@AdamStone
Copy link

@diasdavid @dryajov Thanks, we are doing well and getting used to our new life. I found some time to look into this and figured out why my saving and loading of the pin set wasn't working with the go test repo. I have it working now locally, but only if I comment out this assert to enable DAG links with size 0 -- it seems the go repo allows size 0 links and the pin set implementation includes them. Not sure what to do about that. The hashes I see for the same sets of pins pinned through the js interface are also different, so there must still be some discrepancy in the structure of the pin sets, but it doesn't seem to affect the end result.

@AdamStone
Copy link

It seems like I can't push to this PR, so I pushed these changes to my fork.

@daviddias
Copy link
Member Author

@ghost ghost assigned AdamStone Nov 30, 2017
@daviddias
Copy link
Member Author

@AdamStone note that there are merge conflicts due to recent changes in js-ipfs master, it might be wise to resolve them now so that you build on the current structure and avoid a big hairy rebase at the end.

@AdamStone
Copy link

AdamStone commented Nov 30, 2017

@diasdavid Thanks, I probably got dropped before from inactivity. Pushed an update. There were a couple of inline questions above about utility functions, any thoughts on those? Or the matter of zero-size links?

For now I'll take a look at the merge conflicts.

@daviddias
Copy link
Member Author

@AdamStone re: zero size links - The reality is that the size value on a link can't really be trusted since there is no way to create a proof that the size is the real one other than checking manually. In JS land at least we make the obvious claim that if there is a link, at least it has to be bigger than 0 but it seems that Go didn't and ignores the size completely for Pinsets. Given that and that we want to be feature by feature (and sometimes bug by bug), we can remove that condition from ipld/dag-pb. Wanna submit that PR?

@AdamStone
Copy link

AdamStone commented Dec 2, 2017

I ended up moving getRecursive to the dag module because it just seems more sensible to have it there than buried in the pin code (since all it does is recursively traverse a dag node's links).

@AdamStone
Copy link

I merged the latest master, updated the ipfs-api dependency to the latest version, but tests are still failing on "a link requires a size." The latest ipfs-api version requires an ipld-dag-pb version that should include the change merged above to allow empty links, so I'm not sure why the tests are still seeing this assert.

@daviddias
Copy link
Member Author

@AdamStone Just ran a fresh npm install (note, it is important to remote the package-lock.json) and got

» npm ls ipld-dag-pb
ipfs@0.27.5 /Users/imp/code/js-ipfs
├─┬ interface-ipfs-core@0.36.16
│ └── ipld-dag-pb@0.11.4
├─┬ ipfs-api@17.2.5
│ └── ipld-dag-pb@0.11.4  deduped
├─┬ ipfs-unixfs-engine@0.24.2
│ └── ipld-dag-pb@0.11.4  deduped
└─┬ ipld-resolver@0.14.1
  └── ipld-dag-pb@0.11.4  deduped

Which looks good. Running the tests also gave me:

      pin
        /pin/add
          ✓ pins object recursively by default (39ms)
        /pin/add (direct)
          ✓ pins object directly if specified
        /pin/ls (with path)
          ✓ finds specified pinned object
        /pin/ls (without path or type)
          ✓ finds all pinned objects
        /pin/rm (direct)
          ✓ unpins only directly pinned objects if specified
        /pin/rm
          ✓ unpins recursively by default

Perhaps you just didn't fresh install your node modules? I've this alias on my zshrc to avoid any confusion: alias fresh-modules="rm package-lock.json && rm -rf node_modules && npm install" :)

Copy link
Member Author

@daviddias daviddias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tons of good work here @AdamStone!! This PR is super close to being merged and tests are looking good as well.

I'm making some comments asking for some bits of the code to be a lil' more simplified, just to prevent to confuse our future selfs :)

const type = argv.type
const quiet = argv.quiet
argv.ipfs.pin.ls(paths, { type }, (err, results) => {
if (err) { throw err }
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this print a useful error message to the user?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose probably not. I just followed the same pattern as the other CLI modules.

const paths = argv.path && argv.path.split(' ')
const type = argv.type
const quiet = argv.quiet
argv.ipfs.pin.ls(paths, { type }, (err, results) => {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mind being explicit and using { type: type }?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, no problem

const paths = argv['ipfs-path'].split(' ')
const recursive = argv.recursive
argv.ipfs.pin.rm(paths, { recursive }, (err, results) => {
if (err) { throw err }
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above. Let's make sure that these error messages are printed in a way that it provides useful information to the user and it doesn't look like the whole program crashed

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it looks like all the CLI modules just throw the error, would that be better as a separate PR to replace all of them with some consistent response?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good 👍

@@ -73,6 +75,30 @@ module.exports = function dag (self) {
self._ipldResolver.treeStream(cid, path, options),
pull.collect(callback)
)
}),

getRecursive: promisify((multihash, callback) => {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This getRecursive is only working for ipld-dag-pb right now. Either make it an internal method by prefixing with a _ and add a comment saying just that or move it to https://github.com/ipld/js-ipld-resolver and leverage the .tree and .isLink API calls to make it agnostic to any kind of IPLD Format

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll see if I can move it

Copy link

@AdamStone AdamStone Dec 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@diasdavid Should the existing treeStream method already do what I'm trying to do here, if the recursive: true option is passed? I tried testing it with a CID for this directory but I'm getting an error in the recursive part. Specifically this new CID(...) errors because paths includes separate objects for link name, size and hash that all have p.link (so it tries to recurse all of them, when only the hash ones should be recursed).

[
   {
      "path":"Links/0/Name",
      "link":{
         "/":"about"
      }
   },
   {
      "path":"Links/0/Tsize",
      "link":{
         "/":1688
      }
   },
   {
      "path":"Links/0/Hash",
      "link":{
         "/":"QmZTR5bcpQD7cFgTorqxZDYaew1Wqgfbd2ud9QqGPAkK2V"
      }
   },
   ...
]

Is it a bug that the hashes array is getting these as separate links? Or should the recursion check specifically for the "hash" links?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that definitely looks like a bug as it brings more info than what it promises. The method was created just to return all the paths. That link property is just a left over.

Copy link

@AdamStone AdamStone Jan 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@diasdavid Based on these discussions it looks like the presence of separate "paths" for name, size and hash for each link is correct/necessary, and the bug is that the isLink method (formerly isCID) was meant to identify specifically the "hash" paths but is actually more permissive. The tests pass because the non-valid CID case is checking an empty value, which happens to return false correctly, whereas other non-empty non-CID values would not. I made another PR to address that and update the test.

The same pattern for isLink appears in other types of resolvers as well, so I expect the same problem would occur for those as well. If the fix I made in the above PR looks good, I can do the same for the other resolvers.

hashed[h].push(items[i])
}
const storeItemsCb = (err, child) => {
if (callback.called) { return }
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use http://npmjs.com/once for this

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, looks like I had onced the callbacks but missed removing some of these.

})
}
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the function above. This whole method needs to be simplified.

return callback(null, result.payload)
}
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

}),

isPinned: (multihash, callback) => {
// callback (err, pinned, reason)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove if not used

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, it is preferable for the callback to be use function (err, result) to follow the same pattern as everything else

}
}

exports.normalizeHashes = function (ipfs, hashes, callback) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add tests specifically to this function?

exports.OFFLINE_ERROR = 'This command must be run in online mode. Try running \'ipfs daemon\' first.'

const splitIpfsPath = exports.splitIpfsPath = function (pathString) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add tests specifically to this function?

JonKrone and others added 21 commits June 18, 2018 16:25
* initial sweep through to understand how pin works. Did make some changes but mostly minor.

* refactor pb schema to it's own file

* fix: don't pin files during files.add if opts.pin === false

* feat: add some http qs parsing, http route/resources cleanup, cleanup core/utils.parseIpfsPath

* feat: expand pin tests. \nFirst draft. still needs some further work.

* feat: Add logging for entry/exit of pins: add/rm/flush/load. Clean some documentation.

* feat: add --pin to files.add, fix: improper pin option parsing in core.

* feat: Use ipfs.files.add to add init-docs instead of directly using the unix-fs importer.

* feat(tests): Add tests for cli --pin option. I know this should be more of an integration test. Should be written in /core. Maybe talk with Victor about testing different layers

* feat: use isIPFS to valiate a multihash.
refactor: change pin.flush logging message
fix: don't need to cast the object.get result with toJSON

revert: use interface-datastore.Key for datastore pin storage
The proper change would be that datastore-level automatically casts operations into Keys

fix: do not invoke callback within a try/catch

feat(test): make cli pin tests more robust
By using files that aren't added on IPFS initialization. Still needs work on files.rm (direct) and ipfs ls (indirect).

fix: remove commented code, traced test failures to pin-set
Got to go for the night, though, so will checkpoint here and address tomorrow.

feat: parseIpfsPath now throws errors for consistency

feat: resolveIpfsPaths error message lists the relative path that failed

feat: use follow.bind instead of mutating the links
Also decided not show relative paths. Less human friendly but probably cleaner otherwise.

refactor: resolveIpfsPaths -> resolvePaths

feat: promisify resolvePaths

test: change parseIpfsPath failure tests to use try/catch

docs: edit resolvePath doc

revert: accidentally deleted commands/pin.js
I think my original rebase for this branch 2 weeks ago might have
changed history for the intervening commits, indirectly causing some of
these missed changes. or I just rebase onto the wrong oldparent.

fix: some onlyHash and pin tests broke after merging
onlyHash and pin interact: shouldn't pin when --only-hash.

fix: trim output for 'pin ls when no hash is passed'

test: indirect pins supersede direct pins: turns out we had a bug

feat: add expectTimeout test utility

feat: promisify some additional pin utils
I think I'll end up moving most tests here.

test: add tests for pin.ls and pin.rm
Based tests on other pin fixtures, need to migrate the isPinned* tests to them as well.

fix: direct pins are now deleted by a default pin.rm(hash)

test: prepare for pin.add tests
'indirect supersedes direct' test exposes a bug in pin.ls

feat: switch away from multihashes for isPinned* tests

test: impl pin.add tests

fix: add fixture files only once

test: add test for a potential bug, clean isPinned* tests

refactor: remove a test that's no longer needed

fix: pin.ls, indirect pins should supersede direct pins

test: naive pin.load, pin.flush tests

feat: remove most pin cli tests as functionality is tested in pin core tests

refactor: rename solarSystem
fix: attempt to find a way to use http-api/inject test structure for pin tests

test: fix pin.rm http-api tests

test: fix pin.add http-api tests

docs: docs and cleanup of http-api pin tests

refactor: renaming

fix: lint errors

fix: resolvePaths tests are failing on CI, it might be long ops, testing a timeout bump

fix: add files explicitly before testing resolvePaths

fix: remove mocha.only from resolvePaths. let's hope tests pass, they are passing CI now

fix: rename test/core/utils.spec.js -> utils.js so it's not run during browser tests
Need to leave computer, this is a checkpoint.

test: add sanity test for walkItems and hasChild, clean others
These tests are more descriptive than really pushing the impl. I'd love others' thoughts on what else should be hit and how. I also need to compare go's pinset impl against ours

fix: stop daemons

feat: documentation and multihash buffer handling for dag.get

fix: lint
fix: pinset.hasChild buffer check

feat: hardcode expected length for flush/load tests
I must have missed a commit during a rebase.
fix: yarg arugment naming for pin cli commands

fix: convert file multihashes to a b58 string

fix: another way of checking for CID-ness

fix: lint

fix: toB58String handles non-buffers

fix: key-exchange core tests now shutdown daemon.
refactor: pinset.hasChild -> pinset.hasDescendent

fix: invoke someCb if we've seen the hash before

refactor: async patterns in dag._getRecursive

refactor: pinset.hasDescendant

refactor: pinset.storeItems async patterns

refactor: pinset.loadSet and pin.walkItem async patterns

docs: add link to go-ipfs' fanout bin implementation

refactor: async patterns of pin.load/flush

refactor: lint

refactor: privatize internal pin key storage

refactor: change encapsulation of ipfs.pin, fix resulting issues

fix: lint

fix: 'files add --pin=false' test was giving a false positive

refactor: use is-ipfs to check CID-ability of a string

refactor: remove last instance of 'once' in the pin code
They're simple enough, documented elsewhere, and not used by any exposed functionality.
@alanshaw alanshaw merged commit 2a5cc5e into master Jun 19, 2018
@ghost ghost removed the status/in-progress In progress label Jun 19, 2018
@alanshaw alanshaw deleted the pin-api branch June 19, 2018 20:53
@daviddias
Copy link
Member Author

@alanshaw are the pin tests being run https://github.com/ipfs/js-ipfs/tree/master/test/core ? doens't seem like it.

@alanshaw
Copy link
Member

🤦‍♂️ oh no it doesn't

@daviddias
Copy link
Member Author

Let's see what #1405 says :)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
exp/expert Having worked on the specific codebase is important P1 High: Likely tackled by core team if no one steps up
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants