1# cacache [![npm version](https://img.shields.io/npm/v/cacache.svg)](https://npm.im/cacache) [![license](https://img.shields.io/npm/l/cacache.svg)](https://npm.im/cacache) [![Travis](https://img.shields.io/travis/npm/cacache.svg)](https://travis-ci.org/npm/cacache) [![AppVeyor](https://ci.appveyor.com/api/projects/status/github/npm/cacache?svg=true)](https://ci.appveyor.com/project/npm/cacache) [![Coverage Status](https://coveralls.io/repos/github/npm/cacache/badge.svg?branch=latest)](https://coveralls.io/github/npm/cacache?branch=latest) 2 3[`cacache`](https://github.com/npm/cacache) is a Node.js library for managing 4local key and content address caches. It's really fast, really good at 5concurrency, and it will never give you corrupted data, even if cache files 6get corrupted or manipulated. 7 8On systems that support user and group settings on files, cacache will 9match the `uid` and `gid` values to the folder where the cache lives, even 10when running as `root`. 11 12It was written to be used as [npm](https://npm.im)'s local cache, but can 13just as easily be used on its own. 14 15_Translations: [español](README.es.md)_ 16 17## Install 18 19`$ npm install --save cacache` 20 21## Table of Contents 22 23* [Example](#example) 24* [Features](#features) 25* [Contributing](#contributing) 26* [API](#api) 27 * [Using localized APIs](#localized-api) 28 * Reading 29 * [`ls`](#ls) 30 * [`ls.stream`](#ls-stream) 31 * [`get`](#get-data) 32 * [`get.stream`](#get-stream) 33 * [`get.info`](#get-info) 34 * [`get.hasContent`](#get-hasContent) 35 * Writing 36 * [`put`](#put-data) 37 * [`put.stream`](#put-stream) 38 * [`put*` opts](#put-options) 39 * [`rm.all`](#rm-all) 40 * [`rm.entry`](#rm-entry) 41 * [`rm.content`](#rm-content) 42 * Utilities 43 * [`setLocale`](#set-locale) 44 * [`clearMemoized`](#clear-memoized) 45 * [`tmp.mkdir`](#tmp-mkdir) 46 * [`tmp.withTmp`](#with-tmp) 47 * Integrity 48 * [Subresource Integrity](#integrity) 49 * [`verify`](#verify) 50 * [`verify.lastRun`](#verify-last-run) 51 52### Example 53 54```javascript 55const cacache = require('cacache/en') 56const fs = require('fs') 57 58const tarball = '/path/to/mytar.tgz' 59const cachePath = '/tmp/my-toy-cache' 60const key = 'my-unique-key-1234' 61 62// Cache it! Use `cachePath` as the root of the content cache 63cacache.put(cachePath, key, '10293801983029384').then(integrity => { 64 console.log(`Saved content to ${cachePath}.`) 65}) 66 67const destination = '/tmp/mytar.tgz' 68 69// Copy the contents out of the cache and into their destination! 70// But this time, use stream instead! 71cacache.get.stream( 72 cachePath, key 73).pipe( 74 fs.createWriteStream(destination) 75).on('finish', () => { 76 console.log('done extracting!') 77}) 78 79// The same thing, but skip the key index. 80cacache.get.byDigest(cachePath, integrityHash).then(data => { 81 fs.writeFile(destination, data, err => { 82 console.log('tarball data fetched based on its sha512sum and written out!') 83 }) 84}) 85``` 86 87### Features 88 89* Extraction by key or by content address (shasum, etc) 90* [Subresource Integrity](#integrity) web standard support 91* Multi-hash support - safely host sha1, sha512, etc, in a single cache 92* Automatic content deduplication 93* Fault tolerance (immune to corruption, partial writes, process races, etc) 94* Consistency guarantees on read and write (full data verification) 95* Lockless, high-concurrency cache access 96* Streaming support 97* Promise support 98* Pretty darn fast -- sub-millisecond reads and writes including verification 99* Arbitrary metadata storage 100* Garbage collection and additional offline verification 101* Thorough test coverage 102* There's probably a bloom filter in there somewhere. Those are cool, right? 103 104### Contributing 105 106The cacache team enthusiastically welcomes contributions and project participation! There's a bunch of things you can do if you want to contribute! The [Contributor Guide](CONTRIBUTING.md) has all the information you need for everything from reporting bugs to contributing entire new features. Please don't hesitate to jump in if you'd like to, or even ask us questions if something isn't clear. 107 108All participants and maintainers in this project are expected to follow [Code of Conduct](CODE_OF_CONDUCT.md), and just generally be excellent to each other. 109 110Please refer to the [Changelog](CHANGELOG.md) for project history details, too. 111 112Happy hacking! 113 114### API 115 116#### <a name="localized-api"></a> Using localized APIs 117 118cacache includes a complete API in English, with the same features as other 119translations. To use the English API as documented in this README, use 120`require('cacache/en')`. This is also currently the default if you do 121`require('cacache')`, but may change in the future. 122 123cacache also supports other languages! You can find the list of currently 124supported ones by looking in `./locales` in the source directory. You can use 125the API in that language with `require('cacache/<lang>')`. 126 127Want to add support for a new language? Please go ahead! You should be able to 128copy `./locales/en.js` and `./locales/en.json` and fill them in. Translating the 129`README.md` is a bit more work, but also appreciated if you get around to it. 130 131#### <a name="ls"></a> `> cacache.ls(cache) -> Promise<Object>` 132 133Lists info for all entries currently in the cache as a single large object. Each 134entry in the object will be keyed by the unique index key, with corresponding 135[`get.info`](#get-info) objects as the values. 136 137##### Example 138 139```javascript 140cacache.ls(cachePath).then(console.log) 141// Output 142{ 143 'my-thing': { 144 key: 'my-thing', 145 integrity: 'sha512-BaSe64/EnCoDED+HAsh==' 146 path: '.testcache/content/deadbeef', // joined with `cachePath` 147 time: 12345698490, 148 size: 4023948, 149 metadata: { 150 name: 'blah', 151 version: '1.2.3', 152 description: 'this was once a package but now it is my-thing' 153 } 154 }, 155 'other-thing': { 156 key: 'other-thing', 157 integrity: 'sha1-ANothER+hasH=', 158 path: '.testcache/content/bada55', 159 time: 11992309289, 160 size: 111112 161 } 162} 163``` 164 165#### <a name="ls-stream"></a> `> cacache.ls.stream(cache) -> Readable` 166 167Lists info for all entries currently in the cache as a single large object. 168 169This works just like [`ls`](#ls), except [`get.info`](#get-info) entries are 170returned as `'data'` events on the returned stream. 171 172##### Example 173 174```javascript 175cacache.ls.stream(cachePath).on('data', console.log) 176// Output 177{ 178 key: 'my-thing', 179 integrity: 'sha512-BaSe64HaSh', 180 path: '.testcache/content/deadbeef', // joined with `cachePath` 181 time: 12345698490, 182 size: 13423, 183 metadata: { 184 name: 'blah', 185 version: '1.2.3', 186 description: 'this was once a package but now it is my-thing' 187 } 188} 189 190{ 191 key: 'other-thing', 192 integrity: 'whirlpool-WoWSoMuchSupport', 193 path: '.testcache/content/bada55', 194 time: 11992309289, 195 size: 498023984029 196} 197 198{ 199 ... 200} 201``` 202 203#### <a name="get-data"></a> `> cacache.get(cache, key, [opts]) -> Promise({data, metadata, integrity})` 204 205Returns an object with the cached data, digest, and metadata identified by 206`key`. The `data` property of this object will be a `Buffer` instance that 207presumably holds some data that means something to you. I'm sure you know what 208to do with it! cacache just won't care. 209 210`integrity` is a [Subresource 211Integrity](#integrity) 212string. That is, a string that can be used to verify `data`, which looks like 213`<hash-algorithm>-<base64-integrity-hash>`. 214 215If there is no content identified by `key`, or if the locally-stored data does 216not pass the validity checksum, the promise will be rejected. 217 218A sub-function, `get.byDigest` may be used for identical behavior, except lookup 219will happen by integrity hash, bypassing the index entirely. This version of the 220function *only* returns `data` itself, without any wrapper. 221 222##### Note 223 224This function loads the entire cache entry into memory before returning it. If 225you're dealing with Very Large data, consider using [`get.stream`](#get-stream) 226instead. 227 228##### Example 229 230```javascript 231// Look up by key 232cache.get(cachePath, 'my-thing').then(console.log) 233// Output: 234{ 235 metadata: { 236 thingName: 'my' 237 }, 238 integrity: 'sha512-BaSe64HaSh', 239 data: Buffer#<deadbeef>, 240 size: 9320 241} 242 243// Look up by digest 244cache.get.byDigest(cachePath, 'sha512-BaSe64HaSh').then(console.log) 245// Output: 246Buffer#<deadbeef> 247``` 248 249#### <a name="get-stream"></a> `> cacache.get.stream(cache, key, [opts]) -> Readable` 250 251Returns a [Readable Stream](https://nodejs.org/api/stream.html#stream_readable_streams) of the cached data identified by `key`. 252 253If there is no content identified by `key`, or if the locally-stored data does 254not pass the validity checksum, an error will be emitted. 255 256`metadata` and `integrity` events will be emitted before the stream closes, if 257you need to collect that extra data about the cached entry. 258 259A sub-function, `get.stream.byDigest` may be used for identical behavior, 260except lookup will happen by integrity hash, bypassing the index entirely. This 261version does not emit the `metadata` and `integrity` events at all. 262 263##### Example 264 265```javascript 266// Look up by key 267cache.get.stream( 268 cachePath, 'my-thing' 269).on('metadata', metadata => { 270 console.log('metadata:', metadata) 271}).on('integrity', integrity => { 272 console.log('integrity:', integrity) 273}).pipe( 274 fs.createWriteStream('./x.tgz') 275) 276// Outputs: 277metadata: { ... } 278integrity: 'sha512-SoMeDIGest+64==' 279 280// Look up by digest 281cache.get.stream.byDigest( 282 cachePath, 'sha512-SoMeDIGest+64==' 283).pipe( 284 fs.createWriteStream('./x.tgz') 285) 286``` 287 288#### <a name="get-info"></a> `> cacache.get.info(cache, key) -> Promise` 289 290Looks up `key` in the cache index, returning information about the entry if 291one exists. 292 293##### Fields 294 295* `key` - Key the entry was looked up under. Matches the `key` argument. 296* `integrity` - [Subresource Integrity hash](#integrity) for the content this entry refers to. 297* `path` - Filesystem path where content is stored, joined with `cache` argument. 298* `time` - Timestamp the entry was first added on. 299* `metadata` - User-assigned metadata associated with the entry/content. 300 301##### Example 302 303```javascript 304cacache.get.info(cachePath, 'my-thing').then(console.log) 305 306// Output 307{ 308 key: 'my-thing', 309 integrity: 'sha256-MUSTVERIFY+ALL/THINGS==' 310 path: '.testcache/content/deadbeef', 311 time: 12345698490, 312 size: 849234, 313 metadata: { 314 name: 'blah', 315 version: '1.2.3', 316 description: 'this was once a package but now it is my-thing' 317 } 318} 319``` 320 321#### <a name="get-hasContent"></a> `> cacache.get.hasContent(cache, integrity) -> Promise` 322 323Looks up a [Subresource Integrity hash](#integrity) in the cache. If content 324exists for this `integrity`, it will return an object, with the specific single integrity hash 325that was found in `sri` key, and the size of the found content as `size`. If no content exists for this integrity, it will return `false`. 326 327##### Example 328 329```javascript 330cacache.get.hasContent(cachePath, 'sha256-MUSTVERIFY+ALL/THINGS==').then(console.log) 331 332// Output 333{ 334 sri: { 335 source: 'sha256-MUSTVERIFY+ALL/THINGS==', 336 algorithm: 'sha256', 337 digest: 'MUSTVERIFY+ALL/THINGS==', 338 options: [] 339 }, 340 size: 9001 341} 342 343cacache.get.hasContent(cachePath, 'sha521-NOT+IN/CACHE==').then(console.log) 344 345// Output 346false 347``` 348 349#### <a name="put-data"></a> `> cacache.put(cache, key, data, [opts]) -> Promise` 350 351Inserts data passed to it into the cache. The returned Promise resolves with a 352digest (generated according to [`opts.algorithms`](#optsalgorithms)) after the 353cache entry has been successfully written. 354 355##### Example 356 357```javascript 358fetch( 359 'https://registry.npmjs.org/cacache/-/cacache-1.0.0.tgz' 360).then(data => { 361 return cacache.put(cachePath, 'registry.npmjs.org|cacache@1.0.0', data) 362}).then(integrity => { 363 console.log('integrity hash is', integrity) 364}) 365``` 366 367#### <a name="put-stream"></a> `> cacache.put.stream(cache, key, [opts]) -> Writable` 368 369Returns a [Writable 370Stream](https://nodejs.org/api/stream.html#stream_writable_streams) that inserts 371data written to it into the cache. Emits an `integrity` event with the digest of 372written contents when it succeeds. 373 374##### Example 375 376```javascript 377request.get( 378 'https://registry.npmjs.org/cacache/-/cacache-1.0.0.tgz' 379).pipe( 380 cacache.put.stream( 381 cachePath, 'registry.npmjs.org|cacache@1.0.0' 382 ).on('integrity', d => console.log(`integrity digest is ${d}`)) 383) 384``` 385 386#### <a name="put-options"></a> `> cacache.put options` 387 388`cacache.put` functions have a number of options in common. 389 390##### `opts.metadata` 391 392Arbitrary metadata to be attached to the inserted key. 393 394##### `opts.size` 395 396If provided, the data stream will be verified to check that enough data was 397passed through. If there's more or less data than expected, insertion will fail 398with an `EBADSIZE` error. 399 400##### `opts.integrity` 401 402If present, the pre-calculated digest for the inserted content. If this option 403if provided and does not match the post-insertion digest, insertion will fail 404with an `EINTEGRITY` error. 405 406`algorithms` has no effect if this option is present. 407 408##### `opts.algorithms` 409 410Default: ['sha512'] 411 412Hashing algorithms to use when calculating the [subresource integrity 413digest](#integrity) 414for inserted data. Can use any algorithm listed in `crypto.getHashes()` or 415`'omakase'`/`'お任せします'` to pick a random hash algorithm on each insertion. You 416may also use any anagram of `'modnar'` to use this feature. 417 418Currently only supports one algorithm at a time (i.e., an array length of 419exactly `1`). Has no effect if `opts.integrity` is present. 420 421##### `opts.memoize` 422 423Default: null 424 425If provided, cacache will memoize the given cache insertion in memory, bypassing 426any filesystem checks for that key or digest in future cache fetches. Nothing 427will be written to the in-memory cache unless this option is explicitly truthy. 428 429If `opts.memoize` is an object or a `Map`-like (that is, an object with `get` 430and `set` methods), it will be written to instead of the global memoization 431cache. 432 433Reading from disk data can be forced by explicitly passing `memoize: false` to 434the reader functions, but their default will be to read from memory. 435 436#### <a name="rm-all"></a> `> cacache.rm.all(cache) -> Promise` 437 438Clears the entire cache. Mainly by blowing away the cache directory itself. 439 440##### Example 441 442```javascript 443cacache.rm.all(cachePath).then(() => { 444 console.log('THE APOCALYPSE IS UPON US ') 445}) 446``` 447 448#### <a name="rm-entry"></a> `> cacache.rm.entry(cache, key) -> Promise` 449 450Alias: `cacache.rm` 451 452Removes the index entry for `key`. Content will still be accessible if 453requested directly by content address ([`get.stream.byDigest`](#get-stream)). 454 455To remove the content itself (which might still be used by other entries), use 456[`rm.content`](#rm-content). Or, to safely vacuum any unused content, use 457[`verify`](#verify). 458 459##### Example 460 461```javascript 462cacache.rm.entry(cachePath, 'my-thing').then(() => { 463 console.log('I did not like it anyway') 464}) 465``` 466 467#### <a name="rm-content"></a> `> cacache.rm.content(cache, integrity) -> Promise` 468 469Removes the content identified by `integrity`. Any index entries referring to it 470will not be usable again until the content is re-added to the cache with an 471identical digest. 472 473##### Example 474 475```javascript 476cacache.rm.content(cachePath, 'sha512-SoMeDIGest/IN+BaSE64==').then(() => { 477 console.log('data for my-thing is gone!') 478}) 479``` 480 481#### <a name="set-locale"></a> `> cacache.setLocale(locale)` 482 483Configure the language/locale used for messages and errors coming from cacache. 484The list of available locales is in the `./locales` directory in the project 485root. 486 487_Interested in contributing more languages! [Submit a PR](CONTRIBUTING.md)!_ 488 489#### <a name="clear-memoized"></a> `> cacache.clearMemoized()` 490 491Completely resets the in-memory entry cache. 492 493#### <a name="tmp-mkdir"></a> `> tmp.mkdir(cache, opts) -> Promise<Path>` 494 495Returns a unique temporary directory inside the cache's `tmp` dir. This 496directory will use the same safe user assignment that all the other stuff use. 497 498Once the directory is made, it's the user's responsibility that all files 499within are given the appropriate `gid`/`uid` ownership settings to match 500the rest of the cache. If not, you can ask cacache to do it for you by 501calling [`tmp.fix()`](#tmp-fix), which will fix all tmp directory 502permissions. 503 504If you want automatic cleanup of this directory, use 505[`tmp.withTmp()`](#with-tpm) 506 507##### Example 508 509```javascript 510cacache.tmp.mkdir(cache).then(dir => { 511 fs.writeFile(path.join(dir, 'blablabla'), Buffer#<1234>, ...) 512}) 513``` 514 515#### <a name="tmp-fix"></a> `> tmp.fix(cache) -> Promise` 516 517Sets the `uid` and `gid` properties on all files and folders within the tmp 518folder to match the rest of the cache. 519 520Use this after manually writing files into [`tmp.mkdir`](#tmp-mkdir) or 521[`tmp.withTmp`](#with-tmp). 522 523##### Example 524 525```javascript 526cacache.tmp.mkdir(cache).then(dir => { 527 writeFile(path.join(dir, 'file'), someData).then(() => { 528 // make sure we didn't just put a root-owned file in the cache 529 cacache.tmp.fix().then(() => { 530 // all uids and gids match now 531 }) 532 }) 533}) 534``` 535 536#### <a name="with-tmp"></a> `> tmp.withTmp(cache, opts, cb) -> Promise` 537 538Creates a temporary directory with [`tmp.mkdir()`](#tmp-mkdir) and calls `cb` 539with it. The created temporary directory will be removed when the return value 540of `cb()` resolves -- that is, if you return a Promise from `cb()`, the tmp 541directory will be automatically deleted once that promise completes. 542 543The same caveats apply when it comes to managing permissions for the tmp dir's 544contents. 545 546##### Example 547 548```javascript 549cacache.tmp.withTmp(cache, dir => { 550 return fs.writeFileAsync(path.join(dir, 'blablabla'), Buffer#<1234>, ...) 551}).then(() => { 552 // `dir` no longer exists 553}) 554``` 555 556#### <a name="integrity"></a> Subresource Integrity Digests 557 558For content verification and addressing, cacache uses strings following the 559[Subresource 560Integrity spec](https://developer.mozilla.org/en-US/docs/Web/Security/Subresource_Integrity). 561That is, any time cacache expects an `integrity` argument or option, it 562should be in the format `<hashAlgorithm>-<base64-hash>`. 563 564One deviation from the current spec is that cacache will support any hash 565algorithms supported by the underlying Node.js process. You can use 566`crypto.getHashes()` to see which ones you can use. 567 568##### Generating Digests Yourself 569 570If you have an existing content shasum, they are generally formatted as a 571hexadecimal string (that is, a sha1 would look like: 572`5f5513f8822fdbe5145af33b64d8d970dcf95c6e`). In order to be compatible with 573cacache, you'll need to convert this to an equivalent subresource integrity 574string. For this example, the corresponding hash would be: 575`sha1-X1UT+IIv2+UUWvM7ZNjZcNz5XG4=`. 576 577If you want to generate an integrity string yourself for existing data, you can 578use something like this: 579 580```javascript 581const crypto = require('crypto') 582const hashAlgorithm = 'sha512' 583const data = 'foobarbaz' 584 585const integrity = ( 586 hashAlgorithm + 587 '-' + 588 crypto.createHash(hashAlgorithm).update(data).digest('base64') 589) 590``` 591 592You can also use [`ssri`](https://npm.im/ssri) to have a richer set of functionality 593around SRI strings, including generation, parsing, and translating from existing 594hex-formatted strings. 595 596#### <a name="verify"></a> `> cacache.verify(cache, opts) -> Promise` 597 598Checks out and fixes up your cache: 599 600* Cleans up corrupted or invalid index entries. 601* Custom entry filtering options. 602* Garbage collects any content entries not referenced by the index. 603* Checks integrity for all content entries and removes invalid content. 604* Fixes cache ownership. 605* Removes the `tmp` directory in the cache and all its contents. 606 607When it's done, it'll return an object with various stats about the verification 608process, including amount of storage reclaimed, number of valid entries, number 609of entries removed, etc. 610 611##### Options 612 613* `opts.filter` - receives a formatted entry. Return false to remove it. 614 Note: might be called more than once on the same entry. 615 616##### Example 617 618```sh 619echo somegarbage >> $CACHEPATH/content/deadbeef 620``` 621 622```javascript 623cacache.verify(cachePath).then(stats => { 624 // deadbeef collected, because of invalid checksum. 625 console.log('cache is much nicer now! stats:', stats) 626}) 627``` 628 629#### <a name="verify-last-run"></a> `> cacache.verify.lastRun(cache) -> Promise` 630 631Returns a `Date` representing the last time `cacache.verify` was run on `cache`. 632 633##### Example 634 635```javascript 636cacache.verify(cachePath).then(() => { 637 cacache.verify.lastRun(cachePath).then(lastTime => { 638 console.log('cacache.verify was last called on' + lastTime) 639 }) 640}) 641``` 642