• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1# String decoder
2
3<!--introduced_in=v0.10.0-->
4
5> Stability: 2 - Stable
6
7<!-- source_link=lib/string_decoder.js -->
8
9The `string_decoder` module provides an API for decoding `Buffer` objects into
10strings in a manner that preserves encoded multi-byte UTF-8 and UTF-16
11characters. It can be accessed using:
12
13```js
14const { StringDecoder } = require('string_decoder');
15```
16
17The following example shows the basic use of the `StringDecoder` class.
18
19```js
20const { StringDecoder } = require('string_decoder');
21const decoder = new StringDecoder('utf8');
22
23const cent = Buffer.from([0xC2, 0xA2]);
24console.log(decoder.write(cent));
25
26const euro = Buffer.from([0xE2, 0x82, 0xAC]);
27console.log(decoder.write(euro));
28```
29
30When a `Buffer` instance is written to the `StringDecoder` instance, an
31internal buffer is used to ensure that the decoded string does not contain
32any incomplete multibyte characters. These are held in the buffer until the
33next call to `stringDecoder.write()` or until `stringDecoder.end()` is called.
34
35In the following example, the three UTF-8 encoded bytes of the European Euro
36symbol (`€`) are written over three separate operations:
37
38```js
39const { StringDecoder } = require('string_decoder');
40const decoder = new StringDecoder('utf8');
41
42decoder.write(Buffer.from([0xE2]));
43decoder.write(Buffer.from([0x82]));
44console.log(decoder.end(Buffer.from([0xAC])));
45```
46
47## Class: `StringDecoder`
48
49### `new StringDecoder([encoding])`
50<!-- YAML
51added: v0.1.99
52-->
53
54* `encoding` {string} The character [encoding][] the `StringDecoder` will use.
55  **Default:** `'utf8'`.
56
57Creates a new `StringDecoder` instance.
58
59### `stringDecoder.end([buffer])`
60<!-- YAML
61added: v0.9.3
62-->
63
64* `buffer` {Buffer|TypedArray|DataView} A `Buffer`, or `TypedArray`, or
65  `DataView` containing the bytes to decode.
66* Returns: {string}
67
68Returns any remaining input stored in the internal buffer as a string. Bytes
69representing incomplete UTF-8 and UTF-16 characters will be replaced with
70substitution characters appropriate for the character encoding.
71
72If the `buffer` argument is provided, one final call to `stringDecoder.write()`
73is performed before returning the remaining input.
74After `end()` is called, the `stringDecoder` object can be reused for new input.
75
76### `stringDecoder.write(buffer)`
77<!-- YAML
78added: v0.1.99
79changes:
80  - version: v8.0.0
81    pr-url: https://github.com/nodejs/node/pull/9618
82    description: Each invalid character is now replaced by a single replacement
83                 character instead of one for each individual byte.
84-->
85
86* `buffer` {Buffer|TypedArray|DataView} A `Buffer`, or `TypedArray`, or
87  `DataView` containing the bytes to decode.
88* Returns: {string}
89
90Returns a decoded string, ensuring that any incomplete multibyte characters at
91 the end of the `Buffer`, or `TypedArray`, or `DataView` are omitted from the
92 returned string and stored in an internal buffer for the next call to
93 `stringDecoder.write()` or `stringDecoder.end()`.
94
95[encoding]: buffer.md#buffer_buffers_and_character_encodings
96