1# Stream 2 3In RapidJSON, `rapidjson::Stream` is a concept for reading/writing JSON. Here we first show how to use streams provided. And then see how to create a custom stream. 4 5[TOC] 6 7# Memory Streams {#MemoryStreams} 8 9Memory streams store JSON in memory. 10 11## StringStream (Input) {#StringStream} 12 13`StringStream` is the most basic input stream. It represents a complete, read-only JSON stored in memory. It is defined in `rapidjson/rapidjson.h`. 14 15~~~~~~~~~~cpp 16#include "rapidjson/document.h" // will include "rapidjson/rapidjson.h" 17 18using namespace rapidjson; 19 20// ... 21const char json[] = "[1, 2, 3, 4]"; 22StringStream s(json); 23 24Document d; 25d.ParseStream(s); 26~~~~~~~~~~ 27 28Since this is very common usage, `Document::Parse(const char*)` is provided to do exactly the same as above: 29 30~~~~~~~~~~cpp 31// ... 32const char json[] = "[1, 2, 3, 4]"; 33Document d; 34d.Parse(json); 35~~~~~~~~~~ 36 37Note that, `StringStream` is a typedef of `GenericStringStream<UTF8<> >`, user may use another encodings to represent the character set of the stream. 38 39## StringBuffer (Output) {#StringBuffer} 40 41`StringBuffer` is a simple output stream. It allocates a memory buffer for writing the whole JSON. Use `GetString()` to obtain the buffer. 42 43~~~~~~~~~~cpp 44#include "rapidjson/stringbuffer.h" 45 46StringBuffer buffer; 47Writer<StringBuffer> writer(buffer); 48d.Accept(writer); 49 50const char* output = buffer.GetString(); 51~~~~~~~~~~ 52 53When the buffer is full, it will increases the capacity automatically. The default capacity is 256 characters (256 bytes for UTF8, 512 bytes for UTF16, etc.). User can provide an allocator and a initial capacity. 54 55~~~~~~~~~~cpp 56StringBuffer buffer1(0, 1024); // Use its allocator, initial size = 1024 57StringBuffer buffer2(allocator, 1024); 58~~~~~~~~~~ 59 60By default, `StringBuffer` will instantiate an internal allocator. 61 62Similarly, `StringBuffer` is a typedef of `GenericStringBuffer<UTF8<> >`. 63 64# File Streams {#FileStreams} 65 66When parsing a JSON from file, you may read the whole JSON into memory and use ``StringStream`` above. 67 68However, if the JSON is big, or memory is limited, you can use `FileReadStream`. It only read a part of JSON from file into buffer, and then let the part be parsed. If it runs out of characters in the buffer, it will read the next part from file. 69 70## FileReadStream (Input) {#FileReadStream} 71 72`FileReadStream` reads the file via a `FILE` pointer. And user need to provide a buffer. 73 74~~~~~~~~~~cpp 75#include "rapidjson/filereadstream.h" 76#include <cstdio> 77 78using namespace rapidjson; 79 80FILE* fp = fopen("big.json", "rb"); // non-Windows use "r" 81 82char readBuffer[65536]; 83FileReadStream is(fp, readBuffer, sizeof(readBuffer)); 84 85Document d; 86d.ParseStream(is); 87 88fclose(fp); 89~~~~~~~~~~ 90 91Different from string streams, `FileReadStream` is byte stream. It does not handle encodings. If the file is not UTF-8, the byte stream can be wrapped in a `EncodedInputStream`. It will be discussed very soon. 92 93Apart from reading file, user can also use `FileReadStream` to read `stdin`. 94 95## FileWriteStream (Output) {#FileWriteStream} 96 97`FileWriteStream` is buffered output stream. Its usage is very similar to `FileReadStream`. 98 99~~~~~~~~~~cpp 100#include "rapidjson/filewritestream.h" 101#include <cstdio> 102 103using namespace rapidjson; 104 105Document d; 106d.Parse(json); 107// ... 108 109FILE* fp = fopen("output.json", "wb"); // non-Windows use "w" 110 111char writeBuffer[65536]; 112FileWriteStream os(fp, writeBuffer, sizeof(writeBuffer)); 113 114Writer<FileWriteStream> writer(os); 115d.Accept(writer); 116 117fclose(fp); 118~~~~~~~~~~ 119 120It can also directs the output to `stdout`. 121 122# Encoded Streams {#EncodedStreams} 123 124Encoded streams do not contain JSON itself, but they wrap byte streams to provide basic encoding/decoding function. 125 126As mentioned above, UTF-8 byte streams can be read directly. However, UTF-16 and UTF-32 have endian issue. To handle endian correctly, it needs to convert bytes into characters (e.g. `wchar_t` for UTF-16) while reading, and characters into bytes while writing. 127 128Besides, it also need to handle [byte order mark (BOM)](http://en.wikipedia.org/wiki/Byte_order_mark). When reading from a byte stream, it is needed to detect or just consume the BOM if exists. When writing to a byte stream, it can optionally write BOM. 129 130If the encoding of stream is known in compile-time, you may use `EncodedInputStream` and `EncodedOutputStream`. If the stream can be UTF-8, UTF-16LE, UTF-16BE, UTF-32LE, UTF-32BE JSON, and it is only known in runtime, you may use `AutoUTFInputStream` and `AutoUTFOutputStream`. These streams are defined in `rapidjson/encodedstream.h`. 131 132Note that, these encoded streams can be applied to streams other than file. For example, you may have a file in memory, or a custom byte stream, be wrapped in encoded streams. 133 134## EncodedInputStream {#EncodedInputStream} 135 136`EncodedInputStream` has two template parameters. The first one is a `Encoding` class, such as `UTF8`, `UTF16LE`, defined in `rapidjson/encodings.h`. The second one is the class of stream to be wrapped. 137 138~~~~~~~~~~cpp 139#include "rapidjson/document.h" 140#include "rapidjson/filereadstream.h" // FileReadStream 141#include "rapidjson/encodedstream.h" // EncodedInputStream 142#include <cstdio> 143 144using namespace rapidjson; 145 146FILE* fp = fopen("utf16le.json", "rb"); // non-Windows use "r" 147 148char readBuffer[256]; 149FileReadStream bis(fp, readBuffer, sizeof(readBuffer)); 150 151EncodedInputStream<UTF16LE<>, FileReadStream> eis(bis); // wraps bis into eis 152 153Document d; // Document is GenericDocument<UTF8<> > 154d.ParseStream<0, UTF16LE<> >(eis); // Parses UTF-16LE file into UTF-8 in memory 155 156fclose(fp); 157~~~~~~~~~~ 158 159## EncodedOutputStream {#EncodedOutputStream} 160 161`EncodedOutputStream` is similar but it has a `bool putBOM` parameter in the constructor, controlling whether to write BOM into output byte stream. 162 163~~~~~~~~~~cpp 164#include "rapidjson/filewritestream.h" // FileWriteStream 165#include "rapidjson/encodedstream.h" // EncodedOutputStream 166#include <cstdio> 167 168Document d; // Document is GenericDocument<UTF8<> > 169// ... 170 171FILE* fp = fopen("output_utf32le.json", "wb"); // non-Windows use "w" 172 173char writeBuffer[256]; 174FileWriteStream bos(fp, writeBuffer, sizeof(writeBuffer)); 175 176typedef EncodedOutputStream<UTF32LE<>, FileWriteStream> OutputStream; 177OutputStream eos(bos, true); // Write BOM 178 179Writer<OutputStream, UTF32LE<>, UTF8<>> writer(eos); 180d.Accept(writer); // This generates UTF32-LE file from UTF-8 in memory 181 182fclose(fp); 183~~~~~~~~~~ 184 185## AutoUTFInputStream {#AutoUTFInputStream} 186 187Sometimes an application may want to handle all supported JSON encoding. `AutoUTFInputStream` will detection encoding by BOM first. If BOM is unavailable, it will use characteristics of valid JSON to make detection. If neither method success, it falls back to the UTF type provided in constructor. 188 189Since the characters (code units) may be 8-bit, 16-bit or 32-bit. `AutoUTFInputStream` requires a character type which can hold at least 32-bit. We may use `unsigned`, as in the template parameter: 190 191~~~~~~~~~~cpp 192#include "rapidjson/document.h" 193#include "rapidjson/filereadstream.h" // FileReadStream 194#include "rapidjson/encodedstream.h" // AutoUTFInputStream 195#include <cstdio> 196 197using namespace rapidjson; 198 199FILE* fp = fopen("any.json", "rb"); // non-Windows use "r" 200 201char readBuffer[256]; 202FileReadStream bis(fp, readBuffer, sizeof(readBuffer)); 203 204AutoUTFInputStream<unsigned, FileReadStream> eis(bis); // wraps bis into eis 205 206Document d; // Document is GenericDocument<UTF8<> > 207d.ParseStream<0, AutoUTF<unsigned> >(eis); // This parses any UTF file into UTF-8 in memory 208 209fclose(fp); 210~~~~~~~~~~ 211 212When specifying the encoding of stream, uses `AutoUTF<CharType>` as in `ParseStream()` above. 213 214You can obtain the type of UTF via `UTFType GetType()`. And check whether a BOM is found by `HasBOM()` 215 216## AutoUTFOutputStream {#AutoUTFOutputStream} 217 218Similarly, to choose encoding for output during runtime, we can use `AutoUTFOutputStream`. This class is not automatic *per se*. You need to specify the UTF type and whether to write BOM in runtime. 219 220~~~~~~~~~~cpp 221using namespace rapidjson; 222 223void WriteJSONFile(FILE* fp, UTFType type, bool putBOM, const Document& d) { 224 char writeBuffer[256]; 225 FileWriteStream bos(fp, writeBuffer, sizeof(writeBuffer)); 226 227 typedef AutoUTFOutputStream<unsigned, FileWriteStream> OutputStream; 228 OutputStream eos(bos, type, putBOM); 229 230 Writer<OutputStream, UTF8<>, AutoUTF<> > writer; 231 d.Accept(writer); 232} 233~~~~~~~~~~ 234 235`AutoUTFInputStream` and `AutoUTFOutputStream` is more convenient than `EncodedInputStream` and `EncodedOutputStream`. They just incur a little bit runtime overheads. 236 237# Custom Stream {#CustomStream} 238 239In addition to memory/file streams, user can create their own stream classes which fits RapidJSON's API. For example, you may create network stream, stream from compressed file, etc. 240 241RapidJSON combines different types using templates. A class containing all required interface can be a stream. The Stream interface is defined in comments of `rapidjson/rapidjson.h`: 242 243~~~~~~~~~~cpp 244concept Stream { 245 typename Ch; //!< Character type of the stream. 246 247 //! Read the current character from stream without moving the read cursor. 248 Ch Peek() const; 249 250 //! Read the current character from stream and moving the read cursor to next character. 251 Ch Take(); 252 253 //! Get the current read cursor. 254 //! \return Number of characters read from start. 255 size_t Tell(); 256 257 //! Begin writing operation at the current read pointer. 258 //! \return The begin writer pointer. 259 Ch* PutBegin(); 260 261 //! Write a character. 262 void Put(Ch c); 263 264 //! Flush the buffer. 265 void Flush(); 266 267 //! End the writing operation. 268 //! \param begin The begin write pointer returned by PutBegin(). 269 //! \return Number of characters written. 270 size_t PutEnd(Ch* begin); 271} 272~~~~~~~~~~ 273 274For input stream, they must implement `Peek()`, `Take()` and `Tell()`. 275For output stream, they must implement `Put()` and `Flush()`. 276There are two special interface, `PutBegin()` and `PutEnd()`, which are only for *in situ* parsing. Normal streams do not implement them. However, if the interface is not needed for a particular stream, it is still need to a dummy implementation, otherwise will generate compilation error. 277 278## Example: istream wrapper {#ExampleIStreamWrapper} 279 280The following example is a wrapper of `std::istream`, which only implements 3 functions. 281 282~~~~~~~~~~cpp 283class IStreamWrapper { 284public: 285 typedef char Ch; 286 287 IStreamWrapper(std::istream& is) : is_(is) { 288 } 289 290 Ch Peek() const { // 1 291 int c = is_.peek(); 292 return c == std::char_traits<char>::eof() ? '\0' : (Ch)c; 293 } 294 295 Ch Take() { // 2 296 int c = is_.get(); 297 return c == std::char_traits<char>::eof() ? '\0' : (Ch)c; 298 } 299 300 size_t Tell() const { return (size_t)is_.tellg(); } // 3 301 302 Ch* PutBegin() { assert(false); return 0; } 303 void Put(Ch) { assert(false); } 304 void Flush() { assert(false); } 305 size_t PutEnd(Ch*) { assert(false); return 0; } 306 307private: 308 IStreamWrapper(const IStreamWrapper&); 309 IStreamWrapper& operator=(const IStreamWrapper&); 310 311 std::istream& is_; 312}; 313~~~~~~~~~~ 314 315User can use it to wrap instances of `std::stringstream`, `std::ifstream`. 316 317~~~~~~~~~~cpp 318const char* json = "[1,2,3,4]"; 319std::stringstream ss(json); 320IStreamWrapper is(ss); 321 322Document d; 323d.ParseStream(is); 324~~~~~~~~~~ 325 326Note that, this implementation may not be as efficient as RapidJSON's memory or file streams, due to internal overheads of the standard library. 327 328## Example: ostream wrapper {#ExampleOStreamWrapper} 329 330The following example is a wrapper of `std::istream`, which only implements 2 functions. 331 332~~~~~~~~~~cpp 333class OStreamWrapper { 334public: 335 typedef char Ch; 336 337 OStreamWrapper(std::ostream& os) : os_(os) { 338 } 339 340 Ch Peek() const { assert(false); return '\0'; } 341 Ch Take() { assert(false); return '\0'; } 342 size_t Tell() const { } 343 344 Ch* PutBegin() { assert(false); return 0; } 345 void Put(Ch c) { os_.put(c); } // 1 346 void Flush() { os_.flush(); } // 2 347 size_t PutEnd(Ch*) { assert(false); return 0; } 348 349private: 350 OStreamWrapper(const OStreamWrapper&); 351 OStreamWrapper& operator=(const OStreamWrapper&); 352 353 std::ostream& os_; 354}; 355~~~~~~~~~~ 356 357User can use it to wrap instances of `std::stringstream`, `std::ofstream`. 358 359~~~~~~~~~~cpp 360Document d; 361// ... 362 363std::stringstream ss; 364OSStreamWrapper os(ss); 365 366Writer<OStreamWrapper> writer(os); 367d.Accept(writer); 368~~~~~~~~~~ 369 370Note that, this implementation may not be as efficient as RapidJSON's memory or file streams, due to internal overheads of the standard library. 371 372# Summary {#Summary} 373 374This section describes stream classes available in RapidJSON. Memory streams are simple. File stream can reduce the memory required during JSON parsing and generation, if the JSON is stored in file system. Encoded streams converts between byte streams and character streams. Finally, user may create custom streams using a simple interface. 375