1[/ 2 Copyright (c) 2016-2019 Vinnie Falco (vinnie dot falco at gmail dot com) 3 4 Distributed under the Boost Software License, Version 1.0. (See accompanying 5 file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) 6 7 Official repository: https://github.com/boostorg/beast 8] 9 10[section HTTP Comparison to Other Libraries] 11 12There are a few C++ published libraries which implement some of the HTTP 13protocol. We analyze the message model chosen by those libraries and discuss 14the advantages and disadvantages relative to Beast. 15 16The general strategy used by the author to evaluate external libraries is 17as follows: 18 19* Review the message model. Can it represent a complete request or 20 response? What level of allocator support is present? How much 21 customization is possible? 22 23* Review the stream abstraction. This is the type of object, such as 24 a socket, which may be used to parse or serialize (i.e. read and write). 25 Can user defined types be specified? What's the level of conformance to 26 to Asio or Networking-TS concepts? 27 28* Check treatment of buffers. Does the library manage the buffers 29 or can users provide their own buffers? 30 31* How does the library handle corner cases such as trailers, 32 Expect: 100-continue, or deferred commitment of the body type? 33 34[note 35 Declarations examples from external libraries have been edited: 36 portions have been removed for simplification. 37] 38 39 40 41[heading cpp-netlib] 42 43[@https://github.com/cpp-netlib/cpp-netlib/tree/092cd570fb179d029d1865aade9f25aae90d97b9 [*cpp-netlib]] 44is a network programming library previously intended for Boost but not 45having gone through formal review. As of this writing it still uses the 46Boost name, namespace, and directory structure although the project states 47that Boost acceptance is no longer a goal. The library is based on Boost.Asio 48and bills itself as ['"a collection of network related routines/implementations 49geared towards providing a robust cross-platform networking library"]. It 50cites ['"Common Message Type"] as a feature. As of the branch previous 51linked, it uses these declarations: 52``` 53template <class Tag> 54struct basic_message { 55 public: 56 typedef Tag tag; 57 58 typedef typename headers_container<Tag>::type headers_container_type; 59 typedef typename headers_container_type::value_type header_type; 60 typedef typename string<Tag>::type string_type; 61 62 headers_container_type& headers() { return headers_; } 63 headers_container_type const& headers() const { return headers_; } 64 65 string_type& body() { return body_; } 66 string_type const& body() const { return body_; } 67 68 string_type& source() { return source_; } 69 string_type const& source() const { return source_; } 70 71 string_type& destination() { return destination_; } 72 string_type const& destination() const { return destination_; } 73 74 private: 75 friend struct detail::directive_base<Tag>; 76 friend struct detail::wrapper_base<Tag, basic_message<Tag> >; 77 78 mutable headers_container_type headers_; 79 mutable string_type body_; 80 mutable string_type source_; 81 mutable string_type destination_; 82}; 83``` 84 85This container is the base class template used to represent HTTP messages. 86It uses a "tag" type style specializations for a variety of trait classes, 87allowing for customization of the various parts of the message. For example, 88a user specializes `headers_container<T>` to determine what container type 89holds the header fields. We note some problems with the container declaration: 90 91* The header and body containers may only be default-constructed. 92 93* No stateful allocator support. 94 95* There is no way to defer the commitment of the type for `body_` to 96 after the headers are read in. 97 98* The message model includes a "source" and "destination." This is 99 extraneous metadata associated with the connection which is not part 100 of the HTTP protocol specification and belongs elsewhere. 101 102* The use of `string_type` (a customization point) for source, 103 destination, and body suggests that `string_type` models a 104 [*ForwardRange] whose `value_type` is `char`. This representation 105 is less than ideal, considering that the library is built on 106 Boost.Asio. Adapting a __DynamicBuffer__ to the required forward 107 range destroys information conveyed by the __ConstBufferSequence__ 108 and __MutableBufferSequence__ used in dynamic buffers. The consequence 109 is that cpp-netlib implementations will be less efficient than an 110 equivalent __NetTS__ conforming implementation. 111 112* The library uses specializations of `string<Tag>` to change the type 113 of string used everywhere, including the body, field name and value 114 pairs, and extraneous metadata such as source and destination. The 115 user may only choose a single type: field name, field values, and 116 the body container will all use the same string type. This limits 117 utility of the customization point. The library's use of the string 118 trait is limited to selecting between `std::string` and `std::wstring`. 119 We do not find this use-case compelling given the limitations. 120 121* The specialized trait classes generate a proliferation of small 122 additional framework types. To specialize traits, users need to exit 123 their namespace and intrude into the `boost::network::http` namespace. 124 The way the traits are used in the library limits the usefulness 125 of the traits to trivial purpose. 126 127* The `string<Tag> customization point constrains user defined body types 128 to few possible strategies. There is no way to represent an HTTP message 129 body as a filename with accompanying algorithms to store or retrieve data 130 from the file system. 131 132The design of the message container in this library is cumbersome 133with its system of customization using trait specializations. The 134use of these customizations is extremely limited due to the way they 135are used in the container declaration, making the design overly 136complex without corresponding benefit. 137 138 139 140[heading Boost.HTTP] 141 142[@https://github.com/BoostGSoC14/boost.http/tree/45fc1aa828a9e3810b8d87e669b7f60ec100bff4 [*boost.http]] 143is a library resulting from the 2014 Google Summer of Code. It was submitted 144for a Boost formal review and rejected in 2015. It is based on Boost.Asio, 145and development on the library has continued to the present. As of the branch 146previously linked, it uses these message declarations: 147``` 148template<class Headers, class Body> 149struct basic_message 150{ 151 typedef Headers headers_type; 152 typedef Body body_type; 153 154 headers_type &headers(); 155 156 const headers_type &headers() const; 157 158 body_type &body(); 159 160 const body_type &body() const; 161 162 headers_type &trailers(); 163 164 const headers_type &trailers() const; 165 166private: 167 headers_type headers_; 168 body_type body_; 169 headers_type trailers_; 170}; 171 172typedef basic_message<boost::http::headers, std::vector<std::uint8_t>> message; 173 174template<class Headers, class Body> 175struct is_message<basic_message<Headers, Body>>: public std::true_type {}; 176``` 177 178* This container cannot model a complete message. The ['start-line] items 179 (method and target for requests, reason-phrase for responses) are 180 communicated out of band, as is the ['http-version]. A function that 181 operates on the message including the start line requires additional 182 parameters. This is evident in one of the 183 [@https://github.com/BoostGSoC14/boost.http/blob/45fc1aa828a9e3810b8d87e669b7f60ec100bff4/example/basic_router.cpp#L81 example programs]. 184 The `500` and `"OK"` arguments represent the response ['status-code] and 185 ['reason-phrase] respectively: 186 ``` 187 ... 188 http::message reply; 189 ... 190 self->socket.async_write_response(500, string_ref("OK"), reply, yield); 191 ``` 192 193* `headers_`, `body_`, and `trailers_` may only be default-constructed, 194 since there are no explicitly declared constructors. 195 196* There is no way to defer the commitment of the [*Body] type to after 197 the headers are read in. This is related to the previous limitation 198 on default-construction. 199 200* No stateful allocator support. This follows from the previous limitation 201 on default-construction. Buffers for start-line strings must be 202 managed externally from the message object since they are not members. 203 204* The trailers are stored in a separate object. Aside from the combinatorial 205 explosion of the number of additional constructors necessary to fully 206 support arbitrary forwarded parameter lists for each of the headers, body, 207 and trailers members, the requirement to know in advance whether a 208 particular HTTP field will be located in the headers or the trailers 209 poses an unnecessary complication for general purpose functions that 210 operate on messages. 211 212* The declarations imply that `std::vector` is a model of [*Body]. 213 More formally, that a body is represented by the [*ForwardRange] 214 concept whose `value_type` is an 8-bit integer. This representation 215 is less than ideal, considering that the library is built on 216 Boost.Asio. Adapting a __DynamicBuffer__ to the required forward range 217 destroys information conveyed by the __ConstBufferSequence__ and 218 __MutableBufferSequence__ used in dynamic buffers. The consequence is 219 that Boost.HTTP implementations will be less efficient when dealing 220 with body containers than an equivalent __NetTS__ conforming 221 implementation. 222 223* The [*Body] customization point constrains user defined types to 224 very limited implementation strategies. For example, there is no way 225 to represent an HTTP message body as a filename with accompanying 226 algorithms to store or retrieve data from the file system. 227 228This representation addresses a narrow range of use cases. It has 229limited potential for customization and performance. It is more difficult 230to use because it excludes the start line fields from the model. 231 232 233 234[heading C++ REST SDK (cpprestsdk)] 235 236[@https://github.com/Microsoft/cpprestsdk/tree/381f5aa92d0dfb59e37c0c47b4d3771d8024e09a [*cpprestsdk]] 237is a Microsoft project which ['"...aims to help C++ developers connect to and 238interact with services"]. It offers the most functionality of the libraries 239reviewed here, including support for Websocket services using its websocket++ 240dependency. It can use native APIs such as HTTP.SYS when building Windows 241based applications, and it can use Boost.Asio. The WebSocket module uses 242Boost.Asio exclusively. 243 244As cpprestsdk is developed by a large corporation, it contains quite a bit 245of functionality and necessarily has more interfaces. We will break down 246the interfaces used to model messages into more manageable pieces. This 247is the container used to store the HTTP header fields: 248``` 249class http_headers 250{ 251public: 252 ... 253 254private: 255 std::map<utility::string_t, utility::string_t, _case_insensitive_cmp> m_headers; 256}; 257``` 258 259This declaration is quite bare-bones. We note the typical problems of 260most field containers: 261 262* The container may only be default-constructed. 263 264* No support for allocators, stateful or otherwise. 265 266* There are no customization points at all. 267 268Now we analyze the structure of 269the larger message container. The library uses a handle/body idiom. There 270are two public message container interfaces, one for requests (`http_request`) 271and one for responses (`http_response`). Each interface maintains a private 272shared pointer to an implementation class. Public member function calls 273are routed to the internal implementation. This is the first implementation 274class, which forms the base class for both the request and response 275implementations: 276``` 277namespace details { 278 279class http_msg_base 280{ 281public: 282 http_headers &headers() { return m_headers; } 283 284 _ASYNCRTIMP void set_body(const concurrency::streams::istream &instream, const utf8string &contentType); 285 286 /// Set the stream through which the message body could be read 287 void set_instream(const concurrency::streams::istream &instream) { m_inStream = instream; } 288 289 /// Set the stream through which the message body could be written 290 void set_outstream(const concurrency::streams::ostream &outstream, bool is_default) { m_outStream = outstream; m_default_outstream = is_default; } 291 292 const pplx::task_completion_event<utility::size64_t> & _get_data_available() const { return m_data_available; } 293 294protected: 295 /// Stream to read the message body. 296 concurrency::streams::istream m_inStream; 297 298 /// stream to write the msg body 299 concurrency::streams::ostream m_outStream; 300 301 http_headers m_headers; 302 bool m_default_outstream; 303 304 /// <summary> The TCE is used to signal the availability of the message body. </summary> 305 pplx::task_completion_event<utility::size64_t> m_data_available; 306}; 307``` 308 309To understand these declarations we need to first understand that cpprestsdk 310uses the asynchronous model defined by Microsoft's 311[@https://msdn.microsoft.com/en-us/library/dd504870.aspx [*Concurrency Runtime]]. 312Identifiers from the [@https://msdn.microsoft.com/en-us/library/jj987780.aspx [*`pplx` namespace]] 313define common asynchronous patterns such as tasks and events. The 314`concurrency::streams::istream` parameter and `m_data_available` data member 315indicates a lack of separation of concerns. The representation of HTTP messages 316should not be conflated with the asynchronous model used to serialize or 317parse those messages in the message declarations. 318 319The next declaration forms the complete implementation class referenced by the 320handle in the public interface (which follows after): 321``` 322/// Internal representation of an HTTP request message. 323class _http_request final : public http::details::http_msg_base, public std::enable_shared_from_this<_http_request> 324{ 325public: 326 _ASYNCRTIMP _http_request(http::method mtd); 327 328 _ASYNCRTIMP _http_request(std::unique_ptr<http::details::_http_server_context> server_context); 329 330 http::method &method() { return m_method; } 331 332 const pplx::cancellation_token &cancellation_token() const { return m_cancellationToken; } 333 334 _ASYNCRTIMP pplx::task<void> reply(const http_response &response); 335 336private: 337 338 // Actual initiates sending the response, without checking if a response has already been sent. 339 pplx::task<void> _reply_impl(http_response response); 340 341 http::method m_method; 342 343 std::shared_ptr<progress_handler> m_progress_handler; 344}; 345 346} // namespace details 347``` 348 349As before, we note that the implementation class for HTTP requests concerns 350itself more with the mechanics of sending the message asynchronously than 351it does with actually modeling the HTTP message as described in __rfc7230__: 352 353* The constructor accepting `std::unique_ptr<http::details::_http_server_context` 354 breaks encapsulation and separation of concerns. This cannot be extended 355 for user defined server contexts. 356 357* The "cancellation token" is stored inside the message. This breaks the 358 separation of concerns. 359 360* The `_reply_impl` function implies that the message implementation also 361 shares responsibility for the means of sending back an HTTP reply. This 362 would be better if it was completely separate from the message container. 363 364Finally, here is the public class which represents an HTTP request: 365``` 366class http_request 367{ 368public: 369 const http::method &method() const { return _m_impl->method(); } 370 371 void set_method(const http::method &method) const { _m_impl->method() = method; } 372 373 /// Extract the body of the request message as a string value, checking that the content type is a MIME text type. 374 /// A body can only be extracted once because in some cases an optimization is made where the data is 'moved' out. 375 pplx::task<utility::string_t> extract_string(bool ignore_content_type = false) 376 { 377 auto impl = _m_impl; 378 return pplx::create_task(_m_impl->_get_data_available()).then([impl, ignore_content_type](utility::size64_t) { return impl->extract_string(ignore_content_type); }); 379 } 380 381 /// Extracts the body of the request message into a json value, checking that the content type is application/json. 382 /// A body can only be extracted once because in some cases an optimization is made where the data is 'moved' out. 383 pplx::task<json::value> extract_json(bool ignore_content_type = false) const 384 { 385 auto impl = _m_impl; 386 return pplx::create_task(_m_impl->_get_data_available()).then([impl, ignore_content_type](utility::size64_t) { return impl->_extract_json(ignore_content_type); }); 387 } 388 389 /// Sets the body of the message to the contents of a byte vector. If the 'Content-Type' 390 void set_body(const std::vector<unsigned char> &body_data); 391 392 /// Defines a stream that will be relied on to provide the body of the HTTP message when it is 393 /// sent. 394 void set_body(const concurrency::streams::istream &stream, const utility::string_t &content_type = _XPLATSTR("application/octet-stream")); 395 396 /// Defines a stream that will be relied on to hold the body of the HTTP response message that 397 /// results from the request. 398 void set_response_stream(const concurrency::streams::ostream &stream); 399 { 400 return _m_impl->set_response_stream(stream); 401 } 402 403 /// Defines a callback function that will be invoked for every chunk of data uploaded or downloaded 404 /// as part of the request. 405 void set_progress_handler(const progress_handler &handler); 406 407private: 408 friend class http::details::_http_request; 409 friend class http::client::http_client; 410 411 std::shared_ptr<http::details::_http_request> _m_impl; 412}; 413``` 414 415It is clear from this declaration that the goal of the message model in 416this library is driven by its use-case (interacting with REST servers) 417and not to model HTTP messages generally. We note problems similar to 418the other declarations: 419 420* There are no compile-time customization points at all. The only 421 customization is in the `concurrency::streams::istream` and 422 `concurrency::streams::ostream` reference parameters. Presumably, 423 these are abstract interfaces which may be subclassed by users 424 to achieve custom behaviors. 425 426* The extraction of the body is conflated with the asynchronous model. 427 428* No way to define an allocator for the container used when extracting 429 the body. 430 431* A body can only be extracted once, limiting the use of this container 432 when using a functional programming style. 433 434* Setting the body requires either a vector or a `concurrency::streams::istream`. 435 No user defined types are possible. 436 437* The HTTP request container conflates HTTP response behavior (see the 438 `set_response_stream` member). Again this is likely purpose-driven but 439 the lack of separation of concerns limits this library to only the 440 uses explicitly envisioned by the authors. 441 442The general theme of the HTTP message model in cpprestsdk is "no user 443definable customizations". There is no allocator support, and no 444separation of concerns. It is designed to perform a specific set of 445behaviors. In other words, it does not follow the open/closed principle. 446 447Tasks in the Concurrency Runtime operate in a fashion similar to 448`std::future`, but with some improvements such as continuations which 449are not yet in the C++ standard. The costs of using a task based 450asynchronous interface instead of completion handlers is well 451documented: synchronization points along the call chain of composed 452task operations which cannot be optimized away. See: 453[@http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3747.pdf 454[*A Universal Model for Asynchronous Operations]] (Kohlhoff). 455 456[endsect] 457