• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1[/
2    Copyright (c) 2016-2019 Vinnie Falco (vinnie dot falco at gmail dot com)
3
4    Distributed under the Boost Software License, Version 1.0. (See accompanying
5    file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
6
7    Official repository: https://github.com/boostorg/beast
8]
9
10[section HTTP Comparison to Other Libraries]
11
12There are a few C++ published libraries which implement some of the HTTP
13protocol. We analyze the message model chosen by those libraries and discuss
14the advantages and disadvantages relative to Beast.
15
16The general strategy used by the author to evaluate external libraries is
17as follows:
18
19* Review the message model. Can it represent a complete request or
20  response? What level of allocator support is present? How much
21  customization is possible?
22
23* Review the stream abstraction. This is the type of object, such as
24  a socket, which may be used to parse or serialize (i.e. read and write).
25  Can user defined types be specified? What's the level of conformance to
26  to Asio or Networking-TS concepts?
27
28* Check treatment of buffers. Does the library manage the buffers
29  or can users provide their own buffers?
30
31* How does the library handle corner cases such as trailers,
32  Expect: 100-continue, or deferred commitment of the body type?
33
34[note
35    Declarations examples from external libraries have been edited:
36    portions have been removed for simplification.
37]
38
39
40
41[heading cpp-netlib]
42
43[@https://github.com/cpp-netlib/cpp-netlib/tree/092cd570fb179d029d1865aade9f25aae90d97b9 [*cpp-netlib]]
44is a network programming library previously intended for Boost but not
45having gone through formal review. As of this writing it still uses the
46Boost name, namespace, and directory structure although the project states
47that Boost acceptance is no longer a goal. The library is based on Boost.Asio
48and bills itself as ['"a collection of network related routines/implementations
49geared towards providing a robust cross-platform networking library"]. It
50cites ['"Common Message Type"] as a feature. As of the branch previous
51linked, it uses these declarations:
52```
53template <class Tag>
54struct basic_message {
55 public:
56  typedef Tag tag;
57
58  typedef typename headers_container<Tag>::type headers_container_type;
59  typedef typename headers_container_type::value_type header_type;
60  typedef typename string<Tag>::type string_type;
61
62  headers_container_type& headers() { return headers_; }
63  headers_container_type const& headers() const { return headers_; }
64
65  string_type& body() { return body_; }
66  string_type const& body() const { return body_; }
67
68  string_type& source() { return source_; }
69  string_type const& source() const { return source_; }
70
71  string_type& destination() { return destination_; }
72  string_type const& destination() const { return destination_; }
73
74 private:
75  friend struct detail::directive_base<Tag>;
76  friend struct detail::wrapper_base<Tag, basic_message<Tag> >;
77
78  mutable headers_container_type headers_;
79  mutable string_type body_;
80  mutable string_type source_;
81  mutable string_type destination_;
82};
83```
84
85This container is the base class template used to represent HTTP messages.
86It uses a "tag" type style specializations for a variety of trait classes,
87allowing for customization of the various parts of the message. For example,
88a user specializes `headers_container<T>` to determine what container type
89holds the header fields. We note some problems with the container declaration:
90
91* The header and body containers may only be default-constructed.
92
93* No stateful allocator support.
94
95* There is no way to defer the commitment of the type for `body_` to
96  after the headers are read in.
97
98* The message model includes a "source" and "destination." This is
99  extraneous metadata associated with the connection which is not part
100  of the HTTP protocol specification and belongs elsewhere.
101
102* The use of `string_type` (a customization point) for source,
103  destination, and body suggests that `string_type` models a
104  [*ForwardRange] whose `value_type` is `char`. This representation
105  is less than ideal, considering that the library is built on
106  Boost.Asio. Adapting a __DynamicBuffer__ to the required forward
107  range destroys information conveyed by the __ConstBufferSequence__
108  and __MutableBufferSequence__ used in dynamic buffers. The consequence
109  is that cpp-netlib implementations will be less efficient than an
110  equivalent __NetTS__ conforming implementation.
111
112* The library uses specializations of `string<Tag>` to change the type
113  of string used everywhere, including the body, field name and value
114  pairs, and extraneous metadata such as source and destination. The
115  user may only choose a single type: field name, field values, and
116  the body container will all use the same string type. This limits
117  utility of the customization point. The library's use of the string
118  trait is limited to selecting between `std::string` and `std::wstring`.
119  We do not find this use-case compelling given the limitations.
120
121* The specialized trait classes generate a proliferation of small
122  additional framework types. To specialize traits, users need to exit
123  their namespace and intrude into the `boost::network::http` namespace.
124  The way the traits are used in the library limits the usefulness
125  of the traits to trivial purpose.
126
127* The `string<Tag> customization point constrains user defined body types
128  to few possible strategies. There is no way to represent an HTTP message
129  body as a filename with accompanying algorithms to store or retrieve data
130  from the file system.
131
132The design of the message container in this library is cumbersome
133with its system of customization using trait specializations. The
134use of these customizations is extremely limited due to the way they
135are used in the container declaration, making the design overly
136complex without corresponding benefit.
137
138
139
140[heading Boost.HTTP]
141
142[@https://github.com/BoostGSoC14/boost.http/tree/45fc1aa828a9e3810b8d87e669b7f60ec100bff4 [*boost.http]]
143is a library resulting from the 2014 Google Summer of Code. It was submitted
144for a Boost formal review and rejected in 2015. It is based on Boost.Asio,
145and development on the library has continued to the present. As of the branch
146previously linked, it uses these message declarations:
147```
148template<class Headers, class Body>
149struct basic_message
150{
151    typedef Headers headers_type;
152    typedef Body body_type;
153
154    headers_type &headers();
155
156    const headers_type &headers() const;
157
158    body_type &body();
159
160    const body_type &body() const;
161
162    headers_type &trailers();
163
164    const headers_type &trailers() const;
165
166private:
167    headers_type headers_;
168    body_type body_;
169    headers_type trailers_;
170};
171
172typedef basic_message<boost::http::headers, std::vector<std::uint8_t>> message;
173
174template<class Headers, class Body>
175struct is_message<basic_message<Headers, Body>>: public std::true_type {};
176```
177
178* This container cannot model a complete message. The ['start-line] items
179  (method and target for requests, reason-phrase for responses) are
180  communicated out of band, as is the ['http-version]. A function that
181  operates on the message including the start line requires additional
182  parameters. This is evident in one of the
183  [@https://github.com/BoostGSoC14/boost.http/blob/45fc1aa828a9e3810b8d87e669b7f60ec100bff4/example/basic_router.cpp#L81 example programs].
184  The `500` and `"OK"` arguments represent the response ['status-code] and
185  ['reason-phrase] respectively:
186  ```
187  ...
188  http::message reply;
189  ...
190  self->socket.async_write_response(500, string_ref("OK"), reply, yield);
191  ```
192
193* `headers_`, `body_`, and `trailers_` may only be default-constructed,
194  since there are no explicitly declared constructors.
195
196* There is no way to defer the commitment of the [*Body] type to after
197  the headers are read in. This is related to the previous limitation
198  on default-construction.
199
200* No stateful allocator support. This follows from the previous limitation
201  on default-construction. Buffers for start-line strings must be
202  managed externally from the message object since they are not members.
203
204* The trailers are stored in a separate object. Aside from the combinatorial
205  explosion of the number of additional constructors necessary to fully
206  support arbitrary forwarded parameter lists for each of the headers, body,
207  and trailers members, the requirement to know in advance whether a
208  particular HTTP field will be located in the headers or the trailers
209  poses an unnecessary complication for general purpose functions that
210  operate on messages.
211
212* The declarations imply that `std::vector` is a model of [*Body].
213  More formally, that a body is represented by the [*ForwardRange]
214  concept whose `value_type` is an 8-bit integer. This representation
215  is less than ideal, considering that the library is built on
216  Boost.Asio. Adapting a __DynamicBuffer__ to the required forward range
217  destroys information conveyed by the __ConstBufferSequence__ and
218  __MutableBufferSequence__ used in dynamic buffers. The consequence is
219  that Boost.HTTP implementations will be less efficient when dealing
220  with body containers than an equivalent __NetTS__ conforming
221  implementation.
222
223* The [*Body] customization point constrains user defined types to
224  very limited implementation strategies. For example, there is no way
225  to represent an HTTP message body as a filename with accompanying
226  algorithms to store or retrieve data from the file system.
227
228This representation addresses a narrow range of  use cases. It has
229limited potential for customization and performance. It is more difficult
230to use because it excludes the start line fields from the model.
231
232
233
234[heading C++ REST SDK (cpprestsdk)]
235
236[@https://github.com/Microsoft/cpprestsdk/tree/381f5aa92d0dfb59e37c0c47b4d3771d8024e09a [*cpprestsdk]]
237is a Microsoft project which ['"...aims to help C++ developers connect to and
238interact with services"]. It offers the most functionality of the libraries
239reviewed here, including support for Websocket services using its websocket++
240dependency. It can use native APIs such as HTTP.SYS when building Windows
241based applications, and it can use Boost.Asio. The WebSocket module uses
242Boost.Asio exclusively.
243
244As cpprestsdk is developed by a large corporation, it contains quite a bit
245of functionality and necessarily has more interfaces. We will break down
246the interfaces used to model messages into more manageable pieces. This
247is the container used to store the HTTP header fields:
248```
249class http_headers
250{
251public:
252    ...
253
254private:
255    std::map<utility::string_t, utility::string_t, _case_insensitive_cmp> m_headers;
256};
257```
258
259This declaration is quite bare-bones. We note the typical problems of
260most field containers:
261
262* The container may only be default-constructed.
263
264* No support for allocators, stateful or otherwise.
265
266* There are no customization points at all.
267
268Now we analyze the structure of
269the larger message container. The library uses a handle/body idiom. There
270are two public message container interfaces, one for requests (`http_request`)
271and one for responses (`http_response`). Each interface maintains a private
272shared pointer to an implementation class. Public member function calls
273are routed to the internal implementation. This is the first implementation
274class, which forms the base class for both the request and response
275implementations:
276```
277namespace details {
278
279class http_msg_base
280{
281public:
282    http_headers &headers() { return m_headers; }
283
284    _ASYNCRTIMP void set_body(const concurrency::streams::istream &instream, const utf8string &contentType);
285
286    /// Set the stream through which the message body could be read
287    void set_instream(const concurrency::streams::istream &instream)  { m_inStream = instream; }
288
289    /// Set the stream through which the message body could be written
290    void set_outstream(const concurrency::streams::ostream &outstream, bool is_default)  { m_outStream = outstream; m_default_outstream = is_default; }
291
292    const pplx::task_completion_event<utility::size64_t> & _get_data_available() const { return m_data_available; }
293
294protected:
295    /// Stream to read the message body.
296    concurrency::streams::istream m_inStream;
297
298    /// stream to write the msg body
299    concurrency::streams::ostream m_outStream;
300
301    http_headers m_headers;
302    bool m_default_outstream;
303
304    /// <summary> The TCE is used to signal the availability of the message body. </summary>
305    pplx::task_completion_event<utility::size64_t> m_data_available;
306};
307```
308
309To understand these declarations we need to first understand that cpprestsdk
310uses the asynchronous model defined by Microsoft's
311[@https://msdn.microsoft.com/en-us/library/dd504870.aspx [*Concurrency Runtime]].
312Identifiers from the [@https://msdn.microsoft.com/en-us/library/jj987780.aspx [*`pplx` namespace]]
313define common asynchronous patterns such as tasks and events. The
314`concurrency::streams::istream` parameter and `m_data_available` data member
315indicates a lack of separation of concerns. The representation of HTTP messages
316should not be conflated with the asynchronous model used to serialize or
317parse those messages in the message declarations.
318
319The next declaration forms the complete implementation class referenced by the
320handle in the public interface (which follows after):
321```
322/// Internal representation of an HTTP request message.
323class _http_request final : public http::details::http_msg_base, public std::enable_shared_from_this<_http_request>
324{
325public:
326    _ASYNCRTIMP _http_request(http::method mtd);
327
328    _ASYNCRTIMP _http_request(std::unique_ptr<http::details::_http_server_context> server_context);
329
330    http::method &method() { return m_method; }
331
332    const pplx::cancellation_token &cancellation_token() const { return m_cancellationToken; }
333
334    _ASYNCRTIMP pplx::task<void> reply(const http_response &response);
335
336private:
337
338    // Actual initiates sending the response, without checking if a response has already been sent.
339    pplx::task<void> _reply_impl(http_response response);
340
341    http::method m_method;
342
343    std::shared_ptr<progress_handler> m_progress_handler;
344};
345
346} // namespace details
347```
348
349As before, we note that the implementation class for HTTP requests concerns
350itself more with the mechanics of sending the message asynchronously than
351it does with actually modeling the HTTP message as described in __rfc7230__:
352
353* The constructor accepting `std::unique_ptr<http::details::_http_server_context`
354  breaks encapsulation and separation of concerns. This cannot be extended
355  for user defined server contexts.
356
357* The "cancellation token" is stored inside the message. This breaks the
358  separation of concerns.
359
360* The `_reply_impl` function implies that the message implementation also
361  shares responsibility for the means of sending back an HTTP reply. This
362  would be better if it was completely separate from the message container.
363
364Finally, here is the public class which represents an HTTP request:
365```
366class http_request
367{
368public:
369    const http::method &method() const { return _m_impl->method(); }
370
371    void set_method(const http::method &method) const { _m_impl->method() = method; }
372
373    /// Extract the body of the request message as a string value, checking that the content type is a MIME text type.
374    /// A body can only be extracted once because in some cases an optimization is made where the data is 'moved' out.
375    pplx::task<utility::string_t> extract_string(bool ignore_content_type = false)
376    {
377        auto impl = _m_impl;
378        return pplx::create_task(_m_impl->_get_data_available()).then([impl, ignore_content_type](utility::size64_t) { return impl->extract_string(ignore_content_type); });
379    }
380
381    /// Extracts the body of the request message into a json value, checking that the content type is application/json.
382    /// A body can only be extracted once because in some cases an optimization is made where the data is 'moved' out.
383    pplx::task<json::value> extract_json(bool ignore_content_type = false) const
384    {
385        auto impl = _m_impl;
386        return pplx::create_task(_m_impl->_get_data_available()).then([impl, ignore_content_type](utility::size64_t) { return impl->_extract_json(ignore_content_type); });
387    }
388
389    /// Sets the body of the message to the contents of a byte vector. If the 'Content-Type'
390    void set_body(const std::vector<unsigned char> &body_data);
391
392    /// Defines a stream that will be relied on to provide the body of the HTTP message when it is
393    /// sent.
394    void set_body(const concurrency::streams::istream &stream, const utility::string_t &content_type = _XPLATSTR("application/octet-stream"));
395
396    /// Defines a stream that will be relied on to hold the body of the HTTP response message that
397    /// results from the request.
398    void set_response_stream(const concurrency::streams::ostream &stream);
399    {
400        return _m_impl->set_response_stream(stream);
401    }
402
403    /// Defines a callback function that will be invoked for every chunk of data uploaded or downloaded
404    /// as part of the request.
405    void set_progress_handler(const progress_handler &handler);
406
407private:
408    friend class http::details::_http_request;
409    friend class http::client::http_client;
410
411    std::shared_ptr<http::details::_http_request> _m_impl;
412};
413```
414
415It is clear from this declaration that the goal of the message model in
416this library is driven by its use-case (interacting with REST servers)
417and not to model HTTP messages generally. We note problems similar to
418the other declarations:
419
420* There are no compile-time customization points at all. The only
421  customization is in the `concurrency::streams::istream` and
422  `concurrency::streams::ostream` reference parameters. Presumably,
423  these are abstract interfaces which may be subclassed by users
424  to achieve custom behaviors.
425
426* The extraction of the body is conflated with the asynchronous model.
427
428* No way to define an allocator for the container used when extracting
429  the body.
430
431* A body can only be extracted once, limiting the use of this container
432  when using a functional programming style.
433
434* Setting the body requires either a vector or a `concurrency::streams::istream`.
435  No user defined types are possible.
436
437* The HTTP request container conflates HTTP response behavior (see the
438  `set_response_stream` member). Again this is likely purpose-driven but
439  the lack of separation of concerns limits this library to only the
440  uses explicitly envisioned by the authors.
441
442The general theme of the HTTP message model in cpprestsdk is "no user
443definable customizations". There is no allocator support, and no
444separation of concerns. It is designed to perform a specific set of
445behaviors. In other words, it does not follow the open/closed principle.
446
447Tasks in the Concurrency Runtime operate in a fashion similar to
448`std::future`, but with some improvements such as continuations which
449are not yet in the C++ standard. The costs of using a task based
450asynchronous interface instead of completion handlers is well
451documented: synchronization points along the call chain of composed
452task operations which cannot be optimized away. See:
453[@http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3747.pdf
454[*A Universal Model for Asynchronous Operations]] (Kohlhoff).
455
456[endsect]
457