• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1# curl connection filters
2
3Connection filters is a design in the internals of curl, not visible in its
4public API. They were added in curl v7.87.0. This document describes the
5concepts, its high level implementation and the motivations.
6
7## Filters
8
9A "connection filter" is a piece of code that is responsible for handling a
10range of operations of curl's connections: reading, writing, waiting on
11external events, connecting and closing down - to name the most important
12ones.
13
14The most important feat of connection filters is that they can be stacked on
15top of each other (or "chained" if you prefer that metaphor). In the common
16scenario that you want to retrieve a `https:` URL with curl, you need 2 basic
17things to send the request and get the response: a TCP connection, represented
18by a `socket` and a SSL instance en- and decrypt over that socket. You write
19your request to the SSL instance, which encrypts and writes that data to the
20socket, which then sends the bytes over the network.
21
22With connection filters, curl's internal setup will look something like this
23(cf for connection filter):
24
25```
26Curl_easy *data         connectdata *conn        cf-ssl        cf-socket
27+----------------+      +-----------------+      +-------+     +--------+
28|https://curl.se/|----> | properties      |----> | keys  |---> | socket |--> OS --> network
29+----------------+      +-----------------+      +-------+     +--------+
30
31 Curl_write(data, buffer)
32  --> Curl_cfilter_write(data, data->conn, buffer)
33       ---> conn->filter->write(conn->filter, data, buffer)
34```
35
36While connection filters all do different things, they look the same from the "outside". The code in `data` and `conn` does not really know **which** filters are installed. `conn` just writes into the first filter, whatever that is.
37
38Same is true for filters. Each filter has a pointer to the `next` filter. When SSL has encrypted the data, it does not write to a socket, it writes to the next filter. If that is indeed a socket, or a file, or an HTTP/2 connection is of no concern to the SSL filter.
39
40This allows stacking, as in:
41
42```
43Direct:
44  http://localhost/      conn -> cf-socket
45  https://curl.se/       conn -> cf-ssl -> cf-socket
46Via http proxy tunnel:
47  http://localhost/      conn -> cf-http-proxy -> cf-socket
48  https://curl.se/       conn -> cf-ssl -> cf-http-proxy -> cf-socket
49Via https proxy tunnel:
50  http://localhost/      conn -> cf-http-proxy -> cf-ssl -> cf-socket
51  https://curl.se/       conn -> cf-ssl -> cf-http-proxy -> cf-ssl -> cf-socket
52Via http proxy tunnel via SOCKS proxy:
53  http://localhost/      conn -> cf-http-proxy -> cf-socks -> cf-socket
54```
55
56### Connecting/Closing
57
58Before `Curl_easy` can send the request, the connection needs to be established. This means that all connection filters have done, whatever they need to do: waiting for the socket to be connected, doing the TLS handshake, performing the HTTP tunnel request, etc. This has to be done in reverse order: the last filter has to do its connect first, then the one above can start, etc.
59
60Each filter does in principle the following:
61
62```
63static CURLcode
64myfilter_cf_connect(struct Curl_cfilter *cf,
65                    struct Curl_easy *data,
66                    bool *done)
67{
68  CURLcode result;
69
70  if(cf->connected) {            /* we and all below are done */
71    *done = TRUE;
72    return CURLE_OK;
73  }
74                                 /* Let the filters below connect */
75  result = cf->next->cft->connect(cf->next, data, blocking, done);
76  if(result || !*done)
77    return result;               /* below errored/not finished yet */
78
79  /* MYFILTER CONNECT THINGS */  /* below connected, do out thing */
80  *done = cf->connected = TRUE;  /* done, remember, return */
81  return CURLE_OK;
82}
83```
84
85Closing a connection then works similar. The `conn` tells the first filter to close. Contrary to connecting,
86the filter does its own things first, before telling the next filter to close.
87
88### Efficiency
89
90There are two things curl is concerned about: efficient memory use and fast transfers.
91
92The memory footprint of a filter is relatively small:
93
94```
95struct Curl_cfilter {
96  const struct Curl_cftype *cft; /* the type providing implementation */
97  struct Curl_cfilter *next;     /* next filter in chain */
98  void *ctx;                     /* filter type specific settings */
99  struct connectdata *conn;      /* the connection this filter belongs to */
100  int sockindex;                 /* TODO: like to get rid off this */
101  BIT(connected);                /* != 0 iff this filter is connected */
102};
103```
104The filter type `cft` is a singleton, one static struct for each type of filter. The `ctx` is where a filter will hold its specific data. That varies by filter type. An http-proxy filter will keep the ongoing state of the CONNECT here, but free it after its has been established. The SSL filter will keep the `SSL*` (if OpenSSL is used) here until the connection is closed. So, this varies.
105
106`conn` is a reference to the connection this filter belongs to, so nothing extra besides the pointer itself.
107
108Several things, that before were kept in `struct connectdata`, will now go into the `filter->ctx` *when needed*. So, the memory footprint for connections that do *not* use an http proxy, or socks, or https will be lower.
109
110As to transfer efficiency, writing and reading through a filter comes at near zero cost *if the filter does not transform the data*. An http proxy or socks filter, once it is connected, will just pass the calls through. Those filters implementations will look like this:
111
112```
113ssize_t  Curl_cf_def_send(struct Curl_cfilter *cf, struct Curl_easy *data,
114                          const void *buf, size_t len, CURLcode *err)
115{
116  return cf->next->cft->do_send(cf->next, data, buf, len, err);
117}
118```
119The `recv` implementation is equivalent.
120
121## Filter Types
122
123The currently existing filter types (curl 8.5.0) are:
124
125* `TCP`, `UDP`, `UNIX`: filters that operate on a socket, providing raw I/O.
126* `SOCKET-ACCEPT`: special TCP socket that has a socket that has been `accept()`ed in a `listen()`
127* `SSL`: filter that applies TLS en-/decryption and handshake. Manages the underlying TLS backend implementation.
128* `HTTP-PROXY`, `H1-PROXY`, `H2-PROXY`: the first manages the connection to an
129  HTTP proxy server and uses the other depending on which ALPN protocol has
130  been negotiated.
131* `SOCKS-PROXY`: filter for the various SOCKS proxy protocol variations
132* `HAPROXY`: filter for the protocol of the same name, providing client IP information to a server.
133* `HTTP/2`: filter for handling multiplexed transfers over an HTTP/2 connection
134* `HTTP/3`: filter for handling multiplexed transfers over an HTTP/3+QUIC connection
135* `HAPPY-EYEBALLS`: meta filter that implements IPv4/IPv6 "happy eyeballing". It creates up to 2 sub-filters that race each other for a connection.
136* `SETUP`: meta filter that manages the creation of sub-filter chains for a specific transport (e.g. TCP or QUIC).
137* `HTTPS-CONNECT`: meta filter that races a TCP+TLS and a QUIC connection against each other to determine if HTTP/1.1, HTTP/2 or HTTP/3 shall be used for a transfer.
138
139Meta filters are combining other filters for a specific purpose, mostly during connection establishment. Other filters like `TCP`, `UDP` and `UNIX` are only to be found at the end of filter chains. SSL filters provide encryption, of course. Protocol filters change the bytes sent and received.
140
141## Filter Flags
142
143Filter types carry flags that inform what they do. These are (for now):
144
145* `CF_TYPE_IP_CONNECT`: this filter type talks directly to a server. This does not have to be the server the transfer wants to talk to. For example when a proxy server is used.
146* `CF_TYPE_SSL`: this filter type provides encryption.
147* `CF_TYPE_MULTIPLEX`: this filter type can manage multiple transfers in parallel.
148
149Filter types can combine these flags. For example, the HTTP/3 filter types have `CF_TYPE_IP_CONNECT`, `CF_TYPE_SSL` and `CF_TYPE_MULTIPLEX` set.
150
151Flags are useful to extrapolate properties of a connection. To check if a connection is encrypted, libcurl inspect the filter chain in place, top down, for `CF_TYPE_SSL`. If it finds `CF_TYPE_IP_CONNECT` before any `CF_TYPE_SSL`, the connection is not encrypted.
152
153For example, `conn1` is for a `http:` request using a tunnel through a HTTP/2 `https:` proxy. `conn2` is a `https:` HTTP/2 connection to the same proxy. `conn3` uses HTTP/3 without proxy. The filter chains would look like this (simplified):
154
155```
156conn1 --> `HTTP-PROXY` --> `H2-PROXY` --> `SSL` --> `TCP`
157flags:                     `IP_CONNECT`   `SSL`     `IP_CONNECT`
158
159conn2 --> `HTTP/2` --> `SSL` --> `HTTP-PROXY` --> `H2-PROXY` --> `SSL` --> `TCP`
160flags:                 `SSL`                      `IP_CONNECT`   `SSL`     `IP_CONNECT`
161
162conn3 --> `HTTP/3`
163flags:    `SSL|IP_CONNECT`
164```
165
166Inspecting the filter chains, `conn1` is seen as unencrypted, since it contains an `IP_CONNECT` filter before any `SSL`. `conn2` is clearly encrypted as an `SSL` flagged filter is seen first. `conn3` is also encrypted as the `SSL` flag is checked before the presence of `IP_CONNECT`.
167
168Similar checks can determine if a connection is multiplexed or not.
169
170## Filter Tracing
171
172Filters may make use of special trace macros like `CURL_TRC_CF(data, cf, msg, ...)`. With `data` being the transfer and `cf` being the filter instance. These traces are normally not active and their execution is guarded so that they are cheap to ignore.
173
174Users of `curl` may activate them by adding the name of the filter type to the
175`--trace-config` argument. For example, in order to get more detailed tracing
176of an HTTP/2 request, invoke curl with:
177
178```
179> curl -v --trace-config ids,time,http/2  https://curl.se
180```
181Which will give you trace output with time information, transfer+connection ids and details from the `HTTP/2` filter. Filter type names in the trace config are case insensitive. You may use `all` to enable tracing for all filter types. When using `libcurl` you may call `curl_global_trace(config_string)` at the start of your application to enable filter details.
182
183## Meta Filters
184
185Meta filters is a catch-all name for filter types that do not change the transfer data in any way but provide other important services to curl. In general, it is possible to do all sorts of silly things with them. One of the commonly used, important things is "eyeballing".
186
187The `HAPPY-EYEBALLS` filter is involved in the connect phase. Its job is to
188try the various IPv4 and IPv6 addresses that are known for a server. If only
189one address family is known (or configured), it tries the addresses one after
190the other with timeouts calculated from the amount of addresses and the
191overall connect timeout.
192
193When more than one address family is to be tried, it splits the address list into IPv4 and IPv6 and makes parallel attempts. The connection filter chain will look like this:
194
195```
196* create connection for http://curl.se
197conn[curl.se] --> SETUP[TCP] --> HAPPY-EYEBALLS --> NULL
198* start connect
199conn[curl.se] --> SETUP[TCP] --> HAPPY-EYEBALLS --> NULL
200                                 - ballerv4 --> TCP[151.101.1.91]:443
201                                 - ballerv6 --> TCP[2a04:4e42:c00::347]:443
202* v6 answers, connected
203conn[curl.se] --> SETUP[TCP] --> HAPPY-EYEBALLS --> TCP[2a04:4e42:c00::347]:443
204* transfer
205```
206
207The modular design of connection filters and that we can plug them into each other is used to control the parallel attempts. When a `TCP` filter does not connect (in time), it is torn down and another one is created for the next address. This keeps the `TCP` filter simple.
208
209The `HAPPY-EYEBALLS` on the other hand stays focused on its side of the problem. We can use it also to make other type of connection by just giving it another filter type to try and have happy eyeballing for QUIC:
210
211```
212* create connection for --http3-only https://curl.se
213conn[curl.se] --> SETUP[QUIC] --> HAPPY-EYEBALLS --> NULL
214* start connect
215conn[curl.se] --> SETUP[QUIC] --> HAPPY-EYEBALLS --> NULL
216                                  - ballerv4 --> HTTP/3[151.101.1.91]:443
217                                  - ballerv6 --> HTTP/3[2a04:4e42:c00::347]:443
218* v6 answers, connected
219conn[curl.se] --> SETUP[QUIC] --> HAPPY-EYEBALLS --> HTTP/3[2a04:4e42:c00::347]:443
220* transfer
221```
222
223When we plug these two variants together, we get the `HTTPS-CONNECT` filter
224type that is used for `--http3` when **both** HTTP/3 and HTTP/2 or HTTP/1.1
225shall be attempted:
226
227```
228* create connection for --http3 https://curl.se
229conn[curl.se] --> HTTPS-CONNECT --> NULL
230* start connect
231conn[curl.se] --> HTTPS-CONNECT --> NULL
232                  - SETUP[QUIC] --> HAPPY-EYEBALLS --> NULL
233                                    - ballerv4 --> HTTP/3[151.101.1.91]:443
234                                    - ballerv6 --> HTTP/3[2a04:4e42:c00::347]:443
235                  - SETUP[TCP]  --> HAPPY-EYEBALLS --> NULL
236                                    - ballerv4 --> TCP[151.101.1.91]:443
237                                    - ballerv6 --> TCP[2a04:4e42:c00::347]:443
238* v4 QUIC answers, connected
239conn[curl.se] --> HTTPS-CONNECT --> SETUP[QUIC] --> HAPPY-EYEBALLS --> HTTP/3[151.101.1.91]:443
240* transfer
241```
242
243