Lines Matching +full:whatwg +full:- +full:url
1 <!--
4 SPDX-License-Identifier: curl
5 -->
7 # URL syntax and their use in curl
11 The official "URL syntax" is primarily defined in these two different
14 - [RFC 3986](https://datatracker.ietf.org/doc/html/rfc3986) (although URL is called
16 - [The WHATWG URL Specification](https://url.spec.whatwg.org/)
21 The WHATWG URL spec was written later, is incompatible with the RFC 3986 and
26 URL parsers as implemented in browsers, libraries and tools usually opt to
28 interpretations and the moving nature of the WHATWG spec does however make it
33 Due to the inherent differences between URL parser implementations, it is
37 For example, if you use one parser to check if a URL uses a good hostname or
38 the correct auth field, and then pass on that same URL to a *second* parser,
39 there is always a risk it treats the same URL differently. There is no right
40 and wrong in URL land, only differences of opinions.
42 libcurl offers a separate API to its URL parser for this reason, among others.
46 URL from an external untrusted party and using it with curl brings several
50 an unfiltered URL can trick your application to access a local resource
55 are part of the regular URL format. The combination of a local host and a
59 3. Such a URL might use other schemes than you thought of or planned for.
63 curl recognizes a URL syntax that we call "RFC 3986 plus". It is grounded on
67 curl's URL parser allows a few deviations from the spec in order to
68 inter-operate better with URLs that appear in the wild.
72 A URL provided to curl cannot contain spaces. They need to be provided URL
73 encoded to be accepted in a URL by curl.
77 is a violation of RFC 3986 but is fine in the WHATWG spec. curl handles these
78 by re-encoding them to `%20`.
80 ### non-ASCII
82 Byte values in a provided URL that are outside of the printable ASCII range
83 are percent-encoded by curl.
87 An absolute URL always starts with a "scheme" followed by a colon. For all the
89 RFC 3986 but not according to the WHATWG spec - which allows one to infinity
93 valid URL.
95 ### "scheme-less"
103 - `ftp.` means FTP
104 - `dict.` means DICT
105 - `ldap.` means LDAP
106 - `imap.` means IMAP
107 - `smtp.` means SMTP
108 - `pop3.` means POP3
109 - all other means HTTP
114 create ranges and lists using `[N-M]` and `{one,two,three}` sequences. The
116 legitimately be part of such a URL.
118 They are however not reserved or special in the WHATWG specification, so
120 (using `--globoff`).
122 # URL syntax details
124 A URL may consist of the following components - many of them are optional:
148 When the URL is specified to identify a proxy, curl recognizes the following
166 The hostname part of the URL contains the address of the server that you want
186 This is done to make sure the host accessed is truly the localhost - the local
192 handle hostnames using non-ASCII characters.
195 to the WHATWG URL spec, but differs from certain browsers that use IDNA 2003
206 number to use. 1 - 65535. curl also supports a blank port number field - but
207 only if the URL starts with a scheme.
209 If the port number is not specified in the URL, curl uses a default port
233 When a `FILE://` URL is accessed on Windows systems, it can be crafted in a
237 curl only allows the hostname part of a FILE URL to be one out of these three
239 Anything else makes curl fail to parse the URL.
241 ### Windows-specific FILE details
243 curl accepts that the FILE URL's path starts with a "drive letter". That is a
248 This way, a `file://` URL passed to curl *might* be converted into a network
297 Searching via the query part of the URL `?` is a search request for the
300 numbers (`UID`) by using a custom curl request via `-X`. `UID` numbers are
303 want the matching `MAILINDEX` numbers returned then you could search via URL:
309 imap://user:password@mail.example.com/INBOX -X "UID SEARCH TEXT \"foo bar\""
312 information about the individual components of an IMAP URL please see RFC 5092.
315 was specified in the URL. That was a bug fixed in 7.62.0, which added
338 For more information about the individual components of a LDAP URL please
348 The path part of an SCP URL specifies the path and file to retrieve or
357 The path part of an SFTP URL specifies the file to retrieve or upload. If the
365 If the username is embedded in the URL then it must contain the domain name
366 and as such, the backslash must be URL encoded as %2f.
387 There is no official URL spec for RTMP so libcurl uses the URL syntax supported
389 traditional URL, followed by a space and a series of space-separated