• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1---
2c: Copyright (C) Daniel Stenberg, <daniel.se>, et al.
3SPDX-License-Identifier: curl
4Title: curl_url_get
5Section: 3
6Source: libcurl
7See-also:
8  - CURLOPT_CURLU (3)
9  - curl_url (3)
10  - curl_url_cleanup (3)
11  - curl_url_dup (3)
12  - curl_url_set (3)
13  - curl_url_strerror (3)
14---
15
16# NAME
17
18curl_url_get - extract a part from a URL
19
20# SYNOPSIS
21
22~~~c
23#include <curl/curl.h>
24
25CURLUcode curl_url_get(const CURLU *url,
26                       CURLUPart part,
27                       char **content,
28                       unsigned int flags);
29~~~
30
31# DESCRIPTION
32
33Given a *url* handle of a URL object, this function extracts an individual
34piece or the full URL from it.
35
36The *part* argument specifies which part to extract (see list below) and
37*content* points to a 'char *' to get updated to point to a newly
38allocated string with the contents.
39
40The *flags* argument is a bitmask with individual features.
41
42The returned content pointer must be freed with curl_free(3) after use.
43
44# FLAGS
45
46The flags argument is zero, one or more bits set in a bitmask.
47
48## CURLU_DEFAULT_PORT
49
50If the handle has no port stored, this option makes curl_url_get(3)
51return the default port for the used scheme.
52
53## CURLU_DEFAULT_SCHEME
54
55If the handle has no scheme stored, this option makes curl_url_get(3)
56return the default scheme instead of error.
57
58## CURLU_NO_DEFAULT_PORT
59
60Instructs curl_url_get(3) to not return a port number if it matches the
61default port for the scheme.
62
63## CURLU_URLDECODE
64
65Asks curl_url_get(3) to URL decode the contents before returning it. It
66does not decode the scheme, the port number or the full URL.
67
68The query component also gets plus-to-space conversion as a bonus when this
69bit is set.
70
71Note that this URL decoding is charset unaware and you get a zero terminated
72string back with data that could be intended for a particular encoding.
73
74If there are byte values lower than 32 in the decoded string, the get
75operation returns an error instead.
76
77## CURLU_URLENCODE
78
79If set, curl_url_get(3) URL encodes the hostname part when a full URL
80is retrieved. If not set (default), libcurl returns the URL with the host name
81"raw" to support IDN names to appear as-is. IDN host names are typically using
82non-ASCII bytes that otherwise gets percent-encoded.
83
84Note that even when not asking for URL encoding, the '%' (byte 37) is URL
85encoded to make sure the hostname remains valid.
86
87## CURLU_PUNYCODE
88
89If set and *CURLU_URLENCODE* is not set, and asked to retrieve the
90**CURLUPART_HOST** or **CURLUPART_URL** parts, libcurl returns the host
91name in its punycode version if it contains any non-ASCII octets (and is an
92IDN name).
93
94If libcurl is built without IDN capabilities, using this bit makes
95curl_url_get(3) return *CURLUE_LACKS_IDN* if the hostname contains
96anything outside the ASCII range.
97
98(Added in curl 7.88.0)
99
100## CURLU_PUNY2IDN
101
102If set and asked to retrieve the **CURLUPART_HOST** or **CURLUPART_URL**
103parts, libcurl returns the hostname in its IDN (International Domain Name)
104UTF-8 version if it otherwise is a punycode version. If the punycode name
105cannot be converted to IDN correctly, libcurl returns
106*CURLUE_BAD_HOSTNAME*.
107
108If libcurl is built without IDN capabilities, using this bit makes
109curl_url_get(3) return *CURLUE_LACKS_IDN* if the hostname is using
110punycode.
111
112(Added in curl 8.3.0)
113
114# PARTS
115
116## CURLUPART_URL
117
118When asked to return the full URL, curl_url_get(3) returns a normalized
119and possibly cleaned up version using all available URL parts.
120
121We advise using the *CURLU_PUNYCODE* option to get the URL as "normalized"
122as possible since IDN allows host names to be written in many different ways
123that still end up the same punycode version.
124
125## CURLUPART_SCHEME
126
127Scheme cannot be URL decoded on get.
128
129## CURLUPART_USER
130
131## CURLUPART_PASSWORD
132
133## CURLUPART_OPTIONS
134
135The options field is an optional field that might follow the password in the
136userinfo part. It is only recognized/used when parsing URLs for the following
137schemes: pop3, smtp and imap. The URL API still allows users to set and get
138this field independently of scheme when not parsing full URLs.
139
140## CURLUPART_HOST
141
142The hostname. If it is an IPv6 numeric address, the zone id is not part of it
143but is provided separately in *CURLUPART_ZONEID*. IPv6 numerical addresses
144are returned within brackets ([]).
145
146IPv6 names are normalized when set, which should make them as short as
147possible while maintaining correct syntax.
148
149## CURLUPART_ZONEID
150
151If the hostname is a numeric IPv6 address, this field might also be set.
152
153## CURLUPART_PORT
154
155A port cannot be URL decoded on get. This number is returned in a string just
156like all other parts. That string is guaranteed to hold a valid port number in
157ASCII using base 10.
158
159## CURLUPART_PATH
160
161The *part* is always at least a slash ('/') even if no path was supplied
162in the URL. A URL path always starts with a slash.
163
164## CURLUPART_QUERY
165
166The initial question mark that denotes the beginning of the query part is a
167delimiter only. It is not part of the query contents.
168
169A not-present query returns *part* set to NULL.
170A zero-length query returns *part* as a zero-length string.
171
172The query part gets pluses converted to space when asked to URL decode on get
173with the CURLU_URLDECODE bit.
174
175## CURLUPART_FRAGMENT
176
177The initial hash sign that denotes the beginning of the fragment is a
178delimiter only. It is not part of the fragment contents.
179
180# EXAMPLE
181
182~~~c
183int main(void)
184{
185  CURLUcode rc;
186  CURLU *url = curl_url();
187  rc = curl_url_set(url, CURLUPART_URL, "https://example.com", 0);
188  if(!rc) {
189    char *scheme;
190    rc = curl_url_get(url, CURLUPART_SCHEME, &scheme, 0);
191    if(!rc) {
192      printf("the scheme is %s\n", scheme);
193      curl_free(scheme);
194    }
195    curl_url_cleanup(url);
196  }
197}
198~~~
199
200# AVAILABILITY
201
202Added in 7.62.0. CURLUPART_ZONEID was added in 7.65.0.
203
204# RETURN VALUE
205
206Returns a CURLUcode error value, which is CURLUE_OK (0) if everything went
207fine. See the libcurl-errors(3) man page for the full list with
208descriptions.
209
210If this function returns an error, no URL part is returned.
211