forked from php/doc-en
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathparse-url.xml
More file actions
323 lines (314 loc) · 9.41 KB
/
parse-url.xml
File metadata and controls
323 lines (314 loc) · 9.41 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
<?xml version="1.0" encoding="utf-8"?>
<!-- $Revision$ -->
<refentry xml:id="function.parse-url" xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink">
<refnamediv>
<refname>parse_url</refname>
<refpurpose>Parse a URL and return its components</refpurpose>
</refnamediv>
<refsect1 role="description">
&reftitle.description;
<methodsynopsis>
<type class="union"><type>int</type><type>string</type><type>array</type><type>null</type><type>false</type></type><methodname>parse_url</methodname>
<methodparam><type>string</type><parameter>url</parameter></methodparam>
<methodparam choice="opt"><type>int</type><parameter>component</parameter><initializer>-1</initializer></methodparam>
</methodsynopsis>
<para>
This function parses a URL and returns an associative array containing any
of the various components of the URL that are present.
The values of the array elements are <emphasis>not</emphasis> URL decoded.
</para>
<para>
This function is <emphasis role="strong">not</emphasis> meant to validate
the given URL, it only breaks it up into the parts listed below. Partial and invalid
URLs are also accepted, <function>parse_url</function> tries its best to
parse them correctly.
</para>
<caution>
<simpara>
This function does not follow any established URI or URL standard.
It will return incorrect or non-sense results for relative or malformed
URLs. Even for valid URLs the result may differ from that of a
different URL parser, since there are multiple different URL-related
standards that target different use cases and that differ in their
requirements.
</simpara>
<simpara>
Processing an URL with parsers following different URL standards is a
common source of security vulnerabilities. As an example, validating
an URL against an allow-list of acceptable hostnames with parser A
might be ineffective when the actual retrieval of the resource uses
parser B that extracts hostnames differently.
</simpara>
<simpara>
The <classname>Uri\Rfc3986\Uri</classname> and <classname>Uri\WhatWg\Url</classname>
classes strictly follow the RFC 3986 and WHATWG URL Standards respectively.
It is strongly recommended to use these classes for all newly written code
and to migrate existing uses of the <function>parse_url</function> function
to these classes, unless the <function>parse_url</function> behavior needs
to be preserved for compatibility reasons.
</simpara>
</caution>
</refsect1>
<refsect1 role="parameters">
&reftitle.parameters;
<para>
<variablelist>
<varlistentry>
<term><parameter>url</parameter></term>
<listitem>
<para>
The URL to parse.
</para>
</listitem>
</varlistentry>
</variablelist>
<variablelist>
<varlistentry>
<term><parameter>component</parameter></term>
<listitem>
<para>
Specify one of <constant>PHP_URL_SCHEME</constant>,
<constant>PHP_URL_HOST</constant>, <constant>PHP_URL_PORT</constant>,
<constant>PHP_URL_USER</constant>, <constant>PHP_URL_PASS</constant>,
<constant>PHP_URL_PATH</constant>, <constant>PHP_URL_QUERY</constant>
or <constant>PHP_URL_FRAGMENT</constant> to retrieve just a specific
URL component as a <type>string</type> (except when
<constant>PHP_URL_PORT</constant> is given, in which case the return
value will be an <type>int</type>).
</para>
</listitem>
</varlistentry>
</variablelist>
</para>
</refsect1>
<refsect1 role="returnvalues">
&reftitle.returnvalues;
<para>
On seriously malformed URLs, <function>parse_url</function> may return
&false;.
</para>
<para>
If the <parameter>component</parameter> parameter is omitted, an
associative <type>array</type> is returned. At least one element will be
present within the array. Potential keys within this array are:
<itemizedlist>
<listitem>
<simpara>
<varname remap="structfield">scheme</varname> - e.g. <literal>http</literal>
</simpara>
</listitem>
<listitem>
<simpara>
<varname remap="structfield">host</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname remap="structfield">port</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname remap="structfield">user</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname remap="structfield">pass</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname remap="structfield">path</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname remap="structfield">query</varname> - after the question mark <literal>?</literal>
</simpara>
</listitem>
<listitem>
<simpara>
<varname remap="structfield">fragment</varname> - after the hashmark <literal>#</literal>
</simpara>
</listitem>
</itemizedlist>
</para>
<para>
If the <parameter>component</parameter> parameter is specified,
<function>parse_url</function> returns a <type>string</type> (or an
<type>int</type>, in the case of <constant>PHP_URL_PORT</constant>)
instead of an <type>array</type>. If the requested component doesn't exist
within the given URL, &null; will be returned.
As of PHP 8.0.0, <function>parse_url</function> distinguishes absent and empty
queries and fragments:
</para>
<para>
<informalexample>
<screen>
<![CDATA[
http://example.com/foo → query = null, fragment = null
http://example.com/foo? → query = "", fragment = null
http://example.com/foo# → query = null, fragment = ""
http://example.com/foo?# → query = "", fragment = ""
]]>
</screen>
</informalexample>
</para>
<para>
Previously all cases resulted in query and fragment being &null;.
</para>
<para>
Note that control characters (cf. <function>ctype_cntrl</function>) in the
components are replaced with underscores (<literal>_</literal>).
</para>
</refsect1>
<refsect1 role="changelog">
&reftitle.changelog;
<informaltable>
<tgroup cols="2">
<thead>
<row>
<entry>&Version;</entry>
<entry>&Description;</entry>
</row>
</thead>
<tbody>
<row>
<entry>8.0.0</entry>
<entry>
<function>parse_url</function> will now distinguish absent and empty queries
and fragments.
</entry>
</row>
</tbody>
</tgroup>
</informaltable>
</refsect1>
<refsect1 role="examples">
&reftitle.examples;
<para>
<example>
<title>A <function>parse_url</function> example</title>
<programlisting role="php">
<![CDATA[
<?php
$url = 'http://username:password@hostname:9090/path?arg=value#anchor';
var_dump(parse_url($url));
var_dump(parse_url($url, PHP_URL_SCHEME));
var_dump(parse_url($url, PHP_URL_USER));
var_dump(parse_url($url, PHP_URL_PASS));
var_dump(parse_url($url, PHP_URL_HOST));
var_dump(parse_url($url, PHP_URL_PORT));
var_dump(parse_url($url, PHP_URL_PATH));
var_dump(parse_url($url, PHP_URL_QUERY));
var_dump(parse_url($url, PHP_URL_FRAGMENT));
?>
]]>
</programlisting>
&example.outputs;
<screen>
<![CDATA[
array(8) {
["scheme"]=>
string(4) "http"
["host"]=>
string(8) "hostname"
["port"]=>
int(9090)
["user"]=>
string(8) "username"
["pass"]=>
string(8) "password"
["path"]=>
string(5) "/path"
["query"]=>
string(9) "arg=value"
["fragment"]=>
string(6) "anchor"
}
string(4) "http"
string(8) "username"
string(8) "password"
string(8) "hostname"
int(9090)
string(5) "/path"
string(9) "arg=value"
string(6) "anchor"
]]>
</screen>
</example>
</para>
<para>
<example>
<title>A <function>parse_url</function> example with missing scheme</title>
<programlisting role="php">
<![CDATA[
<?php
$url = '//www.example.com/path?googleguy=googley';
// Prior to 5.4.7 this would show the path as "//www.example.com/path"
var_dump(parse_url($url));
?>
]]>
</programlisting>
&example.outputs;
<screen>
<![CDATA[
array(3) {
["host"]=>
string(15) "www.example.com"
["path"]=>
string(5) "/path"
["query"]=>
string(17) "googleguy=googley"
}
]]>
</screen>
</example>
</para>
</refsect1>
<refsect1 role="notes">
&reftitle.notes;
<note>
<para>
This function is intended specifically for the purpose of parsing URLs
and not URIs. However, to comply with PHP's backwards compatibility
requirements it makes an exception for the <literal>file://</literal> scheme where triple
slashes (<literal>file:///...</literal>) are allowed. For any other scheme this is invalid.
</para>
</note>
</refsect1>
<refsect1 role="seealso">
&reftitle.seealso;
<para>
<simplelist>
<member><function>pathinfo</function></member>
<member><function>parse_str</function></member>
<member><function>http_build_query</function></member>
<member><function>dirname</function></member>
<member><function>basename</function></member>
<member><link xlink:href="&url.rfc;3986">RFC 3986</link></member>
</simplelist>
</para>
</refsect1>
</refentry>
<!-- Keep this comment at the end of the file
Local variables:
mode: sgml
sgml-omittag:t
sgml-shorttag:t
sgml-minimize-attributes:nil
sgml-always-quote-attributes:t
sgml-indent-step:1
sgml-indent-data:t
indent-tabs-mode:nil
sgml-parent-document:nil
sgml-default-dtd-file:"~/.phpdoc/manual.ced"
sgml-exposed-tags:nil
sgml-local-catalogs:nil
sgml-local-ecat-files:nil
End:
vim600: syn=xml fen fdm=syntax fdl=2 si
vim: et tw=78 syn=sgml
vi: ts=1 sw=1
-->