• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1<html>
2<head>
3<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
4<title>HTMLparser: interface for an HTML 4.0 non-verifying parser</title>
5<meta name="generator" content="Libxml2 devhelp stylesheet">
6<link rel="start" href="index.html" title="libxml2 Reference Manual">
7<link rel="up" href="general.html" title="API">
8<link rel="stylesheet" href="style.css" type="text/css">
9<link rel="chapter" href="general.html" title="API">
10</head>
11<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
12<table class="navigation" width="100%" summary="Navigation header" cellpadding="2" cellspacing="2"><tr valign="middle">
13<td><a accesskey="u" href="general.html"><img src="up.png" width="24" height="24" border="0" alt="Up"></a></td>
14<td><a accesskey="h" href="index.html"><img src="home.png" width="24" height="24" border="0" alt="Home"></a></td>
15<td><a accesskey="n" href="libxml2-HTMLtree.html"><img src="right.png" width="24" height="24" border="0" alt="Next"></a></td>
16<th width="100%" align="center">libxml2 Reference Manual</th>
17</tr></table>
18<h2><span class="refentrytitle">HTMLparser</span></h2>
19<p>HTMLparser - interface for an HTML 4.0 non-verifying parser</p>
20<p>this module implements an HTML 4.0 non-verifying parser with API compatible with the XML parser ones. It should be able to parse "real world" HTML, even if severely broken from a specification point of view. </p>
21<p>Author(s): Daniel Veillard </p>
22<div class="refsynopsisdiv">
23<h2>Synopsis</h2>
24<pre class="synopsis">#define <a href="#htmlDefaultSubelement">htmlDefaultSubelement</a>(elt);
25#define <a href="#htmlElementAllowedHereDesc">htmlElementAllowedHereDesc</a>(parent, elt);
26#define <a href="#htmlRequiredAttrs">htmlRequiredAttrs</a>(elt);
27typedef <a href="libxml2-tree.html#xmlDocPtr">xmlDocPtr</a> <a href="#htmlDocPtr">htmlDocPtr</a>;
28typedef struct _htmlElemDesc <a href="#htmlElemDesc">htmlElemDesc</a>;
29typedef <a href="libxml2-HTMLparser.html#htmlElemDesc">htmlElemDesc</a> * <a href="#htmlElemDescPtr">htmlElemDescPtr</a>;
30typedef struct _htmlEntityDesc <a href="#htmlEntityDesc">htmlEntityDesc</a>;
31typedef <a href="libxml2-HTMLparser.html#htmlEntityDesc">htmlEntityDesc</a> * <a href="#htmlEntityDescPtr">htmlEntityDescPtr</a>;
32typedef <a href="libxml2-tree.html#xmlNodePtr">xmlNodePtr</a> <a href="#htmlNodePtr">htmlNodePtr</a>;
33typedef <a href="libxml2-tree.html#xmlParserCtxt">xmlParserCtxt</a> <a href="#htmlParserCtxt">htmlParserCtxt</a>;
34typedef <a href="libxml2-tree.html#xmlParserCtxtPtr">xmlParserCtxtPtr</a> <a href="#htmlParserCtxtPtr">htmlParserCtxtPtr</a>;
35typedef <a href="libxml2-tree.html#xmlParserInput">xmlParserInput</a> <a href="#htmlParserInput">htmlParserInput</a>;
36typedef <a href="libxml2-tree.html#xmlParserInputPtr">xmlParserInputPtr</a> <a href="#htmlParserInputPtr">htmlParserInputPtr</a>;
37typedef <a href="libxml2-parser.html#xmlParserNodeInfo">xmlParserNodeInfo</a> <a href="#htmlParserNodeInfo">htmlParserNodeInfo</a>;
38typedef enum <a href="#htmlParserOption">htmlParserOption</a>;
39typedef <a href="libxml2-tree.html#xmlSAXHandler">xmlSAXHandler</a> <a href="#htmlSAXHandler">htmlSAXHandler</a>;
40typedef <a href="libxml2-tree.html#xmlSAXHandlerPtr">xmlSAXHandlerPtr</a> <a href="#htmlSAXHandlerPtr">htmlSAXHandlerPtr</a>;
41typedef enum <a href="#htmlStatus">htmlStatus</a>;
42int	<a href="#UTF8ToHtml">UTF8ToHtml</a>			(unsigned char * out, <br>					 int * outlen, <br>					 const unsigned char * in, <br>					 int * inlen);
43<a href="libxml2-HTMLparser.html#htmlStatus">htmlStatus</a>	<a href="#htmlAttrAllowed">htmlAttrAllowed</a>		(const <a href="libxml2-HTMLparser.html#htmlElemDesc">htmlElemDesc</a> * elt, <br>					 const <a href="libxml2-xmlstring.html#xmlChar">xmlChar</a> * attr, <br>					 int legacy);
44int	<a href="#htmlAutoCloseTag">htmlAutoCloseTag</a>		(<a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a> doc, <br>					 const <a href="libxml2-xmlstring.html#xmlChar">xmlChar</a> * name, <br>					 <a href="libxml2-HTMLparser.html#htmlNodePtr">htmlNodePtr</a> elem);
45<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a>	<a href="#htmlCreateFileParserCtxt">htmlCreateFileParserCtxt</a>	(const char * filename, <br>							 const char * encoding);
46<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a>	<a href="#htmlCreateMemoryParserCtxt">htmlCreateMemoryParserCtxt</a>	(const char * buffer, <br>							 int size);
47<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a>	<a href="#htmlCreatePushParserCtxt">htmlCreatePushParserCtxt</a>	(<a href="libxml2-HTMLparser.html#htmlSAXHandlerPtr">htmlSAXHandlerPtr</a> sax, <br>							 void * user_data, <br>							 const char * chunk, <br>							 int size, <br>							 const char * filename, <br>							 <a href="libxml2-encoding.html#xmlCharEncoding">xmlCharEncoding</a> enc);
48<a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a>	<a href="#htmlCtxtReadDoc">htmlCtxtReadDoc</a>		(<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a> ctxt, <br>					 const <a href="libxml2-xmlstring.html#xmlChar">xmlChar</a> * cur, <br>					 const char * URL, <br>					 const char * encoding, <br>					 int options);
49<a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a>	<a href="#htmlCtxtReadFd">htmlCtxtReadFd</a>		(<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a> ctxt, <br>					 int fd, <br>					 const char * URL, <br>					 const char * encoding, <br>					 int options);
50<a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a>	<a href="#htmlCtxtReadFile">htmlCtxtReadFile</a>	(<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a> ctxt, <br>					 const char * filename, <br>					 const char * encoding, <br>					 int options);
51<a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a>	<a href="#htmlCtxtReadIO">htmlCtxtReadIO</a>		(<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a> ctxt, <br>					 <a href="libxml2-xmlIO.html#xmlInputReadCallback">xmlInputReadCallback</a> ioread, <br>					 <a href="libxml2-xmlIO.html#xmlInputCloseCallback">xmlInputCloseCallback</a> ioclose, <br>					 void * ioctx, <br>					 const char * URL, <br>					 const char * encoding, <br>					 int options);
52<a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a>	<a href="#htmlCtxtReadMemory">htmlCtxtReadMemory</a>	(<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a> ctxt, <br>					 const char * buffer, <br>					 int size, <br>					 const char * URL, <br>					 const char * encoding, <br>					 int options);
53void	<a href="#htmlCtxtReset">htmlCtxtReset</a>			(<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a> ctxt);
54int	<a href="#htmlCtxtUseOptions">htmlCtxtUseOptions</a>		(<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a> ctxt, <br>					 int options);
55int	<a href="#htmlElementAllowedHere">htmlElementAllowedHere</a>		(const <a href="libxml2-HTMLparser.html#htmlElemDesc">htmlElemDesc</a> * parent, <br>					 const <a href="libxml2-xmlstring.html#xmlChar">xmlChar</a> * elt);
56<a href="libxml2-HTMLparser.html#htmlStatus">htmlStatus</a>	<a href="#htmlElementStatusHere">htmlElementStatusHere</a>	(const <a href="libxml2-HTMLparser.html#htmlElemDesc">htmlElemDesc</a> * parent, <br>					 const <a href="libxml2-HTMLparser.html#htmlElemDesc">htmlElemDesc</a> * elt);
57int	<a href="#htmlEncodeEntities">htmlEncodeEntities</a>		(unsigned char * out, <br>					 int * outlen, <br>					 const unsigned char * in, <br>					 int * inlen, <br>					 int quoteChar);
58const <a href="libxml2-HTMLparser.html#htmlEntityDesc">htmlEntityDesc</a> *	<a href="#htmlEntityLookup">htmlEntityLookup</a>	(const <a href="libxml2-xmlstring.html#xmlChar">xmlChar</a> * name);
59const <a href="libxml2-HTMLparser.html#htmlEntityDesc">htmlEntityDesc</a> *	<a href="#htmlEntityValueLookup">htmlEntityValueLookup</a>	(unsigned int value);
60void	<a href="#htmlFreeParserCtxt">htmlFreeParserCtxt</a>		(<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a> ctxt);
61int	<a href="#htmlHandleOmittedElem">htmlHandleOmittedElem</a>		(int val);
62void	<a href="#htmlInitAutoClose">htmlInitAutoClose</a>		(void);
63int	<a href="#htmlIsAutoClosed">htmlIsAutoClosed</a>		(<a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a> doc, <br>					 <a href="libxml2-HTMLparser.html#htmlNodePtr">htmlNodePtr</a> elem);
64int	<a href="#htmlIsScriptAttribute">htmlIsScriptAttribute</a>		(const <a href="libxml2-xmlstring.html#xmlChar">xmlChar</a> * name);
65<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a>	<a href="#htmlNewParserCtxt">htmlNewParserCtxt</a>	(void);
66<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a>	<a href="#htmlNewSAXParserCtxt">htmlNewSAXParserCtxt</a>	(<a href="libxml2-HTMLparser.html#htmlSAXHandlerPtr">htmlSAXHandlerPtr</a> sax, <br>						 void * userData);
67<a href="libxml2-HTMLparser.html#htmlStatus">htmlStatus</a>	<a href="#htmlNodeStatus">htmlNodeStatus</a>		(const <a href="libxml2-HTMLparser.html#htmlNodePtr">htmlNodePtr</a> node, <br>					 int legacy);
68int	<a href="#htmlParseCharRef">htmlParseCharRef</a>		(<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a> ctxt);
69int	<a href="#htmlParseChunk">htmlParseChunk</a>			(<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a> ctxt, <br>					 const char * chunk, <br>					 int size, <br>					 int terminate);
70<a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a>	<a href="#htmlParseDoc">htmlParseDoc</a>		(const <a href="libxml2-xmlstring.html#xmlChar">xmlChar</a> * cur, <br>					 const char * encoding);
71int	<a href="#htmlParseDocument">htmlParseDocument</a>		(<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a> ctxt);
72void	<a href="#htmlParseElement">htmlParseElement</a>		(<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a> ctxt);
73const <a href="libxml2-HTMLparser.html#htmlEntityDesc">htmlEntityDesc</a> *	<a href="#htmlParseEntityRef">htmlParseEntityRef</a>	(<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a> ctxt, <br>						 const <a href="libxml2-xmlstring.html#xmlChar">xmlChar</a> ** str);
74<a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a>	<a href="#htmlParseFile">htmlParseFile</a>		(const char * filename, <br>					 const char * encoding);
75<a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a>	<a href="#htmlReadDoc">htmlReadDoc</a>		(const <a href="libxml2-xmlstring.html#xmlChar">xmlChar</a> * cur, <br>					 const char * URL, <br>					 const char * encoding, <br>					 int options);
76<a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a>	<a href="#htmlReadFd">htmlReadFd</a>		(int fd, <br>					 const char * URL, <br>					 const char * encoding, <br>					 int options);
77<a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a>	<a href="#htmlReadFile">htmlReadFile</a>		(const char * filename, <br>					 const char * encoding, <br>					 int options);
78<a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a>	<a href="#htmlReadIO">htmlReadIO</a>		(<a href="libxml2-xmlIO.html#xmlInputReadCallback">xmlInputReadCallback</a> ioread, <br>					 <a href="libxml2-xmlIO.html#xmlInputCloseCallback">xmlInputCloseCallback</a> ioclose, <br>					 void * ioctx, <br>					 const char * URL, <br>					 const char * encoding, <br>					 int options);
79<a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a>	<a href="#htmlReadMemory">htmlReadMemory</a>		(const char * buffer, <br>					 int size, <br>					 const char * URL, <br>					 const char * encoding, <br>					 int options);
80<a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a>	<a href="#htmlSAXParseDoc">htmlSAXParseDoc</a>		(const <a href="libxml2-xmlstring.html#xmlChar">xmlChar</a> * cur, <br>					 const char * encoding, <br>					 <a href="libxml2-HTMLparser.html#htmlSAXHandlerPtr">htmlSAXHandlerPtr</a> sax, <br>					 void * userData);
81<a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a>	<a href="#htmlSAXParseFile">htmlSAXParseFile</a>	(const char * filename, <br>					 const char * encoding, <br>					 <a href="libxml2-HTMLparser.html#htmlSAXHandlerPtr">htmlSAXHandlerPtr</a> sax, <br>					 void * userData);
82const <a href="libxml2-HTMLparser.html#htmlElemDesc">htmlElemDesc</a> *	<a href="#htmlTagLookup">htmlTagLookup</a>	(const <a href="libxml2-xmlstring.html#xmlChar">xmlChar</a> * tag);
83</pre>
84</div>
85<div class="refsect1" lang="en"><h2>Description</h2></div>
86<div class="refsect1" lang="en">
87<h2>Details</h2>
88<div class="refsect2" lang="en">
89<div class="refsect2" lang="en">
90<h3>
91<a name="htmlDefaultSubelement">Macro </a>htmlDefaultSubelement</h3>
92<pre class="programlisting">#define <a href="#htmlDefaultSubelement">htmlDefaultSubelement</a>(elt);
93</pre>
94<p>Returns the default subelement for this element</p>
95<div class="variablelist"><table border="0">
96<col align="left">
97<tbody><tr>
98<td><span class="term"><i><tt>elt</tt></i>:</span></td>
99<td>HTML element</td>
100</tr></tbody>
101</table></div>
102</div>
103<hr>
104<div class="refsect2" lang="en">
105<h3>
106<a name="htmlElementAllowedHereDesc">Macro </a>htmlElementAllowedHereDesc</h3>
107<pre class="programlisting">#define <a href="#htmlElementAllowedHereDesc">htmlElementAllowedHereDesc</a>(parent, elt);
108</pre>
109<p>Checks whether an HTML element description may be a direct child of the specified element. Returns 1 if allowed; 0 otherwise.</p>
110<div class="variablelist"><table border="0">
111<col align="left">
112<tbody>
113<tr>
114<td><span class="term"><i><tt>parent</tt></i>:</span></td>
115<td>HTML parent element</td>
116</tr>
117<tr>
118<td><span class="term"><i><tt>elt</tt></i>:</span></td>
119<td>HTML element</td>
120</tr>
121</tbody>
122</table></div>
123</div>
124<hr>
125<div class="refsect2" lang="en">
126<h3>
127<a name="htmlRequiredAttrs">Macro </a>htmlRequiredAttrs</h3>
128<pre class="programlisting">#define <a href="#htmlRequiredAttrs">htmlRequiredAttrs</a>(elt);
129</pre>
130<p>Returns the attributes required for the specified element.</p>
131<div class="variablelist"><table border="0">
132<col align="left">
133<tbody><tr>
134<td><span class="term"><i><tt>elt</tt></i>:</span></td>
135<td>HTML element</td>
136</tr></tbody>
137</table></div>
138</div>
139<hr>
140<div class="refsect2" lang="en">
141<h3>
142<a name="htmlDocPtr">Typedef </a>htmlDocPtr</h3>
143<pre class="programlisting"><a href="libxml2-tree.html#xmlDocPtr">xmlDocPtr</a> htmlDocPtr;
144</pre>
145<p></p>
146</div>
147<hr>
148<div class="refsect2" lang="en">
149<h3>
150<a name="htmlElemDesc">Structure </a>htmlElemDesc</h3>
151<pre class="programlisting">struct _htmlElemDesc {
152    const char *	name	: The tag name
153    char	startTag	: Whether the start tag can be implied
154    char	endTag	: Whether the end tag can be implied
155    char	saveEndTag	: Whether the end tag should be saved
156    char	empty	: Is this an empty element ?
157    char	depr	: Is this a deprecated element ?
158    char	dtd	: 1: only in Loose DTD, 2: only Frameset one
159    char	isinline	: is this a block 0 or inline 1 element
160    const char *	desc	: the description NRK Jan.2003 * New fields encapsulating HTML structur
161    const char **	subelts	: allowed sub-elements of this element
162    const char *	defaultsubelt	: subelement for suggested auto-repair if necessary or NULL
163    const char **	attrs_opt	: Optional Attributes
164    const char **	attrs_depr	: Additional deprecated attributes
165    const char **	attrs_req	: Required attributes
166} htmlElemDesc;
167</pre>
168<p></p>
169</div>
170<hr>
171<div class="refsect2" lang="en">
172<h3>
173<a name="htmlElemDescPtr">Typedef </a>htmlElemDescPtr</h3>
174<pre class="programlisting"><a href="libxml2-HTMLparser.html#htmlElemDesc">htmlElemDesc</a> * htmlElemDescPtr;
175</pre>
176<p></p>
177</div>
178<hr>
179<div class="refsect2" lang="en">
180<h3>
181<a name="htmlEntityDesc">Structure </a>htmlEntityDesc</h3>
182<pre class="programlisting">struct _htmlEntityDesc {
183    unsigned int	value	: the UNICODE value for the character
184    const char *	name	: The entity name
185    const char *	desc	: the description
186} htmlEntityDesc;
187</pre>
188<p></p>
189</div>
190<hr>
191<div class="refsect2" lang="en">
192<h3>
193<a name="htmlEntityDescPtr">Typedef </a>htmlEntityDescPtr</h3>
194<pre class="programlisting"><a href="libxml2-HTMLparser.html#htmlEntityDesc">htmlEntityDesc</a> * htmlEntityDescPtr;
195</pre>
196<p></p>
197</div>
198<hr>
199<div class="refsect2" lang="en">
200<h3>
201<a name="htmlNodePtr">Typedef </a>htmlNodePtr</h3>
202<pre class="programlisting"><a href="libxml2-tree.html#xmlNodePtr">xmlNodePtr</a> htmlNodePtr;
203</pre>
204<p></p>
205</div>
206<hr>
207<div class="refsect2" lang="en">
208<h3>
209<a name="htmlParserCtxt">Typedef </a>htmlParserCtxt</h3>
210<pre class="programlisting"><a href="libxml2-tree.html#xmlParserCtxt">xmlParserCtxt</a> htmlParserCtxt;
211</pre>
212<p></p>
213</div>
214<hr>
215<div class="refsect2" lang="en">
216<h3>
217<a name="htmlParserCtxtPtr">Typedef </a>htmlParserCtxtPtr</h3>
218<pre class="programlisting"><a href="libxml2-tree.html#xmlParserCtxtPtr">xmlParserCtxtPtr</a> htmlParserCtxtPtr;
219</pre>
220<p></p>
221</div>
222<hr>
223<div class="refsect2" lang="en">
224<h3>
225<a name="htmlParserInput">Typedef </a>htmlParserInput</h3>
226<pre class="programlisting"><a href="libxml2-tree.html#xmlParserInput">xmlParserInput</a> htmlParserInput;
227</pre>
228<p></p>
229</div>
230<hr>
231<div class="refsect2" lang="en">
232<h3>
233<a name="htmlParserInputPtr">Typedef </a>htmlParserInputPtr</h3>
234<pre class="programlisting"><a href="libxml2-tree.html#xmlParserInputPtr">xmlParserInputPtr</a> htmlParserInputPtr;
235</pre>
236<p></p>
237</div>
238<hr>
239<div class="refsect2" lang="en">
240<h3>
241<a name="htmlParserNodeInfo">Typedef </a>htmlParserNodeInfo</h3>
242<pre class="programlisting"><a href="libxml2-parser.html#xmlParserNodeInfo">xmlParserNodeInfo</a> htmlParserNodeInfo;
243</pre>
244<p></p>
245</div>
246<hr>
247<div class="refsect2" lang="en">
248<h3>
249<a name="htmlParserOption">Enum </a>htmlParserOption</h3>
250<pre class="programlisting">enum <a href="#htmlParserOption">htmlParserOption</a> {
251    <a name="HTML_PARSE_RECOVER">HTML_PARSE_RECOVER</a> = 1 /* Relaxed parsing */
252    <a name="HTML_PARSE_NODEFDTD">HTML_PARSE_NODEFDTD</a> = 4 /* do not default a doctype if not found */
253    <a name="HTML_PARSE_NOERROR">HTML_PARSE_NOERROR</a> = 32 /* suppress error reports */
254    <a name="HTML_PARSE_NOWARNING">HTML_PARSE_NOWARNING</a> = 64 /* suppress warning reports */
255    <a name="HTML_PARSE_PEDANTIC">HTML_PARSE_PEDANTIC</a> = 128 /* pedantic error reporting */
256    <a name="HTML_PARSE_NOBLANKS">HTML_PARSE_NOBLANKS</a> = 256 /* remove blank nodes */
257    <a name="HTML_PARSE_NONET">HTML_PARSE_NONET</a> = 2048 /* Forbid network access */
258    <a name="HTML_PARSE_NOIMPLIED">HTML_PARSE_NOIMPLIED</a> = 8192 /* Do not add implied html/body... elements */
259    <a name="HTML_PARSE_COMPACT">HTML_PARSE_COMPACT</a> = 65536 /* compact small text nodes */
260    <a name="HTML_PARSE_IGNORE_ENC">HTML_PARSE_IGNORE_ENC</a> = 2097152 /*  ignore internal document encoding hint */
261};
262</pre>
263<p></p>
264</div>
265<hr>
266<div class="refsect2" lang="en">
267<h3>
268<a name="htmlSAXHandler">Typedef </a>htmlSAXHandler</h3>
269<pre class="programlisting"><a href="libxml2-tree.html#xmlSAXHandler">xmlSAXHandler</a> htmlSAXHandler;
270</pre>
271<p></p>
272</div>
273<hr>
274<div class="refsect2" lang="en">
275<h3>
276<a name="htmlSAXHandlerPtr">Typedef </a>htmlSAXHandlerPtr</h3>
277<pre class="programlisting"><a href="libxml2-tree.html#xmlSAXHandlerPtr">xmlSAXHandlerPtr</a> htmlSAXHandlerPtr;
278</pre>
279<p></p>
280</div>
281<hr>
282<div class="refsect2" lang="en">
283<h3>
284<a name="htmlStatus">Enum </a>htmlStatus</h3>
285<pre class="programlisting">enum <a href="#htmlStatus">htmlStatus</a> {
286    <a name="HTML_NA">HTML_NA</a> = 0 /* something we don't check at all */
287    <a name="HTML_INVALID">HTML_INVALID</a> = 1
288    <a name="HTML_DEPRECATED">HTML_DEPRECATED</a> = 2
289    <a name="HTML_VALID">HTML_VALID</a> = 4
290    <a name="HTML_REQUIRED">HTML_REQUIRED</a> = 12 /*  VALID bit set so ( &amp; HTML_VALID ) is TRUE */
291};
292</pre>
293<p></p>
294</div>
295<hr>
296<div class="refsect2" lang="en">
297<h3>
298<a name="UTF8ToHtml"></a>UTF8ToHtml ()</h3>
299<pre class="programlisting">int	UTF8ToHtml			(unsigned char * out, <br>					 int * outlen, <br>					 const unsigned char * in, <br>					 int * inlen)<br>
300</pre>
301<p>Take a block of UTF-8 chars in and try to convert it to an ASCII plus HTML entities block of chars out.</p>
302<div class="variablelist"><table border="0">
303<col align="left">
304<tbody>
305<tr>
306<td><span class="term"><i><tt>out</tt></i>:</span></td>
307<td>a pointer to an array of bytes to store the result</td>
308</tr>
309<tr>
310<td><span class="term"><i><tt>outlen</tt></i>:</span></td>
311<td>the length of @out</td>
312</tr>
313<tr>
314<td><span class="term"><i><tt>in</tt></i>:</span></td>
315<td>a pointer to an array of UTF-8 chars</td>
316</tr>
317<tr>
318<td><span class="term"><i><tt>inlen</tt></i>:</span></td>
319<td>the length of @in</td>
320</tr>
321<tr>
322<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
323<td>0 if success, -2 if the transcoding fails, or -1 otherwise The value of @inlen after return is the number of octets consumed as the return value is positive, else unpredictable. The value of @outlen after return is the number of octets consumed.</td>
324</tr>
325</tbody>
326</table></div>
327</div>
328<hr>
329<div class="refsect2" lang="en">
330<h3>
331<a name="htmlAttrAllowed"></a>htmlAttrAllowed ()</h3>
332<pre class="programlisting"><a href="libxml2-HTMLparser.html#htmlStatus">htmlStatus</a>	htmlAttrAllowed		(const <a href="libxml2-HTMLparser.html#htmlElemDesc">htmlElemDesc</a> * elt, <br>					 const <a href="libxml2-xmlstring.html#xmlChar">xmlChar</a> * attr, <br>					 int legacy)<br>
333</pre>
334<p>Checks whether an <a href="libxml2-SAX.html#attribute">attribute</a> is valid for an element Has full knowledge of Required and Deprecated attributes</p>
335<div class="variablelist"><table border="0">
336<col align="left">
337<tbody>
338<tr>
339<td><span class="term"><i><tt>elt</tt></i>:</span></td>
340<td>HTML element</td>
341</tr>
342<tr>
343<td><span class="term"><i><tt>attr</tt></i>:</span></td>
344<td>HTML <a href="libxml2-SAX.html#attribute">attribute</a>
345</td>
346</tr>
347<tr>
348<td><span class="term"><i><tt>legacy</tt></i>:</span></td>
349<td>whether to allow deprecated attributes</td>
350</tr>
351<tr>
352<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
353<td>one of HTML_REQUIRED, HTML_VALID, HTML_DEPRECATED, <a href="libxml2-HTMLparser.html#HTML_INVALID">HTML_INVALID</a>
354</td>
355</tr>
356</tbody>
357</table></div>
358</div>
359<hr>
360<div class="refsect2" lang="en">
361<h3>
362<a name="htmlAutoCloseTag"></a>htmlAutoCloseTag ()</h3>
363<pre class="programlisting">int	htmlAutoCloseTag		(<a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a> doc, <br>					 const <a href="libxml2-xmlstring.html#xmlChar">xmlChar</a> * name, <br>					 <a href="libxml2-HTMLparser.html#htmlNodePtr">htmlNodePtr</a> elem)<br>
364</pre>
365<p>The HTML DTD allows a tag to implicitly close other tags. The list is kept in htmlStartClose array. This function checks if the element or one of it's children would autoclose the given tag.</p>
366<div class="variablelist"><table border="0">
367<col align="left">
368<tbody>
369<tr>
370<td><span class="term"><i><tt>doc</tt></i>:</span></td>
371<td>the HTML document</td>
372</tr>
373<tr>
374<td><span class="term"><i><tt>name</tt></i>:</span></td>
375<td>The tag name</td>
376</tr>
377<tr>
378<td><span class="term"><i><tt>elem</tt></i>:</span></td>
379<td>the HTML element</td>
380</tr>
381<tr>
382<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
383<td>1 if autoclose, 0 otherwise</td>
384</tr>
385</tbody>
386</table></div>
387</div>
388<hr>
389<div class="refsect2" lang="en">
390<h3>
391<a name="htmlCreateFileParserCtxt"></a>htmlCreateFileParserCtxt ()</h3>
392<pre class="programlisting"><a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a>	htmlCreateFileParserCtxt	(const char * filename, <br>							 const char * encoding)<br>
393</pre>
394<p>Create a parser context for a file content. Automatic support for ZLIB/Compress compressed document is provided by default if found at compile-time.</p>
395<div class="variablelist"><table border="0">
396<col align="left">
397<tbody>
398<tr>
399<td><span class="term"><i><tt>filename</tt></i>:</span></td>
400<td>the filename</td>
401</tr>
402<tr>
403<td><span class="term"><i><tt>encoding</tt></i>:</span></td>
404<td>a free form C string describing the HTML document encoding, or NULL</td>
405</tr>
406<tr>
407<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
408<td>the new parser context or NULL</td>
409</tr>
410</tbody>
411</table></div>
412</div>
413<hr>
414<div class="refsect2" lang="en">
415<h3>
416<a name="htmlCreateMemoryParserCtxt"></a>htmlCreateMemoryParserCtxt ()</h3>
417<pre class="programlisting"><a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a>	htmlCreateMemoryParserCtxt	(const char * buffer, <br>							 int size)<br>
418</pre>
419<p>Create a parser context for an HTML in-memory document.</p>
420<div class="variablelist"><table border="0">
421<col align="left">
422<tbody>
423<tr>
424<td><span class="term"><i><tt>buffer</tt></i>:</span></td>
425<td>a pointer to a char array</td>
426</tr>
427<tr>
428<td><span class="term"><i><tt>size</tt></i>:</span></td>
429<td>the size of the array</td>
430</tr>
431<tr>
432<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
433<td>the new parser context or NULL</td>
434</tr>
435</tbody>
436</table></div>
437</div>
438<hr>
439<div class="refsect2" lang="en">
440<h3>
441<a name="htmlCreatePushParserCtxt"></a>htmlCreatePushParserCtxt ()</h3>
442<pre class="programlisting"><a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a>	htmlCreatePushParserCtxt	(<a href="libxml2-HTMLparser.html#htmlSAXHandlerPtr">htmlSAXHandlerPtr</a> sax, <br>							 void * user_data, <br>							 const char * chunk, <br>							 int size, <br>							 const char * filename, <br>							 <a href="libxml2-encoding.html#xmlCharEncoding">xmlCharEncoding</a> enc)<br>
443</pre>
444<p>Create a parser context for using the HTML parser in push mode The value of @filename is used for fetching external entities and error/warning reports.</p>
445<div class="variablelist"><table border="0">
446<col align="left">
447<tbody>
448<tr>
449<td><span class="term"><i><tt>sax</tt></i>:</span></td>
450<td>a SAX handler</td>
451</tr>
452<tr>
453<td><span class="term"><i><tt>user_data</tt></i>:</span></td>
454<td>The user data returned on SAX callbacks</td>
455</tr>
456<tr>
457<td><span class="term"><i><tt>chunk</tt></i>:</span></td>
458<td>a pointer to an array of chars</td>
459</tr>
460<tr>
461<td><span class="term"><i><tt>size</tt></i>:</span></td>
462<td>number of chars in the array</td>
463</tr>
464<tr>
465<td><span class="term"><i><tt>filename</tt></i>:</span></td>
466<td>an optional file name or URI</td>
467</tr>
468<tr>
469<td><span class="term"><i><tt>enc</tt></i>:</span></td>
470<td>an optional encoding</td>
471</tr>
472<tr>
473<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
474<td>the new parser context or NULL</td>
475</tr>
476</tbody>
477</table></div>
478</div>
479<hr>
480<div class="refsect2" lang="en">
481<h3>
482<a name="htmlCtxtReadDoc"></a>htmlCtxtReadDoc ()</h3>
483<pre class="programlisting"><a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a>	htmlCtxtReadDoc		(<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a> ctxt, <br>					 const <a href="libxml2-xmlstring.html#xmlChar">xmlChar</a> * cur, <br>					 const char * URL, <br>					 const char * encoding, <br>					 int options)<br>
484</pre>
485<p>parse an XML in-memory document and build a tree. This reuses the existing @ctxt parser context</p>
486<div class="variablelist"><table border="0">
487<col align="left">
488<tbody>
489<tr>
490<td><span class="term"><i><tt>ctxt</tt></i>:</span></td>
491<td>an HTML parser context</td>
492</tr>
493<tr>
494<td><span class="term"><i><tt>cur</tt></i>:</span></td>
495<td>a pointer to a zero terminated string</td>
496</tr>
497<tr>
498<td><span class="term"><i><tt>URL</tt></i>:</span></td>
499<td>the base URL to use for the document</td>
500</tr>
501<tr>
502<td><span class="term"><i><tt>encoding</tt></i>:</span></td>
503<td>the document encoding, or NULL</td>
504</tr>
505<tr>
506<td><span class="term"><i><tt>options</tt></i>:</span></td>
507<td>a combination of htmlParserOption(s)</td>
508</tr>
509<tr>
510<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
511<td>the resulting document tree</td>
512</tr>
513</tbody>
514</table></div>
515</div>
516<hr>
517<div class="refsect2" lang="en">
518<h3>
519<a name="htmlCtxtReadFd"></a>htmlCtxtReadFd ()</h3>
520<pre class="programlisting"><a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a>	htmlCtxtReadFd		(<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a> ctxt, <br>					 int fd, <br>					 const char * URL, <br>					 const char * encoding, <br>					 int options)<br>
521</pre>
522<p>parse an XML from a file descriptor and build a tree. This reuses the existing @ctxt parser context</p>
523<div class="variablelist"><table border="0">
524<col align="left">
525<tbody>
526<tr>
527<td><span class="term"><i><tt>ctxt</tt></i>:</span></td>
528<td>an HTML parser context</td>
529</tr>
530<tr>
531<td><span class="term"><i><tt>fd</tt></i>:</span></td>
532<td>an open file descriptor</td>
533</tr>
534<tr>
535<td><span class="term"><i><tt>URL</tt></i>:</span></td>
536<td>the base URL to use for the document</td>
537</tr>
538<tr>
539<td><span class="term"><i><tt>encoding</tt></i>:</span></td>
540<td>the document encoding, or NULL</td>
541</tr>
542<tr>
543<td><span class="term"><i><tt>options</tt></i>:</span></td>
544<td>a combination of htmlParserOption(s)</td>
545</tr>
546<tr>
547<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
548<td>the resulting document tree</td>
549</tr>
550</tbody>
551</table></div>
552</div>
553<hr>
554<div class="refsect2" lang="en">
555<h3>
556<a name="htmlCtxtReadFile"></a>htmlCtxtReadFile ()</h3>
557<pre class="programlisting"><a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a>	htmlCtxtReadFile	(<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a> ctxt, <br>					 const char * filename, <br>					 const char * encoding, <br>					 int options)<br>
558</pre>
559<p>parse an XML file from the filesystem or the network. This reuses the existing @ctxt parser context</p>
560<div class="variablelist"><table border="0">
561<col align="left">
562<tbody>
563<tr>
564<td><span class="term"><i><tt>ctxt</tt></i>:</span></td>
565<td>an HTML parser context</td>
566</tr>
567<tr>
568<td><span class="term"><i><tt>filename</tt></i>:</span></td>
569<td>a file or URL</td>
570</tr>
571<tr>
572<td><span class="term"><i><tt>encoding</tt></i>:</span></td>
573<td>the document encoding, or NULL</td>
574</tr>
575<tr>
576<td><span class="term"><i><tt>options</tt></i>:</span></td>
577<td>a combination of htmlParserOption(s)</td>
578</tr>
579<tr>
580<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
581<td>the resulting document tree</td>
582</tr>
583</tbody>
584</table></div>
585</div>
586<hr>
587<div class="refsect2" lang="en">
588<h3>
589<a name="htmlCtxtReadIO"></a>htmlCtxtReadIO ()</h3>
590<pre class="programlisting"><a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a>	htmlCtxtReadIO		(<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a> ctxt, <br>					 <a href="libxml2-xmlIO.html#xmlInputReadCallback">xmlInputReadCallback</a> ioread, <br>					 <a href="libxml2-xmlIO.html#xmlInputCloseCallback">xmlInputCloseCallback</a> ioclose, <br>					 void * ioctx, <br>					 const char * URL, <br>					 const char * encoding, <br>					 int options)<br>
591</pre>
592<p>parse an HTML document from I/O functions and source and build a tree. This reuses the existing @ctxt parser context</p>
593<div class="variablelist"><table border="0">
594<col align="left">
595<tbody>
596<tr>
597<td><span class="term"><i><tt>ctxt</tt></i>:</span></td>
598<td>an HTML parser context</td>
599</tr>
600<tr>
601<td><span class="term"><i><tt>ioread</tt></i>:</span></td>
602<td>an I/O read function</td>
603</tr>
604<tr>
605<td><span class="term"><i><tt>ioclose</tt></i>:</span></td>
606<td>an I/O close function</td>
607</tr>
608<tr>
609<td><span class="term"><i><tt>ioctx</tt></i>:</span></td>
610<td>an I/O handler</td>
611</tr>
612<tr>
613<td><span class="term"><i><tt>URL</tt></i>:</span></td>
614<td>the base URL to use for the document</td>
615</tr>
616<tr>
617<td><span class="term"><i><tt>encoding</tt></i>:</span></td>
618<td>the document encoding, or NULL</td>
619</tr>
620<tr>
621<td><span class="term"><i><tt>options</tt></i>:</span></td>
622<td>a combination of htmlParserOption(s)</td>
623</tr>
624<tr>
625<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
626<td>the resulting document tree</td>
627</tr>
628</tbody>
629</table></div>
630</div>
631<hr>
632<div class="refsect2" lang="en">
633<h3>
634<a name="htmlCtxtReadMemory"></a>htmlCtxtReadMemory ()</h3>
635<pre class="programlisting"><a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a>	htmlCtxtReadMemory	(<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a> ctxt, <br>					 const char * buffer, <br>					 int size, <br>					 const char * URL, <br>					 const char * encoding, <br>					 int options)<br>
636</pre>
637<p>parse an XML in-memory document and build a tree. This reuses the existing @ctxt parser context</p>
638<div class="variablelist"><table border="0">
639<col align="left">
640<tbody>
641<tr>
642<td><span class="term"><i><tt>ctxt</tt></i>:</span></td>
643<td>an HTML parser context</td>
644</tr>
645<tr>
646<td><span class="term"><i><tt>buffer</tt></i>:</span></td>
647<td>a pointer to a char array</td>
648</tr>
649<tr>
650<td><span class="term"><i><tt>size</tt></i>:</span></td>
651<td>the size of the array</td>
652</tr>
653<tr>
654<td><span class="term"><i><tt>URL</tt></i>:</span></td>
655<td>the base URL to use for the document</td>
656</tr>
657<tr>
658<td><span class="term"><i><tt>encoding</tt></i>:</span></td>
659<td>the document encoding, or NULL</td>
660</tr>
661<tr>
662<td><span class="term"><i><tt>options</tt></i>:</span></td>
663<td>a combination of htmlParserOption(s)</td>
664</tr>
665<tr>
666<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
667<td>the resulting document tree</td>
668</tr>
669</tbody>
670</table></div>
671</div>
672<hr>
673<div class="refsect2" lang="en">
674<h3>
675<a name="htmlCtxtReset"></a>htmlCtxtReset ()</h3>
676<pre class="programlisting">void	htmlCtxtReset			(<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a> ctxt)<br>
677</pre>
678<p>Reset a parser context</p>
679<div class="variablelist"><table border="0">
680<col align="left">
681<tbody><tr>
682<td><span class="term"><i><tt>ctxt</tt></i>:</span></td>
683<td>an HTML parser context</td>
684</tr></tbody>
685</table></div>
686</div>
687<hr>
688<div class="refsect2" lang="en">
689<h3>
690<a name="htmlCtxtUseOptions"></a>htmlCtxtUseOptions ()</h3>
691<pre class="programlisting">int	htmlCtxtUseOptions		(<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a> ctxt, <br>					 int options)<br>
692</pre>
693<p>Applies the options to the parser context</p>
694<div class="variablelist"><table border="0">
695<col align="left">
696<tbody>
697<tr>
698<td><span class="term"><i><tt>ctxt</tt></i>:</span></td>
699<td>an HTML parser context</td>
700</tr>
701<tr>
702<td><span class="term"><i><tt>options</tt></i>:</span></td>
703<td>a combination of htmlParserOption(s)</td>
704</tr>
705<tr>
706<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
707<td>0 in case of success, the set of unknown or unimplemented options in case of error.</td>
708</tr>
709</tbody>
710</table></div>
711</div>
712<hr>
713<div class="refsect2" lang="en">
714<h3>
715<a name="htmlElementAllowedHere"></a>htmlElementAllowedHere ()</h3>
716<pre class="programlisting">int	htmlElementAllowedHere		(const <a href="libxml2-HTMLparser.html#htmlElemDesc">htmlElemDesc</a> * parent, <br>					 const <a href="libxml2-xmlstring.html#xmlChar">xmlChar</a> * elt)<br>
717</pre>
718<p>Checks whether an HTML element may be a direct child of a parent element. Note - doesn't check for deprecated elements</p>
719<div class="variablelist"><table border="0">
720<col align="left">
721<tbody>
722<tr>
723<td><span class="term"><i><tt>parent</tt></i>:</span></td>
724<td>HTML parent element</td>
725</tr>
726<tr>
727<td><span class="term"><i><tt>elt</tt></i>:</span></td>
728<td>HTML element</td>
729</tr>
730<tr>
731<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
732<td>1 if allowed; 0 otherwise.</td>
733</tr>
734</tbody>
735</table></div>
736</div>
737<hr>
738<div class="refsect2" lang="en">
739<h3>
740<a name="htmlElementStatusHere"></a>htmlElementStatusHere ()</h3>
741<pre class="programlisting"><a href="libxml2-HTMLparser.html#htmlStatus">htmlStatus</a>	htmlElementStatusHere	(const <a href="libxml2-HTMLparser.html#htmlElemDesc">htmlElemDesc</a> * parent, <br>					 const <a href="libxml2-HTMLparser.html#htmlElemDesc">htmlElemDesc</a> * elt)<br>
742</pre>
743<p>Checks whether an HTML element may be a direct child of a parent element. and if so whether it is valid or deprecated.</p>
744<div class="variablelist"><table border="0">
745<col align="left">
746<tbody>
747<tr>
748<td><span class="term"><i><tt>parent</tt></i>:</span></td>
749<td>HTML parent element</td>
750</tr>
751<tr>
752<td><span class="term"><i><tt>elt</tt></i>:</span></td>
753<td>HTML element</td>
754</tr>
755<tr>
756<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
757<td>one of HTML_VALID, HTML_DEPRECATED, <a href="libxml2-HTMLparser.html#HTML_INVALID">HTML_INVALID</a>
758</td>
759</tr>
760</tbody>
761</table></div>
762</div>
763<hr>
764<div class="refsect2" lang="en">
765<h3>
766<a name="htmlEncodeEntities"></a>htmlEncodeEntities ()</h3>
767<pre class="programlisting">int	htmlEncodeEntities		(unsigned char * out, <br>					 int * outlen, <br>					 const unsigned char * in, <br>					 int * inlen, <br>					 int quoteChar)<br>
768</pre>
769<p>Take a block of UTF-8 chars in and try to convert it to an ASCII plus HTML entities block of chars out.</p>
770<div class="variablelist"><table border="0">
771<col align="left">
772<tbody>
773<tr>
774<td><span class="term"><i><tt>out</tt></i>:</span></td>
775<td>a pointer to an array of bytes to store the result</td>
776</tr>
777<tr>
778<td><span class="term"><i><tt>outlen</tt></i>:</span></td>
779<td>the length of @out</td>
780</tr>
781<tr>
782<td><span class="term"><i><tt>in</tt></i>:</span></td>
783<td>a pointer to an array of UTF-8 chars</td>
784</tr>
785<tr>
786<td><span class="term"><i><tt>inlen</tt></i>:</span></td>
787<td>the length of @in</td>
788</tr>
789<tr>
790<td><span class="term"><i><tt>quoteChar</tt></i>:</span></td>
791<td>the quote character to escape (' or ") or zero.</td>
792</tr>
793<tr>
794<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
795<td>0 if success, -2 if the transcoding fails, or -1 otherwise The value of @inlen after return is the number of octets consumed as the return value is positive, else unpredictable. The value of @outlen after return is the number of octets consumed.</td>
796</tr>
797</tbody>
798</table></div>
799</div>
800<hr>
801<div class="refsect2" lang="en">
802<h3>
803<a name="htmlEntityLookup"></a>htmlEntityLookup ()</h3>
804<pre class="programlisting">const <a href="libxml2-HTMLparser.html#htmlEntityDesc">htmlEntityDesc</a> *	htmlEntityLookup	(const <a href="libxml2-xmlstring.html#xmlChar">xmlChar</a> * name)<br>
805</pre>
806<p>Lookup the given entity in EntitiesTable TODO: the linear scan is really ugly, an hash table is really needed.</p>
807<div class="variablelist"><table border="0">
808<col align="left">
809<tbody>
810<tr>
811<td><span class="term"><i><tt>name</tt></i>:</span></td>
812<td>the entity name</td>
813</tr>
814<tr>
815<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
816<td>the associated <a href="libxml2-HTMLparser.html#htmlEntityDescPtr">htmlEntityDescPtr</a> if found, NULL otherwise.</td>
817</tr>
818</tbody>
819</table></div>
820</div>
821<hr>
822<div class="refsect2" lang="en">
823<h3>
824<a name="htmlEntityValueLookup"></a>htmlEntityValueLookup ()</h3>
825<pre class="programlisting">const <a href="libxml2-HTMLparser.html#htmlEntityDesc">htmlEntityDesc</a> *	htmlEntityValueLookup	(unsigned int value)<br>
826</pre>
827<p>Lookup the given entity in EntitiesTable TODO: the linear scan is really ugly, an hash table is really needed.</p>
828<div class="variablelist"><table border="0">
829<col align="left">
830<tbody>
831<tr>
832<td><span class="term"><i><tt>value</tt></i>:</span></td>
833<td>the entity's unicode value</td>
834</tr>
835<tr>
836<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
837<td>the associated <a href="libxml2-HTMLparser.html#htmlEntityDescPtr">htmlEntityDescPtr</a> if found, NULL otherwise.</td>
838</tr>
839</tbody>
840</table></div>
841</div>
842<hr>
843<div class="refsect2" lang="en">
844<h3>
845<a name="htmlFreeParserCtxt"></a>htmlFreeParserCtxt ()</h3>
846<pre class="programlisting">void	htmlFreeParserCtxt		(<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a> ctxt)<br>
847</pre>
848<p>Free all the memory used by a parser context. However the parsed document in ctxt-&gt;myDoc is not freed.</p>
849<div class="variablelist"><table border="0">
850<col align="left">
851<tbody><tr>
852<td><span class="term"><i><tt>ctxt</tt></i>:</span></td>
853<td>an HTML parser context</td>
854</tr></tbody>
855</table></div>
856</div>
857<hr>
858<div class="refsect2" lang="en">
859<h3>
860<a name="htmlHandleOmittedElem"></a>htmlHandleOmittedElem ()</h3>
861<pre class="programlisting">int	htmlHandleOmittedElem		(int val)<br>
862</pre>
863<p>Set and return the previous value for handling HTML omitted tags.</p>
864<div class="variablelist"><table border="0">
865<col align="left">
866<tbody>
867<tr>
868<td><span class="term"><i><tt>val</tt></i>:</span></td>
869<td>int 0 or 1</td>
870</tr>
871<tr>
872<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
873<td>the last value for 0 for no handling, 1 for auto insertion.</td>
874</tr>
875</tbody>
876</table></div>
877</div>
878<hr>
879<div class="refsect2" lang="en">
880<h3>
881<a name="htmlInitAutoClose"></a>htmlInitAutoClose ()</h3>
882<pre class="programlisting">void	htmlInitAutoClose		(void)<br>
883</pre>
884<p>DEPRECATED: This function will be made private. Call <a href="libxml2-parser.html#xmlInitParser">xmlInitParser</a> to initialize the library. This is a no-op now.</p>
885</div>
886<hr>
887<div class="refsect2" lang="en">
888<h3>
889<a name="htmlIsAutoClosed"></a>htmlIsAutoClosed ()</h3>
890<pre class="programlisting">int	htmlIsAutoClosed		(<a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a> doc, <br>					 <a href="libxml2-HTMLparser.html#htmlNodePtr">htmlNodePtr</a> elem)<br>
891</pre>
892<p>The HTML DTD allows a tag to implicitly close other tags. The list is kept in htmlStartClose array. This function checks if a tag is autoclosed by one of it's child</p>
893<div class="variablelist"><table border="0">
894<col align="left">
895<tbody>
896<tr>
897<td><span class="term"><i><tt>doc</tt></i>:</span></td>
898<td>the HTML document</td>
899</tr>
900<tr>
901<td><span class="term"><i><tt>elem</tt></i>:</span></td>
902<td>the HTML element</td>
903</tr>
904<tr>
905<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
906<td>1 if autoclosed, 0 otherwise</td>
907</tr>
908</tbody>
909</table></div>
910</div>
911<hr>
912<div class="refsect2" lang="en">
913<h3>
914<a name="htmlIsScriptAttribute"></a>htmlIsScriptAttribute ()</h3>
915<pre class="programlisting">int	htmlIsScriptAttribute		(const <a href="libxml2-xmlstring.html#xmlChar">xmlChar</a> * name)<br>
916</pre>
917<p>Check if an <a href="libxml2-SAX.html#attribute">attribute</a> is of content type Script</p>
918<div class="variablelist"><table border="0">
919<col align="left">
920<tbody>
921<tr>
922<td><span class="term"><i><tt>name</tt></i>:</span></td>
923<td>an <a href="libxml2-SAX.html#attribute">attribute</a> name</td>
924</tr>
925<tr>
926<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
927<td>1 is the <a href="libxml2-SAX.html#attribute">attribute</a> is a script 0 otherwise</td>
928</tr>
929</tbody>
930</table></div>
931</div>
932<hr>
933<div class="refsect2" lang="en">
934<h3>
935<a name="htmlNewParserCtxt"></a>htmlNewParserCtxt ()</h3>
936<pre class="programlisting"><a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a>	htmlNewParserCtxt	(void)<br>
937</pre>
938<p>Allocate and initialize a new parser context.</p>
939<div class="variablelist"><table border="0">
940<col align="left">
941<tbody><tr>
942<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
943<td>the <a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a> or NULL in case of allocation error</td>
944</tr></tbody>
945</table></div>
946</div>
947<hr>
948<div class="refsect2" lang="en">
949<h3>
950<a name="htmlNewSAXParserCtxt"></a>htmlNewSAXParserCtxt ()</h3>
951<pre class="programlisting"><a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a>	htmlNewSAXParserCtxt	(<a href="libxml2-HTMLparser.html#htmlSAXHandlerPtr">htmlSAXHandlerPtr</a> sax, <br>						 void * userData)<br>
952</pre>
953<p>Allocate and initialize a new parser context.</p>
954<div class="variablelist"><table border="0">
955<col align="left">
956<tbody>
957<tr>
958<td><span class="term"><i><tt>sax</tt></i>:</span></td>
959<td>SAX handler</td>
960</tr>
961<tr>
962<td><span class="term"><i><tt>userData</tt></i>:</span></td>
963<td>user data</td>
964</tr>
965<tr>
966<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
967<td>the <a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a> or NULL in case of allocation error</td>
968</tr>
969</tbody>
970</table></div>
971</div>
972<hr>
973<div class="refsect2" lang="en">
974<h3>
975<a name="htmlNodeStatus"></a>htmlNodeStatus ()</h3>
976<pre class="programlisting"><a href="libxml2-HTMLparser.html#htmlStatus">htmlStatus</a>	htmlNodeStatus		(const <a href="libxml2-HTMLparser.html#htmlNodePtr">htmlNodePtr</a> node, <br>					 int legacy)<br>
977</pre>
978<p>Checks whether the tree node is valid. Experimental (the author only uses the HTML enhancements in a SAX parser)</p>
979<div class="variablelist"><table border="0">
980<col align="left">
981<tbody>
982<tr>
983<td><span class="term"><i><tt>node</tt></i>:</span></td>
984<td>an <a href="libxml2-HTMLparser.html#htmlNodePtr">htmlNodePtr</a> in a tree</td>
985</tr>
986<tr>
987<td><span class="term"><i><tt>legacy</tt></i>:</span></td>
988<td>whether to allow deprecated elements (YES is faster here for Element nodes)</td>
989</tr>
990<tr>
991<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
992<td>for Element nodes, a return from <a href="libxml2-HTMLparser.html#htmlElementAllowedHere">htmlElementAllowedHere</a> (if legacy allowed) or <a href="libxml2-HTMLparser.html#htmlElementStatusHere">htmlElementStatusHere</a> (otherwise). for Attribute nodes, a return from <a href="libxml2-HTMLparser.html#htmlAttrAllowed">htmlAttrAllowed</a> for other nodes, <a href="libxml2-HTMLparser.html#HTML_NA">HTML_NA</a> (no checks performed)</td>
993</tr>
994</tbody>
995</table></div>
996</div>
997<hr>
998<div class="refsect2" lang="en">
999<h3>
1000<a name="htmlParseCharRef"></a>htmlParseCharRef ()</h3>
1001<pre class="programlisting">int	htmlParseCharRef		(<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a> ctxt)<br>
1002</pre>
1003<p>parse Reference declarations [66] CharRef ::= '&amp;#' [0-9]+ ';' | '&amp;#x' [0-9a-fA-F]+ ';'</p>
1004<div class="variablelist"><table border="0">
1005<col align="left">
1006<tbody>
1007<tr>
1008<td><span class="term"><i><tt>ctxt</tt></i>:</span></td>
1009<td>an HTML parser context</td>
1010</tr>
1011<tr>
1012<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
1013<td>the value parsed (as an int)</td>
1014</tr>
1015</tbody>
1016</table></div>
1017</div>
1018<hr>
1019<div class="refsect2" lang="en">
1020<h3>
1021<a name="htmlParseChunk"></a>htmlParseChunk ()</h3>
1022<pre class="programlisting">int	htmlParseChunk			(<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a> ctxt, <br>					 const char * chunk, <br>					 int size, <br>					 int terminate)<br>
1023</pre>
1024<p>Parse a Chunk of memory</p>
1025<div class="variablelist"><table border="0">
1026<col align="left">
1027<tbody>
1028<tr>
1029<td><span class="term"><i><tt>ctxt</tt></i>:</span></td>
1030<td>an HTML parser context</td>
1031</tr>
1032<tr>
1033<td><span class="term"><i><tt>chunk</tt></i>:</span></td>
1034<td>an char array</td>
1035</tr>
1036<tr>
1037<td><span class="term"><i><tt>size</tt></i>:</span></td>
1038<td>the size in byte of the chunk</td>
1039</tr>
1040<tr>
1041<td><span class="term"><i><tt>terminate</tt></i>:</span></td>
1042<td>last chunk indicator</td>
1043</tr>
1044<tr>
1045<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
1046<td>zero if no error, the <a href="libxml2-xmlerror.html#xmlParserErrors">xmlParserErrors</a> otherwise.</td>
1047</tr>
1048</tbody>
1049</table></div>
1050</div>
1051<hr>
1052<div class="refsect2" lang="en">
1053<h3>
1054<a name="htmlParseDoc"></a>htmlParseDoc ()</h3>
1055<pre class="programlisting"><a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a>	htmlParseDoc		(const <a href="libxml2-xmlstring.html#xmlChar">xmlChar</a> * cur, <br>					 const char * encoding)<br>
1056</pre>
1057<p>parse an HTML in-memory document and build a tree.</p>
1058<div class="variablelist"><table border="0">
1059<col align="left">
1060<tbody>
1061<tr>
1062<td><span class="term"><i><tt>cur</tt></i>:</span></td>
1063<td>a pointer to an array of <a href="libxml2-xmlstring.html#xmlChar">xmlChar</a>
1064</td>
1065</tr>
1066<tr>
1067<td><span class="term"><i><tt>encoding</tt></i>:</span></td>
1068<td>a free form C string describing the HTML document encoding, or NULL</td>
1069</tr>
1070<tr>
1071<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
1072<td>the resulting document tree</td>
1073</tr>
1074</tbody>
1075</table></div>
1076</div>
1077<hr>
1078<div class="refsect2" lang="en">
1079<h3>
1080<a name="htmlParseDocument"></a>htmlParseDocument ()</h3>
1081<pre class="programlisting">int	htmlParseDocument		(<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a> ctxt)<br>
1082</pre>
1083<p>parse an HTML document (and build a tree if using the standard SAX interface).</p>
1084<div class="variablelist"><table border="0">
1085<col align="left">
1086<tbody>
1087<tr>
1088<td><span class="term"><i><tt>ctxt</tt></i>:</span></td>
1089<td>an HTML parser context</td>
1090</tr>
1091<tr>
1092<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
1093<td>0, -1 in case of error. the parser context is augmented as a result of the parsing.</td>
1094</tr>
1095</tbody>
1096</table></div>
1097</div>
1098<hr>
1099<div class="refsect2" lang="en">
1100<h3>
1101<a name="htmlParseElement"></a>htmlParseElement ()</h3>
1102<pre class="programlisting">void	htmlParseElement		(<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a> ctxt)<br>
1103</pre>
1104<p>parse an HTML element, this is highly recursive this is kept for compatibility with previous code versions [39] element ::= EmptyElemTag | STag content ETag [41] Attribute ::= Name Eq AttValue</p>
1105<div class="variablelist"><table border="0">
1106<col align="left">
1107<tbody><tr>
1108<td><span class="term"><i><tt>ctxt</tt></i>:</span></td>
1109<td>an HTML parser context</td>
1110</tr></tbody>
1111</table></div>
1112</div>
1113<hr>
1114<div class="refsect2" lang="en">
1115<h3>
1116<a name="htmlParseEntityRef"></a>htmlParseEntityRef ()</h3>
1117<pre class="programlisting">const <a href="libxml2-HTMLparser.html#htmlEntityDesc">htmlEntityDesc</a> *	htmlParseEntityRef	(<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a> ctxt, <br>						 const <a href="libxml2-xmlstring.html#xmlChar">xmlChar</a> ** str)<br>
1118</pre>
1119<p>parse an HTML ENTITY references [68] EntityRef ::= '&amp;' Name ';'</p>
1120<div class="variablelist"><table border="0">
1121<col align="left">
1122<tbody>
1123<tr>
1124<td><span class="term"><i><tt>ctxt</tt></i>:</span></td>
1125<td>an HTML parser context</td>
1126</tr>
1127<tr>
1128<td><span class="term"><i><tt>str</tt></i>:</span></td>
1129<td>location to store the entity name</td>
1130</tr>
1131<tr>
1132<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
1133<td>the associated <a href="libxml2-HTMLparser.html#htmlEntityDescPtr">htmlEntityDescPtr</a> if found, or NULL otherwise, if non-NULL *str will have to be freed by the caller.</td>
1134</tr>
1135</tbody>
1136</table></div>
1137</div>
1138<hr>
1139<div class="refsect2" lang="en">
1140<h3>
1141<a name="htmlParseFile"></a>htmlParseFile ()</h3>
1142<pre class="programlisting"><a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a>	htmlParseFile		(const char * filename, <br>					 const char * encoding)<br>
1143</pre>
1144<p>parse an HTML file and build a tree. Automatic support for ZLIB/Compress compressed document is provided by default if found at compile-time.</p>
1145<div class="variablelist"><table border="0">
1146<col align="left">
1147<tbody>
1148<tr>
1149<td><span class="term"><i><tt>filename</tt></i>:</span></td>
1150<td>the filename</td>
1151</tr>
1152<tr>
1153<td><span class="term"><i><tt>encoding</tt></i>:</span></td>
1154<td>a free form C string describing the HTML document encoding, or NULL</td>
1155</tr>
1156<tr>
1157<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
1158<td>the resulting document tree</td>
1159</tr>
1160</tbody>
1161</table></div>
1162</div>
1163<hr>
1164<div class="refsect2" lang="en">
1165<h3>
1166<a name="htmlReadDoc"></a>htmlReadDoc ()</h3>
1167<pre class="programlisting"><a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a>	htmlReadDoc		(const <a href="libxml2-xmlstring.html#xmlChar">xmlChar</a> * cur, <br>					 const char * URL, <br>					 const char * encoding, <br>					 int options)<br>
1168</pre>
1169<p>parse an XML in-memory document and build a tree.</p>
1170<div class="variablelist"><table border="0">
1171<col align="left">
1172<tbody>
1173<tr>
1174<td><span class="term"><i><tt>cur</tt></i>:</span></td>
1175<td>a pointer to a zero terminated string</td>
1176</tr>
1177<tr>
1178<td><span class="term"><i><tt>URL</tt></i>:</span></td>
1179<td>the base URL to use for the document</td>
1180</tr>
1181<tr>
1182<td><span class="term"><i><tt>encoding</tt></i>:</span></td>
1183<td>the document encoding, or NULL</td>
1184</tr>
1185<tr>
1186<td><span class="term"><i><tt>options</tt></i>:</span></td>
1187<td>a combination of htmlParserOption(s)</td>
1188</tr>
1189<tr>
1190<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
1191<td>the resulting document tree</td>
1192</tr>
1193</tbody>
1194</table></div>
1195</div>
1196<hr>
1197<div class="refsect2" lang="en">
1198<h3>
1199<a name="htmlReadFd"></a>htmlReadFd ()</h3>
1200<pre class="programlisting"><a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a>	htmlReadFd		(int fd, <br>					 const char * URL, <br>					 const char * encoding, <br>					 int options)<br>
1201</pre>
1202<p>parse an HTML from a file descriptor and build a tree. NOTE that the file descriptor will not be closed when the reader is closed or reset.</p>
1203<div class="variablelist"><table border="0">
1204<col align="left">
1205<tbody>
1206<tr>
1207<td><span class="term"><i><tt>fd</tt></i>:</span></td>
1208<td>an open file descriptor</td>
1209</tr>
1210<tr>
1211<td><span class="term"><i><tt>URL</tt></i>:</span></td>
1212<td>the base URL to use for the document</td>
1213</tr>
1214<tr>
1215<td><span class="term"><i><tt>encoding</tt></i>:</span></td>
1216<td>the document encoding, or NULL</td>
1217</tr>
1218<tr>
1219<td><span class="term"><i><tt>options</tt></i>:</span></td>
1220<td>a combination of htmlParserOption(s)</td>
1221</tr>
1222<tr>
1223<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
1224<td>the resulting document tree</td>
1225</tr>
1226</tbody>
1227</table></div>
1228</div>
1229<hr>
1230<div class="refsect2" lang="en">
1231<h3>
1232<a name="htmlReadFile"></a>htmlReadFile ()</h3>
1233<pre class="programlisting"><a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a>	htmlReadFile		(const char * filename, <br>					 const char * encoding, <br>					 int options)<br>
1234</pre>
1235<p>parse an XML file from the filesystem or the network.</p>
1236<div class="variablelist"><table border="0">
1237<col align="left">
1238<tbody>
1239<tr>
1240<td><span class="term"><i><tt>filename</tt></i>:</span></td>
1241<td>a file or URL</td>
1242</tr>
1243<tr>
1244<td><span class="term"><i><tt>encoding</tt></i>:</span></td>
1245<td>the document encoding, or NULL</td>
1246</tr>
1247<tr>
1248<td><span class="term"><i><tt>options</tt></i>:</span></td>
1249<td>a combination of htmlParserOption(s)</td>
1250</tr>
1251<tr>
1252<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
1253<td>the resulting document tree</td>
1254</tr>
1255</tbody>
1256</table></div>
1257</div>
1258<hr>
1259<div class="refsect2" lang="en">
1260<h3>
1261<a name="htmlReadIO"></a>htmlReadIO ()</h3>
1262<pre class="programlisting"><a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a>	htmlReadIO		(<a href="libxml2-xmlIO.html#xmlInputReadCallback">xmlInputReadCallback</a> ioread, <br>					 <a href="libxml2-xmlIO.html#xmlInputCloseCallback">xmlInputCloseCallback</a> ioclose, <br>					 void * ioctx, <br>					 const char * URL, <br>					 const char * encoding, <br>					 int options)<br>
1263</pre>
1264<p>parse an HTML document from I/O functions and source and build a tree.</p>
1265<div class="variablelist"><table border="0">
1266<col align="left">
1267<tbody>
1268<tr>
1269<td><span class="term"><i><tt>ioread</tt></i>:</span></td>
1270<td>an I/O read function</td>
1271</tr>
1272<tr>
1273<td><span class="term"><i><tt>ioclose</tt></i>:</span></td>
1274<td>an I/O close function</td>
1275</tr>
1276<tr>
1277<td><span class="term"><i><tt>ioctx</tt></i>:</span></td>
1278<td>an I/O handler</td>
1279</tr>
1280<tr>
1281<td><span class="term"><i><tt>URL</tt></i>:</span></td>
1282<td>the base URL to use for the document</td>
1283</tr>
1284<tr>
1285<td><span class="term"><i><tt>encoding</tt></i>:</span></td>
1286<td>the document encoding, or NULL</td>
1287</tr>
1288<tr>
1289<td><span class="term"><i><tt>options</tt></i>:</span></td>
1290<td>a combination of htmlParserOption(s)</td>
1291</tr>
1292<tr>
1293<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
1294<td>the resulting document tree</td>
1295</tr>
1296</tbody>
1297</table></div>
1298</div>
1299<hr>
1300<div class="refsect2" lang="en">
1301<h3>
1302<a name="htmlReadMemory"></a>htmlReadMemory ()</h3>
1303<pre class="programlisting"><a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a>	htmlReadMemory		(const char * buffer, <br>					 int size, <br>					 const char * URL, <br>					 const char * encoding, <br>					 int options)<br>
1304</pre>
1305<p>parse an XML in-memory document and build a tree.</p>
1306<div class="variablelist"><table border="0">
1307<col align="left">
1308<tbody>
1309<tr>
1310<td><span class="term"><i><tt>buffer</tt></i>:</span></td>
1311<td>a pointer to a char array</td>
1312</tr>
1313<tr>
1314<td><span class="term"><i><tt>size</tt></i>:</span></td>
1315<td>the size of the array</td>
1316</tr>
1317<tr>
1318<td><span class="term"><i><tt>URL</tt></i>:</span></td>
1319<td>the base URL to use for the document</td>
1320</tr>
1321<tr>
1322<td><span class="term"><i><tt>encoding</tt></i>:</span></td>
1323<td>the document encoding, or NULL</td>
1324</tr>
1325<tr>
1326<td><span class="term"><i><tt>options</tt></i>:</span></td>
1327<td>a combination of htmlParserOption(s)</td>
1328</tr>
1329<tr>
1330<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
1331<td>the resulting document tree</td>
1332</tr>
1333</tbody>
1334</table></div>
1335</div>
1336<hr>
1337<div class="refsect2" lang="en">
1338<h3>
1339<a name="htmlSAXParseDoc"></a>htmlSAXParseDoc ()</h3>
1340<pre class="programlisting"><a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a>	htmlSAXParseDoc		(const <a href="libxml2-xmlstring.html#xmlChar">xmlChar</a> * cur, <br>					 const char * encoding, <br>					 <a href="libxml2-HTMLparser.html#htmlSAXHandlerPtr">htmlSAXHandlerPtr</a> sax, <br>					 void * userData)<br>
1341</pre>
1342<p>Parse an HTML in-memory document. If sax is not NULL, use the SAX callbacks to handle parse events. If sax is NULL, fallback to the default DOM behavior and return a tree.</p>
1343<div class="variablelist"><table border="0">
1344<col align="left">
1345<tbody>
1346<tr>
1347<td><span class="term"><i><tt>cur</tt></i>:</span></td>
1348<td>a pointer to an array of <a href="libxml2-xmlstring.html#xmlChar">xmlChar</a>
1349</td>
1350</tr>
1351<tr>
1352<td><span class="term"><i><tt>encoding</tt></i>:</span></td>
1353<td>a free form C string describing the HTML document encoding, or NULL</td>
1354</tr>
1355<tr>
1356<td><span class="term"><i><tt>sax</tt></i>:</span></td>
1357<td>the SAX handler block</td>
1358</tr>
1359<tr>
1360<td><span class="term"><i><tt>userData</tt></i>:</span></td>
1361<td>if using SAX, this pointer will be provided on callbacks.</td>
1362</tr>
1363<tr>
1364<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
1365<td>the resulting document tree unless SAX is NULL or the document is not well formed.</td>
1366</tr>
1367</tbody>
1368</table></div>
1369</div>
1370<hr>
1371<div class="refsect2" lang="en">
1372<h3>
1373<a name="htmlSAXParseFile"></a>htmlSAXParseFile ()</h3>
1374<pre class="programlisting"><a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a>	htmlSAXParseFile	(const char * filename, <br>					 const char * encoding, <br>					 <a href="libxml2-HTMLparser.html#htmlSAXHandlerPtr">htmlSAXHandlerPtr</a> sax, <br>					 void * userData)<br>
1375</pre>
1376<p>parse an HTML file and build a tree. Automatic support for ZLIB/Compress compressed document is provided by default if found at compile-time. It use the given SAX function block to handle the parsing callback. If sax is NULL, fallback to the default DOM tree building routines.</p>
1377<div class="variablelist"><table border="0">
1378<col align="left">
1379<tbody>
1380<tr>
1381<td><span class="term"><i><tt>filename</tt></i>:</span></td>
1382<td>the filename</td>
1383</tr>
1384<tr>
1385<td><span class="term"><i><tt>encoding</tt></i>:</span></td>
1386<td>a free form C string describing the HTML document encoding, or NULL</td>
1387</tr>
1388<tr>
1389<td><span class="term"><i><tt>sax</tt></i>:</span></td>
1390<td>the SAX handler block</td>
1391</tr>
1392<tr>
1393<td><span class="term"><i><tt>userData</tt></i>:</span></td>
1394<td>if using SAX, this pointer will be provided on callbacks.</td>
1395</tr>
1396<tr>
1397<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
1398<td>the resulting document tree unless SAX is NULL or the document is not well formed.</td>
1399</tr>
1400</tbody>
1401</table></div>
1402</div>
1403<hr>
1404<div class="refsect2" lang="en">
1405<h3>
1406<a name="htmlTagLookup"></a>htmlTagLookup ()</h3>
1407<pre class="programlisting">const <a href="libxml2-HTMLparser.html#htmlElemDesc">htmlElemDesc</a> *	htmlTagLookup	(const <a href="libxml2-xmlstring.html#xmlChar">xmlChar</a> * tag)<br>
1408</pre>
1409<p>Lookup the HTML tag in the ElementTable</p>
1410<div class="variablelist"><table border="0">
1411<col align="left">
1412<tbody>
1413<tr>
1414<td><span class="term"><i><tt>tag</tt></i>:</span></td>
1415<td>The tag name in lowercase</td>
1416</tr>
1417<tr>
1418<td><span class="term"><i><tt>Returns</tt></i>:</span></td>
1419<td>the related <a href="libxml2-HTMLparser.html#htmlElemDescPtr">htmlElemDescPtr</a> or NULL if not found.</td>
1420</tr>
1421</tbody>
1422</table></div>
1423</div>
1424<hr>
1425</div>
1426</div>
1427</body>
1428</html>
1429