• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1Additions
2---------
3
4More of the jQuery API: nextUntil?
5
6Optimizations
7-------------
8
9The html5lib tree builder doesn't use the standard tree-building API,
10which worries me and has resulted in a number of bugs.
11
12markup_attr_map can be optimized since it's always a map now.
13
14Upon encountering UTF-16LE data or some other uncommon serialization
15of Unicode, UnicodeDammit will convert the data to Unicode, then
16encode it at UTF-8. This is wasteful because it will just get decoded
17back to Unicode.
18
19CDATA
20-----
21
22The elementtree XMLParser has a strip_cdata argument that, when set to
23False, should allow Beautiful Soup to preserve CDATA sections instead
24of treating them as text. Except it doesn't. (This argument is also
25present for HTMLParser, and also does nothing there.)
26
27Currently, htm5lib converts CDATA sections into comments. An
28as-yet-unreleased version of html5lib changes the parser's handling of
29CDATA sections to allow CDATA sections in tags like <svg> and
30<math>. The HTML5TreeBuilder will need to be updated to create CData
31objects instead of Comment objects in this situation.
32