1title: Extensions API 2 3# Writing Extensions for Python-Markdown 4 5Python-Markdown includes an API for extension writers to plug their own custom functionality and syntax into the 6parser. An extension will patch into one or more stages of the parser: 7 8* [*Preprocessors*](#preprocessors) alter the source before it is passed to the parser. 9* [*Block Processors*](#blockprocessors) work with blocks of text separated by blank lines. 10* [*Tree Processors*](#treeprocessors) modify the constructed ElementTree 11* [*Inline Processors*](#inlineprocessors) are common tree processors for inline elements, such as `*strong*`. 12* [*Postprocessors*](#postprocessors) munge of the output of the parser just before it is returned. 13 14The parser loads text, applies the preprocessors, creates and builds an [ElementTree][ElementTree] object from the 15block processors and inline processors, renders the ElementTree object as Unicode text, and then then applies the 16postprocessors. 17 18There are classes and helpers provided to ease writing your extension. Each part of the API is discussed in its 19respective section below. Additionally, you can walk through the [Tutorial on Writing Extensions][tutorial]; look at 20some of the [Available Extensions][] and their [source code][extension source]. As always, you may report bugs, ask 21for help, and discuss various other issues on the [bug tracker]. 22 23## Phases of processing {: #stages } 24 25### Preprocessors {: #preprocessors } 26 27Preprocessors munge the source text before it is passed to the Markdown parser. This is an excellent place to clean up 28bad characters or to extract portions for later processing that the parser may otherwise choke on. 29 30Preprocessors inherit from `markdown.preprocessors.Preprocessor` and implement a `run` method, which takes a single 31parameter `lines`. This parameter is the entire source text stored as a list of Unicode strings, one per line. `run` 32should return its processed list of Unicode strings, one per line. 33 34#### Example 35 36This simple example removes any lines with 'NO RENDER' before processing: 37 38```python 39from markdown.preprocessors import Preprocessor 40import re 41 42class NoRender(Preprocessor): 43 """ Skip any line with words 'NO RENDER' in it. """ 44 def run(self, lines): 45 new_lines = [] 46 for line in lines: 47 m = re.search("NO RENDER", line) 48 if not m: 49 # any line without NO RENDER is passed through 50 new_lines.append(line) 51 return new_lines 52``` 53 54#### Usages 55 56Some preprocessors in the Markdown source tree include: 57 58| Class | Kind | Description | 59| ------------------------------|-----------|------------------------------------------------- | 60| [`NormalizeWhiteSpace`][c1] | built-in | Normalizes whitespace by expanding tabs, fixing `\r` line endings, etc. | 61| [`HtmlBlockPreprocessor`][c2] | built-in | Removes html blocks from the text and stores them for later processing | 62| [`ReferencePreprocessor`][c3] | built-in | Removes reference definitions from text and stores for later processing | 63| [`MetaPreprocessor`][c4] | extension | Strips and records meta data at top of documents | 64| [`FootnotesPreprocessor`][c5] | extension | Removes footnote blocks from the text and stores them for later processing | 65 66[c1]: https://github.com/Python-Markdown/markdown/blob/master/markdown/preprocessors.py 67[c2]: https://github.com/Python-Markdown/markdown/blob/master/markdown/preprocessors.py 68[c3]: https://github.com/Python-Markdown/markdown/blob/master/markdown/preprocessors.py 69[c4]: https://github.com/Python-Markdown/markdown/blob/master/markdown/extensions/meta.py 70[c5]: https://github.com/Python-Markdown/markdown/blob/master/markdown/extensions/footnotes.py 71 72### Block Processors {: #blockprocessors } 73 74A block processor parses blocks of text and adds new elements to the `ElementTree`. Blocks of text, separated from 75other text by blank lines, may have a different syntax and produce a differently structured tree than other Markdown. 76Block processors excel at code formatting, equation layouts, and tables. 77 78Block processors inherit from `markdown.blockprocessors.BlockProcessor`, are passed `md.parser` on initialization, and 79implement both the `test` and `run` methods: 80 81* `test(self, parent, block)` takes two parameters: `parent` is the parent `ElementTree` element and `block` is a 82 single, multi-line, Unicode string of the current block. `test`, often a regular expression match, returns a true 83 value if the block processor's `run` method should be called to process starting at that block. 84* `run(self, parent, blocks)` has the same `parent` parameter as `test`; and `blocks` is the list of all remaining 85 blocks in the document, starting with the `block` passed to `test`. `run` may return `False` (not `None`) to signal 86 failure, meaning that it did not process the blocks after all. On success, `run` is expected to `pop` one or more 87 blocks from the front of `blocks` and attach new nodes to `parent`. 88 89Crafting block processors is more involved and flexible than the other processors, involving controlling recursive 90parsing of the block's contents and managing state across invocations. For example, a blank line is allowed in 91indented code, so the second invocation of the inline code processor appends to the element tree generated by the 92previous call. Other block processors may insert new text into the `blocks` list, signal to future calls of itself, 93and more. 94 95To make writing these complex beasts more tractable, three convenience functions have been provided by the 96`BlockProcessor` parent class: 97 98* `lastChild(parent)` returns the last child of the given element or `None` if it has no children. 99* `detab(text)` removes one level of indent (four spaces by default) from the front of each line of the given 100 multi-line, text string, until a non-blank line is indented less. 101* `looseDetab(text, level)` removes multiple levels 102 of indent from the front of each line of `text` but does not affect lines indented less. 103 104Also, `BlockProcessor` provides the fields `self.tab_length`, the tab length (default 4), and `self.parser`, the 105current `BlockParser` instance. 106 107#### BlockParser 108 109`BlockParser`, not to be confused with `BlockProcessor`, is the class used by Markdown to cycle through all the 110registered block processors. You should never need to create your own instance; use `self.parser` instead. 111 112The `BlockParser` instance provides a stack of strings for its current state, which your processor can push with 113`self.parser.set(state)`, pop with `self.parser.reset()`, or check the the top state with 114`self.parser.isstate(state)`. Be sure your code pops the states it pushes. 115 116The `BlockParser` instance can also be called recursively, that is, to process blocks from within your block 117processor. There are three methods: 118 119* `parseDocument(lines)` parses a list of lines, each a single-line Unicode string, returning a complete 120 `ElementTree`. 121* `parseChunk(parent, text)` parses a single, multi-line, possibly multi-block, Unicode string `text` and attaches the 122 resulting tree to `parent`. 123* `parseBlocks(parent, blocks)` takes a list of `blocks`, each a multi-line Unicode string without blank lines, and 124 attaches the resulting tree to `parent`. 125 126For perspective, Markdown calls `parseDocument` which calls `parseChunk` which calls `parseBlocks` which calls your 127block processor, which, in turn, might call one of these routines. 128 129#### Example 130 131This example calls out important paragraphs by giving them a border. It looks for a fence line of exclamation points 132before and after and renders the fenced blocks into a new, styled `div`. If it does not find the ending fence line, 133it does nothing. 134 135Our code, like most block processors, is longer than other examples: 136 137```python 138def test_block_processor(): 139 class BoxBlockProcessor(BlockProcessor): 140 RE_FENCE_START = r'^ *!{3,} *\n' # start line, e.g., ` !!!! ` 141 RE_FENCE_END = r'\n *!{3,}\s*$' # last non-blank line, e.g, '!!!\n \n\n' 142 143 def test(self, parent, block): 144 return re.match(self.RE_FENCE_START, block) 145 146 def run(self, parent, blocks): 147 original_block = blocks[0] 148 blocks[0] = re.sub(self.RE_FENCE_START, '', blocks[0]) 149 150 # Find block with ending fence 151 for block_num, block in enumerate(blocks): 152 if re.search(self.RE_FENCE_END, block): 153 # remove fence 154 blocks[block_num] = re.sub(self.RE_FENCE_END, '', block) 155 # render fenced area inside a new div 156 e = etree.SubElement(parent, 'div') 157 e.set('style', 'display: inline-block; border: 1px solid red;') 158 self.parser.parseBlocks(e, blocks[0:block_num + 1]) 159 # remove used blocks 160 for i in range(0, block_num + 1): 161 blocks.pop(0) 162 return True # or could have had no return statement 163 # No closing marker! Restore and do nothing 164 blocks[0] = original_block 165 return False # equivalent to our test() routine returning False 166 167 class BoxExtension(Extension): 168 def extendMarkdown(self, md): 169 md.parser.blockprocessors.register(BoxBlockProcessor(md.parser), 'box', 175) 170``` 171 172Start with this example input: 173 174``` text 175A regular paragraph of text. 176 177!!!!! 178First paragraph of wrapped text. 179 180Second Paragraph of **wrapped** text. 181!!!!! 182 183Another regular paragraph of text. 184``` 185 186The fenced text adds one node with two children to the tree: 187 188* `div`, with a `style` attribute. It renders as 189 `<div style="display: inline-block; border: 1px solid red;">...</div>` 190 * `p` with text `First paragraph of wrapped text.` 191 * `p` with text `Second Paragraph of **wrapped** text`. The conversion to a `<strong>` tag will happen when 192 running the inline processors, which will happen after all of the block processors have completed. 193 194The example output might display as follows: 195 196!!! note "" 197 <p>A regular paragraph of text.</p> 198 <div style="display: inline-block; border: 1px solid red;"> 199 <p>First paragraph of wrapped text.</p> 200 <p>Second Paragraph of **wrapped** text.</p> 201 </div> 202 <p>Another regular paragraph of text.</p> 203 204#### Usages 205 206Some block processors in the Markdown source tree include: 207 208| Class | Kind | Description | 209| ----------------------------|-----------|---------------------------------------------| 210| [`HashHeaderProcessor`][b1] | built-in | Title hashes (`#`), which may split blocks | 211| [`HRProcessor`][b2] | built-in | Horizontal lines, e.g., `---` | 212| [`OListProcessor`][b3] | built-in | Ordered lists; complex and using `state` | 213| [`Admonition`][b4] | extension | Render each [Admonition][] in a new `div` | 214 215[b1]: https://github.com/Python-Markdown/markdown/blob/master/markdown/blockprocessors.py 216[b2]: https://github.com/Python-Markdown/markdown/blob/master/markdown/blockprocessors.py 217[b3]: https://github.com/Python-Markdown/markdown/blob/master/markdown/blockprocessors.py 218[Admonition]: https://python-markdown.github.io/extensions/admonition/ 219[b4]: https://github.com/Python-Markdown/markdown/blob/master/markdown/extensions/admonition.py 220 221### Tree processors {: #treeprocessors } 222 223Tree processors manipulate the tree created by block processors. They can even create an entirely new ElementTree 224object. This is an excellent place for creating summaries, adding collected references, or last minute adjustments. 225 226A tree processor must inherit from `markdown.treeprocessors.Treeprocessor` (note the capitalization). A tree processor 227must implement a `run` method which takes a single argument `root`. In most cases `root` would be an 228`xml.etree.ElementTree.Element` instance; however, in rare cases it could be some other type of ElementTree object. 229The `run` method may return `None`, in which case the (possibly modified) original `root` object is used, or it may 230return an entirely new `Element` object, which will replace the existing `root` object and all of its children. It is 231generally preferred to modify `root` in place and return `None`, which avoids creating multiple copies of the entire 232document tree in memory. 233 234For specifics on manipulating the ElementTree, see [Working with the ElementTree][workingwithetree] below. 235 236#### Example 237 238A pseudo example: 239 240```python 241from markdown.treeprocessors import Treeprocessor 242 243class MyTreeprocessor(Treeprocessor): 244 def run(self, root): 245 root.text = 'modified content' 246 # No return statement is same as `return None` 247``` 248 249#### Usages 250 251The core `InlineProcessor` class is a tree processor. It walks the tree, matches patterns, and splits and creates 252nodes on matches. 253 254Additional tree processors in the Markdown source tree include: 255 256| Class | Kind | Description | 257| ----------------------------------|-----------|---------------------------------------------------------------| 258| [`PrettifyTreeprocessor`][e1] | built-in | Add line breaks to the html document | 259| [`TocTreeprocessor`][e2] | extension | Builds a [table of contents][] from the finished tree | 260| [`FootnoteTreeprocessor`][e3] | extension | Create [footnote][] div at end of document | 261| [`FootnotePostTreeprocessor`][e4] | extension | Amend div created by `FootnoteTreeprocessor` with duplicates | 262 263[e1]: https://github.com/Python-Markdown/markdown/blob/master/markdown/treeprocessors.py 264[e2]: https://github.com/Python-Markdown/markdown/blob/master/markdown/extensions/toc.py 265[e3]: https://github.com/Python-Markdown/markdown/blob/master/markdown/extensions/footnotes.py 266[e4]: https://github.com/Python-Markdown/markdown/blob/master/markdown/extensions/footnotes.py 267[table of contents]: https://python-markdown.github.io/extensions/toc/ 268[footnote]: https://python-markdown.github.io/extensions/footnotes/ 269 270### Inline Processors {: #inlineprocessors } 271 272Inline processors, previously called inline patterns, are used to add formatting, such as `**emphasis**`, by replacing 273a matched pattern with a new element tree node. It is an excellent for adding new syntax for inline tags. Inline 274processor code is often quite short. 275 276Inline processors inherit from `InlineProcessor`, are initialized, and implement `handleMatch`: 277 278* `__init__(self, pattern, md=None)` is the inherited constructor. You do not need to implement your own. 279 * `pattern` is the regular expression string that must match the code block in order for the `handleMatch` method 280 to be called. 281 * `md`, an optional parameter, is a pointer to the instance of `markdown.Markdown` and is available as `self.md` 282 on the `InlineProcessor` instance. 283 284* `handleMatch(self, m, data)` must be implemented in all `InlineProcessor` subclasses. 285 * `m` is the regular expression [match object][] found by the `pattern` passed to `__init__`. 286 * `data` is a single, multi-line, Unicode string containing the entire block of text around the pattern. A block 287 is text set apart by blank lines. 288 * Returns either `(None, None, None)`, indicating the provided match was rejected or `(el, start, end)`, if the 289 match was successfully processed. On success, `el` is the element being added the tree, `start` and `end` are 290 indexes in `data` that were "consumed" by the pattern. The "consumed" span will be replaced by a placeholder. 291 The same inline processor may be called several times on the same block. 292 293Inline Processors can define the property `ANCESTOR_EXCLUDES` which is either a list or tuple of undesirable ancestors. 294The processor will be skipped if it would cause the content to be a descendant of one of the listed tag names. 295 296##### Convenience Classes 297 298Convenience subclasses of `InlineProcessor` are provide for common operations: 299 300* [`SimpleTextInlineProcessor`][i1] returns the text of `group(1)` of the match. 301* [`SubstituteTagInlineProcessor`][i4] is initialized as `SubstituteTagInlineProcessor(pattern, tag)`. It returns a 302 new element `tag` whenever `pattern` is matched. 303* [`SimpleTagInlineProcessor`][i3] is initialized as `SimpleTagInlineProcessor(pattern, tag)`. It returns an element 304 `tag` with a text field of `group(2)` of the match. 305 306##### Example 307 308This example changes `--strike--` to `<del>strike</del>`. 309 310```python 311from markdown.inlinepatterns import InlineProcessor 312from markdown.extensions import Extension 313import xml.etree.ElementTree as etree 314 315 316class DelInlineProcessor(InlineProcessor): 317 def handleMatch(self, m, data): 318 el = etree.Element('del') 319 el.text = m.group(1) 320 return el, m.start(0), m.end(0) 321 322class DelExtension(Extension): 323 def extendMarkdown(self, md): 324 DEL_PATTERN = r'--(.*?)--' # like --del-- 325 md.inlinePatterns.register(DelInlineProcessor(DEL_PATTERN, md), 'del', 175) 326``` 327 328Use this input example: 329 330``` text 331First line of the block. 332This is --strike one--. 333This is --strike two--. 334End of the block. 335``` 336 337The example output might display as follows: 338 339!!! note "" 340 <p>First line of the block. 341 This is <del>strike one</del>. 342 This is <del>strike two</del>. 343 End of the block.</p> 344 345* On the first call to `handleMatch` 346 * `m` will be the match for `--strike one--` 347 * `data` will be the string: 348 `First line of the block.\nThis is --strike one--.\nThis is --strike two--.\nEnd of the block.` 349 350 Because the match was successful, the region between the returned `start` and `end` are replaced with a 351 placeholder token and the new element is added to the tree. 352 353* On the second call to `handleMatch` 354 * `m` will be the match for `--strike two--` 355 * `data` will be the string 356 `First line of the block.\nThis is klzzwxh:0000.\nThis is --strike two--.\nEnd of the block.` 357 358Note the placeholder token `klzzwxh:0000`. This allows the regular expression to be run against the entire block, 359not just the the text contained in an individual element. The placeholders will later be swapped back out for the 360actual elements by the parser. 361 362Actually it would not be necessary to create the above inline processor. The fact is, that example is not very DRY 363(Don't Repeat Yourself). A pattern for `**strong**` text would be almost identical, with the exception that it would 364create a `strong` element. Therefore, Markdown provides a number of generic `InlineProcessor` subclasses that can 365provide some common functionality. For example, strike could be implemented with an instance of the 366`SimpleTagInlineProcessor` class as demonstrated below. Feel free to use or extend any of the `InlineProcessor` 367subclasses found at `markdown.inlinepatterns`. 368 369```python 370from markdown.inlinepatterns import SimpleTagInlineProcessor 371from markdown.extensions import Extension 372 373class DelExtension(Extension): 374 def extendMarkdown(self, md): 375 md.inlinePatterns.register(SimpleTagInlineProcessor(r'()--(.*?)--', 'del'), 'del', 175) 376``` 377 378 379##### Usages 380 381Here are some convenience functions and other examples: 382 383| Class | Kind | Description | 384| ---------------------------------|-----------|---------------------------------------------------------------| 385| [`AsteriskProcessor`][i5] | built-in | Emphasis processor for handling strong and em matches inside asterisks | 386| [`AbbrInlineProcessor`][i6] | extension | Apply tag to abbreviation registered by preprocessor | 387| [`WikiLinksInlineProcessor`][i7] | extension | Link `[[article names]]` to wiki given in metadata | 388| [`FootnoteInlineProcessor`][i8] | extension | Replaces footnote in text with link to footnote div at bottom | 389 390[i1]: https://github.com/Python-Markdown/markdown/blob/master/markdown/inlinepatterns.py 391[i2]: https://github.com/Python-Markdown/markdown/blob/master/markdown/inlinepatterns.py 392[i3]: https://github.com/Python-Markdown/markdown/blob/master/markdown/inlinepatterns.py 393[i4]: https://github.com/Python-Markdown/markdown/blob/master/markdown/inlinepatterns.py 394[i5]: https://github.com/Python-Markdown/markdown/blob/master/markdown/inlinepatterns.py 395[i6]: https://github.com/Python-Markdown/markdown/blob/master/markdown/extensions/abbr.py 396[i7]: https://github.com/Python-Markdown/markdown/blob/master/markdown/extensions/wikilinks.py 397[i8]: https://github.com/Python-Markdown/markdown/blob/master/markdown/extensions/footnotes.py 398 399### Patterns 400 401In version 3.0, a new, more flexible inline processor was added, `markdown.inlinepatterns.InlineProcessor`. The 402original inline patterns, which inherit from `markdown.inlinepatterns.Pattern` or one of its children are still 403supported, though users are encouraged to migrate. 404 405#### Comparison with new `InlineProcessor` 406 407The new `InlineProcessor` provides two major enhancements to `Patterns`: 408 4091. Inline Processors no longer need to match the entire block, so regular expressions no longer need to start with 410 `r'^(.*?)'` and end with `r'(.*?)%'`. This runs faster. The returned [match object][] will only contain what is 411 explicitly matched in the pattern, and extension pattern groups now start with `m.group(1)`. 412 4132. The `handleMatch` method now takes an additional input called `data`, which is the entire block under analysis, 414 not just what is matched with the specified pattern. The method now returns the element *and* the indexes relative 415 to `data` that the return element is replacing (usually `m.start(0)` and `m.end(0)`). If the boundaries are 416 returned as `None`, it is assumed that the match did not take place, and nothing will be altered in `data`. 417 418 This allows handling of more complex constructs than regular expressions can handle, e.g., matching nested 419 brackets, and explicit control of the span "consumed" by the processor. 420 421#### Inline Patterns 422 423Inline Patterns can implement inline HTML element syntax for Markdown such as `*emphasis*` or 424`[links](http://example.com)`. Pattern objects should be instances of classes that inherit from 425`markdown.inlinepatterns.Pattern` or one of its children. Each pattern object uses a single regular expression and 426must have the following methods: 427 428* **`getCompiledRegExp()`**: 429 430 Returns a compiled regular expression. 431 432* **`handleMatch(m)`**: 433 434 Accepts a match object and returns an ElementTree element of a plain Unicode string. 435 436Inline Patterns can define the property `ANCESTOR_EXCLUDES` with is either a list or tuple of undesirable ancestors. 437The pattern will be skipped if it would cause the content to be a descendant of one of the listed tag names. 438 439Note that any regular expression returned by `getCompiledRegExp` must capture the whole block. Therefore, they should 440all start with `r'^(.*?)'` and end with `r'(.*?)!'`. When using the default `getCompiledRegExp()` method provided in 441the `Pattern` you can pass in a regular expression without that and `getCompiledRegExp` will wrap your expression for 442you and set the `re.DOTALL` and `re.UNICODE` flags. This means that the first group of your match will be `m.group(2)` 443as `m.group(1)` will match everything before the pattern. 444 445For an example, consider this simplified emphasis pattern: 446 447```python 448from markdown.inlinepatterns import Pattern 449import xml.etree.ElementTree as etree 450 451class EmphasisPattern(Pattern): 452 def handleMatch(self, m): 453 el = etree.Element('em') 454 el.text = m.group(2) 455 return el 456``` 457 458As discussed in [Integrating Your Code Into Markdown][], an instance of this class will need to be provided to 459Markdown. That instance would be created like so: 460 461```python 462# an oversimplified regex 463MYPATTERN = r'\*([^*]+)\*' 464# pass in pattern and create instance 465emphasis = EmphasisPattern(MYPATTERN) 466``` 467 468### Postprocessors {: #postprocessors } 469 470Postprocessors munge the document after the ElementTree has been serialized into a string. Postprocessors should be 471used to work with the text just before output. Usually, they are used add back sections that were extracted in a 472preprocessor, fix up outgoing encodings, or wrap the whole document. 473 474Postprocessors inherit from `markdown.postprocessors.Postprocessor` and implement a `run` method which takes a single 475parameter `text`, the entire HTML document as a single Unicode string. `run` should return a single Unicode string 476ready for output. Note that preprocessors use a list of lines while postprocessors use a single multi-line string. 477 478#### Example 479 480Here is a simple example that changes the output to one big page showing the raw html. 481 482```python 483from markdown.postprocessors import Postprocessor 484import re 485 486class ShowActualHtmlPostprocesor(Postprocessor): 487 """ Wrap entire output in <pre> tags as a diagnostic. """ 488 def run(self, text): 489 return '<pre>\n' + re.sub('<', '<', text) + '</pre>\n' 490``` 491 492#### Usages 493 494Some postprocessors in the Markdown source tree include: 495 496| Class | Kind | Description | 497| ------------------------------|-----------|----------------------------------------------------| 498| [`raw_html`][p1] | built-in | Restore raw html from `htmlStash`, stored by `HTMLBlockPreprocessor`, and code highlighters | 499| [`amp_substitute`][p2] | built-in | Convert ampersand substitutes to `&`; used in links | 500| [`unescape`][p3] | built-in | Convert some escaped characters back from integers; used in links | 501| [`FootnotePostProcessor`][p4] | extension | Replace footnote placeholders with html entities; as set by other stages | 502 503 [p1]: https://github.com/Python-Markdown/markdown/blob/master/markdown/postprocessors.py 504 [p2]: https://github.com/Python-Markdown/markdown/blob/master/markdown/postprocessors.py 505 [p3]: https://github.com/Python-Markdown/markdown/blob/master/markdown/postprocessors.py 506 [p4]: https://github.com/Python-Markdown/markdown/blob/master/markdown/extensions/footnotes.py 507 508 509## Working with the ElementTree {: #working_with_et } 510 511As mentioned, the Markdown parser converts a source document to an [ElementTree][ElementTree] object before 512serializing that back to Unicode text. Markdown has provided some helpers to ease that manipulation within the context 513of the Markdown module. 514 515First, import the ElementTree module: 516 517```python 518import xml.etree.ElementTree as etree 519``` 520Sometimes you may want text inserted into an element to be parsed by [Inline Patterns][]. In such a situation, simply 521insert the text as you normally would and the text will be automatically run through the Inline Patterns. However, if 522you do *not* want some text to be parsed by Inline Patterns, then insert the text as an `AtomicString`. 523 524```python 525from markdown.util import AtomicString 526some_element.text = AtomicString(some_text) 527``` 528 529Here's a basic example which creates an HTML table (note that the contents of the second cell (`td2`) will be run 530through Inline Patterns latter): 531 532```python 533table = etree.Element("table") 534table.set("cellpadding", "2") # Set cellpadding to 2 535tr = etree.SubElement(table, "tr") # Add child tr to table 536td1 = etree.SubElement(tr, "td") # Add child td1 to tr 537td1.text = markdown.util.AtomicString("Cell content") # Add plain text content 538td2 = etree.SubElement(tr, "td") # Add second td to tr 539td2.text = "*text* with **inline** formatting." # Add markup text 540table.tail = "Text after table" # Add text after table 541``` 542 543You can also manipulate an existing tree. Consider the following example which adds a `class` attribute to `<a>` 544elements: 545 546```python 547def set_link_class(self, element): 548 for child in element: 549 if child.tag == "a": 550 child.set("class", "myclass") #set the class attribute 551 set_link_class(child) # run recursively on children 552``` 553 554For more information about working with ElementTree see the [ElementTree 555Documentation][ElementTree]. 556 557## Working with Raw HTML {: #working_with_raw_html } 558 559Occasionally an extension may need to call out to a third party library which returns a pre-made string 560of raw HTML that needs to be inserted into the document unmodified. Raw strings can be stashed for later 561retrieval using an `htmlStash` instance, rather than converting them into `ElementTree` objects. A raw string 562(which may or may not be raw HTML) passed to `self.md.htmlStash.store()` will be saved to the stash and a 563placeholder string will be returned which should be inserted into the tree instead. After the tree is 564serialized, a postprocessor will replace the placeholder with the raw string. This prevents subsequent 565processing steps from modifying the HTML data. For example, 566 567```python 568html = "<p>This is some <em>raw</em> HTML data</p>" 569el = etree.Element("div") 570el.text = self.md.htmlStash.store(html) 571``` 572 573For the global `htmlStash` instance to be available from a processor, the `markdown.Markdown` instance must 574be passed to the processor from [extendMarkdown](#extendmarkdown) and will be available as `self.md.htmlStash`. 575 576## Integrating Your Code Into Markdown {: #integrating_into_markdown } 577 578Once you have the various pieces of your extension built, you need to tell Markdown about them and ensure that they 579are run in the proper sequence. Markdown accepts an `Extension` instance for each extension. Therefore, you will need 580to define a class that extends `markdown.extensions.Extension` and over-rides the `extendMarkdown` method. Within this 581class you will manage configuration options for your extension and attach the various processors and patterns to the 582Markdown instance. 583 584It is important to note that the order of the various processors and patterns matters. For example, if we replace 585`http://...` links with `<a>` elements, and *then* try to deal with inline HTML, we will end up with a mess. 586Therefore, the various types of processors and patterns are stored within an instance of the `markdown.Markdown` class 587in a [Registry][]. Your `Extension` class will need to manipulate those registries appropriately. You may `register` 588instances of your processors and patterns with an appropriate priority, `deregister` built-in instances, or replace a 589built-in instance with your own. 590 591### `extendMarkdown` {: #extendmarkdown } 592 593The `extendMarkdown` method of a `markdown.extensions.Extension` class accepts one argument: 594 595* **`md`**: 596 597 A pointer to the instance of the `markdown.Markdown` class. You should use this to access the 598 [Registries][Registry] of processors and patterns. They are found under the following attributes: 599 600 * `md.preprocessors` 601 * `md.inlinePatterns` 602 * `md.parser.blockprocessors` 603 * `md.treeprocessors` 604 * `md.postprocessors` 605 606 Some other things you may want to access on the `markdown.Markdown` instance are: 607 608 * `md.htmlStash` 609 * `md.output_formats` 610 * `md.set_output_format()` 611 * `md.output_format` 612 * `md.serializer` 613 * `md.registerExtension()` 614 * `md.tab_length` 615 * `md.block_level_elements` 616 * `md.isBlockLevel()` 617 618!!! Warning 619 With access to the above items, theoretically you have the option to change anything through various 620 [monkey_patching][] techniques. However, you should be aware that the various undocumented parts of Markdown may 621 change without notice and your monkey_patches may break with a new release. Therefore, what you really should be 622 doing is inserting processors and patterns into the Markdown pipeline. Consider yourself warned! 623 624[monkey_patching]: https://en.wikipedia.org/wiki/Monkey_patch 625 626A simple example: 627 628```python 629from markdown.extensions import Extension 630 631class MyExtension(Extension): 632 def extendMarkdown(self, md): 633 # Register instance of 'mypattern' with a priority of 175 634 md.inlinePatterns.register(MyPattern(md), 'mypattern', 175) 635``` 636 637### registerExtension {: #registerextension } 638 639Some extensions may need to have their state reset between multiple runs of the `markdown.Markdown` class. For 640example, consider the following use of the [Footnotes][] extension: 641 642```python 643md = markdown.Markdown(extensions=['footnotes']) 644html1 = md.convert(text_with_footnote) 645md.reset() 646html2 = md.convert(text_without_footnote) 647``` 648 649Without calling `reset`, the footnote definitions from the first document will be inserted into the second document as 650they are still stored within the class instance. Therefore the `Extension` class needs to define a `reset` method that 651will reset the state of the extension (i.e.: `self.footnotes = {}`). However, as many extensions do not have a need 652for `reset`, `reset` is only called on extensions that are registered. 653 654To register an extension, call `md.registerExtension` from within your `extendMarkdown` method: 655 656```python 657def extendMarkdown(self, md): 658 md.registerExtension(self) 659 # insert processors and patterns here 660``` 661 662Then, each time `reset` is called on the `markdown.Markdown` instance, the `reset` method of each registered extension 663will be called as well. You should also note that `reset` will be called on each registered extension after it is 664initialized the first time. Keep that in mind when over-riding the extension's `reset` method. 665 666### Configuration Settings {: #configsettings } 667 668If an extension uses any parameters that the user may want to change, those parameters should be stored in 669`self.config` of your `markdown.extensions.Extension` class in the following format: 670 671```python 672class MyExtension(markdown.extensions.Extension): 673 def __init__(self, **kwargs): 674 self.config = { 675 'option1' : ['value1', 'description1'], 676 'option2' : ['value2', 'description2'] 677 } 678 super(MyExtension, self).__init__(**kwargs) 679``` 680 681When implemented this way the configuration parameters can be over-ridden at run time (thus the call to `super`). For 682example: 683 684```python 685markdown.Markdown(extensions=[MyExtension(option1='other value')]) 686``` 687 688Note that if a keyword is passed in that is not already defined in `self.config`, then a `KeyError` is raised. 689 690The `markdown.extensions.Extension` class and its subclasses have the following methods available to assist in working 691with configuration settings: 692 693* **`getConfig(key [, default])`**: 694 695 Returns the stored value for the given `key` or `default` if the `key` does not exist. If not set, `default` 696 returns an empty string. 697 698* **`getConfigs()`**: 699 700 Returns a dict of all key/value pairs. 701 702* **`getConfigInfo()`**: 703 704 Returns all configuration descriptions as a list of tuples. 705 706* **`setConfig(key, value)`**: 707 708 Sets a configuration setting for `key` with the given `value`. If `key` is unknown, a `KeyError` is raised. If the 709 previous value of `key` was a Boolean value, then `value` is converted to a Boolean value. If the previous value 710 of `key` is `None`, then `value` is converted to a Boolean value except when it is `None`. No conversion takes 711 place when the previous value of `key` is a string. 712 713* **`setConfigs(items)`**: 714 715 Sets multiple configuration settings given a dict of key/value pairs. 716 717### Naming an Extension { #naming_an_extension } 718 719As noted in the [library reference] an instance of an extension can be passed directly to `markdown.Markdown`. In 720fact, this is the preferred way to use third-party extensions. 721 722For example: 723 724```python 725import markdown 726from path.to.module import MyExtension 727md = markdown.Markdown(extensions=[MyExtension(option='value')]) 728``` 729 730However, Markdown also accepts "named" third party extensions for those occasions when it is impractical to import an 731extension directly (from the command line or from within templates). A "name" can either be a registered [entry 732point](#entry_point) or a string using Python's [dot notation](#dot_notation). 733 734#### Entry Point { #entry_point } 735 736[Entry points] are defined in a Python package's `setup.py` script. The script must use [setuptools] to support entry 737points. Python-Markdown extensions must be assigned to the `markdown.extensions` group. An entry point definition 738might look like this: 739 740```python 741from setuptools import setup 742 743setup( 744 # ... 745 entry_points={ 746 'markdown.extensions': ['myextension = path.to.module:MyExtension'] 747 } 748) 749``` 750 751After a user installs your extension using the above script, they could then call the extension using the 752`myextension` string name like this: 753 754```python 755markdown.markdown(text, extensions=['myextension']) 756``` 757 758Note that if two or more entry points within the same group are assigned the same name, Python-Markdown will only ever 759use the first one found and ignore all others. Therefore, be sure to give your extension a unique name. 760 761For more information on writing `setup.py` scripts, see the Python documentation on [Packaging and Distributing 762Projects]. 763 764#### Dot Notation { #dot_notation } 765 766If an extension does not have a registered entry point, Python's dot notation may be used instead. The extension must 767be installed as a Python module on your PYTHONPATH. Generally, a class should be specified in the name. The class must 768be at the end of the name and be separated by a colon from the module. 769 770Therefore, if you were to import the class like this: 771 772```python 773from path.to.module import MyExtension 774``` 775 776Then the extension can be loaded as follows: 777 778```python 779markdown.markdown(text, extensions=['path.to.module:MyExtension']) 780``` 781 782You do not need to do anything special to support this feature. As long as your extension class is able to be 783imported, a user can include it with the above syntax. 784 785The above two methods are especially useful if you need to implement a large number of extensions with more than one 786residing in a module. However, if you do not want to require that your users include the class name in their string, 787you must define only one extension per module and that module must contain a module-level function called 788`makeExtension` that accepts `**kwargs` and returns an extension instance. 789 790For example: 791 792```python 793class MyExtension(markdown.extensions.Extension) 794 # Define extension here... 795 796def makeExtension(**kwargs): 797 return MyExtension(**kwargs) 798``` 799 800When `markdown.Markdown` is passed the "name" of your extension as a dot notation string that does not include a class 801(for example `path.to.module`), it will import the module and call the `makeExtension` function to initiate your 802extension. 803 804## Registries 805 806The `markdown.util.Registry` class is a priority sorted registry which Markdown uses internally to determine the 807processing order of its various processors and patterns. 808 809A `Registry` instance provides two public methods to alter the data of the registry: `register` and `deregister`. Use 810`register` to add items and `deregister` to remove items. See each method for specifics. 811 812When registering an item, a "name" and a "priority" must be provided. All items are automatically sorted by the value 813of the "priority" parameter such that the item with the highest value will be processed first. The "name" is used to 814remove (`deregister`) and get items. 815 816A `Registry` instance is like a list (which maintains order) when reading data. You may iterate over the items, get an 817item and get a count (length) of all items. You may also check that the registry contains an item. 818 819When getting an item you may use either the index of the item or the string-based "name". For example: 820 821```python 822registry = Registry() 823registry.register(SomeItem(), 'itemname', 20) 824# Get the item by index 825item = registry[0] 826# Get the item by name 827item = registry['itemname'] 828``` 829 830When checking that the registry contains an item, you may use either the string-based "name", or a reference to the 831actual item. For example: 832 833```python 834someitem = SomeItem() 835registry.register(someitem, 'itemname', 20) 836# Contains the name 837assert 'itemname' in registry 838# Contains the item instance 839assert someitem in registry 840``` 841 842`markdown.util.Registry` has the following methods: 843 844### `Registry.register(self, item, name, priority)` {: #registry.register data-toc-label='Registry.register'} 845 846: Add an item to the registry with the given name and priority. 847 848 Parameters: 849 850 * `item`: The item being registered. 851 * `name`: A string used to reference the item. 852 * `priority`: An integer or float used to sort against all items. 853 854 If an item is registered with a "name" which already exists, the existing item is replaced with the new item. 855 Be careful as the old item is lost with no way to recover it. The new item will be sorted according to its 856 priority and will **not** retain the position of the old item. 857 858### `Registry.deregister(self, name, strict=True)` {: #registry.deregister data-toc-label='Registry.deregister'} 859 860: Remove an item from the registry. 861 862 Set `strict=False` to fail silently. 863 864### `Registry.get_index_for_name(self, name)` {: #registry.get_index_for_name data-toc-label='Registry.get_index_for_name'} 865 866: Return the index of the given `name`. 867 868[match object]: https://docs.python.org/3/library/re.html#match-objects 869[bug tracker]: https://github.com/Python-Markdown/markdown/issues 870[extension source]: https://github.com/Python-Markdown/markdown/tree/master/markdown/extensions 871[tutorial]: https://github.com/Python-Markdown/markdown/wiki/Tutorial-1---Writing-Extensions-for-Python-Markdown 872[workingwithetree]: #working_with_et 873[Integrating your code into Markdown]: #integrating_into_markdown 874[extendMarkdown]: #extendmarkdown 875[Registry]: #registry 876[registerExtension]: #registerextension 877[Config Settings]: #configsettings 878[makeExtension]: #makeextension 879[ElementTree]: https://docs.python.org/3/library/xml.etree.elementtree.html 880[Available Extensions]: index.md 881[Footnotes]: https://github.com/Python-Markdown/markdown/blob/master/markdown/extensions/footnotes.py 882[Definition Lists]: https://github.com/Python-Markdown/markdown/blob/master/markdown/extensions/definition_lists 883[library reference]: ../reference.md 884[setuptools]: https://packaging.python.org/key_projects/#setuptools 885[Entry points]: https://setuptools.readthedocs.io/en/latest/setuptools.html#dynamic-discovery-of-services-and-plugins 886[Packaging and Distributing Projects]: https://packaging.python.org/tutorials/distributing-packages/ 887