1--- 2title: CommonMark Spec 3author: 4- John MacFarlane 5version: 2 6date: 2014-09-19 7... 8 9# Introduction 10 11## What is Markdown? 12 13Markdown is a plain text format for writing structured documents, 14based on conventions used for indicating formatting in email and 15usenet posts. It was developed in 2004 by John Gruber, who wrote 16the first Markdown-to-HTML converter in perl, and it soon became 17widely used in websites. By 2014 there were dozens of 18implementations in many languages. Some of them extended basic 19Markdown syntax with conventions for footnotes, definition lists, 20tables, and other constructs, and some allowed output not just in 21HTML but in LaTeX and many other formats. 22 23## Why is a spec needed? 24 25John Gruber's [canonical description of Markdown's 26syntax](http://daringfireball.net/projects/markdown/syntax) 27does not specify the syntax unambiguously. Here are some examples of 28questions it does not answer: 29 301. How much indentation is needed for a sublist? The spec says that 31 continuation paragraphs need to be indented four spaces, but is 32 not fully explicit about sublists. It is natural to think that 33 they, too, must be indented four spaces, but `Markdown.pl` does 34 not require that. This is hardly a "corner case," and divergences 35 between implementations on this issue often lead to surprises for 36 users in real documents. (See [this comment by John 37 Gruber](http://article.gmane.org/gmane.text.markdown.general/1997).) 38 392. Is a blank line needed before a block quote or header? 40 Most implementations do not require the blank line. However, 41 this can lead to unexpected results in hard-wrapped text, and 42 also to ambiguities in parsing (note that some implementations 43 put the header inside the blockquote, while others do not). 44 (John Gruber has also spoken [in favor of requiring the blank 45 lines](http://article.gmane.org/gmane.text.markdown.general/2146).) 46 473. Is a blank line needed before an indented code block? 48 (`Markdown.pl` requires it, but this is not mentioned in the 49 documentation, and some implementations do not require it.) 50 51 ``` markdown 52 paragraph 53 code? 54 ``` 55 564. What is the exact rule for determining when list items get 57 wrapped in `<p>` tags? Can a list be partially "loose" and partially 58 "tight"? What should we do with a list like this? 59 60 ``` markdown 61 1. one 62 63 2. two 64 3. three 65 ``` 66 67 Or this? 68 69 ``` markdown 70 1. one 71 - a 72 73 - b 74 2. two 75 ``` 76 77 (There are some relevant comments by John Gruber 78 [here](http://article.gmane.org/gmane.text.markdown.general/2554).) 79 805. Can list markers be indented? Can ordered list markers be right-aligned? 81 82 ``` markdown 83 8. item 1 84 9. item 2 85 10. item 2a 86 ``` 87 886. Is this one list with a horizontal rule in its second item, 89 or two lists separated by a horizontal rule? 90 91 ``` markdown 92 * a 93 * * * * * 94 * b 95 ``` 96 977. When list markers change from numbers to bullets, do we have 98 two lists or one? (The Markdown syntax description suggests two, 99 but the perl scripts and many other implementations produce one.) 100 101 ``` markdown 102 1. fee 103 2. fie 104 - foe 105 - fum 106 ``` 107 1088. What are the precedence rules for the markers of inline structure? 109 For example, is the following a valid link, or does the code span 110 take precedence ? 111 112 ``` markdown 113 [a backtick (`)](/url) and [another backtick (`)](/url). 114 ``` 115 1169. What are the precedence rules for markers of emphasis and strong 117 emphasis? For example, how should the following be parsed? 118 119 ``` markdown 120 *foo *bar* baz* 121 ``` 122 12310. What are the precedence rules between block-level and inline-level 124 structure? For example, how should the following be parsed? 125 126 ``` markdown 127 - `a long code span can contain a hyphen like this 128 - and it can screw things up` 129 ``` 130 13111. Can list items include headers? (`Markdown.pl` does not allow this, 132 but headers can occur in blockquotes.) 133 134 ``` markdown 135 - # Heading 136 ``` 137 13812. Can link references be defined inside block quotes or list items? 139 140 ``` markdown 141 > Blockquote [foo]. 142 > 143 > [foo]: /url 144 ``` 145 14613. If there are multiple definitions for the same reference, which takes 147 precedence? 148 149 ``` markdown 150 [foo]: /url1 151 [foo]: /url2 152 153 [foo][] 154 ``` 155 156In the absence of a spec, early implementers consulted `Markdown.pl` 157to resolve these ambiguities. But `Markdown.pl` was quite buggy, and 158gave manifestly bad results in many cases, so it was not a 159satisfactory replacement for a spec. 160 161Because there is no unambiguous spec, implementations have diverged 162considerably. As a result, users are often surprised to find that 163a document that renders one way on one system (say, a github wiki) 164renders differently on another (say, converting to docbook using 165pandoc). To make matters worse, because nothing in Markdown counts 166as a "syntax error," the divergence often isn't discovered right away. 167 168## About this document 169 170This document attempts to specify Markdown syntax unambiguously. 171It contains many examples with side-by-side Markdown and 172HTML. These are intended to double as conformance tests. An 173accompanying script `runtests.pl` can be used to run the tests 174against any Markdown program: 175 176 perl runtests.pl spec.txt PROGRAM 177 178Since this document describes how Markdown is to be parsed into 179an abstract syntax tree, it would have made sense to use an abstract 180representation of the syntax tree instead of HTML. But HTML is capable 181of representing the structural distinctions we need to make, and the 182choice of HTML for the tests makes it possible to run the tests against 183an implementation without writing an abstract syntax tree renderer. 184 185This document is generated from a text file, `spec.txt`, written 186in Markdown with a small extension for the side-by-side tests. 187The script `spec2md.pl` can be used to turn `spec.txt` into pandoc 188Markdown, which can then be converted into other formats. 189 190In the examples, the `→` character is used to represent tabs. 191 192# Preprocessing 193 194A [line](#line) <a id="line"></a> 195is a sequence of zero or more characters followed by a line 196ending (CR, LF, or CRLF) or by the end of 197file. 198 199This spec does not specify an encoding; it thinks of lines as composed 200of characters rather than bytes. A conforming parser may be limited 201to a certain encoding. 202 203Tabs in lines are expanded to spaces, with a tab stop of 4 characters: 204 205. 206→foo→baz→→bim 207. 208<pre><code>foo baz bim 209</code></pre> 210. 211 212. 213 a→a 214 ὐ→a 215. 216<pre><code>a a 217ὐ a 218</code></pre> 219. 220 221Line endings are replaced by newline characters (LF). 222 223A line containing no characters, or a line containing only spaces (after 224tab expansion), is called a [blank line](#blank-line). 225<a id="blank-line"></a> 226 227# Blocks and inlines 228 229We can think of a document as a sequence of [blocks](#block)<a 230id="block"></a>---structural elements like paragraphs, block quotations, 231lists, headers, rules, and code blocks. Blocks can contain other 232blocks, or they can contain [inline](#inline)<a id="inline"></a> content: 233words, spaces, links, emphasized text, images, and inline code. 234 235## Precedence 236 237Indicators of block structure always take precedence over indicators 238of inline structure. So, for example, the following is a list with 239two items, not a list with one item containing a code span: 240 241. 242- `one 243- two` 244. 245<ul> 246<li>`one</li> 247<li>two`</li> 248</ul> 249. 250 251This means that parsing can proceed in two steps: first, the block 252structure of the document can be discerned; second, text lines inside 253paragraphs, headers, and other block constructs can be parsed for inline 254structure. The second step requires information about link reference 255definitions that will be available only at the end of the first 256step. Note that the first step requires processing lines in sequence, 257but the second can be parallelized, since the inline parsing of 258one block element does not affect the inline parsing of any other. 259 260## Container blocks and leaf blocks 261 262We can divide blocks into two types: 263[container blocks](#container-block), <a id="container-block"></a> 264which can contain other blocks, and [leaf blocks](#leaf-block), 265<a id="leaf-block"></a> which cannot. 266 267# Leaf blocks 268 269This section describes the different kinds of leaf block that make up a 270Markdown document. 271 272## Horizontal rules 273 274A line consisting of 0-3 spaces of indentation, followed by a sequence 275of three or more matching `-`, `_`, or `*` characters, each followed 276optionally by any number of spaces, forms a [horizontal 277rule](#horizontal-rule). <a id="horizontal-rule"></a> 278 279. 280*** 281--- 282___ 283. 284<hr /> 285<hr /> 286<hr /> 287. 288 289Wrong characters: 290 291. 292+++ 293. 294<p>+++</p> 295. 296 297. 298=== 299. 300<p>===</p> 301. 302 303Not enough characters: 304 305. 306-- 307** 308__ 309. 310<p>-- 311** 312__</p> 313. 314 315One to three spaces indent are allowed: 316 317. 318 *** 319 *** 320 *** 321. 322<hr /> 323<hr /> 324<hr /> 325. 326 327Four spaces is too many: 328 329. 330 *** 331. 332<pre><code>*** 333</code></pre> 334. 335 336. 337Foo 338 *** 339. 340<p>Foo 341***</p> 342. 343 344More than three characters may be used: 345 346. 347_____________________________________ 348. 349<hr /> 350. 351 352Spaces are allowed between the characters: 353 354. 355 - - - 356. 357<hr /> 358. 359 360. 361 ** * ** * ** * ** 362. 363<hr /> 364. 365 366. 367- - - - 368. 369<hr /> 370. 371 372Spaces are allowed at the end: 373 374. 375- - - - 376. 377<hr /> 378. 379 380However, no other characters may occur at the end or the 381beginning: 382 383. 384_ _ _ _ a 385 386a------ 387. 388<p>_ _ _ _ a</p> 389<p>a------</p> 390. 391 392It is required that all of the non-space characters be the same. 393So, this is not a horizontal rule: 394 395. 396 *-* 397. 398<p><em>-</em></p> 399. 400 401Horizontal rules do not need blank lines before or after: 402 403. 404- foo 405*** 406- bar 407. 408<ul> 409<li>foo</li> 410</ul> 411<hr /> 412<ul> 413<li>bar</li> 414</ul> 415. 416 417Horizontal rules can interrupt a paragraph: 418 419. 420Foo 421*** 422bar 423. 424<p>Foo</p> 425<hr /> 426<p>bar</p> 427. 428 429Note, however, that this is a setext header, not a paragraph followed 430by a horizontal rule: 431 432. 433Foo 434--- 435bar 436. 437<h2>Foo</h2> 438<p>bar</p> 439. 440 441When both a horizontal rule and a list item are possible 442interpretations of a line, the horizontal rule is preferred: 443 444. 445* Foo 446* * * 447* Bar 448. 449<ul> 450<li>Foo</li> 451</ul> 452<hr /> 453<ul> 454<li>Bar</li> 455</ul> 456. 457 458If you want a horizontal rule in a list item, use a different bullet: 459 460. 461- Foo 462- * * * 463. 464<ul> 465<li>Foo</li> 466<li><hr /></li> 467</ul> 468. 469 470## ATX headers 471 472An [ATX header](#atx-header) <a id="atx-header"></a> 473consists of a string of characters, parsed as inline content, between an 474opening sequence of 1--6 unescaped `#` characters and an optional 475closing sequence of any number of `#` characters. The opening sequence 476of `#` characters cannot be followed directly by a nonspace character. 477The closing `#` characters may be followed by spaces only. The opening 478`#` character may be indented 0-3 spaces. The raw contents of the 479header are stripped of leading and trailing spaces before being parsed 480as inline content. The header level is equal to the number of `#` 481characters in the opening sequence. 482 483Simple headers: 484 485. 486# foo 487## foo 488### foo 489#### foo 490##### foo 491###### foo 492. 493<h1>foo</h1> 494<h2>foo</h2> 495<h3>foo</h3> 496<h4>foo</h4> 497<h5>foo</h5> 498<h6>foo</h6> 499. 500 501More than six `#` characters is not a header: 502 503. 504####### foo 505. 506<p>####### foo</p> 507. 508 509A space is required between the `#` characters and the header's 510contents. Note that many implementations currently do not require 511the space. However, the space was required by the [original ATX 512implementation](http://www.aaronsw.com/2002/atx/atx.py), and it helps 513prevent things like the following from being parsed as headers: 514 515. 516#5 bolt 517. 518<p>#5 bolt</p> 519. 520 521This is not a header, because the first `#` is escaped: 522 523. 524\## foo 525. 526<p>## foo</p> 527. 528 529Contents are parsed as inlines: 530 531. 532# foo *bar* \*baz\* 533. 534<h1>foo <em>bar</em> *baz*</h1> 535. 536 537Leading and trailing blanks are ignored in parsing inline content: 538 539. 540# foo 541. 542<h1>foo</h1> 543. 544 545One to three spaces indentation are allowed: 546 547. 548 ### foo 549 ## foo 550 # foo 551. 552<h3>foo</h3> 553<h2>foo</h2> 554<h1>foo</h1> 555. 556 557Four spaces are too much: 558 559. 560 # foo 561. 562<pre><code># foo 563</code></pre> 564. 565 566. 567foo 568 # bar 569. 570<p>foo 571# bar</p> 572. 573 574A closing sequence of `#` characters is optional: 575 576. 577## foo ## 578 ### bar ### 579. 580<h2>foo</h2> 581<h3>bar</h3> 582. 583 584It need not be the same length as the opening sequence: 585 586. 587# foo ################################## 588##### foo ## 589. 590<h1>foo</h1> 591<h5>foo</h5> 592. 593 594Spaces are allowed after the closing sequence: 595 596. 597### foo ### 598. 599<h3>foo</h3> 600. 601 602A sequence of `#` characters with a nonspace character following it 603is not a closing sequence, but counts as part of the contents of the 604header: 605 606. 607### foo ### b 608. 609<h3>foo ### b</h3> 610. 611 612Backslash-escaped `#` characters do not count as part 613of the closing sequence: 614 615. 616### foo \### 617## foo \#\## 618# foo \# 619. 620<h3>foo #</h3> 621<h2>foo ##</h2> 622<h1>foo #</h1> 623. 624 625ATX headers need not be separated from surrounding content by blank 626lines, and they can interrupt paragraphs: 627 628. 629**** 630## foo 631**** 632. 633<hr /> 634<h2>foo</h2> 635<hr /> 636. 637 638. 639Foo bar 640# baz 641Bar foo 642. 643<p>Foo bar</p> 644<h1>baz</h1> 645<p>Bar foo</p> 646. 647 648ATX headers can be empty: 649 650. 651## 652# 653### ### 654. 655<h2></h2> 656<h1></h1> 657<h3></h3> 658. 659 660## Setext headers 661 662A [setext header](#setext-header) <a id="setext-header"></a> 663consists of a line of text, containing at least one nonspace character, 664with no more than 3 spaces indentation, followed by a [setext header 665underline](#setext-header-underline). A [setext header 666underline](#setext-header-underline) <a id="setext-header-underline"></a> 667is a sequence of `=` characters or a sequence of `-` characters, with no 668more than 3 spaces indentation and any number of trailing 669spaces. The header is a level 1 header if `=` characters are used, and 670a level 2 header if `-` characters are used. The contents of the header 671are the result of parsing the first line as Markdown inline content. 672 673In general, a setext header need not be preceded or followed by a 674blank line. However, it cannot interrupt a paragraph, so when a 675setext header comes after a paragraph, a blank line is needed between 676them. 677 678Simple examples: 679 680. 681Foo *bar* 682========= 683 684Foo *bar* 685--------- 686. 687<h1>Foo <em>bar</em></h1> 688<h2>Foo <em>bar</em></h2> 689. 690 691The underlining can be any length: 692 693. 694Foo 695------------------------- 696 697Foo 698= 699. 700<h2>Foo</h2> 701<h1>Foo</h1> 702. 703 704The header content can be indented up to three spaces, and need 705not line up with the underlining: 706 707. 708 Foo 709--- 710 711 Foo 712----- 713 714 Foo 715 === 716. 717<h2>Foo</h2> 718<h2>Foo</h2> 719<h1>Foo</h1> 720. 721 722Four spaces indent is too much: 723 724. 725 Foo 726 --- 727 728 Foo 729--- 730. 731<pre><code>Foo 732--- 733 734Foo 735</code></pre> 736<hr /> 737. 738 739The setext header underline can be indented up to three spaces, and 740may have trailing spaces: 741 742. 743Foo 744 ---- 745. 746<h2>Foo</h2> 747. 748 749Four spaces is too much: 750 751. 752Foo 753 --- 754. 755<p>Foo 756---</p> 757. 758 759The setext header underline cannot contain internal spaces: 760 761. 762Foo 763= = 764 765Foo 766--- - 767. 768<p>Foo 769= =</p> 770<p>Foo</p> 771<hr /> 772. 773 774Trailing spaces in the content line do not cause a line break: 775 776. 777Foo 778----- 779. 780<h2>Foo</h2> 781. 782 783Nor does a backslash at the end: 784 785. 786Foo\ 787---- 788. 789<h2>Foo\</h2> 790. 791 792Since indicators of block structure take precedence over 793indicators of inline structure, the following are setext headers: 794 795. 796`Foo 797---- 798` 799 800<a title="a lot 801--- 802of dashes"/> 803. 804<h2>`Foo</h2> 805<p>`</p> 806<h2><a title="a lot</h2> 807<p>of dashes"/></p> 808. 809 810The setext header underline cannot be a lazy line: 811 812. 813> Foo 814--- 815. 816<blockquote> 817<p>Foo</p> 818</blockquote> 819<hr /> 820. 821 822A setext header cannot interrupt a paragraph: 823 824. 825Foo 826Bar 827--- 828 829Foo 830Bar 831=== 832. 833<p>Foo 834Bar</p> 835<hr /> 836<p>Foo 837Bar 838===</p> 839. 840 841But in general a blank line is not required before or after: 842 843. 844--- 845Foo 846--- 847Bar 848--- 849Baz 850. 851<hr /> 852<h2>Foo</h2> 853<h2>Bar</h2> 854<p>Baz</p> 855. 856 857Setext headers cannot be empty: 858 859. 860 861==== 862. 863<p>====</p> 864. 865 866 867## Indented code blocks 868 869An [indented code block](#indented-code-block) 870<a id="indented-code-block"></a> is composed of one or more 871[indented chunks](#indented-chunk) separated by blank lines. 872An [indented chunk](#indented-chunk) <a id="indented-chunk"></a> 873is a sequence of non-blank lines, each indented four or more 874spaces. An indented code block cannot interrupt a paragraph, so 875if it occurs before or after a paragraph, there must be an 876intervening blank line. The contents of the code block are 877the literal contents of the lines, including trailing newlines, 878minus four spaces of indentation. An indented code block has no 879attributes. 880 881. 882 a simple 883 indented code block 884. 885<pre><code>a simple 886 indented code block 887</code></pre> 888. 889 890The contents are literal text, and do not get parsed as Markdown: 891 892. 893 <a/> 894 *hi* 895 896 - one 897. 898<pre><code><a/> 899*hi* 900 901- one 902</code></pre> 903. 904 905Here we have three chunks separated by blank lines: 906 907. 908 chunk1 909 910 chunk2 911 912 913 914 chunk3 915. 916<pre><code>chunk1 917 918chunk2 919 920 921 922chunk3 923</code></pre> 924. 925 926Any initial spaces beyond four will be included in the content, even 927in interior blank lines: 928 929. 930 chunk1 931 932 chunk2 933. 934<pre><code>chunk1 935 936 chunk2 937</code></pre> 938. 939 940An indented code block cannot interrupt a paragraph. (This 941allows hanging indents and the like.) 942 943. 944Foo 945 bar 946 947. 948<p>Foo 949bar</p> 950. 951 952However, any non-blank line with fewer than four leading spaces ends 953the code block immediately. So a paragraph may occur immediately 954after indented code: 955 956. 957 foo 958bar 959. 960<pre><code>foo 961</code></pre> 962<p>bar</p> 963. 964 965And indented code can occur immediately before and after other kinds of 966blocks: 967 968. 969# Header 970 foo 971Header 972------ 973 foo 974---- 975. 976<h1>Header</h1> 977<pre><code>foo 978</code></pre> 979<h2>Header</h2> 980<pre><code>foo 981</code></pre> 982<hr /> 983. 984 985The first line can be indented more than four spaces: 986 987. 988 foo 989 bar 990. 991<pre><code> foo 992bar 993</code></pre> 994. 995 996Blank lines preceding or following an indented code block 997are not included in it: 998 999. 1000 1001 1002 foo 1003 1004 1005. 1006<pre><code>foo 1007</code></pre> 1008. 1009 1010Trailing spaces are included in the code block's content: 1011 1012. 1013 foo 1014. 1015<pre><code>foo 1016</code></pre> 1017. 1018 1019 1020## Fenced code blocks 1021 1022A [code fence](#code-fence) <a id="code-fence"></a> is a sequence 1023of at least three consecutive backtick characters (`` ` ``) or 1024tildes (`~`). (Tildes and backticks cannot be mixed.) 1025A [fenced code block](#fenced-code-block) <a id="fenced-code-block"></a> 1026begins with a code fence, indented no more than three spaces. 1027 1028The line with the opening code fence may optionally contain some text 1029following the code fence; this is trimmed of leading and trailing 1030spaces and called the [info string](#info-string). 1031<a id="info-string"></a> The info string may not contain any backtick 1032characters. (The reason for this restriction is that otherwise 1033some inline code would be incorrectly interpreted as the 1034beginning of a fenced code block.) 1035 1036The content of the code block consists of all subsequent lines, until 1037a closing [code fence](#code-fence) of the same type as the code block 1038began with (backticks or tildes), and with at least as many backticks 1039or tildes as the opening code fence. If the leading code fence is 1040indented N spaces, then up to N spaces of indentation are removed from 1041each line of the content (if present). (If a content line is not 1042indented, it is preserved unchanged. If it is indented less than N 1043spaces, all of the indentation is removed.) 1044 1045The closing code fence may be indented up to three spaces, and may be 1046followed only by spaces, which are ignored. If the end of the 1047containing block (or document) is reached and no closing code fence 1048has been found, the code block contains all of the lines after the 1049opening code fence until the end of the containing block (or 1050document). (An alternative spec would require backtracking in the 1051event that a closing code fence is not found. But this makes parsing 1052much less efficient, and there seems to be no real down side to the 1053behavior described here.) 1054 1055A fenced code block may interrupt a paragraph, and does not require 1056a blank line either before or after. 1057 1058The content of a code fence is treated as literal text, not parsed 1059as inlines. The first word of the info string is typically used to 1060specify the language of the code sample, and rendered in the `class` 1061attribute of the `code` tag. However, this spec does not mandate any 1062particular treatment of the info string. 1063 1064Here is a simple example with backticks: 1065 1066. 1067``` 1068< 1069 > 1070``` 1071. 1072<pre><code>< 1073 > 1074</code></pre> 1075. 1076 1077With tildes: 1078 1079. 1080~~~ 1081< 1082 > 1083~~~ 1084. 1085<pre><code>< 1086 > 1087</code></pre> 1088. 1089 1090The closing code fence must use the same character as the opening 1091fence: 1092 1093. 1094``` 1095aaa 1096~~~ 1097``` 1098. 1099<pre><code>aaa 1100~~~ 1101</code></pre> 1102. 1103 1104. 1105~~~ 1106aaa 1107``` 1108~~~ 1109. 1110<pre><code>aaa 1111``` 1112</code></pre> 1113. 1114 1115The closing code fence must be at least as long as the opening fence: 1116 1117. 1118```` 1119aaa 1120``` 1121`````` 1122. 1123<pre><code>aaa 1124``` 1125</code></pre> 1126. 1127 1128. 1129~~~~ 1130aaa 1131~~~ 1132~~~~ 1133. 1134<pre><code>aaa 1135~~~ 1136</code></pre> 1137. 1138 1139Unclosed code blocks are closed by the end of the document: 1140 1141. 1142``` 1143. 1144<pre><code></code></pre> 1145. 1146 1147. 1148````` 1149 1150``` 1151aaa 1152. 1153<pre><code> 1154``` 1155aaa 1156</code></pre> 1157. 1158 1159A code block can have all empty lines as its content: 1160 1161. 1162``` 1163 1164 1165``` 1166. 1167<pre><code> 1168 1169</code></pre> 1170. 1171 1172A code block can be empty: 1173 1174. 1175``` 1176``` 1177. 1178<pre><code></code></pre> 1179. 1180 1181Fences can be indented. If the opening fence is indented, 1182content lines will have equivalent opening indentation removed, 1183if present: 1184 1185. 1186 ``` 1187 aaa 1188aaa 1189``` 1190. 1191<pre><code>aaa 1192aaa 1193</code></pre> 1194. 1195 1196. 1197 ``` 1198aaa 1199 aaa 1200aaa 1201 ``` 1202. 1203<pre><code>aaa 1204aaa 1205aaa 1206</code></pre> 1207. 1208 1209. 1210 ``` 1211 aaa 1212 aaa 1213 aaa 1214 ``` 1215. 1216<pre><code>aaa 1217 aaa 1218aaa 1219</code></pre> 1220. 1221 1222Four spaces indentation produces an indented code block: 1223 1224. 1225 ``` 1226 aaa 1227 ``` 1228. 1229<pre><code>``` 1230aaa 1231``` 1232</code></pre> 1233. 1234 1235Code fences (opening and closing) cannot contain internal spaces: 1236 1237. 1238``` ``` 1239aaa 1240. 1241<p><code></code> 1242aaa</p> 1243. 1244 1245. 1246~~~~~~ 1247aaa 1248~~~ ~~ 1249. 1250<pre><code>aaa 1251~~~ ~~ 1252</code></pre> 1253. 1254 1255Fenced code blocks can interrupt paragraphs, and can be followed 1256directly by paragraphs, without a blank line between: 1257 1258. 1259foo 1260``` 1261bar 1262``` 1263baz 1264. 1265<p>foo</p> 1266<pre><code>bar 1267</code></pre> 1268<p>baz</p> 1269. 1270 1271Other blocks can also occur before and after fenced code blocks 1272without an intervening blank line: 1273 1274. 1275foo 1276--- 1277~~~ 1278bar 1279~~~ 1280# baz 1281. 1282<h2>foo</h2> 1283<pre><code>bar 1284</code></pre> 1285<h1>baz</h1> 1286. 1287 1288An [info string](#info-string) can be provided after the opening code fence. 1289Opening and closing spaces will be stripped, and the first word, prefixed 1290with `language-`, is used as the value for the `class` attribute of the 1291`code` element within the enclosing `pre` element. 1292 1293. 1294```ruby 1295def foo(x) 1296 return 3 1297end 1298``` 1299. 1300<pre><code class="language-ruby">def foo(x) 1301 return 3 1302end 1303</code></pre> 1304. 1305 1306. 1307~~~~ ruby startline=3 $%@#$ 1308def foo(x) 1309 return 3 1310end 1311~~~~~~~ 1312. 1313<pre><code class="language-ruby">def foo(x) 1314 return 3 1315end 1316</code></pre> 1317. 1318 1319. 1320````; 1321```` 1322. 1323<pre><code class="language-;"></code></pre> 1324. 1325 1326Info strings for backtick code blocks cannot contain backticks: 1327 1328. 1329``` aa ``` 1330foo 1331. 1332<p><code>aa</code> 1333foo</p> 1334. 1335 1336Closing code fences cannot have info strings: 1337 1338. 1339``` 1340``` aaa 1341``` 1342. 1343<pre><code>``` aaa 1344</code></pre> 1345. 1346 1347 1348## HTML blocks 1349 1350An [HTML block tag](#html-block-tag) <a id="html-block-tag"></a> is 1351an [open tag](#open-tag) or [closing tag](#closing-tag) whose tag 1352name is one of the following (case-insensitive): 1353`article`, `header`, `aside`, `hgroup`, `blockquote`, `hr`, `iframe`, 1354`body`, `li`, `map`, `button`, `object`, `canvas`, `ol`, `caption`, 1355`output`, `col`, `p`, `colgroup`, `pre`, `dd`, `progress`, `div`, 1356`section`, `dl`, `table`, `td`, `dt`, `tbody`, `embed`, `textarea`, 1357`fieldset`, `tfoot`, `figcaption`, `th`, `figure`, `thead`, `footer`, 1358`footer`, `tr`, `form`, `ul`, `h1`, `h2`, `h3`, `h4`, `h5`, `h6`, 1359`video`, `script`, `style`. 1360 1361An [HTML block](#html-block) <a id="html-block"></a> begins with an 1362[HTML block tag](#html-block-tag), [HTML comment](#html-comment), 1363[processing instruction](#processing-instruction), 1364[declaration](#declaration), or [CDATA section](#cdata-section). 1365It ends when a [blank line](#blank-line) or the end of the 1366input is encountered. The initial line may be indented up to three 1367spaces, and subsequent lines may have any indentation. The contents 1368of the HTML block are interpreted as raw HTML, and will not be escaped 1369in HTML output. 1370 1371Some simple examples: 1372 1373. 1374<table> 1375 <tr> 1376 <td> 1377 hi 1378 </td> 1379 </tr> 1380</table> 1381 1382okay. 1383. 1384<table> 1385 <tr> 1386 <td> 1387 hi 1388 </td> 1389 </tr> 1390</table> 1391<p>okay.</p> 1392. 1393 1394. 1395 <div> 1396 *hello* 1397 <foo><a> 1398. 1399 <div> 1400 *hello* 1401 <foo><a> 1402. 1403 1404Here we have two code blocks with a Markdown paragraph between them: 1405 1406. 1407<DIV CLASS="foo"> 1408 1409*Markdown* 1410 1411</DIV> 1412. 1413<DIV CLASS="foo"> 1414<p><em>Markdown</em></p> 1415</DIV> 1416. 1417 1418In the following example, what looks like a Markdown code block 1419is actually part of the HTML block, which continues until a blank 1420line or the end of the document is reached: 1421 1422. 1423<div></div> 1424``` c 1425int x = 33; 1426``` 1427. 1428<div></div> 1429``` c 1430int x = 33; 1431``` 1432. 1433 1434A comment: 1435 1436. 1437<!-- Foo 1438bar 1439 baz --> 1440. 1441<!-- Foo 1442bar 1443 baz --> 1444. 1445 1446A processing instruction: 1447 1448. 1449<?php 1450 echo 'foo' 1451?> 1452. 1453<?php 1454 echo 'foo' 1455?> 1456. 1457 1458CDATA: 1459 1460. 1461<![CDATA[ 1462function matchwo(a,b) 1463{ 1464if (a < b && a < 0) then 1465 { 1466 return 1; 1467 } 1468else 1469 { 1470 return 0; 1471 } 1472} 1473]]> 1474. 1475<![CDATA[ 1476function matchwo(a,b) 1477{ 1478if (a < b && a < 0) then 1479 { 1480 return 1; 1481 } 1482else 1483 { 1484 return 0; 1485 } 1486} 1487]]> 1488. 1489 1490The opening tag can be indented 1-3 spaces, but not 4: 1491 1492. 1493 <!-- foo --> 1494 1495 <!-- foo --> 1496. 1497 <!-- foo --> 1498<pre><code><!-- foo --> 1499</code></pre> 1500. 1501 1502An HTML block can interrupt a paragraph, and need not be preceded 1503by a blank line. 1504 1505. 1506Foo 1507<div> 1508bar 1509</div> 1510. 1511<p>Foo</p> 1512<div> 1513bar 1514</div> 1515. 1516 1517However, a following blank line is always needed, except at the end of 1518a document: 1519 1520. 1521<div> 1522bar 1523</div> 1524*foo* 1525. 1526<div> 1527bar 1528</div> 1529*foo* 1530. 1531 1532An incomplete HTML block tag may also start an HTML block: 1533 1534. 1535<div class 1536foo 1537. 1538<div class 1539foo 1540. 1541 1542This rule differs from John Gruber's original Markdown syntax 1543specification, which says: 1544 1545> The only restrictions are that block-level HTML elements — 1546> e.g. `<div>`, `<table>`, `<pre>`, `<p>`, etc. — must be separated from 1547> surrounding content by blank lines, and the start and end tags of the 1548> block should not be indented with tabs or spaces. 1549 1550In some ways Gruber's rule is more restrictive than the one given 1551here: 1552 1553- It requires that an HTML block be preceded by a blank line. 1554- It does not allow the start tag to be indented. 1555- It requires a matching end tag, which it also does not allow to 1556 be indented. 1557 1558Indeed, most Markdown implementations, including some of Gruber's 1559own perl implementations, do not impose these restrictions. 1560 1561There is one respect, however, in which Gruber's rule is more liberal 1562than the one given here, since it allows blank lines to occur inside 1563an HTML block. There are two reasons for disallowing them here. 1564First, it removes the need to parse balanced tags, which is 1565expensive and can require backtracking from the end of the document 1566if no matching end tag is found. Second, it provides a very simple 1567and flexible way of including Markdown content inside HTML tags: 1568simply separate the Markdown from the HTML using blank lines: 1569 1570. 1571<div> 1572 1573*Emphasized* text. 1574 1575</div> 1576. 1577<div> 1578<p><em>Emphasized</em> text.</p> 1579</div> 1580. 1581 1582Compare: 1583 1584. 1585<div> 1586*Emphasized* text. 1587</div> 1588. 1589<div> 1590*Emphasized* text. 1591</div> 1592. 1593 1594Some Markdown implementations have adopted a convention of 1595interpreting content inside tags as text if the open tag has 1596the attribute `markdown=1`. The rule given above seems a simpler and 1597more elegant way of achieving the same expressive power, which is also 1598much simpler to parse. 1599 1600The main potential drawback is that one can no longer paste HTML 1601blocks into Markdown documents with 100% reliability. However, 1602*in most cases* this will work fine, because the blank lines in 1603HTML are usually followed by HTML block tags. For example: 1604 1605. 1606<table> 1607 1608<tr> 1609 1610<td> 1611Hi 1612</td> 1613 1614</tr> 1615 1616</table> 1617. 1618<table> 1619<tr> 1620<td> 1621Hi 1622</td> 1623</tr> 1624</table> 1625. 1626 1627Moreover, blank lines are usually not necessary and can be 1628deleted. The exception is inside `<pre>` tags; here, one can 1629replace the blank lines with ` ` entities. 1630 1631So there is no important loss of expressive power with the new rule. 1632 1633## Link reference definitions 1634 1635A [link reference definition](#link-reference-definition) 1636<a id="link-reference-definition"></a> consists of a [link 1637label](#link-label), indented up to three spaces, followed 1638by a colon (`:`), optional blank space (including up to one 1639newline), a [link destination](#link-destination), optional 1640blank space (including up to one newline), and an optional [link 1641title](#link-title), which if it is present must be separated 1642from the [link destination](#link-destination) by whitespace. 1643No further non-space characters may occur on the line. 1644 1645A [link reference-definition](#link-reference-definition) 1646does not correspond to a structural element of a document. Instead, it 1647defines a label which can be used in [reference links](#reference-link) 1648and reference-style [images](#image) elsewhere in the document. [Link 1649reference definitions] can come either before or after the links that use 1650them. 1651 1652. 1653[foo]: /url "title" 1654 1655[foo] 1656. 1657<p><a href="/url" title="title">foo</a></p> 1658. 1659 1660. 1661 [foo]: 1662 /url 1663 'the title' 1664 1665[foo] 1666. 1667<p><a href="/url" title="the title">foo</a></p> 1668. 1669 1670. 1671[Foo*bar\]]:my_(url) 'title (with parens)' 1672 1673[Foo*bar\]] 1674. 1675<p><a href="my_(url)" title="title (with parens)">Foo*bar]</a></p> 1676. 1677 1678. 1679[Foo bar]: 1680<my url> 1681'title' 1682 1683[Foo bar] 1684. 1685<p><a href="my%20url" title="title">Foo bar</a></p> 1686. 1687 1688The title may be omitted: 1689 1690. 1691[foo]: 1692/url 1693 1694[foo] 1695. 1696<p><a href="/url">foo</a></p> 1697. 1698 1699The link destination may not be omitted: 1700 1701. 1702[foo]: 1703 1704[foo] 1705. 1706<p>[foo]:</p> 1707<p>[foo]</p> 1708. 1709 1710A link can come before its corresponding definition: 1711 1712. 1713[foo] 1714 1715[foo]: url 1716. 1717<p><a href="url">foo</a></p> 1718. 1719 1720If there are several matching definitions, the first one takes 1721precedence: 1722 1723. 1724[foo] 1725 1726[foo]: first 1727[foo]: second 1728. 1729<p><a href="first">foo</a></p> 1730. 1731 1732As noted in the section on [Links], matching of labels is 1733case-insensitive (see [matches](#matches)). 1734 1735. 1736[FOO]: /url 1737 1738[Foo] 1739. 1740<p><a href="/url">Foo</a></p> 1741. 1742 1743. 1744[ΑΓΩ]: /φου 1745 1746[αγω] 1747. 1748<p><a href="/%CF%86%CE%BF%CF%85">αγω</a></p> 1749. 1750 1751Here is a link reference definition with no corresponding link. 1752It contributes nothing to the document. 1753 1754. 1755[foo]: /url 1756. 1757. 1758 1759This is not a link reference definition, because there are 1760non-space characters after the title: 1761 1762. 1763[foo]: /url "title" ok 1764. 1765<p>[foo]: /url "title" ok</p> 1766. 1767 1768This is not a link reference definition, because it is indented 1769four spaces: 1770 1771. 1772 [foo]: /url "title" 1773 1774[foo] 1775. 1776<pre><code>[foo]: /url "title" 1777</code></pre> 1778<p>[foo]</p> 1779. 1780 1781This is not a link reference definition, because it occurs inside 1782a code block: 1783 1784. 1785``` 1786[foo]: /url 1787``` 1788 1789[foo] 1790. 1791<pre><code>[foo]: /url 1792</code></pre> 1793<p>[foo]</p> 1794. 1795 1796A [link reference definition](#link-reference-definition) cannot 1797interrupt a paragraph. 1798 1799. 1800Foo 1801[bar]: /baz 1802 1803[bar] 1804. 1805<p>Foo 1806[bar]: /baz</p> 1807<p>[bar]</p> 1808. 1809 1810However, it can directly follow other block elements, such as headers 1811and horizontal rules, and it need not be followed by a blank line. 1812 1813. 1814# [Foo] 1815[foo]: /url 1816> bar 1817. 1818<h1><a href="/url">Foo</a></h1> 1819<blockquote> 1820<p>bar</p> 1821</blockquote> 1822. 1823 1824Several [link references](#link-reference) can occur one after another, 1825without intervening blank lines. 1826 1827. 1828[foo]: /foo-url "foo" 1829[bar]: /bar-url 1830 "bar" 1831[baz]: /baz-url 1832 1833[foo], 1834[bar], 1835[baz] 1836. 1837<p><a href="/foo-url" title="foo">foo</a>, 1838<a href="/bar-url" title="bar">bar</a>, 1839<a href="/baz-url">baz</a></p> 1840. 1841 1842[Link reference definitions](#link-reference-definition) can occur 1843inside block containers, like lists and block quotations. They 1844affect the entire document, not just the container in which they 1845are defined: 1846 1847. 1848[foo] 1849 1850> [foo]: /url 1851. 1852<p><a href="/url">foo</a></p> 1853<blockquote> 1854</blockquote> 1855. 1856 1857 1858## Paragraphs 1859 1860A sequence of non-blank lines that cannot be interpreted as other 1861kinds of blocks forms a [paragraph](#paragraph).<a id="paragraph"></a> 1862The contents of the paragraph are the result of parsing the 1863paragraph's raw content as inlines. The paragraph's raw content 1864is formed by concatenating the lines and removing initial and final 1865spaces. 1866 1867A simple example with two paragraphs: 1868 1869. 1870aaa 1871 1872bbb 1873. 1874<p>aaa</p> 1875<p>bbb</p> 1876. 1877 1878Paragraphs can contain multiple lines, but no blank lines: 1879 1880. 1881aaa 1882bbb 1883 1884ccc 1885ddd 1886. 1887<p>aaa 1888bbb</p> 1889<p>ccc 1890ddd</p> 1891. 1892 1893Multiple blank lines between paragraph have no effect: 1894 1895. 1896aaa 1897 1898 1899bbb 1900. 1901<p>aaa</p> 1902<p>bbb</p> 1903. 1904 1905Leading spaces are skipped: 1906 1907. 1908 aaa 1909 bbb 1910. 1911<p>aaa 1912bbb</p> 1913. 1914 1915Lines after the first may be indented any amount, since indented 1916code blocks cannot interrupt paragraphs. 1917 1918. 1919aaa 1920 bbb 1921 ccc 1922. 1923<p>aaa 1924bbb 1925ccc</p> 1926. 1927 1928However, the first line may be indented at most three spaces, 1929or an indented code block will be triggered: 1930 1931. 1932 aaa 1933bbb 1934. 1935<p>aaa 1936bbb</p> 1937. 1938 1939. 1940 aaa 1941bbb 1942. 1943<pre><code>aaa 1944</code></pre> 1945<p>bbb</p> 1946. 1947 1948Final spaces are stripped before inline parsing, so a paragraph 1949that ends with two or more spaces will not end with a hard line 1950break: 1951 1952. 1953aaa 1954bbb 1955. 1956<p>aaa<br /> 1957bbb</p> 1958. 1959 1960## Blank lines 1961 1962[Blank lines](#blank-line) between block-level elements are ignored, 1963except for the role they play in determining whether a [list](#list) 1964is [tight](#tight) or [loose](#loose). 1965 1966Blank lines at the beginning and end of the document are also ignored. 1967 1968. 1969 1970 1971aaa 1972 1973 1974# aaa 1975 1976 1977. 1978<p>aaa</p> 1979<h1>aaa</h1> 1980. 1981 1982 1983# Container blocks 1984 1985A [container block](#container-block) is a block that has other 1986blocks as its contents. There are two basic kinds of container blocks: 1987[block quotes](#block-quote) and [list items](#list-item). 1988[Lists](#list) are meta-containers for [list items](#list-item). 1989 1990We define the syntax for container blocks recursively. The general 1991form of the definition is: 1992 1993> If X is a sequence of blocks, then the result of 1994> transforming X in such-and-such a way is a container of type Y 1995> with these blocks as its content. 1996 1997So, we explain what counts as a block quote or list item by explaining 1998how these can be *generated* from their contents. This should suffice 1999to define the syntax, although it does not give a recipe for *parsing* 2000these constructions. (A recipe is provided below in the section entitled 2001[A parsing strategy](#appendix-a-a-parsing-strategy).) 2002 2003## Block quotes 2004 2005A [block quote marker](#block-quote-marker) <a id="block-quote-marker"></a> 2006consists of 0-3 spaces of initial indent, plus (a) the character `>` together 2007with a following space, or (b) a single character `>` not followed by a space. 2008 2009The following rules define [block quotes](#block-quote): 2010<a id="block-quote"></a> 2011 20121. **Basic case.** If a string of lines *Ls* constitute a sequence 2013 of blocks *Bs*, then the result of appending a [block quote 2014 marker](#block-quote-marker) to the beginning of each line in *Ls* 2015 is a [block quote](#block-quote) containing *Bs*. 2016 20172. **Laziness.** If a string of lines *Ls* constitute a [block 2018 quote](#block-quote) with contents *Bs*, then the result of deleting 2019 the initial [block quote marker](#block-quote-marker) from one or 2020 more lines in which the next non-space character after the [block 2021 quote marker](#block-quote-marker) is [paragraph continuation 2022 text](#paragraph-continuation-text) is a block quote with *Bs* as 2023 its content. <a id="paragraph-continuation-text"></a> 2024 [Paragraph continuation text](#paragraph-continuation-text) is text 2025 that will be parsed as part of the content of a paragraph, but does 2026 not occur at the beginning of the paragraph. 2027 20283. **Consecutiveness.** A document cannot contain two [block 2029 quotes](#block-quote) in a row unless there is a [blank 2030 line](#blank-line) between them. 2031 2032Nothing else counts as a [block quote](#block-quote). 2033 2034Here is a simple example: 2035 2036. 2037> # Foo 2038> bar 2039> baz 2040. 2041<blockquote> 2042<h1>Foo</h1> 2043<p>bar 2044baz</p> 2045</blockquote> 2046. 2047 2048The spaces after the `>` characters can be omitted: 2049 2050. 2051># Foo 2052>bar 2053> baz 2054. 2055<blockquote> 2056<h1>Foo</h1> 2057<p>bar 2058baz</p> 2059</blockquote> 2060. 2061 2062The `>` characters can be indented 1-3 spaces: 2063 2064. 2065 > # Foo 2066 > bar 2067 > baz 2068. 2069<blockquote> 2070<h1>Foo</h1> 2071<p>bar 2072baz</p> 2073</blockquote> 2074. 2075 2076Four spaces gives us a code block: 2077 2078. 2079 > # Foo 2080 > bar 2081 > baz 2082. 2083<pre><code>> # Foo 2084> bar 2085> baz 2086</code></pre> 2087. 2088 2089The Laziness clause allows us to omit the `>` before a 2090paragraph continuation line: 2091 2092. 2093> # Foo 2094> bar 2095baz 2096. 2097<blockquote> 2098<h1>Foo</h1> 2099<p>bar 2100baz</p> 2101</blockquote> 2102. 2103 2104A block quote can contain some lazy and some non-lazy 2105continuation lines: 2106 2107. 2108> bar 2109baz 2110> foo 2111. 2112<blockquote> 2113<p>bar 2114baz 2115foo</p> 2116</blockquote> 2117. 2118 2119Laziness only applies to lines that are continuations of 2120paragraphs. Lines containing characters or indentation that indicate 2121block structure cannot be lazy. 2122 2123. 2124> foo 2125--- 2126. 2127<blockquote> 2128<p>foo</p> 2129</blockquote> 2130<hr /> 2131. 2132 2133. 2134> - foo 2135- bar 2136. 2137<blockquote> 2138<ul> 2139<li>foo</li> 2140</ul> 2141</blockquote> 2142<ul> 2143<li>bar</li> 2144</ul> 2145. 2146 2147. 2148> foo 2149 bar 2150. 2151<blockquote> 2152<pre><code>foo 2153</code></pre> 2154</blockquote> 2155<pre><code>bar 2156</code></pre> 2157. 2158 2159. 2160> ``` 2161foo 2162``` 2163. 2164<blockquote> 2165<pre><code></code></pre> 2166</blockquote> 2167<p>foo</p> 2168<pre><code></code></pre> 2169. 2170 2171A block quote can be empty: 2172 2173. 2174> 2175. 2176<blockquote> 2177</blockquote> 2178. 2179 2180. 2181> 2182> 2183> 2184. 2185<blockquote> 2186</blockquote> 2187. 2188 2189A block quote can have initial or final blank lines: 2190 2191. 2192> 2193> foo 2194> 2195. 2196<blockquote> 2197<p>foo</p> 2198</blockquote> 2199. 2200 2201A blank line always separates block quotes: 2202 2203. 2204> foo 2205 2206> bar 2207. 2208<blockquote> 2209<p>foo</p> 2210</blockquote> 2211<blockquote> 2212<p>bar</p> 2213</blockquote> 2214. 2215 2216(Most current Markdown implementations, including John Gruber's 2217original `Markdown.pl`, will parse this example as a single block quote 2218with two paragraphs. But it seems better to allow the author to decide 2219whether two block quotes or one are wanted.) 2220 2221Consecutiveness means that if we put these block quotes together, 2222we get a single block quote: 2223 2224. 2225> foo 2226> bar 2227. 2228<blockquote> 2229<p>foo 2230bar</p> 2231</blockquote> 2232. 2233 2234To get a block quote with two paragraphs, use: 2235 2236. 2237> foo 2238> 2239> bar 2240. 2241<blockquote> 2242<p>foo</p> 2243<p>bar</p> 2244</blockquote> 2245. 2246 2247Block quotes can interrupt paragraphs: 2248 2249. 2250foo 2251> bar 2252. 2253<p>foo</p> 2254<blockquote> 2255<p>bar</p> 2256</blockquote> 2257. 2258 2259In general, blank lines are not needed before or after block 2260quotes: 2261 2262. 2263> aaa 2264*** 2265> bbb 2266. 2267<blockquote> 2268<p>aaa</p> 2269</blockquote> 2270<hr /> 2271<blockquote> 2272<p>bbb</p> 2273</blockquote> 2274. 2275 2276However, because of laziness, a blank line is needed between 2277a block quote and a following paragraph: 2278 2279. 2280> bar 2281baz 2282. 2283<blockquote> 2284<p>bar 2285baz</p> 2286</blockquote> 2287. 2288 2289. 2290> bar 2291 2292baz 2293. 2294<blockquote> 2295<p>bar</p> 2296</blockquote> 2297<p>baz</p> 2298. 2299 2300. 2301> bar 2302> 2303baz 2304. 2305<blockquote> 2306<p>bar</p> 2307</blockquote> 2308<p>baz</p> 2309. 2310 2311It is a consequence of the Laziness rule that any number 2312of initial `>`s may be omitted on a continuation line of a 2313nested block quote: 2314 2315. 2316> > > foo 2317bar 2318. 2319<blockquote> 2320<blockquote> 2321<blockquote> 2322<p>foo 2323bar</p> 2324</blockquote> 2325</blockquote> 2326</blockquote> 2327. 2328 2329. 2330>>> foo 2331> bar 2332>>baz 2333. 2334<blockquote> 2335<blockquote> 2336<blockquote> 2337<p>foo 2338bar 2339baz</p> 2340</blockquote> 2341</blockquote> 2342</blockquote> 2343. 2344 2345When including an indented code block in a block quote, 2346remember that the [block quote marker](#block-quote-marker) includes 2347both the `>` and a following space. So *five spaces* are needed after 2348the `>`: 2349 2350. 2351> code 2352 2353> not code 2354. 2355<blockquote> 2356<pre><code>code 2357</code></pre> 2358</blockquote> 2359<blockquote> 2360<p>not code</p> 2361</blockquote> 2362. 2363 2364 2365## List items 2366 2367A [list marker](#list-marker) <a id="list-marker"></a> is a 2368[bullet list marker](#bullet-list-marker) or an [ordered list 2369marker](#ordered-list-marker). 2370 2371A [bullet list marker](#bullet-list-marker) <a id="bullet-list-marker"></a> 2372is a `-`, `+`, or `*` character. 2373 2374An [ordered list marker](#ordered-list-marker) <a id="ordered-list-marker"></a> 2375is a sequence of one of more digits (`0-9`), followed by either a 2376`.` character or a `)` character. 2377 2378The following rules define [list items](#list-item): 2379 23801. **Basic case.** If a sequence of lines *Ls* constitute a sequence of 2381 blocks *Bs* starting with a non-space character and not separated 2382 from each other by more than one blank line, and *M* is a list 2383 marker *M* of width *W* followed by 0 < *N* < 5 spaces, then the result 2384 of prepending *M* and the following spaces to the first line of 2385 *Ls*, and indenting subsequent lines of *Ls* by *W + N* spaces, is a 2386 list item with *Bs* as its contents. The type of the list item 2387 (bullet or ordered) is determined by the type of its list marker. 2388 If the list item is ordered, then it is also assigned a start 2389 number, based on the ordered list marker. 2390 2391For example, let *Ls* be the lines 2392 2393. 2394A paragraph 2395with two lines. 2396 2397 indented code 2398 2399> A block quote. 2400. 2401<p>A paragraph 2402with two lines.</p> 2403<pre><code>indented code 2404</code></pre> 2405<blockquote> 2406<p>A block quote.</p> 2407</blockquote> 2408. 2409 2410And let *M* be the marker `1.`, and *N* = 2. Then rule #1 says 2411that the following is an ordered list item with start number 1, 2412and the same contents as *Ls*: 2413 2414. 24151. A paragraph 2416 with two lines. 2417 2418 indented code 2419 2420 > A block quote. 2421. 2422<ol> 2423<li><p>A paragraph 2424with two lines.</p> 2425<pre><code>indented code 2426</code></pre> 2427<blockquote> 2428<p>A block quote.</p> 2429</blockquote></li> 2430</ol> 2431. 2432 2433The most important thing to notice is that the position of 2434the text after the list marker determines how much indentation 2435is needed in subsequent blocks in the list item. If the list 2436marker takes up two spaces, and there are three spaces between 2437the list marker and the next nonspace character, then blocks 2438must be indented five spaces in order to fall under the list 2439item. 2440 2441Here are some examples showing how far content must be indented to be 2442put under the list item: 2443 2444. 2445- one 2446 2447 two 2448. 2449<ul> 2450<li>one</li> 2451</ul> 2452<p>two</p> 2453. 2454 2455. 2456- one 2457 2458 two 2459. 2460<ul> 2461<li><p>one</p> 2462<p>two</p></li> 2463</ul> 2464. 2465 2466. 2467 - one 2468 2469 two 2470. 2471<ul> 2472<li>one</li> 2473</ul> 2474<pre><code> two 2475</code></pre> 2476. 2477 2478. 2479 - one 2480 2481 two 2482. 2483<ul> 2484<li><p>one</p> 2485<p>two</p></li> 2486</ul> 2487. 2488 2489It is tempting to think of this in terms of columns: the continuation 2490blocks must be indented at least to the column of the first nonspace 2491character after the list marker. However, that is not quite right. 2492The spaces after the list marker determine how much relative indentation 2493is needed. Which column this indentation reaches will depend on 2494how the list item is embedded in other constructions, as shown by 2495this example: 2496 2497. 2498 > > 1. one 2499>> 2500>> two 2501. 2502<blockquote> 2503<blockquote> 2504<ol> 2505<li><p>one</p> 2506<p>two</p></li> 2507</ol> 2508</blockquote> 2509</blockquote> 2510. 2511 2512Here `two` occurs in the same column as the list marker `1.`, 2513but is actually contained in the list item, because there is 2514sufficent indentation after the last containing blockquote marker. 2515 2516The converse is also possible. In the following example, the word `two` 2517occurs far to the right of the initial text of the list item, `one`, but 2518it is not considered part of the list item, because it is not indented 2519far enough past the blockquote marker: 2520 2521. 2522>>- one 2523>> 2524 > > two 2525. 2526<blockquote> 2527<blockquote> 2528<ul> 2529<li>one</li> 2530</ul> 2531<p>two</p> 2532</blockquote> 2533</blockquote> 2534. 2535 2536A list item may not contain blocks that are separated by more than 2537one blank line. Thus, two blank lines will end a list, unless the 2538two blanks are contained in a [fenced code block](#fenced-code-block). 2539 2540. 2541- foo 2542 2543 bar 2544 2545- foo 2546 2547 2548 bar 2549 2550- ``` 2551 foo 2552 2553 2554 bar 2555 ``` 2556. 2557<ul> 2558<li><p>foo</p> 2559<p>bar</p></li> 2560<li><p>foo</p></li> 2561</ul> 2562<p>bar</p> 2563<ul> 2564<li><pre><code>foo 2565 2566 2567bar 2568</code></pre></li> 2569</ul> 2570. 2571 2572A list item may contain any kind of block: 2573 2574. 25751. foo 2576 2577 ``` 2578 bar 2579 ``` 2580 2581 baz 2582 2583 > bam 2584. 2585<ol> 2586<li><p>foo</p> 2587<pre><code>bar 2588</code></pre> 2589<p>baz</p> 2590<blockquote> 2591<p>bam</p> 2592</blockquote></li> 2593</ol> 2594. 2595 25962. **Item starting with indented code.** If a sequence of lines *Ls* 2597 constitute a sequence of blocks *Bs* starting with an indented code 2598 block and not separated from each other by more than one blank line, 2599 and *M* is a list marker *M* of width *W* followed by 2600 one space, then the result of prepending *M* and the following 2601 space to the first line of *Ls*, and indenting subsequent lines of 2602 *Ls* by *W + 1* spaces, is a list item with *Bs* as its contents. 2603 If a line is empty, then it need not be indented. The type of the 2604 list item (bullet or ordered) is determined by the type of its list 2605 marker. If the list item is ordered, then it is also assigned a 2606 start number, based on the ordered list marker. 2607 2608An indented code block will have to be indented four spaces beyond 2609the edge of the region where text will be included in the list item. 2610In the following case that is 6 spaces: 2611 2612. 2613- foo 2614 2615 bar 2616. 2617<ul> 2618<li><p>foo</p> 2619<pre><code>bar 2620</code></pre></li> 2621</ul> 2622. 2623 2624And in this case it is 11 spaces: 2625 2626. 2627 10. foo 2628 2629 bar 2630. 2631<ol start="10"> 2632<li><p>foo</p> 2633<pre><code>bar 2634</code></pre></li> 2635</ol> 2636. 2637 2638If the *first* block in the list item is an indented code block, 2639then by rule #2, the contents must be indented *one* space after the 2640list marker: 2641 2642. 2643 indented code 2644 2645paragraph 2646 2647 more code 2648. 2649<pre><code>indented code 2650</code></pre> 2651<p>paragraph</p> 2652<pre><code>more code 2653</code></pre> 2654. 2655 2656. 26571. indented code 2658 2659 paragraph 2660 2661 more code 2662. 2663<ol> 2664<li><pre><code>indented code 2665</code></pre> 2666<p>paragraph</p> 2667<pre><code>more code 2668</code></pre></li> 2669</ol> 2670. 2671 2672Note that an additional space indent is interpreted as space 2673inside the code block: 2674 2675. 26761. indented code 2677 2678 paragraph 2679 2680 more code 2681. 2682<ol> 2683<li><pre><code> indented code 2684</code></pre> 2685<p>paragraph</p> 2686<pre><code>more code 2687</code></pre></li> 2688</ol> 2689. 2690 2691Note that rules #1 and #2 only apply to two cases: (a) cases 2692in which the lines to be included in a list item begin with a nonspace 2693character, and (b) cases in which they begin with an indented code 2694block. In a case like the following, where the first block begins with 2695a three-space indent, the rules do not allow us to form a list item by 2696indenting the whole thing and prepending a list marker: 2697 2698. 2699 foo 2700 2701bar 2702. 2703<p>foo</p> 2704<p>bar</p> 2705. 2706 2707. 2708- foo 2709 2710 bar 2711. 2712<ul> 2713<li>foo</li> 2714</ul> 2715<p>bar</p> 2716. 2717 2718This is not a significant restriction, because when a block begins 2719with 1-3 spaces indent, the indentation can always be removed without 2720a change in interpretation, allowing rule #1 to be applied. So, in 2721the above case: 2722 2723. 2724- foo 2725 2726 bar 2727. 2728<ul> 2729<li><p>foo</p> 2730<p>bar</p></li> 2731</ul> 2732. 2733 2734 27353. **Indentation.** If a sequence of lines *Ls* constitutes a list item 2736 according to rule #1 or #2, then the result of indenting each line 2737 of *L* by 1-3 spaces (the same for each line) also constitutes a 2738 list item with the same contents and attributes. If a line is 2739 empty, then it need not be indented. 2740 2741Indented one space: 2742 2743. 2744 1. A paragraph 2745 with two lines. 2746 2747 indented code 2748 2749 > A block quote. 2750. 2751<ol> 2752<li><p>A paragraph 2753with two lines.</p> 2754<pre><code>indented code 2755</code></pre> 2756<blockquote> 2757<p>A block quote.</p> 2758</blockquote></li> 2759</ol> 2760. 2761 2762Indented two spaces: 2763 2764. 2765 1. A paragraph 2766 with two lines. 2767 2768 indented code 2769 2770 > A block quote. 2771. 2772<ol> 2773<li><p>A paragraph 2774with two lines.</p> 2775<pre><code>indented code 2776</code></pre> 2777<blockquote> 2778<p>A block quote.</p> 2779</blockquote></li> 2780</ol> 2781. 2782 2783Indented three spaces: 2784 2785. 2786 1. A paragraph 2787 with two lines. 2788 2789 indented code 2790 2791 > A block quote. 2792. 2793<ol> 2794<li><p>A paragraph 2795with two lines.</p> 2796<pre><code>indented code 2797</code></pre> 2798<blockquote> 2799<p>A block quote.</p> 2800</blockquote></li> 2801</ol> 2802. 2803 2804Four spaces indent gives a code block: 2805 2806. 2807 1. A paragraph 2808 with two lines. 2809 2810 indented code 2811 2812 > A block quote. 2813. 2814<pre><code>1. A paragraph 2815 with two lines. 2816 2817 indented code 2818 2819 > A block quote. 2820</code></pre> 2821. 2822 2823 28244. **Laziness.** If a string of lines *Ls* constitute a [list 2825 item](#list-item) with contents *Bs*, then the result of deleting 2826 some or all of the indentation from one or more lines in which the 2827 next non-space character after the indentation is 2828 [paragraph continuation text](#paragraph-continuation-text) is a 2829 list item with the same contents and attributes. 2830 2831Here is an example with lazy continuation lines: 2832 2833. 2834 1. A paragraph 2835with two lines. 2836 2837 indented code 2838 2839 > A block quote. 2840. 2841<ol> 2842<li><p>A paragraph 2843with two lines.</p> 2844<pre><code>indented code 2845</code></pre> 2846<blockquote> 2847<p>A block quote.</p> 2848</blockquote></li> 2849</ol> 2850. 2851 2852Indentation can be partially deleted: 2853 2854. 2855 1. A paragraph 2856 with two lines. 2857. 2858<ol> 2859<li>A paragraph 2860with two lines.</li> 2861</ol> 2862. 2863 2864These examples show how laziness can work in nested structures: 2865 2866. 2867> 1. > Blockquote 2868continued here. 2869. 2870<blockquote> 2871<ol> 2872<li><blockquote> 2873<p>Blockquote 2874continued here.</p> 2875</blockquote></li> 2876</ol> 2877</blockquote> 2878. 2879 2880. 2881> 1. > Blockquote 2882> continued here. 2883. 2884<blockquote> 2885<ol> 2886<li><blockquote> 2887<p>Blockquote 2888continued here.</p> 2889</blockquote></li> 2890</ol> 2891</blockquote> 2892. 2893 2894 28955. **That's all.** Nothing that is not counted as a list item by rules 2896 #1--4 counts as a [list item](#list-item). 2897 2898The rules for sublists follow from the general rules above. A sublist 2899must be indented the same number of spaces a paragraph would need to be 2900in order to be included in the list item. 2901 2902So, in this case we need two spaces indent: 2903 2904. 2905- foo 2906 - bar 2907 - baz 2908. 2909<ul> 2910<li>foo 2911<ul> 2912<li>bar 2913<ul> 2914<li>baz</li> 2915</ul></li> 2916</ul></li> 2917</ul> 2918. 2919 2920One is not enough: 2921 2922. 2923- foo 2924 - bar 2925 - baz 2926. 2927<ul> 2928<li>foo</li> 2929<li>bar</li> 2930<li>baz</li> 2931</ul> 2932. 2933 2934Here we need four, because the list marker is wider: 2935 2936. 293710) foo 2938 - bar 2939. 2940<ol start="10"> 2941<li>foo 2942<ul> 2943<li>bar</li> 2944</ul></li> 2945</ol> 2946. 2947 2948Three is not enough: 2949 2950. 295110) foo 2952 - bar 2953. 2954<ol start="10"> 2955<li>foo</li> 2956</ol> 2957<ul> 2958<li>bar</li> 2959</ul> 2960. 2961 2962A list may be the first block in a list item: 2963 2964. 2965- - foo 2966. 2967<ul> 2968<li><ul> 2969<li>foo</li> 2970</ul></li> 2971</ul> 2972. 2973 2974. 29751. - 2. foo 2976. 2977<ol> 2978<li><ul> 2979<li><ol start="2"> 2980<li>foo</li> 2981</ol></li> 2982</ul></li> 2983</ol> 2984. 2985 2986A list item may be empty: 2987 2988. 2989- foo 2990- 2991- bar 2992. 2993<ul> 2994<li>foo</li> 2995<li></li> 2996<li>bar</li> 2997</ul> 2998. 2999 3000. 3001- 3002. 3003<ul> 3004<li></li> 3005</ul> 3006. 3007 3008### Motivation 3009 3010John Gruber's Markdown spec says the following about list items: 3011 30121. "List markers typically start at the left margin, but may be indented 3013 by up to three spaces. List markers must be followed by one or more 3014 spaces or a tab." 3015 30162. "To make lists look nice, you can wrap items with hanging indents.... 3017 But if you don't want to, you don't have to." 3018 30193. "List items may consist of multiple paragraphs. Each subsequent 3020 paragraph in a list item must be indented by either 4 spaces or one 3021 tab." 3022 30234. "It looks nice if you indent every line of the subsequent paragraphs, 3024 but here again, Markdown will allow you to be lazy." 3025 30265. "To put a blockquote within a list item, the blockquote's `>` 3027 delimiters need to be indented." 3028 30296. "To put a code block within a list item, the code block needs to be 3030 indented twice — 8 spaces or two tabs." 3031 3032These rules specify that a paragraph under a list item must be indented 3033four spaces (presumably, from the left margin, rather than the start of 3034the list marker, but this is not said), and that code under a list item 3035must be indented eight spaces instead of the usual four. They also say 3036that a block quote must be indented, but not by how much; however, the 3037example given has four spaces indentation. Although nothing is said 3038about other kinds of block-level content, it is certainly reasonable to 3039infer that *all* block elements under a list item, including other 3040lists, must be indented four spaces. This principle has been called the 3041*four-space rule*. 3042 3043The four-space rule is clear and principled, and if the reference 3044implementation `Markdown.pl` had followed it, it probably would have 3045become the standard. However, `Markdown.pl` allowed paragraphs and 3046sublists to start with only two spaces indentation, at least on the 3047outer level. Worse, its behavior was inconsistent: a sublist of an 3048outer-level list needed two spaces indentation, but a sublist of this 3049sublist needed three spaces. It is not surprising, then, that different 3050implementations of Markdown have developed very different rules for 3051determining what comes under a list item. (Pandoc and python-Markdown, 3052for example, stuck with Gruber's syntax description and the four-space 3053rule, while discount, redcarpet, marked, PHP Markdown, and others 3054followed `Markdown.pl`'s behavior more closely.) 3055 3056Unfortunately, given the divergences between implementations, there 3057is no way to give a spec for list items that will be guaranteed not 3058to break any existing documents. However, the spec given here should 3059correctly handle lists formatted with either the four-space rule or 3060the more forgiving `Markdown.pl` behavior, provided they are laid out 3061in a way that is natural for a human to read. 3062 3063The strategy here is to let the width and indentation of the list marker 3064determine the indentation necessary for blocks to fall under the list 3065item, rather than having a fixed and arbitrary number. The writer can 3066think of the body of the list item as a unit which gets indented to the 3067right enough to fit the list marker (and any indentation on the list 3068marker). (The laziness rule, #4, then allows continuation lines to be 3069unindented if needed.) 3070 3071This rule is superior, we claim, to any rule requiring a fixed level of 3072indentation from the margin. The four-space rule is clear but 3073unnatural. It is quite unintuitive that 3074 3075``` markdown 3076- foo 3077 3078 bar 3079 3080 - baz 3081``` 3082 3083should be parsed as two lists with an intervening paragraph, 3084 3085``` html 3086<ul> 3087<li>foo</li> 3088</ul> 3089<p>bar</p> 3090<ul> 3091<li>baz</li> 3092</ul> 3093``` 3094 3095as the four-space rule demands, rather than a single list, 3096 3097``` html 3098<ul> 3099<li><p>foo</p> 3100<p>bar</p> 3101<ul> 3102<li>baz</li> 3103</ul></li> 3104</ul> 3105``` 3106 3107The choice of four spaces is arbitrary. It can be learned, but it is 3108not likely to be guessed, and it trips up beginners regularly. 3109 3110Would it help to adopt a two-space rule? The problem is that such 3111a rule, together with the rule allowing 1--3 spaces indentation of the 3112initial list marker, allows text that is indented *less than* the 3113original list marker to be included in the list item. For example, 3114`Markdown.pl` parses 3115 3116``` markdown 3117 - one 3118 3119 two 3120``` 3121 3122as a single list item, with `two` a continuation paragraph: 3123 3124``` html 3125<ul> 3126<li><p>one</p> 3127<p>two</p></li> 3128</ul> 3129``` 3130 3131and similarly 3132 3133``` markdown 3134> - one 3135> 3136> two 3137``` 3138 3139as 3140 3141``` html 3142<blockquote> 3143<ul> 3144<li><p>one</p> 3145<p>two</p></li> 3146</ul> 3147</blockquote> 3148``` 3149 3150This is extremely unintuitive. 3151 3152Rather than requiring a fixed indent from the margin, we could require 3153a fixed indent (say, two spaces, or even one space) from the list marker (which 3154may itself be indented). This proposal would remove the last anomaly 3155discussed. Unlike the spec presented above, it would count the following 3156as a list item with a subparagraph, even though the paragraph `bar` 3157is not indented as far as the first paragraph `foo`: 3158 3159``` markdown 3160 10. foo 3161 3162 bar 3163``` 3164 3165Arguably this text does read like a list item with `bar` as a subparagraph, 3166which may count in favor of the proposal. However, on this proposal indented 3167code would have to be indented six spaces after the list marker. And this 3168would break a lot of existing Markdown, which has the pattern: 3169 3170``` markdown 31711. foo 3172 3173 indented code 3174``` 3175 3176where the code is indented eight spaces. The spec above, by contrast, will 3177parse this text as expected, since the code block's indentation is measured 3178from the beginning of `foo`. 3179 3180The one case that needs special treatment is a list item that *starts* 3181with indented code. How much indentation is required in that case, since 3182we don't have a "first paragraph" to measure from? Rule #2 simply stipulates 3183that in such cases, we require one space indentation from the list marker 3184(and then the normal four spaces for the indented code). This will match the 3185four-space rule in cases where the list marker plus its initial indentation 3186takes four spaces (a common case), but diverge in other cases. 3187 3188## Lists 3189 3190A [list](#list) <a id="list"></a> is a sequence of one or more 3191list items [of the same type](#of-the-same-type). The list items 3192may be separated by single [blank lines](#blank-line), but two 3193blank lines end all containing lists. 3194 3195Two list items are [of the same type](#of-the-same-type) 3196<a id="of-the-same-type"></a> if they begin with a [list 3197marker](#list-marker) of the same type. Two list markers are of the 3198same type if (a) they are bullet list markers using the same character 3199(`-`, `+`, or `*`) or (b) they are ordered list numbers with the same 3200delimiter (either `.` or `)`). 3201 3202A list is an [ordered list](#ordered-list) <a id="ordered-list"></a> 3203if its constituent list items begin with 3204[ordered list markers](#ordered-list-marker), and a [bullet 3205list](#bullet-list) <a id="bullet-list"></a> if its constituent list 3206items begin with [bullet list markers](#bullet-list-marker). 3207 3208The [start number](#start-number) <a id="start-number"></a> 3209of an [ordered list](#ordered-list) is determined by the list number of 3210its initial list item. The numbers of subsequent list items are 3211disregarded. 3212 3213A list is [loose](#loose) if it any of its constituent list items are 3214separated by blank lines, or if any of its constituent list items 3215directly contain two block-level elements with a blank line between 3216them. Otherwise a list is [tight](#tight). (The difference in HTML output 3217is that paragraphs in a loose with are wrapped in `<p>` tags, while 3218paragraphs in a tight list are not.) 3219 3220Changing the bullet or ordered list delimiter starts a new list: 3221 3222. 3223- foo 3224- bar 3225+ baz 3226. 3227<ul> 3228<li>foo</li> 3229<li>bar</li> 3230</ul> 3231<ul> 3232<li>baz</li> 3233</ul> 3234. 3235 3236. 32371. foo 32382. bar 32393) baz 3240. 3241<ol> 3242<li>foo</li> 3243<li>bar</li> 3244</ol> 3245<ol start="3"> 3246<li>baz</li> 3247</ol> 3248. 3249 3250There can be blank lines between items, but two blank lines end 3251a list: 3252 3253. 3254- foo 3255 3256- bar 3257 3258 3259- baz 3260. 3261<ul> 3262<li><p>foo</p></li> 3263<li><p>bar</p></li> 3264</ul> 3265<ul> 3266<li>baz</li> 3267</ul> 3268. 3269 3270As illustrated above in the section on [list items](#list-item), 3271two blank lines between blocks *within* a list item will also end a 3272list: 3273 3274. 3275- foo 3276 3277 3278 bar 3279- baz 3280. 3281<ul> 3282<li>foo</li> 3283</ul> 3284<p>bar</p> 3285<ul> 3286<li>baz</li> 3287</ul> 3288. 3289 3290Indeed, two blank lines will end *all* containing lists: 3291 3292. 3293- foo 3294 - bar 3295 - baz 3296 3297 3298 bim 3299. 3300<ul> 3301<li>foo 3302<ul> 3303<li>bar 3304<ul> 3305<li>baz</li> 3306</ul></li> 3307</ul></li> 3308</ul> 3309<pre><code> bim 3310</code></pre> 3311. 3312 3313Thus, two blank lines can be used to separate consecutive lists of 3314the same type, or to separate a list from an indented code block 3315that would otherwise be parsed as a subparagraph of the final list 3316item: 3317 3318. 3319- foo 3320- bar 3321 3322 3323- baz 3324- bim 3325. 3326<ul> 3327<li>foo</li> 3328<li>bar</li> 3329</ul> 3330<ul> 3331<li>baz</li> 3332<li>bim</li> 3333</ul> 3334. 3335 3336. 3337- foo 3338 3339 notcode 3340 3341- foo 3342 3343 3344 code 3345. 3346<ul> 3347<li><p>foo</p> 3348<p>notcode</p></li> 3349<li><p>foo</p></li> 3350</ul> 3351<pre><code>code 3352</code></pre> 3353. 3354 3355List items need not be indented to the same level. The following 3356list items will be treated as items at the same list level, 3357since none is indented enough to belong to the previous list 3358item: 3359 3360. 3361- a 3362 - b 3363 - c 3364 - d 3365 - e 3366 - f 3367- g 3368. 3369<ul> 3370<li>a</li> 3371<li>b</li> 3372<li>c</li> 3373<li>d</li> 3374<li>e</li> 3375<li>f</li> 3376<li>g</li> 3377</ul> 3378. 3379 3380This is a loose list, because there is a blank line between 3381two of the list items: 3382 3383. 3384- a 3385- b 3386 3387- c 3388. 3389<ul> 3390<li><p>a</p></li> 3391<li><p>b</p></li> 3392<li><p>c</p></li> 3393</ul> 3394. 3395 3396So is this, with a empty second item: 3397 3398. 3399* a 3400* 3401 3402* c 3403. 3404<ul> 3405<li><p>a</p></li> 3406<li></li> 3407<li><p>c</p></li> 3408</ul> 3409. 3410 3411These are loose lists, even though there is no space between the items, 3412because one of the items directly contains two block-level elements 3413with a blank line between them: 3414 3415. 3416- a 3417- b 3418 3419 c 3420- d 3421. 3422<ul> 3423<li><p>a</p></li> 3424<li><p>b</p> 3425<p>c</p></li> 3426<li><p>d</p></li> 3427</ul> 3428. 3429 3430. 3431- a 3432- b 3433 3434 [ref]: /url 3435- d 3436. 3437<ul> 3438<li><p>a</p></li> 3439<li><p>b</p></li> 3440<li><p>d</p></li> 3441</ul> 3442. 3443 3444This is a tight list, because the blank lines are in a code block: 3445 3446. 3447- a 3448- ``` 3449 b 3450 3451 3452 ``` 3453- c 3454. 3455<ul> 3456<li>a</li> 3457<li><pre><code>b 3458 3459 3460</code></pre></li> 3461<li>c</li> 3462</ul> 3463. 3464 3465This is a tight list, because the blank line is between two 3466paragraphs of a sublist. So the inner list is loose while 3467the other list is tight: 3468 3469. 3470- a 3471 - b 3472 3473 c 3474- d 3475. 3476<ul> 3477<li>a 3478<ul> 3479<li><p>b</p> 3480<p>c</p></li> 3481</ul></li> 3482<li>d</li> 3483</ul> 3484. 3485 3486This is a tight list, because the blank line is inside the 3487block quote: 3488 3489. 3490* a 3491 > b 3492 > 3493* c 3494. 3495<ul> 3496<li>a 3497<blockquote> 3498<p>b</p> 3499</blockquote></li> 3500<li>c</li> 3501</ul> 3502. 3503 3504This list is tight, because the consecutive block elements 3505are not separated by blank lines: 3506 3507. 3508- a 3509 > b 3510 ``` 3511 c 3512 ``` 3513- d 3514. 3515<ul> 3516<li>a 3517<blockquote> 3518<p>b</p> 3519</blockquote> 3520<pre><code>c 3521</code></pre></li> 3522<li>d</li> 3523</ul> 3524. 3525 3526A single-paragraph list is tight: 3527 3528. 3529- a 3530. 3531<ul> 3532<li>a</li> 3533</ul> 3534. 3535 3536. 3537- a 3538 - b 3539. 3540<ul> 3541<li>a 3542<ul> 3543<li>b</li> 3544</ul></li> 3545</ul> 3546. 3547 3548Here the outer list is loose, the inner list tight: 3549 3550. 3551* foo 3552 * bar 3553 3554 baz 3555. 3556<ul> 3557<li><p>foo</p> 3558<ul> 3559<li>bar</li> 3560</ul> 3561<p>baz</p></li> 3562</ul> 3563. 3564 3565. 3566- a 3567 - b 3568 - c 3569 3570- d 3571 - e 3572 - f 3573. 3574<ul> 3575<li><p>a</p> 3576<ul> 3577<li>b</li> 3578<li>c</li> 3579</ul></li> 3580<li><p>d</p> 3581<ul> 3582<li>e</li> 3583<li>f</li> 3584</ul></li> 3585</ul> 3586. 3587 3588# Inlines 3589 3590Inlines are parsed sequentially from the beginning of the character 3591stream to the end (left to right, in left-to-right languages). 3592Thus, for example, in 3593 3594. 3595`hi`lo` 3596. 3597<p><code>hi</code>lo`</p> 3598. 3599 3600`hi` is parsed as code, leaving the backtick at the end as a literal 3601backtick. 3602 3603## Backslash escapes 3604 3605Any ASCII punctuation character may be backslash-escaped: 3606 3607. 3608\!\"\#\$\%\&\'\(\)\*\+\,\-\.\/\:\;\<\=\>\?\@\[\\\]\^\_\`\{\|\}\~ 3609. 3610<p>!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~</p> 3611. 3612 3613Backslashes before other characters are treated as literal 3614backslashes: 3615 3616. 3617\→\A\a\ \3\φ\« 3618. 3619<p>\ \A\a\ \3\φ\«</p> 3620. 3621 3622Escaped characters are treated as regular characters and do 3623not have their usual Markdown meanings: 3624 3625. 3626\*not emphasized* 3627\<br/> not a tag 3628\[not a link](/foo) 3629\`not code` 36301\. not a list 3631\* not a list 3632\# not a header 3633\[foo]: /url "not a reference" 3634. 3635<p>*not emphasized* 3636<br/> not a tag 3637[not a link](/foo) 3638`not code` 36391. not a list 3640* not a list 3641# not a header 3642[foo]: /url "not a reference"</p> 3643. 3644 3645If a backslash is itself escaped, the following character is not: 3646 3647. 3648\\*emphasis* 3649. 3650<p>\<em>emphasis</em></p> 3651. 3652 3653A backslash at the end of the line is a hard line break: 3654 3655. 3656foo\ 3657bar 3658. 3659<p>foo<br /> 3660bar</p> 3661. 3662 3663Backslash escapes do not work in code blocks, code spans, autolinks, or 3664raw HTML: 3665 3666. 3667`` \[\` `` 3668. 3669<p><code>\[\`</code></p> 3670. 3671 3672. 3673 \[\] 3674. 3675<pre><code>\[\] 3676</code></pre> 3677. 3678 3679. 3680~~~ 3681\[\] 3682~~~ 3683. 3684<pre><code>\[\] 3685</code></pre> 3686. 3687 3688. 3689<http://google.com?find=\*> 3690. 3691<p><a href="http://google.com?find=%5C*">http://google.com?find=\*</a></p> 3692. 3693 3694. 3695<a href="/bar\/)"> 3696. 3697<p><a href="/bar\/)"></p> 3698. 3699 3700But they work in all other contexts, including URLs and link titles, 3701link references, and info strings in [fenced code 3702blocks](#fenced-code-block): 3703 3704. 3705[foo](/bar\* "ti\*tle") 3706. 3707<p><a href="/bar*" title="ti*tle">foo</a></p> 3708. 3709 3710. 3711[foo] 3712 3713[foo]: /bar\* "ti\*tle" 3714. 3715<p><a href="/bar*" title="ti*tle">foo</a></p> 3716. 3717 3718. 3719``` foo\+bar 3720foo 3721``` 3722. 3723<pre><code class="language-foo+bar">foo 3724</code></pre> 3725. 3726 3727 3728## Entities 3729 3730With the goal of making this standard as HTML-agnostic as possible, all HTML valid HTML Entities in any 3731context are recognized as such and converted into their actual values (i.e. the UTF8 characters representing 3732the entity itself) before they are stored in the AST. 3733 3734This allows implementations that target HTML output to trivially escape the entities when generating HTML, 3735and simplifies the job of implementations targetting other languages, as these will only need to handle the 3736UTF8 chars and need not be HTML-entity aware. 3737 3738[Named entities](#name-entities) <a id="named-entities"></a> consist of `&` 3739+ any of the valid HTML5 entity names + `;`. The [following document](http://www.whatwg.org/specs/web-apps/current-work/multipage/entities.json) 3740is used as an authoritative source of the valid entity names and their corresponding codepoints. 3741 3742Conforming implementations that target Markdown don't need to generate entities for all the valid 3743named entities that exist, with the exception of `"` (`"`), `&` (`&`), `<` (`<`) and `>` (`>`), 3744which always need to be written as entities for security reasons. 3745 3746. 3747 & © Æ Ď ¾ ℋ ⅆ ∲ 3748. 3749<p> & © Æ Ď ¾ ℋ ⅆ ∲</p> 3750. 3751 3752[Decimal entities](#decimal-entities) <a id="decimal-entities"></a> 3753consist of `&#` + a string of 1--8 arabic digits + `;`. Again, these entities need to be recognised 3754and tranformed into their corresponding UTF8 codepoints. Invalid Unicode codepoints will be written 3755as the "unknown codepoint" character (`0xFFFD`) 3756 3757. 3758# Ӓ Ϡ � 3759. 3760<p># Ӓ Ϡ �</p> 3761. 3762 3763[Hexadecimal entities](#hexadecimal-entities) <a id="hexadecimal-entities"></a> 3764consist of `&#` + either `X` or `x` + a string of 1-8 hexadecimal digits 3765+ `;`. They will also be parsed and turned into their corresponding UTF8 values in the AST. 3766 3767. 3768" ആ ಫ 3769. 3770<p>" ആ ಫ</p> 3771. 3772 3773Here are some nonentities: 3774 3775. 3776  &x; &#; &#x; &ThisIsWayTooLongToBeAnEntityIsntIt; &hi?; 3777. 3778<p>&nbsp &x; &#; &#x; &ThisIsWayTooLongToBeAnEntityIsntIt; &hi?;</p> 3779. 3780 3781Although HTML5 does accept some entities without a trailing semicolon 3782(such as `©`), these are not recognized as entities here, because it makes the grammar too ambiguous: 3783 3784. 3785© 3786. 3787<p>&copy</p> 3788. 3789 3790Strings that are not on the list of HTML5 named entities are not recognized as entities either: 3791 3792. 3793&MadeUpEntity; 3794. 3795<p>&MadeUpEntity;</p> 3796. 3797 3798Entities are recognized in any context besides code spans or 3799code blocks, including raw HTML, URLs, [link titles](#link-title), and 3800[fenced code block](#fenced-code-block) info strings: 3801 3802. 3803<a href="öö.html"> 3804. 3805<p><a href="öö.html"></p> 3806. 3807 3808. 3809[foo](/föö "föö") 3810. 3811<p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p> 3812. 3813 3814. 3815[foo] 3816 3817[foo]: /föö "föö" 3818. 3819<p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p> 3820. 3821 3822. 3823``` föö 3824foo 3825``` 3826. 3827<pre><code class="language-föö">foo 3828</code></pre> 3829. 3830 3831Entities are treated as literal text in code spans and code blocks: 3832 3833. 3834`föö` 3835. 3836<p><code>f&ouml;&ouml;</code></p> 3837. 3838 3839. 3840 föfö 3841. 3842<pre><code>f&ouml;f&ouml; 3843</code></pre> 3844. 3845 3846## Code span 3847 3848A [backtick string](#backtick-string) <a id="backtick-string"></a> 3849is a string of one or more backtick characters (`` ` ``) that is neither 3850preceded nor followed by a backtick. 3851 3852A code span begins with a backtick string and ends with a backtick 3853string of equal length. The contents of the code span are the 3854characters between the two backtick strings, with leading and trailing 3855spaces and newlines removed, and consecutive spaces and newlines 3856collapsed to single spaces. 3857 3858This is a simple code span: 3859 3860. 3861`foo` 3862. 3863<p><code>foo</code></p> 3864. 3865 3866Here two backticks are used, because the code contains a backtick. 3867This example also illustrates stripping of leading and trailing spaces: 3868 3869. 3870`` foo ` bar `` 3871. 3872<p><code>foo ` bar</code></p> 3873. 3874 3875This example shows the motivation for stripping leading and trailing 3876spaces: 3877 3878. 3879` `` ` 3880. 3881<p><code>``</code></p> 3882. 3883 3884Newlines are treated like spaces: 3885 3886. 3887`` 3888foo 3889`` 3890. 3891<p><code>foo</code></p> 3892. 3893 3894Interior spaces and newlines are collapsed into single spaces, just 3895as they would be by a browser: 3896 3897. 3898`foo bar 3899 baz` 3900. 3901<p><code>foo bar baz</code></p> 3902. 3903 3904Q: Why not just leave the spaces, since browsers will collapse them 3905anyway? A: Because we might be targeting a non-HTML format, and we 3906shouldn't rely on HTML-specific rendering assumptions. 3907 3908(Existing implementations differ in their treatment of internal 3909spaces and newlines. Some, including `Markdown.pl` and 3910`showdown`, convert an internal newline into a `<br />` tag. 3911But this makes things difficult for those who like to hard-wrap 3912their paragraphs, since a line break in the midst of a code 3913span will cause an unintended line break in the output. Others 3914just leave internal spaces as they are, which is fine if only 3915HTML is being targeted.) 3916 3917. 3918`foo `` bar` 3919. 3920<p><code>foo `` bar</code></p> 3921. 3922 3923Note that backslash escapes do not work in code spans. All backslashes 3924are treated literally: 3925 3926. 3927`foo\`bar` 3928. 3929<p><code>foo\</code>bar`</p> 3930. 3931 3932Backslash escapes are never needed, because one can always choose a 3933string of *n* backtick characters as delimiters, where the code does 3934not contain any strings of exactly *n* backtick characters. 3935 3936Code span backticks have higher precedence than any other inline 3937constructs except HTML tags and autolinks. Thus, for example, this is 3938not parsed as emphasized text, since the second `*` is part of a code 3939span: 3940 3941. 3942*foo`*` 3943. 3944<p>*foo<code>*</code></p> 3945. 3946 3947And this is not parsed as a link: 3948 3949. 3950[not a `link](/foo`) 3951. 3952<p>[not a <code>link](/foo</code>)</p> 3953. 3954 3955But this is a link: 3956 3957. 3958<http://foo.bar.`baz>` 3959. 3960<p><a href="http://foo.bar.%60baz">http://foo.bar.`baz</a>`</p> 3961. 3962 3963And this is an HTML tag: 3964 3965. 3966<a href="`">` 3967. 3968<p><a href="`">`</p> 3969. 3970 3971When a backtick string is not closed by a matching backtick string, 3972we just have literal backticks: 3973 3974. 3975```foo`` 3976. 3977<p>```foo``</p> 3978. 3979 3980. 3981`foo 3982. 3983<p>`foo</p> 3984. 3985 3986## Emphasis and strong emphasis 3987 3988John Gruber's original [Markdown syntax 3989description](http://daringfireball.net/projects/markdown/syntax#em) says: 3990 3991> Markdown treats asterisks (`*`) and underscores (`_`) as indicators of 3992> emphasis. Text wrapped with one `*` or `_` will be wrapped with an HTML 3993> `<em>` tag; double `*`'s or `_`'s will be wrapped with an HTML `<strong>` 3994> tag. 3995 3996This is enough for most users, but these rules leave much undecided, 3997especially when it comes to nested emphasis. The original 3998`Markdown.pl` test suite makes it clear that triple `***` and 3999`___` delimiters can be used for strong emphasis, and most 4000implementations have also allowed the following patterns: 4001 4002``` markdown 4003***strong emph*** 4004***strong** in emph* 4005***emph* in strong** 4006**in strong *emph*** 4007*in emph **strong*** 4008``` 4009 4010The following patterns are less widely supported, but the intent 4011is clear and they are useful (especially in contexts like bibliography 4012entries): 4013 4014``` markdown 4015*emph *with emph* in it* 4016**strong **with strong** in it** 4017``` 4018 4019Many implementations have also restricted intraword emphasis to 4020the `*` forms, to avoid unwanted emphasis in words containing 4021internal underscores. (It is best practice to put these in code 4022spans, but users often do not.) 4023 4024``` markdown 4025internal emphasis: foo*bar*baz 4026no emphasis: foo_bar_baz 4027``` 4028 4029The following rules capture all of these patterns, while allowing 4030for efficient parsing strategies that do not backtrack: 4031 40321. A single `*` character [can open emphasis](#can-open-emphasis) 4033 <a id="can-open-emphasis"></a> iff 4034 4035 (a) it is not part of a sequence of four or more unescaped `*`s, 4036 (b) it is not followed by whitespace, and 4037 (c) either it is not followed by a `*` character or it is 4038 followed immediately by strong emphasis. 4039 40402. A single `_` character [can open emphasis](#can-open-emphasis) iff 4041 4042 (a) it is not part of a sequence of four or more unescaped `_`s, 4043 (b) it is not followed by whitespace, 4044 (c) it is not preceded by an ASCII alphanumeric character, and 4045 (d) either it is not followed by a `_` character or it is 4046 followed immediately by strong emphasis. 4047 40483. A single `*` character [can close emphasis](#can-close-emphasis) 4049 <a id="can-close-emphasis"></a> iff 4050 4051 (a) it is not part of a sequence of four or more unescaped `*`s, and 4052 (b) it is not preceded by whitespace. 4053 40544. A single `_` character [can close emphasis](#can-close-emphasis) iff 4055 4056 (a) it is not part of a sequence of four or more unescaped `_`s, 4057 (b) it is not preceded by whitespace, and 4058 (c) it is not followed by an ASCII alphanumeric character. 4059 40605. A double `**` [can open strong emphasis](#can-open-strong-emphasis) 4061 <a id="can-open-strong-emphasis" ></a> iff 4062 4063 (a) it is not part of a sequence of four or more unescaped `*`s, 4064 (b) it is not followed by whitespace, and 4065 (c) either it is not followed by a `*` character or it is 4066 followed immediately by emphasis. 4067 40686. A double `__` [can open strong emphasis](#can-open-strong-emphasis) 4069 iff 4070 4071 (a) it is not part of a sequence of four or more unescaped `_`s, 4072 (b) it is not followed by whitespace, and 4073 (c) it is not preceded by an ASCII alphanumeric character, and 4074 (d) either it is not followed by a `_` character or it is 4075 followed immediately by emphasis. 4076 40777. A double `**` [can close strong emphasis](#can-close-strong-emphasis) 4078 <a id="can-close-strong-emphasis" ></a> iff 4079 4080 (a) it is not part of a sequence of four or more unescaped `*`s, and 4081 (b) it is not preceded by whitespace. 4082 40838. A double `__` [can close strong emphasis](#can-close-strong-emphasis) 4084 iff 4085 4086 (a) it is not part of a sequence of four or more unescaped `_`s, 4087 (b) it is not preceded by whitespace, and 4088 (c) it is not followed by an ASCII alphanumeric character. 4089 40909. Emphasis begins with a delimiter that [can open 4091 emphasis](#can-open-emphasis) and includes inlines parsed 4092 sequentially until a delimiter that [can close 4093 emphasis](#can-close-emphasis), and that uses the same 4094 character (`_` or `*`) as the opening delimiter, is reached. 4095 409610. Strong emphasis begins with a delimiter that [can open strong 4097 emphasis](#can-open-strong-emphasis) and includes inlines parsed 4098 sequentially until a delimiter that [can close strong 4099 emphasis](#can-close-strong-emphasis), and that uses the 4100 same character (`_` or `*`) as the opening delimiter, is reached. 4101 4102These rules can be illustrated through a series of examples. 4103 4104Simple emphasis: 4105 4106. 4107*foo bar* 4108. 4109<p><em>foo bar</em></p> 4110. 4111 4112. 4113_foo bar_ 4114. 4115<p><em>foo bar</em></p> 4116. 4117 4118Simple strong emphasis: 4119 4120. 4121**foo bar** 4122. 4123<p><strong>foo bar</strong></p> 4124. 4125 4126. 4127__foo bar__ 4128. 4129<p><strong>foo bar</strong></p> 4130. 4131 4132Emphasis can continue over line breaks: 4133 4134. 4135*foo 4136bar* 4137. 4138<p><em>foo 4139bar</em></p> 4140. 4141 4142. 4143_foo 4144bar_ 4145. 4146<p><em>foo 4147bar</em></p> 4148. 4149 4150. 4151**foo 4152bar** 4153. 4154<p><strong>foo 4155bar</strong></p> 4156. 4157 4158. 4159__foo 4160bar__ 4161. 4162<p><strong>foo 4163bar</strong></p> 4164. 4165 4166Emphasis can contain other inline constructs: 4167 4168. 4169*foo [bar](/url)* 4170. 4171<p><em>foo <a href="/url">bar</a></em></p> 4172. 4173 4174. 4175_foo [bar](/url)_ 4176. 4177<p><em>foo <a href="/url">bar</a></em></p> 4178. 4179 4180. 4181**foo [bar](/url)** 4182. 4183<p><strong>foo <a href="/url">bar</a></strong></p> 4184. 4185 4186. 4187__foo [bar](/url)__ 4188. 4189<p><strong>foo <a href="/url">bar</a></strong></p> 4190. 4191 4192Symbols contained in other inline constructs will not 4193close emphasis: 4194 4195. 4196*foo [bar*](/url) 4197. 4198<p>*foo <a href="/url">bar*</a></p> 4199. 4200 4201. 4202_foo [bar_](/url) 4203. 4204<p>_foo <a href="/url">bar_</a></p> 4205. 4206 4207. 4208**<a href="**"> 4209. 4210<p>**<a href="**"></p> 4211. 4212 4213. 4214__<a href="__"> 4215. 4216<p>__<a href="__"></p> 4217. 4218 4219. 4220*a `*`* 4221. 4222<p><em>a <code>*</code></em></p> 4223. 4224 4225. 4226_a `_`_ 4227. 4228<p><em>a <code>_</code></em></p> 4229. 4230 4231. 4232**a<http://foo.bar?q=**> 4233. 4234<p>**a<a href="http://foo.bar?q=**">http://foo.bar?q=**</a></p> 4235. 4236 4237. 4238__a<http://foo.bar?q=__> 4239. 4240<p>__a<a href="http://foo.bar?q=__">http://foo.bar?q=__</a></p> 4241. 4242 4243This is not emphasis, because the opening delimiter is 4244followed by white space: 4245 4246. 4247and * foo bar* 4248. 4249<p>and * foo bar*</p> 4250. 4251 4252. 4253_ foo bar_ 4254. 4255<p>_ foo bar_</p> 4256. 4257 4258. 4259and ** foo bar** 4260. 4261<p>and ** foo bar**</p> 4262. 4263 4264. 4265__ foo bar__ 4266. 4267<p>__ foo bar__</p> 4268. 4269 4270This is not emphasis, because the closing delimiter is 4271preceded by white space: 4272 4273. 4274and *foo bar * 4275. 4276<p>and *foo bar *</p> 4277. 4278 4279. 4280and _foo bar _ 4281. 4282<p>and _foo bar _</p> 4283. 4284 4285. 4286and **foo bar ** 4287. 4288<p>and **foo bar **</p> 4289. 4290 4291. 4292and __foo bar __ 4293. 4294<p>and __foo bar __</p> 4295. 4296 4297The rules imply that a sequence of four or more unescaped `*` or 4298`_` characters will always be parsed as a literal string: 4299 4300. 4301****hi**** 4302. 4303<p>****hi****</p> 4304. 4305 4306. 4307_____hi_____ 4308. 4309<p>_____hi_____</p> 4310. 4311 4312. 4313Sign here: _________ 4314. 4315<p>Sign here: _________</p> 4316. 4317 4318The rules also imply that there can be no empty emphasis or strong 4319emphasis: 4320 4321. 4322** is not an empty emphasis 4323. 4324<p>** is not an empty emphasis</p> 4325. 4326 4327. 4328**** is not an empty strong emphasis 4329. 4330<p>**** is not an empty strong emphasis</p> 4331. 4332 4333To include `*` or `_` in emphasized sections, use backslash escapes 4334or code spans: 4335 4336. 4337*here is a \** 4338. 4339<p><em>here is a *</em></p> 4340. 4341 4342. 4343__this is a double underscore (`__`)__ 4344. 4345<p><strong>this is a double underscore (<code>__</code>)</strong></p> 4346. 4347 4348`*` delimiters allow intra-word emphasis; `_` delimiters do not: 4349 4350. 4351foo*bar*baz 4352. 4353<p>foo<em>bar</em>baz</p> 4354. 4355 4356. 4357foo_bar_baz 4358. 4359<p>foo_bar_baz</p> 4360. 4361 4362. 4363foo__bar__baz 4364. 4365<p>foo__bar__baz</p> 4366. 4367 4368. 4369_foo_bar_baz_ 4370. 4371<p><em>foo_bar_baz</em></p> 4372. 4373 4374. 437511*15*32 4376. 4377<p>11<em>15</em>32</p> 4378. 4379 4380. 438111_15_32 4382. 4383<p>11_15_32</p> 4384. 4385 4386Internal underscores will be ignored in underscore-delimited 4387emphasis: 4388 4389. 4390_foo_bar_baz_ 4391. 4392<p><em>foo_bar_baz</em></p> 4393. 4394 4395. 4396__foo__bar__baz__ 4397. 4398<p><strong>foo__bar__baz</strong></p> 4399. 4400 4401The rules are sufficient for the following nesting patterns: 4402 4403. 4404***foo bar*** 4405. 4406<p><strong><em>foo bar</em></strong></p> 4407. 4408 4409. 4410___foo bar___ 4411. 4412<p><strong><em>foo bar</em></strong></p> 4413. 4414 4415. 4416***foo** bar* 4417. 4418<p><em><strong>foo</strong> bar</em></p> 4419. 4420 4421. 4422___foo__ bar_ 4423. 4424<p><em><strong>foo</strong> bar</em></p> 4425. 4426 4427. 4428***foo* bar** 4429. 4430<p><strong><em>foo</em> bar</strong></p> 4431. 4432 4433. 4434___foo_ bar__ 4435. 4436<p><strong><em>foo</em> bar</strong></p> 4437. 4438 4439. 4440*foo **bar*** 4441. 4442<p><em>foo <strong>bar</strong></em></p> 4443. 4444 4445. 4446_foo __bar___ 4447. 4448<p><em>foo <strong>bar</strong></em></p> 4449. 4450 4451. 4452**foo *bar*** 4453. 4454<p><strong>foo <em>bar</em></strong></p> 4455. 4456 4457. 4458__foo _bar___ 4459. 4460<p><strong>foo <em>bar</em></strong></p> 4461. 4462 4463. 4464*foo **bar*** 4465. 4466<p><em>foo <strong>bar</strong></em></p> 4467. 4468 4469. 4470_foo __bar___ 4471. 4472<p><em>foo <strong>bar</strong></em></p> 4473. 4474 4475. 4476*foo *bar* baz* 4477. 4478<p><em>foo <em>bar</em> baz</em></p> 4479. 4480 4481. 4482_foo _bar_ baz_ 4483. 4484<p><em>foo <em>bar</em> baz</em></p> 4485. 4486 4487. 4488**foo **bar** baz** 4489. 4490<p><strong>foo <strong>bar</strong> baz</strong></p> 4491. 4492 4493. 4494__foo __bar__ baz__ 4495. 4496<p><strong>foo <strong>bar</strong> baz</strong></p> 4497. 4498 4499. 4500*foo **bar** baz* 4501. 4502<p><em>foo <strong>bar</strong> baz</em></p> 4503. 4504 4505. 4506_foo __bar__ baz_ 4507. 4508<p><em>foo <strong>bar</strong> baz</em></p> 4509. 4510 4511. 4512**foo *bar* baz** 4513. 4514<p><strong>foo <em>bar</em> baz</strong></p> 4515. 4516 4517. 4518__foo _bar_ baz__ 4519. 4520<p><strong>foo <em>bar</em> baz</strong></p> 4521. 4522 4523Note that you cannot nest emphasis directly inside emphasis 4524using the same delimeter, or strong emphasis directly inside 4525strong emphasis: 4526 4527. 4528**foo** 4529. 4530<p><strong>foo</strong></p> 4531. 4532 4533. 4534****foo**** 4535. 4536<p>****foo****</p> 4537. 4538 4539For these nestings, you need to switch delimiters: 4540 4541. 4542*_foo_* 4543. 4544<p><em><em>foo</em></em></p> 4545. 4546 4547. 4548**__foo__** 4549. 4550<p><strong><strong>foo</strong></strong></p> 4551. 4552 4553Note that a `*` followed by a `*` can close emphasis, and 4554a `**` followed by a `*` can close strong emphasis (and 4555similarly for `_` and `__`): 4556 4557. 4558*foo** 4559. 4560<p><em>foo</em>*</p> 4561. 4562 4563. 4564*foo *bar** 4565. 4566<p><em>foo <em>bar</em></em></p> 4567. 4568 4569. 4570**foo*** 4571. 4572<p><strong>foo</strong>*</p> 4573. 4574 4575. 4576***foo* bar*** 4577. 4578<p><strong><em>foo</em> bar</strong>*</p> 4579. 4580 4581. 4582***foo** bar*** 4583. 4584<p><em><strong>foo</strong> bar</em>**</p> 4585. 4586 4587The following contains no strong emphasis, because the opening 4588delimiter is closed by the first `*` before `bar`: 4589 4590. 4591*foo**bar*** 4592. 4593<p><em>foo</em><em>bar</em>**</p> 4594. 4595 4596However, a string of four or more `****` can never close emphasis: 4597 4598. 4599*foo**** 4600. 4601<p>*foo****</p> 4602. 4603 4604Note that there are some asymmetries here: 4605 4606. 4607*foo** 4608 4609**foo* 4610. 4611<p><em>foo</em>*</p> 4612<p>**foo*</p> 4613. 4614 4615. 4616*foo *bar** 4617 4618**foo* bar* 4619. 4620<p><em>foo <em>bar</em></em></p> 4621<p>**foo* bar*</p> 4622. 4623 4624More cases with mismatched delimiters: 4625 4626. 4627**foo* bar* 4628. 4629<p>**foo* bar*</p> 4630. 4631 4632. 4633*bar*** 4634. 4635<p><em>bar</em>**</p> 4636. 4637 4638. 4639***foo* 4640. 4641<p>***foo*</p> 4642. 4643 4644. 4645**bar*** 4646. 4647<p><strong>bar</strong>*</p> 4648. 4649 4650. 4651***foo** 4652. 4653<p>***foo**</p> 4654. 4655 4656. 4657***foo *bar* 4658. 4659<p>***foo <em>bar</em></p> 4660. 4661 4662## Links 4663 4664A link contains a [link label](#link-label) (the visible text), 4665a [destination](#destination) (the URI that is the link destination), 4666and optionally a [link title](#link-title). There are two basic kinds 4667of links in Markdown. In [inline links](#inline-links) the destination 4668and title are given immediately after the label. In [reference 4669links](#reference-links) the destination and title are defined elsewhere 4670in the document. 4671 4672A [link label](#link-label) <a id="link-label"></a> consists of 4673 4674- an opening `[`, followed by 4675- zero or more backtick code spans, autolinks, HTML tags, link labels, 4676 backslash-escaped ASCII punctuation characters, or non-`]` characters, 4677 followed by 4678- a closing `]`. 4679 4680These rules are motivated by the following intuitive ideas: 4681 4682- A link label is a container for inline elements. 4683- The square brackets bind more tightly than emphasis markers, 4684 but less tightly than `<>` or `` ` ``. 4685- Link labels may contain material in matching square brackets. 4686 4687A [link destination](#link-destination) <a id="link-destination"></a> 4688consists of either 4689 4690- a sequence of zero or more characters between an opening `<` and a 4691 closing `>` that contains no line breaks or unescaped `<` or `>` 4692 characters, or 4693 4694- a nonempty sequence of characters that does not include 4695 ASCII space or control characters, and includes parentheses 4696 only if (a) they are backslash-escaped or (b) they are part of 4697 a balanced pair of unescaped parentheses that is not itself 4698 inside a balanced pair of unescaped paretheses. 4699 4700A [link title](#link-title) <a id="link-title"></a> consists of either 4701 4702- a sequence of zero or more characters between straight double-quote 4703 characters (`"`), including a `"` character only if it is 4704 backslash-escaped, or 4705 4706- a sequence of zero or more characters between straight single-quote 4707 characters (`'`), including a `'` character only if it is 4708 backslash-escaped, or 4709 4710- a sequence of zero or more characters between matching parentheses 4711 (`(...)`), including a `)` character only if it is backslash-escaped. 4712 4713An [inline link](#inline-link) <a id="inline-link"></a> 4714consists of a [link label](#link-label) followed immediately 4715by a left parenthesis `(`, optional whitespace, 4716an optional [link destination](#link-destination), 4717an optional [link title](#link-title) separated from the link 4718destination by whitespace, optional whitespace, and a right 4719parenthesis `)`. The link's text consists of the label (excluding 4720the enclosing square brackets) parsed as inlines. The link's 4721URI consists of the link destination, excluding enclosing `<...>` if 4722present, with backslash-escapes in effect as described above. The 4723link's title consists of the link title, excluding its enclosing 4724delimiters, with backslash-escapes in effect as described above. 4725 4726Here is a simple inline link: 4727 4728. 4729[link](/uri "title") 4730. 4731<p><a href="/uri" title="title">link</a></p> 4732. 4733 4734The title may be omitted: 4735 4736. 4737[link](/uri) 4738. 4739<p><a href="/uri">link</a></p> 4740. 4741 4742Both the title and the destination may be omitted: 4743 4744. 4745[link]() 4746. 4747<p><a href="">link</a></p> 4748. 4749 4750. 4751[link](<>) 4752. 4753<p><a href="">link</a></p> 4754. 4755 4756 4757If the destination contains spaces, it must be enclosed in pointy 4758braces: 4759 4760. 4761[link](/my uri) 4762. 4763<p>[link](/my uri)</p> 4764. 4765 4766. 4767[link](</my uri>) 4768. 4769<p><a href="/my%20uri">link</a></p> 4770. 4771 4772The destination cannot contain line breaks, even with pointy braces: 4773 4774. 4775[link](foo 4776bar) 4777. 4778<p>[link](foo 4779bar)</p> 4780. 4781 4782One level of balanced parentheses is allowed without escaping: 4783 4784. 4785[link]((foo)and(bar)) 4786. 4787<p><a href="(foo)and(bar)">link</a></p> 4788. 4789 4790However, if you have parentheses within parentheses, you need to escape 4791or use the `<...>` form: 4792 4793. 4794[link](foo(and(bar))) 4795. 4796<p>[link](foo(and(bar)))</p> 4797. 4798 4799. 4800[link](foo(and\(bar\))) 4801. 4802<p><a href="foo(and(bar))">link</a></p> 4803. 4804 4805. 4806[link](<foo(and(bar))>) 4807. 4808<p><a href="foo(and(bar))">link</a></p> 4809. 4810 4811Parentheses and other symbols can also be escaped, as usual 4812in Markdown: 4813 4814. 4815[link](foo\)\:) 4816. 4817<p><a href="foo):">link</a></p> 4818. 4819 4820URL-escaping and should be left alone inside the destination, as all URL-escaped characters 4821are also valid URL characters. HTML entities in the destination will be parsed into their UTF8 4822codepoints, as usual, and optionally URL-escaped when written as HTML. 4823 4824. 4825[link](foo%20bä) 4826. 4827<p><a href="foo%20b%C3%A4">link</a></p> 4828. 4829 4830Note that, because titles can often be parsed as destinations, 4831if you try to omit the destination and keep the title, you'll 4832get unexpected results: 4833 4834. 4835[link]("title") 4836. 4837<p><a href="%22title%22">link</a></p> 4838. 4839 4840Titles may be in single quotes, double quotes, or parentheses: 4841 4842. 4843[link](/url "title") 4844[link](/url 'title') 4845[link](/url (title)) 4846. 4847<p><a href="/url" title="title">link</a> 4848<a href="/url" title="title">link</a> 4849<a href="/url" title="title">link</a></p> 4850. 4851 4852Backslash escapes and entities may be used in titles: 4853 4854. 4855[link](/url "title \""") 4856. 4857<p><a href="/url" title="title """>link</a></p> 4858. 4859 4860Nested balanced quotes are not allowed without escaping: 4861 4862. 4863[link](/url "title "and" title") 4864. 4865<p>[link](/url "title "and" title")</p> 4866. 4867 4868But it is easy to work around this by using a different quote type: 4869 4870. 4871[link](/url 'title "and" title') 4872. 4873<p><a href="/url" title="title "and" title">link</a></p> 4874. 4875 4876(Note: `Markdown.pl` did allow double quotes inside a double-quoted 4877title, and its test suite included a test demonstrating this. 4878But it is hard to see a good rationale for the extra complexity this 4879brings, since there are already many ways---backslash escaping, 4880entities, or using a different quote type for the enclosing title---to 4881write titles containing double quotes. `Markdown.pl`'s handling of 4882titles has a number of other strange features. For example, it allows 4883single-quoted titles in inline links, but not reference links. And, in 4884reference links but not inline links, it allows a title to begin with 4885`"` and end with `)`. `Markdown.pl` 1.0.1 even allows titles with no closing 4886quotation mark, though 1.0.2b8 does not. It seems preferable to adopt 4887a simple, rational rule that works the same way in inline links and 4888link reference definitions.) 4889 4890Whitespace is allowed around the destination and title: 4891 4892. 4893[link]( /uri 4894 "title" ) 4895. 4896<p><a href="/uri" title="title">link</a></p> 4897. 4898 4899But it is not allowed between the link label and the 4900following parenthesis: 4901 4902. 4903[link] (/uri) 4904. 4905<p>[link] (/uri)</p> 4906. 4907 4908Note that this is not a link, because the closing `]` occurs in 4909an HTML tag: 4910 4911. 4912[foo <bar attr="](baz)"> 4913. 4914<p>[foo <bar attr="](baz)"></p> 4915. 4916 4917 4918There are three kinds of [reference links](#reference-link): 4919<a id="reference-link"></a> 4920 4921A [full reference link](#full-reference-link) <a id="full-reference-link"></a> 4922consists of a [link label](#link-label), optional whitespace, and 4923another [link label](#link-label) that [matches](#matches) a 4924[link reference definition](#link-reference-definition) elsewhere in the 4925document. 4926 4927One label [matches](#matches) <a id="matches"></a> 4928another just in case their normalized forms are equal. To normalize a 4929label, perform the *unicode case fold* and collapse consecutive internal 4930whitespace to a single space. If there are multiple matching reference 4931link definitions, the one that comes first in the document is used. (It 4932is desirable in such cases to emit a warning.) 4933 4934The contents of the first link label are parsed as inlines, which are 4935used as the link's text. The link's URI and title are provided by the 4936matching [link reference definition](#link-reference-definition). 4937 4938Here is a simple example: 4939 4940. 4941[foo][bar] 4942 4943[bar]: /url "title" 4944. 4945<p><a href="/url" title="title">foo</a></p> 4946. 4947 4948The first label can contain inline content: 4949 4950. 4951[*foo\!*][bar] 4952 4953[bar]: /url "title" 4954. 4955<p><a href="/url" title="title"><em>foo!</em></a></p> 4956. 4957 4958Matching is case-insensitive: 4959 4960. 4961[foo][BaR] 4962 4963[bar]: /url "title" 4964. 4965<p><a href="/url" title="title">foo</a></p> 4966. 4967 4968Unicode case fold is used: 4969 4970. 4971[Толпой][Толпой] is a Russian word. 4972 4973[ТОЛПОЙ]: /url 4974. 4975<p><a href="/url">Толпой</a> is a Russian word.</p> 4976. 4977 4978Consecutive internal whitespace is treated as one space for 4979purposes of determining matching: 4980 4981. 4982[Foo 4983 bar]: /url 4984 4985[Baz][Foo bar] 4986. 4987<p><a href="/url">Baz</a></p> 4988. 4989 4990There can be whitespace between the two labels: 4991 4992. 4993[foo] [bar] 4994 4995[bar]: /url "title" 4996. 4997<p><a href="/url" title="title">foo</a></p> 4998. 4999 5000. 5001[foo] 5002[bar] 5003 5004[bar]: /url "title" 5005. 5006<p><a href="/url" title="title">foo</a></p> 5007. 5008 5009When there are multiple matching [link reference 5010definitions](#link-reference-definition), the first is used: 5011 5012. 5013[foo]: /url1 5014 5015[foo]: /url2 5016 5017[bar][foo] 5018. 5019<p><a href="/url1">bar</a></p> 5020. 5021 5022Note that matching is performed on normalized strings, not parsed 5023inline content. So the following does not match, even though the 5024labels define equivalent inline content: 5025 5026. 5027[bar][foo\!] 5028 5029[foo!]: /url 5030. 5031<p>[bar][foo!]</p> 5032. 5033 5034A [collapsed reference link](#collapsed-reference-link) 5035<a id="collapsed-reference-link"></a> consists of a [link 5036label](#link-label) that [matches](#matches) a [link reference 5037definition](#link-reference-definition) elsewhere in the 5038document, optional whitespace, and the string `[]`. The contents of the 5039first link label are parsed as inlines, which are used as the link's 5040text. The link's URI and title are provided by the matching reference 5041link definition. Thus, `[foo][]` is equivalent to `[foo][foo]`. 5042 5043. 5044[foo][] 5045 5046[foo]: /url "title" 5047. 5048<p><a href="/url" title="title">foo</a></p> 5049. 5050 5051. 5052[*foo* bar][] 5053 5054[*foo* bar]: /url "title" 5055. 5056<p><a href="/url" title="title"><em>foo</em> bar</a></p> 5057. 5058 5059The link labels are case-insensitive: 5060 5061. 5062[Foo][] 5063 5064[foo]: /url "title" 5065. 5066<p><a href="/url" title="title">Foo</a></p> 5067. 5068 5069 5070As with full reference links, whitespace is allowed 5071between the two sets of brackets: 5072 5073. 5074[foo] 5075[] 5076 5077[foo]: /url "title" 5078. 5079<p><a href="/url" title="title">foo</a></p> 5080. 5081 5082A [shortcut reference link](#shortcut-reference-link) 5083<a id="shortcut-reference-link"></a> consists of a [link 5084label](#link-label) that [matches](#matches) a [link reference 5085definition](#link-reference-definition) elsewhere in the 5086document and is not followed by `[]` or a link label. 5087The contents of the first link label are parsed as inlines, 5088which are used as the link's text. the link's URI and title 5089are provided by the matching link reference definition. 5090Thus, `[foo]` is equivalent to `[foo][]`. 5091 5092. 5093[foo] 5094 5095[foo]: /url "title" 5096. 5097<p><a href="/url" title="title">foo</a></p> 5098. 5099 5100. 5101[*foo* bar] 5102 5103[*foo* bar]: /url "title" 5104. 5105<p><a href="/url" title="title"><em>foo</em> bar</a></p> 5106. 5107 5108. 5109[[*foo* bar]] 5110 5111[*foo* bar]: /url "title" 5112. 5113<p>[<a href="/url" title="title"><em>foo</em> bar</a>]</p> 5114. 5115 5116The link labels are case-insensitive: 5117 5118. 5119[Foo] 5120 5121[foo]: /url "title" 5122. 5123<p><a href="/url" title="title">Foo</a></p> 5124. 5125 5126If you just want bracketed text, you can backslash-escape the 5127opening bracket to avoid links: 5128 5129. 5130\[foo] 5131 5132[foo]: /url "title" 5133. 5134<p>[foo]</p> 5135. 5136 5137Note that this is a link, because link labels bind more tightly 5138than emphasis: 5139 5140. 5141[foo*]: /url 5142 5143*[foo*] 5144. 5145<p>*<a href="/url">foo*</a></p> 5146. 5147 5148However, this is not, because link labels bind less 5149tightly than code backticks: 5150 5151. 5152[foo`]: /url 5153 5154[foo`]` 5155. 5156<p>[foo<code>]</code></p> 5157. 5158 5159Link labels can contain matched square brackets: 5160 5161. 5162[[[foo]]] 5163 5164[[[foo]]]: /url 5165. 5166<p><a href="/url">[[foo]]</a></p> 5167. 5168 5169. 5170[[[foo]]] 5171 5172[[[foo]]]: /url1 5173[foo]: /url2 5174. 5175<p><a href="/url1">[[foo]]</a></p> 5176. 5177 5178For non-matching brackets, use backslash escapes: 5179 5180. 5181[\[foo] 5182 5183[\[foo]: /url 5184. 5185<p><a href="/url">[foo</a></p> 5186. 5187 5188Full references take precedence over shortcut references: 5189 5190. 5191[foo][bar] 5192 5193[foo]: /url1 5194[bar]: /url2 5195. 5196<p><a href="/url2">foo</a></p> 5197. 5198 5199In the following case `[bar][baz]` is parsed as a reference, 5200`[foo]` as normal text: 5201 5202. 5203[foo][bar][baz] 5204 5205[baz]: /url 5206. 5207<p>[foo]<a href="/url">bar</a></p> 5208. 5209 5210Here, though, `[foo][bar]` is parsed as a reference, since 5211`[bar]` is defined: 5212 5213. 5214[foo][bar][baz] 5215 5216[baz]: /url1 5217[bar]: /url2 5218. 5219<p><a href="/url2">foo</a><a href="/url1">baz</a></p> 5220. 5221 5222Here `[foo]` is not parsed as a shortcut reference, because it 5223is followed by a link label (even though `[bar]` is not defined): 5224 5225. 5226[foo][bar][baz] 5227 5228[baz]: /url1 5229[foo]: /url2 5230. 5231<p>[foo]<a href="/url1">bar</a></p> 5232. 5233 5234 5235## Images 5236 5237An (unescaped) exclamation mark (`!`) followed by a reference or 5238inline link will be parsed as an image. The link label will be 5239used as the image's alt text, and the link title, if any, will 5240be used as the image's title. 5241 5242. 5243![foo](/url "title") 5244. 5245<p><img src="/url" alt="foo" title="title" /></p> 5246. 5247 5248. 5249![foo *bar*] 5250 5251[foo *bar*]: train.jpg "train & tracks" 5252. 5253<p><img src="train.jpg" alt="foo <em>bar</em>" title="train & tracks" /></p> 5254. 5255 5256. 5257![foo *bar*][] 5258 5259[foo *bar*]: train.jpg "train & tracks" 5260. 5261<p><img src="train.jpg" alt="foo <em>bar</em>" title="train & tracks" /></p> 5262. 5263 5264. 5265![foo *bar*][foobar] 5266 5267[FOOBAR]: train.jpg "train & tracks" 5268. 5269<p><img src="train.jpg" alt="foo <em>bar</em>" title="train & tracks" /></p> 5270. 5271 5272. 5273![foo](train.jpg) 5274. 5275<p><img src="train.jpg" alt="foo" /></p> 5276. 5277 5278. 5279My ![foo bar](/path/to/train.jpg "title" ) 5280. 5281<p>My <img src="/path/to/train.jpg" alt="foo bar" title="title" /></p> 5282. 5283 5284. 5285![foo](<url>) 5286. 5287<p><img src="url" alt="foo" /></p> 5288. 5289 5290. 5291![](/url) 5292. 5293<p><img src="/url" alt="" /></p> 5294. 5295 5296Reference-style: 5297 5298. 5299![foo] [bar] 5300 5301[bar]: /url 5302. 5303<p><img src="/url" alt="foo" /></p> 5304. 5305 5306. 5307![foo] [bar] 5308 5309[BAR]: /url 5310. 5311<p><img src="/url" alt="foo" /></p> 5312. 5313 5314Collapsed: 5315 5316. 5317![foo][] 5318 5319[foo]: /url "title" 5320. 5321<p><img src="/url" alt="foo" title="title" /></p> 5322. 5323 5324. 5325![*foo* bar][] 5326 5327[*foo* bar]: /url "title" 5328. 5329<p><img src="/url" alt="<em>foo</em> bar" title="title" /></p> 5330. 5331 5332The labels are case-insensitive: 5333 5334. 5335![Foo][] 5336 5337[foo]: /url "title" 5338. 5339<p><img src="/url" alt="Foo" title="title" /></p> 5340. 5341 5342As with full reference links, whitespace is allowed 5343between the two sets of brackets: 5344 5345. 5346![foo] 5347[] 5348 5349[foo]: /url "title" 5350. 5351<p><img src="/url" alt="foo" title="title" /></p> 5352. 5353 5354Shortcut: 5355 5356. 5357![foo] 5358 5359[foo]: /url "title" 5360. 5361<p><img src="/url" alt="foo" title="title" /></p> 5362. 5363 5364. 5365![*foo* bar] 5366 5367[*foo* bar]: /url "title" 5368. 5369<p><img src="/url" alt="<em>foo</em> bar" title="title" /></p> 5370. 5371 5372. 5373![[foo]] 5374 5375[[foo]]: /url "title" 5376. 5377<p><img src="/url" alt="[foo]" title="title" /></p> 5378. 5379 5380The link labels are case-insensitive: 5381 5382. 5383![Foo] 5384 5385[foo]: /url "title" 5386. 5387<p><img src="/url" alt="Foo" title="title" /></p> 5388. 5389 5390If you just want bracketed text, you can backslash-escape the 5391opening `!` and `[`: 5392 5393. 5394\!\[foo] 5395 5396[foo]: /url "title" 5397. 5398<p>![foo]</p> 5399. 5400 5401If you want a link after a literal `!`, backslash-escape the 5402`!`: 5403 5404. 5405\![foo] 5406 5407[foo]: /url "title" 5408. 5409<p>!<a href="/url" title="title">foo</a></p> 5410. 5411 5412## Autolinks 5413 5414Autolinks are absolute URIs and email addresses inside `<` and `>`. 5415They are parsed as links, with the URL or email address as the link 5416label. 5417 5418A [URI autolink](#uri-autolink) <a id="uri-autolink"></a> 5419consists of `<`, followed by an [absolute 5420URI](#absolute-uri) not containing `<`, followed by `>`. It is parsed 5421as a link to the URI, with the URI as the link's label. 5422 5423An [absolute URI](#absolute-uri), <a id="absolute-uri"></a> 5424for these purposes, consists of a [scheme](#scheme) followed by a colon (`:`) 5425followed by zero or more characters other than ASCII whitespace and 5426control characters, `<`, and `>`. If the URI includes these characters, 5427you must use percent-encoding (e.g. `%20` for a space). 5428 5429The following [schemes](#scheme) <a id="scheme"></a> 5430are recognized (case-insensitive): 5431`coap`, `doi`, `javascript`, `aaa`, `aaas`, `about`, `acap`, `cap`, 5432`cid`, `crid`, `data`, `dav`, `dict`, `dns`, `file`, `ftp`, `geo`, `go`, 5433`gopher`, `h323`, `http`, `https`, `iax`, `icap`, `im`, `imap`, `info`, 5434`ipp`, `iris`, `iris.beep`, `iris.xpc`, `iris.xpcs`, `iris.lwz`, `ldap`, 5435`mailto`, `mid`, `msrp`, `msrps`, `mtqp`, `mupdate`, `news`, `nfs`, 5436`ni`, `nih`, `nntp`, `opaquelocktoken`, `pop`, `pres`, `rtsp`, 5437`service`, `session`, `shttp`, `sieve`, `sip`, `sips`, `sms`, `snmp`,` 5438soap.beep`, `soap.beeps`, `tag`, `tel`, `telnet`, `tftp`, `thismessage`, 5439`tn3270`, `tip`, `tv`, `urn`, `vemmi`, `ws`, `wss`, `xcon`, 5440`xcon-userid`, `xmlrpc.beep`, `xmlrpc.beeps`, `xmpp`, `z39.50r`, 5441`z39.50s`, `adiumxtra`, `afp`, `afs`, `aim`, `apt`,` attachment`, `aw`, 5442`beshare`, `bitcoin`, `bolo`, `callto`, `chrome`,` chrome-extension`, 5443`com-eventbrite-attendee`, `content`, `cvs`,` dlna-playsingle`, 5444`dlna-playcontainer`, `dtn`, `dvb`, `ed2k`, `facetime`, `feed`, 5445`finger`, `fish`, `gg`, `git`, `gizmoproject`, `gtalk`, `hcp`, `icon`, 5446`ipn`, `irc`, `irc6`, `ircs`, `itms`, `jar`, `jms`, `keyparc`, `lastfm`, 5447`ldaps`, `magnet`, `maps`, `market`,` message`, `mms`, `ms-help`, 5448`msnim`, `mumble`, `mvn`, `notes`, `oid`, `palm`, `paparazzi`, 5449`platform`, `proxy`, `psyc`, `query`, `res`, `resource`, `rmi`, `rsync`, 5450`rtmp`, `secondlife`, `sftp`, `sgn`, `skype`, `smb`, `soldat`, 5451`spotify`, `ssh`, `steam`, `svn`, `teamspeak`, `things`, `udp`, 5452`unreal`, `ut2004`, `ventrilo`, `view-source`, `webcal`, `wtai`, 5453`wyciwyg`, `xfire`, `xri`, `ymsgr`. 5454 5455Here are some valid autolinks: 5456 5457. 5458<http://foo.bar.baz> 5459. 5460<p><a href="http://foo.bar.baz">http://foo.bar.baz</a></p> 5461. 5462 5463. 5464<http://foo.bar.baz?q=hello&id=22&boolean> 5465. 5466<p><a href="http://foo.bar.baz?q=hello&id=22&boolean">http://foo.bar.baz?q=hello&id=22&boolean</a></p> 5467. 5468 5469. 5470<irc://foo.bar:2233/baz> 5471. 5472<p><a href="irc://foo.bar:2233/baz">irc://foo.bar:2233/baz</a></p> 5473. 5474 5475Uppercase is also fine: 5476 5477. 5478<MAILTO:FOO@BAR.BAZ> 5479. 5480<p><a href="MAILTO:FOO@BAR.BAZ">MAILTO:FOO@BAR.BAZ</a></p> 5481. 5482 5483Spaces are not allowed in autolinks: 5484 5485. 5486<http://foo.bar/baz bim> 5487. 5488<p><http://foo.bar/baz bim></p> 5489. 5490 5491An [email autolink](#email-autolink) <a id="email-autolink"></a> 5492consists of `<`, followed by an [email address](#email-address), 5493followed by `>`. The link's label is the email address, 5494and the URL is `mailto:` followed by the email address. 5495 5496An [email address](#email-address), <a id="email-address"></a> 5497for these purposes, is anything that matches 5498the [non-normative regex from the HTML5 5499spec](http://www.whatwg.org/specs/web-apps/current-work/multipage/forms.html#e-mail-state-%28type=email%29): 5500 5501 /^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])? 5502 (?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/ 5503 5504Examples of email autolinks: 5505 5506. 5507<foo@bar.baz.com> 5508. 5509<p><a href="mailto:foo@bar.baz.com">foo@bar.baz.com</a></p> 5510. 5511 5512. 5513<foo+special@Bar.baz-bar0.com> 5514. 5515<p><a href="mailto:foo+special@Bar.baz-bar0.com">foo+special@Bar.baz-bar0.com</a></p> 5516. 5517 5518These are not autolinks: 5519 5520. 5521<> 5522. 5523<p><></p> 5524. 5525 5526. 5527<heck://bing.bong> 5528. 5529<p><heck://bing.bong></p> 5530. 5531 5532. 5533< http://foo.bar > 5534. 5535<p>< http://foo.bar ></p> 5536. 5537 5538. 5539<foo.bar.baz> 5540. 5541<p><foo.bar.baz></p> 5542. 5543 5544. 5545<localhost:5001/foo> 5546. 5547<p><localhost:5001/foo></p> 5548. 5549 5550. 5551http://google.com 5552. 5553<p>http://google.com</p> 5554. 5555 5556. 5557foo@bar.baz.com 5558. 5559<p>foo@bar.baz.com</p> 5560. 5561 5562## Raw HTML 5563 5564Text between `<` and `>` that looks like an HTML tag is parsed as a 5565raw HTML tag and will be rendered in HTML without escaping. 5566Tag and attribute names are not limited to current HTML tags, 5567so custom tags (and even, say, DocBook tags) may be used. 5568 5569Here is the grammar for tags: 5570 5571A [tag name](#tag-name) <a id="tag-name"></a> consists of an ASCII letter 5572followed by zero or more ASCII letters or digits. 5573 5574An [attribute](#attribute) <a id="attribute"></a> consists of whitespace, 5575an **attribute name**, and an optional **attribute value 5576specification**. 5577 5578An [attribute name](#attribute-name) <a id="attribute-name"></a> 5579consists of an ASCII letter, `_`, or `:`, followed by zero or more ASCII 5580letters, digits, `_`, `.`, `:`, or `-`. (Note: This is the XML 5581specification restricted to ASCII. HTML5 is laxer.) 5582 5583An [attribute value specification](#attribute-value-specification) 5584<a id="attribute-value-specification"></a> consists of optional whitespace, 5585a `=` character, optional whitespace, and an [attribute 5586value](#attribute-value). 5587 5588An [attribute value](#attribute-value) <a id="attribute-value"></a> 5589consists of an [unquoted attribute value](#unquoted-attribute-value), 5590a [single-quoted attribute value](#single-quoted-attribute-value), 5591or a [double-quoted attribute value](#double-quoted-attribute-value). 5592 5593An [unquoted attribute value](#unquoted-attribute-value) 5594<a id="unquoted-attribute-value"></a> is a nonempty string of characters not 5595including spaces, `"`, `'`, `=`, `<`, `>`, or `` ` ``. 5596 5597A [single-quoted attribute value](#single-quoted-attribute-value) 5598<a id="single-quoted-attribute-value"></a> consists of `'`, zero or more 5599characters not including `'`, and a final `'`. 5600 5601A [double-quoted attribute value](#double-quoted-attribute-value) 5602<a id="double-quoted-attribute-value"></a> consists of `"`, zero or more 5603characters not including `"`, and a final `"`. 5604 5605An [open tag](#open-tag) <a id="open-tag"></a> consists of a `<` character, 5606a [tag name](#tag-name), zero or more [attributes](#attribute), 5607optional whitespace, an optional `/` character, and a `>` character. 5608 5609A [closing tag](#closing-tag) <a id="closing-tag"></a> consists of the 5610string `</`, a [tag name](#tag-name), optional whitespace, and the 5611character `>`. 5612 5613An [HTML comment](#html-comment) <a id="html-comment"></a> consists of the 5614string `<!--`, a string of characters not including the string `--`, and 5615the string `-->`. 5616 5617A [processing instruction](#processing-instruction) 5618<a id="processing-instruction"></a> consists of the string `<?`, a string 5619of characters not including the string `?>`, and the string 5620`?>`. 5621 5622A [declaration](#declaration) <a id="declaration"></a> consists of the 5623string `<!`, a name consisting of one or more uppercase ASCII letters, 5624whitespace, a string of characters not including the character `>`, and 5625the character `>`. 5626 5627A [CDATA section](#cdata-section) <a id="cdata-section"></a> consists of 5628the string `<![CDATA[`, a string of characters not including the string 5629`]]>`, and the string `]]>`. 5630 5631An [HTML tag](#html-tag) <a id="html-tag"></a> consists of an [open 5632tag](#open-tag), a [closing tag](#closing-tag), an [HTML 5633comment](#html-comment), a [processing 5634instruction](#processing-instruction), an [element type 5635declaration](#element-type-declaration), or a [CDATA 5636section](#cdata-section). 5637 5638Here are some simple open tags: 5639 5640. 5641<a><bab><c2c> 5642. 5643<p><a><bab><c2c></p> 5644. 5645 5646Empty elements: 5647 5648. 5649<a/><b2/> 5650. 5651<p><a/><b2/></p> 5652. 5653 5654Whitespace is allowed: 5655 5656. 5657<a /><b2 5658data="foo" > 5659. 5660<p><a /><b2 5661data="foo" ></p> 5662. 5663 5664With attributes: 5665 5666. 5667<a foo="bar" bam = 'baz <em>"</em>' 5668_boolean zoop:33=zoop:33 /> 5669. 5670<p><a foo="bar" bam = 'baz <em>"</em>' 5671_boolean zoop:33=zoop:33 /></p> 5672. 5673 5674Illegal tag names, not parsed as HTML: 5675 5676. 5677<33> <__> 5678. 5679<p><33> <__></p> 5680. 5681 5682Illegal attribute names: 5683 5684. 5685<a h*#ref="hi"> 5686. 5687<p><a h*#ref="hi"></p> 5688. 5689 5690Illegal attribute values: 5691 5692. 5693<a href="hi'> <a href=hi'> 5694. 5695<p><a href="hi'> <a href=hi'></p> 5696. 5697 5698Illegal whitespace: 5699 5700. 5701< a>< 5702foo><bar/ > 5703. 5704<p>< a>< 5705foo><bar/ ></p> 5706. 5707 5708Missing whitespace: 5709 5710. 5711<a href='bar'title=title> 5712. 5713<p><a href='bar'title=title></p> 5714. 5715 5716Closing tags: 5717 5718. 5719</a> 5720</foo > 5721. 5722<p></a> 5723</foo ></p> 5724. 5725 5726Illegal attributes in closing tag: 5727 5728. 5729</a href="foo"> 5730. 5731<p></a href="foo"></p> 5732. 5733 5734Comments: 5735 5736. 5737foo <!-- this is a 5738comment - with hyphen --> 5739. 5740<p>foo <!-- this is a 5741comment - with hyphen --></p> 5742. 5743 5744. 5745foo <!-- not a comment -- two hyphens --> 5746. 5747<p>foo <!-- not a comment -- two hyphens --></p> 5748. 5749 5750Processing instructions: 5751 5752. 5753foo <?php echo $a; ?> 5754. 5755<p>foo <?php echo $a; ?></p> 5756. 5757 5758Declarations: 5759 5760. 5761foo <!ELEMENT br EMPTY> 5762. 5763<p>foo <!ELEMENT br EMPTY></p> 5764. 5765 5766CDATA sections: 5767 5768. 5769foo <![CDATA[>&<]]> 5770. 5771<p>foo <![CDATA[>&<]]></p> 5772. 5773 5774Entities are preserved in HTML attributes: 5775 5776. 5777<a href="ö"> 5778. 5779<p><a href="ö"></p> 5780. 5781 5782Backslash escapes do not work in HTML attributes: 5783 5784. 5785<a href="\*"> 5786. 5787<p><a href="\*"></p> 5788. 5789 5790. 5791<a href="\""> 5792. 5793<p><a href="""></p> 5794. 5795 5796## Hard line breaks 5797 5798A line break (not in a code span or HTML tag) that is preceded 5799by two or more spaces is parsed as a linebreak (rendered 5800in HTML as a `<br />` tag): 5801 5802. 5803foo 5804baz 5805. 5806<p>foo<br /> 5807baz</p> 5808. 5809 5810For a more visible alternative, a backslash before the newline may be 5811used instead of two spaces: 5812 5813. 5814foo\ 5815baz 5816. 5817<p>foo<br /> 5818baz</p> 5819. 5820 5821More than two spaces can be used: 5822 5823. 5824foo 5825baz 5826. 5827<p>foo<br /> 5828baz</p> 5829. 5830 5831Leading spaces at the beginning of the next line are ignored: 5832 5833. 5834foo 5835 bar 5836. 5837<p>foo<br /> 5838bar</p> 5839. 5840 5841. 5842foo\ 5843 bar 5844. 5845<p>foo<br /> 5846bar</p> 5847. 5848 5849Line breaks can occur inside emphasis, links, and other constructs 5850that allow inline content: 5851 5852. 5853*foo 5854bar* 5855. 5856<p><em>foo<br /> 5857bar</em></p> 5858. 5859 5860. 5861*foo\ 5862bar* 5863. 5864<p><em>foo<br /> 5865bar</em></p> 5866. 5867 5868Line breaks do not occur inside code spans 5869 5870. 5871`code 5872span` 5873. 5874<p><code>code span</code></p> 5875. 5876 5877. 5878`code\ 5879span` 5880. 5881<p><code>code\ span</code></p> 5882. 5883 5884or HTML tags: 5885 5886. 5887<a href="foo 5888bar"> 5889. 5890<p><a href="foo 5891bar"></p> 5892. 5893 5894. 5895<a href="foo\ 5896bar"> 5897. 5898<p><a href="foo\ 5899bar"></p> 5900. 5901 5902## Soft line breaks 5903 5904A regular line break (not in a code span or HTML tag) that is not 5905preceded by two or more spaces is parsed as a softbreak. (A 5906softbreak may be rendered in HTML either as a newline or as a space. 5907The result will be the same in browsers. In the examples here, a 5908newline will be used.) 5909 5910. 5911foo 5912baz 5913. 5914<p>foo 5915baz</p> 5916. 5917 5918Spaces at the end of the line and beginning of the next line are 5919removed: 5920 5921. 5922foo 5923 baz 5924. 5925<p>foo 5926baz</p> 5927. 5928 5929A conforming parser may render a soft line break in HTML either as a 5930line break or as a space. 5931 5932A renderer may also provide an option to render soft line breaks 5933as hard line breaks. 5934 5935## Strings 5936 5937Any characters not given an interpretation by the above rules will 5938be parsed as string content. 5939 5940. 5941hello $.;'there 5942. 5943<p>hello $.;'there</p> 5944. 5945 5946. 5947Foo χρῆν 5948. 5949<p>Foo χρῆν</p> 5950. 5951 5952Internal spaces are preserved verbatim: 5953 5954. 5955Multiple spaces 5956. 5957<p>Multiple spaces</p> 5958. 5959 5960<!-- END TESTS --> 5961 5962# Appendix A: A parsing strategy {-} 5963 5964## Overview {-} 5965 5966Parsing has two phases: 5967 59681. In the first phase, lines of input are consumed and the block 5969structure of the document---its division into paragraphs, block quotes, 5970list items, and so on---is constructed. Text is assigned to these 5971blocks but not parsed. Link reference definitions are parsed and a 5972map of links is constructed. 5973 59742. In the second phase, the raw text contents of paragraphs and headers 5975are parsed into sequences of Markdown inline elements (strings, 5976code spans, links, emphasis, and so on), using the map of link 5977references constructed in phase 1. 5978 5979## The document tree {-} 5980 5981At each point in processing, the document is represented as a tree of 5982**blocks**. The root of the tree is a `document` block. The `document` 5983may have any number of other blocks as **children**. These children 5984may, in turn, have other blocks as children. The last child of a block 5985is normally considered **open**, meaning that subsequent lines of input 5986can alter its contents. (Blocks that are not open are **closed**.) 5987Here, for example, is a possible document tree, with the open blocks 5988marked by arrows: 5989 5990``` tree 5991-> document 5992 -> block_quote 5993 paragraph 5994 "Lorem ipsum dolor\nsit amet." 5995 -> list (type=bullet tight=true bullet_char=-) 5996 list_item 5997 paragraph 5998 "Qui *quodsi iracundia*" 5999 -> list_item 6000 -> paragraph 6001 "aliquando id" 6002``` 6003 6004## How source lines alter the document tree {-} 6005 6006Each line that is processed has an effect on this tree. The line is 6007analyzed and, depending on its contents, the document may be altered 6008in one or more of the following ways: 6009 60101. One or more open blocks may be closed. 60112. One or more new blocks may be created as children of the 6012 last open block. 60133. Text may be added to the last (deepest) open block remaining 6014 on the tree. 6015 6016Once a line has been incorporated into the tree in this way, 6017it can be discarded, so input can be read in a stream. 6018 6019We can see how this works by considering how the tree above is 6020generated by four lines of Markdown: 6021 6022``` markdown 6023> Lorem ipsum dolor 6024sit amet. 6025> - Qui *quodsi iracundia* 6026> - aliquando id 6027``` 6028 6029At the outset, our document model is just 6030 6031``` tree 6032-> document 6033``` 6034 6035The first line of our text, 6036 6037``` markdown 6038> Lorem ipsum dolor 6039``` 6040 6041causes a `block_quote` block to be created as a child of our 6042open `document` block, and a `paragraph` block as a child of 6043the `block_quote`. Then the text is added to the last open 6044block, the `paragraph`: 6045 6046``` tree 6047-> document 6048 -> block_quote 6049 -> paragraph 6050 "Lorem ipsum dolor" 6051``` 6052 6053The next line, 6054 6055``` markdown 6056sit amet. 6057``` 6058 6059is a "lazy continuation" of the open `paragraph`, so it gets added 6060to the paragraph's text: 6061 6062``` tree 6063-> document 6064 -> block_quote 6065 -> paragraph 6066 "Lorem ipsum dolor\nsit amet." 6067``` 6068 6069The third line, 6070 6071``` markdown 6072> - Qui *quodsi iracundia* 6073``` 6074 6075causes the `paragraph` block to be closed, and a new `list` block 6076opened as a child of the `block_quote`. A `list_item` is also 6077added as a child of the `list`, and a `paragraph` as a child of 6078the `list_item`. The text is then added to the new `paragraph`: 6079 6080``` tree 6081-> document 6082 -> block_quote 6083 paragraph 6084 "Lorem ipsum dolor\nsit amet." 6085 -> list (type=bullet tight=true bullet_char=-) 6086 -> list_item 6087 -> paragraph 6088 "Qui *quodsi iracundia*" 6089``` 6090 6091The fourth line, 6092 6093``` markdown 6094> - aliquando id 6095``` 6096 6097causes the `list_item` (and its child the `paragraph`) to be closed, 6098and a new `list_item` opened up as child of the `list`. A `paragraph` 6099is added as a child of the new `list_item`, to contain the text. 6100We thus obtain the final tree: 6101 6102``` tree 6103-> document 6104 -> block_quote 6105 paragraph 6106 "Lorem ipsum dolor\nsit amet." 6107 -> list (type=bullet tight=true bullet_char=-) 6108 list_item 6109 paragraph 6110 "Qui *quodsi iracundia*" 6111 -> list_item 6112 -> paragraph 6113 "aliquando id" 6114``` 6115 6116## From block structure to the final document {-} 6117 6118Once all of the input has been parsed, all open blocks are closed. 6119 6120We then "walk the tree," visiting every node, and parse raw 6121string contents of paragraphs and headers as inlines. At this 6122point we have seen all the link reference definitions, so we can 6123resolve reference links as we go. 6124 6125``` tree 6126document 6127 block_quote 6128 paragraph 6129 str "Lorem ipsum dolor" 6130 softbreak 6131 str "sit amet." 6132 list (type=bullet tight=true bullet_char=-) 6133 list_item 6134 paragraph 6135 str "Qui " 6136 emph 6137 str "quodsi iracundia" 6138 list_item 6139 paragraph 6140 str "aliquando id" 6141``` 6142 6143Notice how the newline in the first paragraph has been parsed as 6144a `softbreak`, and the asterisks in the first list item have become 6145an `emph`. 6146 6147The document can be rendered as HTML, or in any other format, given 6148an appropriate renderer. 6149 6150 6151