• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1---
2title: CommonMark Spec
3author:
4- John MacFarlane
5version: 2
6date: 2014-09-19
7...
8
9# Introduction
10
11## What is Markdown?
12
13Markdown is a plain text format for writing structured documents,
14based on conventions used for indicating formatting in email and
15usenet posts.  It was developed in 2004 by John Gruber, who wrote
16the first Markdown-to-HTML converter in perl, and it soon became
17widely used in websites.  By 2014 there were dozens of
18implementations in many languages.  Some of them extended basic
19Markdown syntax with conventions for footnotes, definition lists,
20tables, and other constructs, and some allowed output not just in
21HTML but in LaTeX and many other formats.
22
23## Why is a spec needed?
24
25John Gruber's [canonical description of Markdown's
26syntax](http://daringfireball.net/projects/markdown/syntax)
27does not specify the syntax unambiguously.  Here are some examples of
28questions it does not answer:
29
301.  How much indentation is needed for a sublist?  The spec says that
31    continuation paragraphs need to be indented four spaces, but is
32    not fully explicit about sublists.  It is natural to think that
33    they, too, must be indented four spaces, but `Markdown.pl` does
34    not require that.  This is hardly a "corner case," and divergences
35    between implementations on this issue often lead to surprises for
36    users in real documents. (See [this comment by John
37    Gruber](http://article.gmane.org/gmane.text.markdown.general/1997).)
38
392.  Is a blank line needed before a block quote or header?
40    Most implementations do not require the blank line.  However,
41    this can lead to unexpected results in hard-wrapped text, and
42    also to ambiguities in parsing (note that some implementations
43    put the header inside the blockquote, while others do not).
44    (John Gruber has also spoken [in favor of requiring the blank
45    lines](http://article.gmane.org/gmane.text.markdown.general/2146).)
46
473.  Is a blank line needed before an indented code block?
48    (`Markdown.pl` requires it, but this is not mentioned in the
49    documentation, and some implementations do not require it.)
50
51    ``` markdown
52    paragraph
53        code?
54    ```
55
564.  What is the exact rule for determining when list items get
57    wrapped in `<p>` tags?  Can a list be partially "loose" and partially
58    "tight"?  What should we do with a list like this?
59
60    ``` markdown
61    1. one
62
63    2. two
64    3. three
65    ```
66
67    Or this?
68
69    ``` markdown
70    1.  one
71        - a
72
73        - b
74    2.  two
75    ```
76
77    (There are some relevant comments by John Gruber
78    [here](http://article.gmane.org/gmane.text.markdown.general/2554).)
79
805.  Can list markers be indented?  Can ordered list markers be right-aligned?
81
82    ``` markdown
83     8. item 1
84     9. item 2
85    10. item 2a
86    ```
87
886.  Is this one list with a horizontal rule in its second item,
89    or two lists separated by a horizontal rule?
90
91    ``` markdown
92    * a
93    * * * * *
94    * b
95    ```
96
977.  When list markers change from numbers to bullets, do we have
98    two lists or one?  (The Markdown syntax description suggests two,
99    but the perl scripts and many other implementations produce one.)
100
101    ``` markdown
102    1. fee
103    2. fie
104    -  foe
105    -  fum
106    ```
107
1088.  What are the precedence rules for the markers of inline structure?
109    For example, is the following a valid link, or does the code span
110    take precedence ?
111
112    ``` markdown
113    [a backtick (`)](/url) and [another backtick (`)](/url).
114    ```
115
1169.  What are the precedence rules for markers of emphasis and strong
117    emphasis?  For example, how should the following be parsed?
118
119    ``` markdown
120    *foo *bar* baz*
121    ```
122
12310. What are the precedence rules between block-level and inline-level
124    structure?  For example, how should the following be parsed?
125
126    ``` markdown
127    - `a long code span can contain a hyphen like this
128      - and it can screw things up`
129    ```
130
13111. Can list items include headers?  (`Markdown.pl` does not allow this,
132    but headers can occur in blockquotes.)
133
134    ``` markdown
135    - # Heading
136    ```
137
13812. Can link references be defined inside block quotes or list items?
139
140    ``` markdown
141    > Blockquote [foo].
142    >
143    > [foo]: /url
144    ```
145
14613. If there are multiple definitions for the same reference, which takes
147    precedence?
148
149    ``` markdown
150    [foo]: /url1
151    [foo]: /url2
152
153    [foo][]
154    ```
155
156In the absence of a spec, early implementers consulted `Markdown.pl`
157to resolve these ambiguities.  But `Markdown.pl` was quite buggy, and
158gave manifestly bad results in many cases, so it was not a
159satisfactory replacement for a spec.
160
161Because there is no unambiguous spec, implementations have diverged
162considerably.  As a result, users are often surprised to find that
163a document that renders one way on one system (say, a github wiki)
164renders differently on another (say, converting to docbook using
165pandoc).  To make matters worse, because nothing in Markdown counts
166as a "syntax error," the divergence often isn't discovered right away.
167
168## About this document
169
170This document attempts to specify Markdown syntax unambiguously.
171It contains many examples with side-by-side Markdown and
172HTML.  These are intended to double as conformance tests.  An
173accompanying script `runtests.pl` can be used to run the tests
174against any Markdown program:
175
176    perl runtests.pl spec.txt PROGRAM
177
178Since this document describes how Markdown is to be parsed into
179an abstract syntax tree, it would have made sense to use an abstract
180representation of the syntax tree instead of HTML.  But HTML is capable
181of representing the structural distinctions we need to make, and the
182choice of HTML for the tests makes it possible to run the tests against
183an implementation without writing an abstract syntax tree renderer.
184
185This document is generated from a text file, `spec.txt`, written
186in Markdown with a small extension for the side-by-side tests.
187The script `spec2md.pl` can be used to turn `spec.txt` into pandoc
188Markdown, which can then be converted into other formats.
189
190In the examples, the `→` character is used to represent tabs.
191
192# Preprocessing
193
194A [line](#line) <a id="line"></a>
195is a sequence of zero or more characters followed by a line
196ending (CR, LF, or CRLF) or by the end of
197file.
198
199This spec does not specify an encoding; it thinks of lines as composed
200of characters rather than bytes.  A conforming parser may be limited
201to a certain encoding.
202
203Tabs in lines are expanded to spaces, with a tab stop of 4 characters:
204
205.
206→foo→baz→→bim
207.
208<pre><code>foo baz     bim
209</code></pre>
210.
211
212.
213    a→a
214    ὐ→a
215.
216<pre><code>a   a
217ὐ   a
218</code></pre>
219.
220
221Line endings are replaced by newline characters (LF).
222
223A line containing no characters, or a line containing only spaces (after
224tab expansion), is called a [blank line](#blank-line).
225<a id="blank-line"></a>
226
227# Blocks and inlines
228
229We can think of a document as a sequence of [blocks](#block)<a
230id="block"></a>---structural elements like paragraphs, block quotations,
231lists, headers, rules, and code blocks.  Blocks can contain other
232blocks, or they can contain [inline](#inline)<a id="inline"></a> content:
233words, spaces, links, emphasized text, images, and inline code.
234
235## Precedence
236
237Indicators of block structure always take precedence over indicators
238of inline structure.  So, for example, the following is a list with
239two items, not a list with one item containing a code span:
240
241.
242- `one
243- two`
244.
245<ul>
246<li>`one</li>
247<li>two`</li>
248</ul>
249.
250
251This means that parsing can proceed in two steps:  first, the block
252structure of the document can be discerned; second, text lines inside
253paragraphs, headers, and other block constructs can be parsed for inline
254structure.  The second step requires information about link reference
255definitions that will be available only at the end of the first
256step.  Note that the first step requires processing lines in sequence,
257but the second can be parallelized, since the inline parsing of
258one block element does not affect the inline parsing of any other.
259
260## Container blocks and leaf blocks
261
262We can divide blocks into two types:
263[container blocks](#container-block), <a id="container-block"></a>
264which can contain other blocks, and [leaf blocks](#leaf-block),
265<a id="leaf-block"></a> which cannot.
266
267# Leaf blocks
268
269This section describes the different kinds of leaf block that make up a
270Markdown document.
271
272## Horizontal rules
273
274A line consisting of 0-3 spaces of indentation, followed by a sequence
275of three or more matching `-`, `_`, or `*` characters, each followed
276optionally by any number of spaces, forms a [horizontal
277rule](#horizontal-rule). <a id="horizontal-rule"></a>
278
279.
280***
281---
282___
283.
284<hr />
285<hr />
286<hr />
287.
288
289Wrong characters:
290
291.
292+++
293.
294<p>+++</p>
295.
296
297.
298===
299.
300<p>===</p>
301.
302
303Not enough characters:
304
305.
306--
307**
308__
309.
310<p>--
311**
312__</p>
313.
314
315One to three spaces indent are allowed:
316
317.
318 ***
319  ***
320   ***
321.
322<hr />
323<hr />
324<hr />
325.
326
327Four spaces is too many:
328
329.
330    ***
331.
332<pre><code>***
333</code></pre>
334.
335
336.
337Foo
338    ***
339.
340<p>Foo
341***</p>
342.
343
344More than three characters may be used:
345
346.
347_____________________________________
348.
349<hr />
350.
351
352Spaces are allowed between the characters:
353
354.
355 - - -
356.
357<hr />
358.
359
360.
361 **  * ** * ** * **
362.
363<hr />
364.
365
366.
367-     -      -      -
368.
369<hr />
370.
371
372Spaces are allowed at the end:
373
374.
375- - - -
376.
377<hr />
378.
379
380However, no other characters may occur at the end or the
381beginning:
382
383.
384_ _ _ _ a
385
386a------
387.
388<p>_ _ _ _ a</p>
389<p>a------</p>
390.
391
392It is required that all of the non-space characters be the same.
393So, this is not a horizontal rule:
394
395.
396 *-*
397.
398<p><em>-</em></p>
399.
400
401Horizontal rules do not need blank lines before or after:
402
403.
404- foo
405***
406- bar
407.
408<ul>
409<li>foo</li>
410</ul>
411<hr />
412<ul>
413<li>bar</li>
414</ul>
415.
416
417Horizontal rules can interrupt a paragraph:
418
419.
420Foo
421***
422bar
423.
424<p>Foo</p>
425<hr />
426<p>bar</p>
427.
428
429Note, however, that this is a setext header, not a paragraph followed
430by a horizontal rule:
431
432.
433Foo
434---
435bar
436.
437<h2>Foo</h2>
438<p>bar</p>
439.
440
441When both a horizontal rule and a list item are possible
442interpretations of a line, the horizontal rule is preferred:
443
444.
445* Foo
446* * *
447* Bar
448.
449<ul>
450<li>Foo</li>
451</ul>
452<hr />
453<ul>
454<li>Bar</li>
455</ul>
456.
457
458If you want a horizontal rule in a list item, use a different bullet:
459
460.
461- Foo
462- * * *
463.
464<ul>
465<li>Foo</li>
466<li><hr /></li>
467</ul>
468.
469
470## ATX headers
471
472An [ATX header](#atx-header) <a id="atx-header"></a>
473consists of a string of characters, parsed as inline content, between an
474opening sequence of 1--6 unescaped `#` characters and an optional
475closing sequence of any number of `#` characters.  The opening sequence
476of `#` characters cannot be followed directly by a nonspace character.
477The closing `#` characters may be followed by spaces only.  The opening
478`#` character may be indented 0-3 spaces.  The raw contents of the
479header are stripped of leading and trailing spaces before being parsed
480as inline content.  The header level is equal to the number of `#`
481characters in the opening sequence.
482
483Simple headers:
484
485.
486# foo
487## foo
488### foo
489#### foo
490##### foo
491###### foo
492.
493<h1>foo</h1>
494<h2>foo</h2>
495<h3>foo</h3>
496<h4>foo</h4>
497<h5>foo</h5>
498<h6>foo</h6>
499.
500
501More than six `#` characters is not a header:
502
503.
504####### foo
505.
506<p>####### foo</p>
507.
508
509A space is required between the `#` characters and the header's
510contents.  Note that many implementations currently do not require
511the space.  However, the space was required by the [original ATX
512implementation](http://www.aaronsw.com/2002/atx/atx.py), and it helps
513prevent things like the following from being parsed as headers:
514
515.
516#5 bolt
517.
518<p>#5 bolt</p>
519.
520
521This is not a header, because the first `#` is escaped:
522
523.
524\## foo
525.
526<p>## foo</p>
527.
528
529Contents are parsed as inlines:
530
531.
532# foo *bar* \*baz\*
533.
534<h1>foo <em>bar</em> *baz*</h1>
535.
536
537Leading and trailing blanks are ignored in parsing inline content:
538
539.
540#                  foo
541.
542<h1>foo</h1>
543.
544
545One to three spaces indentation are allowed:
546
547.
548 ### foo
549  ## foo
550   # foo
551.
552<h3>foo</h3>
553<h2>foo</h2>
554<h1>foo</h1>
555.
556
557Four spaces are too much:
558
559.
560    # foo
561.
562<pre><code># foo
563</code></pre>
564.
565
566.
567foo
568    # bar
569.
570<p>foo
571# bar</p>
572.
573
574A closing sequence of `#` characters is optional:
575
576.
577## foo ##
578  ###   bar    ###
579.
580<h2>foo</h2>
581<h3>bar</h3>
582.
583
584It need not be the same length as the opening sequence:
585
586.
587# foo ##################################
588##### foo ##
589.
590<h1>foo</h1>
591<h5>foo</h5>
592.
593
594Spaces are allowed after the closing sequence:
595
596.
597### foo ###
598.
599<h3>foo</h3>
600.
601
602A sequence of `#` characters with a nonspace character following it
603is not a closing sequence, but counts as part of the contents of the
604header:
605
606.
607### foo ### b
608.
609<h3>foo ### b</h3>
610.
611
612Backslash-escaped `#` characters do not count as part
613of the closing sequence:
614
615.
616### foo \###
617## foo \#\##
618# foo \#
619.
620<h3>foo #</h3>
621<h2>foo ##</h2>
622<h1>foo #</h1>
623.
624
625ATX headers need not be separated from surrounding content by blank
626lines, and they can interrupt paragraphs:
627
628.
629****
630## foo
631****
632.
633<hr />
634<h2>foo</h2>
635<hr />
636.
637
638.
639Foo bar
640# baz
641Bar foo
642.
643<p>Foo bar</p>
644<h1>baz</h1>
645<p>Bar foo</p>
646.
647
648ATX headers can be empty:
649
650.
651##
652#
653### ###
654.
655<h2></h2>
656<h1></h1>
657<h3></h3>
658.
659
660## Setext headers
661
662A [setext header](#setext-header) <a id="setext-header"></a>
663consists of a line of text, containing at least one nonspace character,
664with no more than 3 spaces indentation, followed by a [setext header
665underline](#setext-header-underline).  A [setext header
666underline](#setext-header-underline) <a id="setext-header-underline"></a>
667is a sequence of `=` characters or a sequence of `-` characters, with no
668more than 3 spaces indentation and any number of trailing
669spaces.  The header is a level 1 header if `=` characters are used, and
670a level 2 header if `-` characters are used.  The contents of the header
671are the result of parsing the first line as Markdown inline content.
672
673In general, a setext header need not be preceded or followed by a
674blank line.  However, it cannot interrupt a paragraph, so when a
675setext header comes after a paragraph, a blank line is needed between
676them.
677
678Simple examples:
679
680.
681Foo *bar*
682=========
683
684Foo *bar*
685---------
686.
687<h1>Foo <em>bar</em></h1>
688<h2>Foo <em>bar</em></h2>
689.
690
691The underlining can be any length:
692
693.
694Foo
695-------------------------
696
697Foo
698=
699.
700<h2>Foo</h2>
701<h1>Foo</h1>
702.
703
704The header content can be indented up to three spaces, and need
705not line up with the underlining:
706
707.
708   Foo
709---
710
711  Foo
712-----
713
714  Foo
715  ===
716.
717<h2>Foo</h2>
718<h2>Foo</h2>
719<h1>Foo</h1>
720.
721
722Four spaces indent is too much:
723
724.
725    Foo
726    ---
727
728    Foo
729---
730.
731<pre><code>Foo
732---
733
734Foo
735</code></pre>
736<hr />
737.
738
739The setext header underline can be indented up to three spaces, and
740may have trailing spaces:
741
742.
743Foo
744   ----
745.
746<h2>Foo</h2>
747.
748
749Four spaces is too much:
750
751.
752Foo
753     ---
754.
755<p>Foo
756---</p>
757.
758
759The setext header underline cannot contain internal spaces:
760
761.
762Foo
763= =
764
765Foo
766--- -
767.
768<p>Foo
769= =</p>
770<p>Foo</p>
771<hr />
772.
773
774Trailing spaces in the content line do not cause a line break:
775
776.
777Foo
778-----
779.
780<h2>Foo</h2>
781.
782
783Nor does a backslash at the end:
784
785.
786Foo\
787----
788.
789<h2>Foo\</h2>
790.
791
792Since indicators of block structure take precedence over
793indicators of inline structure, the following are setext headers:
794
795.
796`Foo
797----
798`
799
800<a title="a lot
801---
802of dashes"/>
803.
804<h2>`Foo</h2>
805<p>`</p>
806<h2>&lt;a title=&quot;a lot</h2>
807<p>of dashes&quot;/&gt;</p>
808.
809
810The setext header underline cannot be a lazy line:
811
812.
813> Foo
814---
815.
816<blockquote>
817<p>Foo</p>
818</blockquote>
819<hr />
820.
821
822A setext header cannot interrupt a paragraph:
823
824.
825Foo
826Bar
827---
828
829Foo
830Bar
831===
832.
833<p>Foo
834Bar</p>
835<hr />
836<p>Foo
837Bar
838===</p>
839.
840
841But in general a blank line is not required before or after:
842
843.
844---
845Foo
846---
847Bar
848---
849Baz
850.
851<hr />
852<h2>Foo</h2>
853<h2>Bar</h2>
854<p>Baz</p>
855.
856
857Setext headers cannot be empty:
858
859.
860
861====
862.
863<p>====</p>
864.
865
866
867## Indented code blocks
868
869An [indented code block](#indented-code-block)
870<a id="indented-code-block"></a> is composed of one or more
871[indented chunks](#indented-chunk) separated by blank lines.
872An [indented chunk](#indented-chunk) <a id="indented-chunk"></a>
873is a sequence of non-blank lines, each indented four or more
874spaces.  An indented code block cannot interrupt a paragraph, so
875if it occurs before or after a paragraph, there must be an
876intervening blank line.  The contents of the code block are
877the literal contents of the lines, including trailing newlines,
878minus four spaces of indentation. An indented code block has no
879attributes.
880
881.
882    a simple
883      indented code block
884.
885<pre><code>a simple
886  indented code block
887</code></pre>
888.
889
890The contents are literal text, and do not get parsed as Markdown:
891
892.
893    <a/>
894    *hi*
895
896    - one
897.
898<pre><code>&lt;a/&gt;
899*hi*
900
901- one
902</code></pre>
903.
904
905Here we have three chunks separated by blank lines:
906
907.
908    chunk1
909
910    chunk2
911
912
913
914    chunk3
915.
916<pre><code>chunk1
917
918chunk2
919
920
921
922chunk3
923</code></pre>
924.
925
926Any initial spaces beyond four will be included in the content, even
927in interior blank lines:
928
929.
930    chunk1
931
932      chunk2
933.
934<pre><code>chunk1
935
936  chunk2
937</code></pre>
938.
939
940An indented code block cannot interrupt a paragraph.  (This
941allows hanging indents and the like.)
942
943.
944Foo
945    bar
946
947.
948<p>Foo
949bar</p>
950.
951
952However, any non-blank line with fewer than four leading spaces ends
953the code block immediately.  So a paragraph may occur immediately
954after indented code:
955
956.
957    foo
958bar
959.
960<pre><code>foo
961</code></pre>
962<p>bar</p>
963.
964
965And indented code can occur immediately before and after other kinds of
966blocks:
967
968.
969# Header
970    foo
971Header
972------
973    foo
974----
975.
976<h1>Header</h1>
977<pre><code>foo
978</code></pre>
979<h2>Header</h2>
980<pre><code>foo
981</code></pre>
982<hr />
983.
984
985The first line can be indented more than four spaces:
986
987.
988        foo
989    bar
990.
991<pre><code>    foo
992bar
993</code></pre>
994.
995
996Blank lines preceding or following an indented code block
997are not included in it:
998
999.
1000
1001
1002    foo
1003
1004
1005.
1006<pre><code>foo
1007</code></pre>
1008.
1009
1010Trailing spaces are included in the code block's content:
1011
1012.
1013    foo
1014.
1015<pre><code>foo
1016</code></pre>
1017.
1018
1019
1020## Fenced code blocks
1021
1022A [code fence](#code-fence) <a id="code-fence"></a> is a sequence
1023of at least three consecutive backtick characters (`` ` ``) or
1024tildes (`~`).  (Tildes and backticks cannot be mixed.)
1025A [fenced code block](#fenced-code-block) <a id="fenced-code-block"></a>
1026begins with a code fence, indented no more than three spaces.
1027
1028The line with the opening code fence may optionally contain some text
1029following the code fence; this is trimmed of leading and trailing
1030spaces and called the [info string](#info-string).
1031<a id="info-string"></a> The info string may not contain any backtick
1032characters.  (The reason for this restriction is that otherwise
1033some inline code would be incorrectly interpreted as the
1034beginning of a fenced code block.)
1035
1036The content of the code block consists of all subsequent lines, until
1037a closing [code fence](#code-fence) of the same type as the code block
1038began with (backticks or tildes), and with at least as many backticks
1039or tildes as the opening code fence.  If the leading code fence is
1040indented N spaces, then up to N spaces of indentation are removed from
1041each line of the content (if present).  (If a content line is not
1042indented, it is preserved unchanged.  If it is indented less than N
1043spaces, all of the indentation is removed.)
1044
1045The closing code fence may be indented up to three spaces, and may be
1046followed only by spaces, which are ignored.  If the end of the
1047containing block (or document) is reached and no closing code fence
1048has been found, the code block contains all of the lines after the
1049opening code fence until the end of the containing block (or
1050document).  (An alternative spec would require backtracking in the
1051event that a closing code fence is not found.  But this makes parsing
1052much less efficient, and there seems to be no real down side to the
1053behavior described here.)
1054
1055A fenced code block may interrupt a paragraph, and does not require
1056a blank line either before or after.
1057
1058The content of a code fence is treated as literal text, not parsed
1059as inlines.  The first word of the info string is typically used to
1060specify the language of the code sample, and rendered in the `class`
1061attribute of the `code` tag.  However, this spec does not mandate any
1062particular treatment of the info string.
1063
1064Here is a simple example with backticks:
1065
1066.
1067```
1068<
1069 >
1070```
1071.
1072<pre><code>&lt;
1073 &gt;
1074</code></pre>
1075.
1076
1077With tildes:
1078
1079.
1080~~~
1081<
1082 >
1083~~~
1084.
1085<pre><code>&lt;
1086 &gt;
1087</code></pre>
1088.
1089
1090The closing code fence must use the same character as the opening
1091fence:
1092
1093.
1094```
1095aaa
1096~~~
1097```
1098.
1099<pre><code>aaa
1100~~~
1101</code></pre>
1102.
1103
1104.
1105~~~
1106aaa
1107```
1108~~~
1109.
1110<pre><code>aaa
1111```
1112</code></pre>
1113.
1114
1115The closing code fence must be at least as long as the opening fence:
1116
1117.
1118````
1119aaa
1120```
1121``````
1122.
1123<pre><code>aaa
1124```
1125</code></pre>
1126.
1127
1128.
1129~~~~
1130aaa
1131~~~
1132~~~~
1133.
1134<pre><code>aaa
1135~~~
1136</code></pre>
1137.
1138
1139Unclosed code blocks are closed by the end of the document:
1140
1141.
1142```
1143.
1144<pre><code></code></pre>
1145.
1146
1147.
1148`````
1149
1150```
1151aaa
1152.
1153<pre><code>
1154```
1155aaa
1156</code></pre>
1157.
1158
1159A code block can have all empty lines as its content:
1160
1161.
1162```
1163
1164
1165```
1166.
1167<pre><code>
1168
1169</code></pre>
1170.
1171
1172A code block can be empty:
1173
1174.
1175```
1176```
1177.
1178<pre><code></code></pre>
1179.
1180
1181Fences can be indented.  If the opening fence is indented,
1182content lines will have equivalent opening indentation removed,
1183if present:
1184
1185.
1186 ```
1187 aaa
1188aaa
1189```
1190.
1191<pre><code>aaa
1192aaa
1193</code></pre>
1194.
1195
1196.
1197  ```
1198aaa
1199  aaa
1200aaa
1201  ```
1202.
1203<pre><code>aaa
1204aaa
1205aaa
1206</code></pre>
1207.
1208
1209.
1210   ```
1211   aaa
1212    aaa
1213  aaa
1214   ```
1215.
1216<pre><code>aaa
1217 aaa
1218aaa
1219</code></pre>
1220.
1221
1222Four spaces indentation produces an indented code block:
1223
1224.
1225    ```
1226    aaa
1227    ```
1228.
1229<pre><code>```
1230aaa
1231```
1232</code></pre>
1233.
1234
1235Code fences (opening and closing) cannot contain internal spaces:
1236
1237.
1238``` ```
1239aaa
1240.
1241<p><code></code>
1242aaa</p>
1243.
1244
1245.
1246~~~~~~
1247aaa
1248~~~ ~~
1249.
1250<pre><code>aaa
1251~~~ ~~
1252</code></pre>
1253.
1254
1255Fenced code blocks can interrupt paragraphs, and can be followed
1256directly by paragraphs, without a blank line between:
1257
1258.
1259foo
1260```
1261bar
1262```
1263baz
1264.
1265<p>foo</p>
1266<pre><code>bar
1267</code></pre>
1268<p>baz</p>
1269.
1270
1271Other blocks can also occur before and after fenced code blocks
1272without an intervening blank line:
1273
1274.
1275foo
1276---
1277~~~
1278bar
1279~~~
1280# baz
1281.
1282<h2>foo</h2>
1283<pre><code>bar
1284</code></pre>
1285<h1>baz</h1>
1286.
1287
1288An [info string](#info-string) can be provided after the opening code fence.
1289Opening and closing spaces will be stripped, and the first word, prefixed
1290with `language-`, is used as the value for the `class` attribute of the
1291`code` element within the enclosing `pre` element.
1292
1293.
1294```ruby
1295def foo(x)
1296  return 3
1297end
1298```
1299.
1300<pre><code class="language-ruby">def foo(x)
1301  return 3
1302end
1303</code></pre>
1304.
1305
1306.
1307~~~~    ruby startline=3 $%@#$
1308def foo(x)
1309  return 3
1310end
1311~~~~~~~
1312.
1313<pre><code class="language-ruby">def foo(x)
1314  return 3
1315end
1316</code></pre>
1317.
1318
1319.
1320````;
1321````
1322.
1323<pre><code class="language-;"></code></pre>
1324.
1325
1326Info strings for backtick code blocks cannot contain backticks:
1327
1328.
1329``` aa ```
1330foo
1331.
1332<p><code>aa</code>
1333foo</p>
1334.
1335
1336Closing code fences cannot have info strings:
1337
1338.
1339```
1340``` aaa
1341```
1342.
1343<pre><code>``` aaa
1344</code></pre>
1345.
1346
1347
1348## HTML blocks
1349
1350An [HTML block tag](#html-block-tag) <a id="html-block-tag"></a> is
1351an [open tag](#open-tag) or [closing tag](#closing-tag) whose tag
1352name is one of the following (case-insensitive):
1353`article`, `header`, `aside`, `hgroup`, `blockquote`, `hr`, `iframe`,
1354`body`, `li`, `map`, `button`, `object`, `canvas`, `ol`, `caption`,
1355`output`, `col`, `p`, `colgroup`, `pre`, `dd`, `progress`, `div`,
1356`section`, `dl`, `table`, `td`, `dt`, `tbody`, `embed`, `textarea`,
1357`fieldset`, `tfoot`, `figcaption`, `th`, `figure`, `thead`, `footer`,
1358`footer`, `tr`, `form`, `ul`, `h1`, `h2`, `h3`, `h4`, `h5`, `h6`,
1359`video`, `script`, `style`.
1360
1361An [HTML block](#html-block) <a id="html-block"></a> begins with an
1362[HTML block tag](#html-block-tag), [HTML comment](#html-comment),
1363[processing instruction](#processing-instruction),
1364[declaration](#declaration), or [CDATA section](#cdata-section).
1365It ends when a [blank line](#blank-line) or the end of the
1366input is encountered.  The initial line may be indented up to three
1367spaces, and subsequent lines may have any indentation.  The contents
1368of the HTML block are interpreted as raw HTML, and will not be escaped
1369in HTML output.
1370
1371Some simple examples:
1372
1373.
1374<table>
1375  <tr>
1376    <td>
1377           hi
1378    </td>
1379  </tr>
1380</table>
1381
1382okay.
1383.
1384<table>
1385  <tr>
1386    <td>
1387           hi
1388    </td>
1389  </tr>
1390</table>
1391<p>okay.</p>
1392.
1393
1394.
1395 <div>
1396  *hello*
1397         <foo><a>
1398.
1399 <div>
1400  *hello*
1401         <foo><a>
1402.
1403
1404Here we have two code blocks with a Markdown paragraph between them:
1405
1406.
1407<DIV CLASS="foo">
1408
1409*Markdown*
1410
1411</DIV>
1412.
1413<DIV CLASS="foo">
1414<p><em>Markdown</em></p>
1415</DIV>
1416.
1417
1418In the following example, what looks like a Markdown code block
1419is actually part of the HTML block, which continues until a blank
1420line or the end of the document is reached:
1421
1422.
1423<div></div>
1424``` c
1425int x = 33;
1426```
1427.
1428<div></div>
1429``` c
1430int x = 33;
1431```
1432.
1433
1434A comment:
1435
1436.
1437<!-- Foo
1438bar
1439   baz -->
1440.
1441<!-- Foo
1442bar
1443   baz -->
1444.
1445
1446A processing instruction:
1447
1448.
1449<?php
1450  echo 'foo'
1451?>
1452.
1453<?php
1454  echo 'foo'
1455?>
1456.
1457
1458CDATA:
1459
1460.
1461<![CDATA[
1462function matchwo(a,b)
1463{
1464if (a < b && a < 0) then
1465  {
1466  return 1;
1467  }
1468else
1469  {
1470  return 0;
1471  }
1472}
1473]]>
1474.
1475<![CDATA[
1476function matchwo(a,b)
1477{
1478if (a < b && a < 0) then
1479  {
1480  return 1;
1481  }
1482else
1483  {
1484  return 0;
1485  }
1486}
1487]]>
1488.
1489
1490The opening tag can be indented 1-3 spaces, but not 4:
1491
1492.
1493  <!-- foo -->
1494
1495    <!-- foo -->
1496.
1497  <!-- foo -->
1498<pre><code>&lt;!-- foo --&gt;
1499</code></pre>
1500.
1501
1502An HTML block can interrupt a paragraph, and need not be preceded
1503by a blank line.
1504
1505.
1506Foo
1507<div>
1508bar
1509</div>
1510.
1511<p>Foo</p>
1512<div>
1513bar
1514</div>
1515.
1516
1517However, a following blank line is always needed, except at the end of
1518a document:
1519
1520.
1521<div>
1522bar
1523</div>
1524*foo*
1525.
1526<div>
1527bar
1528</div>
1529*foo*
1530.
1531
1532An incomplete HTML block tag may also start an HTML block:
1533
1534.
1535<div class
1536foo
1537.
1538<div class
1539foo
1540.
1541
1542This rule differs from John Gruber's original Markdown syntax
1543specification, which says:
1544
1545> The only restrictions are that block-level HTML elements —
1546> e.g. `<div>`, `<table>`, `<pre>`, `<p>`, etc. — must be separated from
1547> surrounding content by blank lines, and the start and end tags of the
1548> block should not be indented with tabs or spaces.
1549
1550In some ways Gruber's rule is more restrictive than the one given
1551here:
1552
1553- It requires that an HTML block be preceded by a blank line.
1554- It does not allow the start tag to be indented.
1555- It requires a matching end tag, which it also does not allow to
1556  be indented.
1557
1558Indeed, most Markdown implementations, including some of Gruber's
1559own perl implementations, do not impose these restrictions.
1560
1561There is one respect, however, in which Gruber's rule is more liberal
1562than the one given here, since it allows blank lines to occur inside
1563an HTML block.  There are two reasons for disallowing them here.
1564First, it removes the need to parse balanced tags, which is
1565expensive and can require backtracking from the end of the document
1566if no matching end tag is found. Second, it provides a very simple
1567and flexible way of including Markdown content inside HTML tags:
1568simply separate the Markdown from the HTML using blank lines:
1569
1570.
1571<div>
1572
1573*Emphasized* text.
1574
1575</div>
1576.
1577<div>
1578<p><em>Emphasized</em> text.</p>
1579</div>
1580.
1581
1582Compare:
1583
1584.
1585<div>
1586*Emphasized* text.
1587</div>
1588.
1589<div>
1590*Emphasized* text.
1591</div>
1592.
1593
1594Some Markdown implementations have adopted a convention of
1595interpreting content inside tags as text if the open tag has
1596the attribute `markdown=1`.  The rule given above seems a simpler and
1597more elegant way of achieving the same expressive power, which is also
1598much simpler to parse.
1599
1600The main potential drawback is that one can no longer paste HTML
1601blocks into Markdown documents with 100% reliability.  However,
1602*in most cases* this will work fine, because the blank lines in
1603HTML are usually followed by HTML block tags.  For example:
1604
1605.
1606<table>
1607
1608<tr>
1609
1610<td>
1611Hi
1612</td>
1613
1614</tr>
1615
1616</table>
1617.
1618<table>
1619<tr>
1620<td>
1621Hi
1622</td>
1623</tr>
1624</table>
1625.
1626
1627Moreover, blank lines are usually not necessary and can be
1628deleted.  The exception is inside `<pre>` tags; here, one can
1629replace the blank lines with `&#10;` entities.
1630
1631So there is no important loss of expressive power with the new rule.
1632
1633## Link reference definitions
1634
1635A [link reference definition](#link-reference-definition)
1636<a id="link-reference-definition"></a> consists of a [link
1637label](#link-label), indented up to three spaces, followed
1638by a colon (`:`), optional blank space (including up to one
1639newline), a [link destination](#link-destination), optional
1640blank space (including up to one newline), and an optional [link
1641title](#link-title), which if it is present must be separated
1642from the [link destination](#link-destination) by whitespace.
1643No further non-space characters may occur on the line.
1644
1645A [link reference-definition](#link-reference-definition)
1646does not correspond to a structural element of a document.  Instead, it
1647defines a label which can be used in [reference links](#reference-link)
1648and reference-style [images](#image) elsewhere in the document.  [Link
1649reference definitions] can come either before or after the links that use
1650them.
1651
1652.
1653[foo]: /url "title"
1654
1655[foo]
1656.
1657<p><a href="/url" title="title">foo</a></p>
1658.
1659
1660.
1661   [foo]:
1662      /url
1663           'the title'
1664
1665[foo]
1666.
1667<p><a href="/url" title="the title">foo</a></p>
1668.
1669
1670.
1671[Foo*bar\]]:my_(url) 'title (with parens)'
1672
1673[Foo*bar\]]
1674.
1675<p><a href="my_(url)" title="title (with parens)">Foo*bar]</a></p>
1676.
1677
1678.
1679[Foo bar]:
1680<my url>
1681'title'
1682
1683[Foo bar]
1684.
1685<p><a href="my%20url" title="title">Foo bar</a></p>
1686.
1687
1688The title may be omitted:
1689
1690.
1691[foo]:
1692/url
1693
1694[foo]
1695.
1696<p><a href="/url">foo</a></p>
1697.
1698
1699The link destination may not be omitted:
1700
1701.
1702[foo]:
1703
1704[foo]
1705.
1706<p>[foo]:</p>
1707<p>[foo]</p>
1708.
1709
1710A link can come before its corresponding definition:
1711
1712.
1713[foo]
1714
1715[foo]: url
1716.
1717<p><a href="url">foo</a></p>
1718.
1719
1720If there are several matching definitions, the first one takes
1721precedence:
1722
1723.
1724[foo]
1725
1726[foo]: first
1727[foo]: second
1728.
1729<p><a href="first">foo</a></p>
1730.
1731
1732As noted in the section on [Links], matching of labels is
1733case-insensitive (see [matches](#matches)).
1734
1735.
1736[FOO]: /url
1737
1738[Foo]
1739.
1740<p><a href="/url">Foo</a></p>
1741.
1742
1743.
1744[ΑΓΩ]: /φου
1745
1746[αγω]
1747.
1748<p><a href="/%CF%86%CE%BF%CF%85">αγω</a></p>
1749.
1750
1751Here is a link reference definition with no corresponding link.
1752It contributes nothing to the document.
1753
1754.
1755[foo]: /url
1756.
1757.
1758
1759This is not a link reference definition, because there are
1760non-space characters after the title:
1761
1762.
1763[foo]: /url "title" ok
1764.
1765<p>[foo]: /url &quot;title&quot; ok</p>
1766.
1767
1768This is not a link reference definition, because it is indented
1769four spaces:
1770
1771.
1772    [foo]: /url "title"
1773
1774[foo]
1775.
1776<pre><code>[foo]: /url &quot;title&quot;
1777</code></pre>
1778<p>[foo]</p>
1779.
1780
1781This is not a link reference definition, because it occurs inside
1782a code block:
1783
1784.
1785```
1786[foo]: /url
1787```
1788
1789[foo]
1790.
1791<pre><code>[foo]: /url
1792</code></pre>
1793<p>[foo]</p>
1794.
1795
1796A [link reference definition](#link-reference-definition) cannot
1797interrupt a paragraph.
1798
1799.
1800Foo
1801[bar]: /baz
1802
1803[bar]
1804.
1805<p>Foo
1806[bar]: /baz</p>
1807<p>[bar]</p>
1808.
1809
1810However, it can directly follow other block elements, such as headers
1811and horizontal rules, and it need not be followed by a blank line.
1812
1813.
1814# [Foo]
1815[foo]: /url
1816> bar
1817.
1818<h1><a href="/url">Foo</a></h1>
1819<blockquote>
1820<p>bar</p>
1821</blockquote>
1822.
1823
1824Several [link references](#link-reference) can occur one after another,
1825without intervening blank lines.
1826
1827.
1828[foo]: /foo-url "foo"
1829[bar]: /bar-url
1830  "bar"
1831[baz]: /baz-url
1832
1833[foo],
1834[bar],
1835[baz]
1836.
1837<p><a href="/foo-url" title="foo">foo</a>,
1838<a href="/bar-url" title="bar">bar</a>,
1839<a href="/baz-url">baz</a></p>
1840.
1841
1842[Link reference definitions](#link-reference-definition) can occur
1843inside block containers, like lists and block quotations.  They
1844affect the entire document, not just the container in which they
1845are defined:
1846
1847.
1848[foo]
1849
1850> [foo]: /url
1851.
1852<p><a href="/url">foo</a></p>
1853<blockquote>
1854</blockquote>
1855.
1856
1857
1858## Paragraphs
1859
1860A sequence of non-blank lines that cannot be interpreted as other
1861kinds of blocks forms a [paragraph](#paragraph).<a id="paragraph"></a>
1862The contents of the paragraph are the result of parsing the
1863paragraph's raw content as inlines.  The paragraph's raw content
1864is formed by concatenating the lines and removing initial and final
1865spaces.
1866
1867A simple example with two paragraphs:
1868
1869.
1870aaa
1871
1872bbb
1873.
1874<p>aaa</p>
1875<p>bbb</p>
1876.
1877
1878Paragraphs can contain multiple lines, but no blank lines:
1879
1880.
1881aaa
1882bbb
1883
1884ccc
1885ddd
1886.
1887<p>aaa
1888bbb</p>
1889<p>ccc
1890ddd</p>
1891.
1892
1893Multiple blank lines between paragraph have no effect:
1894
1895.
1896aaa
1897
1898
1899bbb
1900.
1901<p>aaa</p>
1902<p>bbb</p>
1903.
1904
1905Leading spaces are skipped:
1906
1907.
1908  aaa
1909 bbb
1910.
1911<p>aaa
1912bbb</p>
1913.
1914
1915Lines after the first may be indented any amount, since indented
1916code blocks cannot interrupt paragraphs.
1917
1918.
1919aaa
1920             bbb
1921                                       ccc
1922.
1923<p>aaa
1924bbb
1925ccc</p>
1926.
1927
1928However, the first line may be indented at most three spaces,
1929or an indented code block will be triggered:
1930
1931.
1932   aaa
1933bbb
1934.
1935<p>aaa
1936bbb</p>
1937.
1938
1939.
1940    aaa
1941bbb
1942.
1943<pre><code>aaa
1944</code></pre>
1945<p>bbb</p>
1946.
1947
1948Final spaces are stripped before inline parsing, so a paragraph
1949that ends with two or more spaces will not end with a hard line
1950break:
1951
1952.
1953aaa
1954bbb
1955.
1956<p>aaa<br />
1957bbb</p>
1958.
1959
1960## Blank lines
1961
1962[Blank lines](#blank-line) between block-level elements are ignored,
1963except for the role they play in determining whether a [list](#list)
1964is [tight](#tight) or [loose](#loose).
1965
1966Blank lines at the beginning and end of the document are also ignored.
1967
1968.
1969
1970
1971aaa
1972
1973
1974# aaa
1975
1976
1977.
1978<p>aaa</p>
1979<h1>aaa</h1>
1980.
1981
1982
1983# Container blocks
1984
1985A [container block](#container-block) is a block that has other
1986blocks as its contents.  There are two basic kinds of container blocks:
1987[block quotes](#block-quote) and [list items](#list-item).
1988[Lists](#list) are meta-containers for [list items](#list-item).
1989
1990We define the syntax for container blocks recursively.  The general
1991form of the definition is:
1992
1993> If X is a sequence of blocks, then the result of
1994> transforming X in such-and-such a way is a container of type Y
1995> with these blocks as its content.
1996
1997So, we explain what counts as a block quote or list item by explaining
1998how these can be *generated* from their contents. This should suffice
1999to define the syntax, although it does not give a recipe for *parsing*
2000these constructions.  (A recipe is provided below in the section entitled
2001[A parsing strategy](#appendix-a-a-parsing-strategy).)
2002
2003## Block quotes
2004
2005A [block quote marker](#block-quote-marker) <a id="block-quote-marker"></a>
2006consists of 0-3 spaces of initial indent, plus (a) the character `>` together
2007with a following space, or (b) a single character `>` not followed by a space.
2008
2009The following rules define [block quotes](#block-quote):
2010<a id="block-quote"></a>
2011
20121.  **Basic case.**  If a string of lines *Ls* constitute a sequence
2013    of blocks *Bs*, then the result of appending a [block quote
2014    marker](#block-quote-marker) to the beginning of each line in *Ls*
2015    is a [block quote](#block-quote) containing *Bs*.
2016
20172.  **Laziness.**  If a string of lines *Ls* constitute a [block
2018    quote](#block-quote) with contents *Bs*, then the result of deleting
2019    the initial [block quote marker](#block-quote-marker) from one or
2020    more lines in which the next non-space character after the [block
2021    quote marker](#block-quote-marker) is [paragraph continuation
2022    text](#paragraph-continuation-text) is a block quote with *Bs* as
2023    its content.  <a id="paragraph-continuation-text"></a>
2024    [Paragraph continuation text](#paragraph-continuation-text) is text
2025    that will be parsed as part of the content of a paragraph, but does
2026    not occur at the beginning of the paragraph.
2027
20283.  **Consecutiveness.**  A document cannot contain two [block
2029    quotes](#block-quote) in a row unless there is a [blank
2030    line](#blank-line) between them.
2031
2032Nothing else counts as a [block quote](#block-quote).
2033
2034Here is a simple example:
2035
2036.
2037> # Foo
2038> bar
2039> baz
2040.
2041<blockquote>
2042<h1>Foo</h1>
2043<p>bar
2044baz</p>
2045</blockquote>
2046.
2047
2048The spaces after the `>` characters can be omitted:
2049
2050.
2051># Foo
2052>bar
2053> baz
2054.
2055<blockquote>
2056<h1>Foo</h1>
2057<p>bar
2058baz</p>
2059</blockquote>
2060.
2061
2062The `>` characters can be indented 1-3 spaces:
2063
2064.
2065   > # Foo
2066   > bar
2067 > baz
2068.
2069<blockquote>
2070<h1>Foo</h1>
2071<p>bar
2072baz</p>
2073</blockquote>
2074.
2075
2076Four spaces gives us a code block:
2077
2078.
2079    > # Foo
2080    > bar
2081    > baz
2082.
2083<pre><code>&gt; # Foo
2084&gt; bar
2085&gt; baz
2086</code></pre>
2087.
2088
2089The Laziness clause allows us to omit the `>` before a
2090paragraph continuation line:
2091
2092.
2093> # Foo
2094> bar
2095baz
2096.
2097<blockquote>
2098<h1>Foo</h1>
2099<p>bar
2100baz</p>
2101</blockquote>
2102.
2103
2104A block quote can contain some lazy and some non-lazy
2105continuation lines:
2106
2107.
2108> bar
2109baz
2110> foo
2111.
2112<blockquote>
2113<p>bar
2114baz
2115foo</p>
2116</blockquote>
2117.
2118
2119Laziness only applies to lines that are continuations of
2120paragraphs. Lines containing characters or indentation that indicate
2121block structure cannot be lazy.
2122
2123.
2124> foo
2125---
2126.
2127<blockquote>
2128<p>foo</p>
2129</blockquote>
2130<hr />
2131.
2132
2133.
2134> - foo
2135- bar
2136.
2137<blockquote>
2138<ul>
2139<li>foo</li>
2140</ul>
2141</blockquote>
2142<ul>
2143<li>bar</li>
2144</ul>
2145.
2146
2147.
2148>     foo
2149    bar
2150.
2151<blockquote>
2152<pre><code>foo
2153</code></pre>
2154</blockquote>
2155<pre><code>bar
2156</code></pre>
2157.
2158
2159.
2160> ```
2161foo
2162```
2163.
2164<blockquote>
2165<pre><code></code></pre>
2166</blockquote>
2167<p>foo</p>
2168<pre><code></code></pre>
2169.
2170
2171A block quote can be empty:
2172
2173.
2174>
2175.
2176<blockquote>
2177</blockquote>
2178.
2179
2180.
2181>
2182>
2183>
2184.
2185<blockquote>
2186</blockquote>
2187.
2188
2189A block quote can have initial or final blank lines:
2190
2191.
2192>
2193> foo
2194>
2195.
2196<blockquote>
2197<p>foo</p>
2198</blockquote>
2199.
2200
2201A blank line always separates block quotes:
2202
2203.
2204> foo
2205
2206> bar
2207.
2208<blockquote>
2209<p>foo</p>
2210</blockquote>
2211<blockquote>
2212<p>bar</p>
2213</blockquote>
2214.
2215
2216(Most current Markdown implementations, including John Gruber's
2217original `Markdown.pl`, will parse this example as a single block quote
2218with two paragraphs.  But it seems better to allow the author to decide
2219whether two block quotes or one are wanted.)
2220
2221Consecutiveness means that if we put these block quotes together,
2222we get a single block quote:
2223
2224.
2225> foo
2226> bar
2227.
2228<blockquote>
2229<p>foo
2230bar</p>
2231</blockquote>
2232.
2233
2234To get a block quote with two paragraphs, use:
2235
2236.
2237> foo
2238>
2239> bar
2240.
2241<blockquote>
2242<p>foo</p>
2243<p>bar</p>
2244</blockquote>
2245.
2246
2247Block quotes can interrupt paragraphs:
2248
2249.
2250foo
2251> bar
2252.
2253<p>foo</p>
2254<blockquote>
2255<p>bar</p>
2256</blockquote>
2257.
2258
2259In general, blank lines are not needed before or after block
2260quotes:
2261
2262.
2263> aaa
2264***
2265> bbb
2266.
2267<blockquote>
2268<p>aaa</p>
2269</blockquote>
2270<hr />
2271<blockquote>
2272<p>bbb</p>
2273</blockquote>
2274.
2275
2276However, because of laziness, a blank line is needed between
2277a block quote and a following paragraph:
2278
2279.
2280> bar
2281baz
2282.
2283<blockquote>
2284<p>bar
2285baz</p>
2286</blockquote>
2287.
2288
2289.
2290> bar
2291
2292baz
2293.
2294<blockquote>
2295<p>bar</p>
2296</blockquote>
2297<p>baz</p>
2298.
2299
2300.
2301> bar
2302>
2303baz
2304.
2305<blockquote>
2306<p>bar</p>
2307</blockquote>
2308<p>baz</p>
2309.
2310
2311It is a consequence of the Laziness rule that any number
2312of initial `>`s may be omitted on a continuation line of a
2313nested block quote:
2314
2315.
2316> > > foo
2317bar
2318.
2319<blockquote>
2320<blockquote>
2321<blockquote>
2322<p>foo
2323bar</p>
2324</blockquote>
2325</blockquote>
2326</blockquote>
2327.
2328
2329.
2330>>> foo
2331> bar
2332>>baz
2333.
2334<blockquote>
2335<blockquote>
2336<blockquote>
2337<p>foo
2338bar
2339baz</p>
2340</blockquote>
2341</blockquote>
2342</blockquote>
2343.
2344
2345When including an indented code block in a block quote,
2346remember that the [block quote marker](#block-quote-marker) includes
2347both the `>` and a following space.  So *five spaces* are needed after
2348the `>`:
2349
2350.
2351>     code
2352
2353>    not code
2354.
2355<blockquote>
2356<pre><code>code
2357</code></pre>
2358</blockquote>
2359<blockquote>
2360<p>not code</p>
2361</blockquote>
2362.
2363
2364
2365## List items
2366
2367A [list marker](#list-marker) <a id="list-marker"></a> is a
2368[bullet list marker](#bullet-list-marker) or an [ordered list
2369marker](#ordered-list-marker).
2370
2371A [bullet list marker](#bullet-list-marker) <a id="bullet-list-marker"></a>
2372is a `-`, `+`, or `*` character.
2373
2374An [ordered list marker](#ordered-list-marker) <a id="ordered-list-marker"></a>
2375is a sequence of one of more digits (`0-9`), followed by either a
2376`.` character or a `)` character.
2377
2378The following rules define [list items](#list-item):
2379
23801.  **Basic case.**  If a sequence of lines *Ls* constitute a sequence of
2381    blocks *Bs* starting with a non-space character and not separated
2382    from each other by more than one blank line, and *M* is a list
2383    marker *M* of width *W* followed by 0 < *N* < 5 spaces, then the result
2384    of prepending *M* and the following spaces to the first line of
2385    *Ls*, and indenting subsequent lines of *Ls* by *W + N* spaces, is a
2386    list item with *Bs* as its contents.  The type of the list item
2387    (bullet or ordered) is determined by the type of its list marker.
2388    If the list item is ordered, then it is also assigned a start
2389    number, based on the ordered list marker.
2390
2391For example, let *Ls* be the lines
2392
2393.
2394A paragraph
2395with two lines.
2396
2397    indented code
2398
2399> A block quote.
2400.
2401<p>A paragraph
2402with two lines.</p>
2403<pre><code>indented code
2404</code></pre>
2405<blockquote>
2406<p>A block quote.</p>
2407</blockquote>
2408.
2409
2410And let *M* be the marker `1.`, and *N* = 2.  Then rule #1 says
2411that the following is an ordered list item with start number 1,
2412and the same contents as *Ls*:
2413
2414.
24151.  A paragraph
2416    with two lines.
2417
2418        indented code
2419
2420    > A block quote.
2421.
2422<ol>
2423<li><p>A paragraph
2424with two lines.</p>
2425<pre><code>indented code
2426</code></pre>
2427<blockquote>
2428<p>A block quote.</p>
2429</blockquote></li>
2430</ol>
2431.
2432
2433The most important thing to notice is that the position of
2434the text after the list marker determines how much indentation
2435is needed in subsequent blocks in the list item.  If the list
2436marker takes up two spaces, and there are three spaces between
2437the list marker and the next nonspace character, then blocks
2438must be indented five spaces in order to fall under the list
2439item.
2440
2441Here are some examples showing how far content must be indented to be
2442put under the list item:
2443
2444.
2445- one
2446
2447 two
2448.
2449<ul>
2450<li>one</li>
2451</ul>
2452<p>two</p>
2453.
2454
2455.
2456- one
2457
2458  two
2459.
2460<ul>
2461<li><p>one</p>
2462<p>two</p></li>
2463</ul>
2464.
2465
2466.
2467 -    one
2468
2469     two
2470.
2471<ul>
2472<li>one</li>
2473</ul>
2474<pre><code> two
2475</code></pre>
2476.
2477
2478.
2479 -    one
2480
2481      two
2482.
2483<ul>
2484<li><p>one</p>
2485<p>two</p></li>
2486</ul>
2487.
2488
2489It is tempting to think of this in terms of columns:  the continuation
2490blocks must be indented at least to the column of the first nonspace
2491character after the list marker.  However, that is not quite right.
2492The spaces after the list marker determine how much relative indentation
2493is needed.  Which column this indentation reaches will depend on
2494how the list item is embedded in other constructions, as shown by
2495this example:
2496
2497.
2498   > > 1.  one
2499>>
2500>>     two
2501.
2502<blockquote>
2503<blockquote>
2504<ol>
2505<li><p>one</p>
2506<p>two</p></li>
2507</ol>
2508</blockquote>
2509</blockquote>
2510.
2511
2512Here `two` occurs in the same column as the list marker `1.`,
2513but is actually contained in the list item, because there is
2514sufficent indentation after the last containing blockquote marker.
2515
2516The converse is also possible.  In the following example, the word `two`
2517occurs far to the right of the initial text of the list item, `one`, but
2518it is not considered part of the list item, because it is not indented
2519far enough past the blockquote marker:
2520
2521.
2522>>- one
2523>>
2524  >  > two
2525.
2526<blockquote>
2527<blockquote>
2528<ul>
2529<li>one</li>
2530</ul>
2531<p>two</p>
2532</blockquote>
2533</blockquote>
2534.
2535
2536A list item may not contain blocks that are separated by more than
2537one blank line.  Thus, two blank lines will end a list, unless the
2538two blanks are contained in a [fenced code block](#fenced-code-block).
2539
2540.
2541- foo
2542
2543  bar
2544
2545- foo
2546
2547
2548  bar
2549
2550- ```
2551  foo
2552
2553
2554  bar
2555  ```
2556.
2557<ul>
2558<li><p>foo</p>
2559<p>bar</p></li>
2560<li><p>foo</p></li>
2561</ul>
2562<p>bar</p>
2563<ul>
2564<li><pre><code>foo
2565
2566
2567bar
2568</code></pre></li>
2569</ul>
2570.
2571
2572A list item may contain any kind of block:
2573
2574.
25751.  foo
2576
2577    ```
2578    bar
2579    ```
2580
2581    baz
2582
2583    > bam
2584.
2585<ol>
2586<li><p>foo</p>
2587<pre><code>bar
2588</code></pre>
2589<p>baz</p>
2590<blockquote>
2591<p>bam</p>
2592</blockquote></li>
2593</ol>
2594.
2595
25962.  **Item starting with indented code.**  If a sequence of lines *Ls*
2597    constitute a sequence of blocks *Bs* starting with an indented code
2598    block and not separated from each other by more than one blank line,
2599    and *M* is a list marker *M* of width *W* followed by
2600    one space, then the result of prepending *M* and the following
2601    space to the first line of *Ls*, and indenting subsequent lines of
2602    *Ls* by *W + 1* spaces, is a list item with *Bs* as its contents.
2603    If a line is empty, then it need not be indented.  The type of the
2604    list item (bullet or ordered) is determined by the type of its list
2605    marker.  If the list item is ordered, then it is also assigned a
2606    start number, based on the ordered list marker.
2607
2608An indented code block will have to be indented four spaces beyond
2609the edge of the region where text will be included in the list item.
2610In the following case that is 6 spaces:
2611
2612.
2613- foo
2614
2615      bar
2616.
2617<ul>
2618<li><p>foo</p>
2619<pre><code>bar
2620</code></pre></li>
2621</ul>
2622.
2623
2624And in this case it is 11 spaces:
2625
2626.
2627  10.  foo
2628
2629           bar
2630.
2631<ol start="10">
2632<li><p>foo</p>
2633<pre><code>bar
2634</code></pre></li>
2635</ol>
2636.
2637
2638If the *first* block in the list item is an indented code block,
2639then by rule #2, the contents must be indented *one* space after the
2640list marker:
2641
2642.
2643    indented code
2644
2645paragraph
2646
2647    more code
2648.
2649<pre><code>indented code
2650</code></pre>
2651<p>paragraph</p>
2652<pre><code>more code
2653</code></pre>
2654.
2655
2656.
26571.     indented code
2658
2659   paragraph
2660
2661       more code
2662.
2663<ol>
2664<li><pre><code>indented code
2665</code></pre>
2666<p>paragraph</p>
2667<pre><code>more code
2668</code></pre></li>
2669</ol>
2670.
2671
2672Note that an additional space indent is interpreted as space
2673inside the code block:
2674
2675.
26761.      indented code
2677
2678   paragraph
2679
2680       more code
2681.
2682<ol>
2683<li><pre><code> indented code
2684</code></pre>
2685<p>paragraph</p>
2686<pre><code>more code
2687</code></pre></li>
2688</ol>
2689.
2690
2691Note that rules #1 and #2 only apply to two cases:  (a) cases
2692in which the lines to be included in a list item begin with a nonspace
2693character, and (b) cases in which they begin with an indented code
2694block.  In a case like the following, where the first block begins with
2695a three-space indent, the rules do not allow us to form a list item by
2696indenting the whole thing and prepending a list marker:
2697
2698.
2699   foo
2700
2701bar
2702.
2703<p>foo</p>
2704<p>bar</p>
2705.
2706
2707.
2708-    foo
2709
2710  bar
2711.
2712<ul>
2713<li>foo</li>
2714</ul>
2715<p>bar</p>
2716.
2717
2718This is not a significant restriction, because when a block begins
2719with 1-3 spaces indent, the indentation can always be removed without
2720a change in interpretation, allowing rule #1 to be applied.  So, in
2721the above case:
2722
2723.
2724-  foo
2725
2726   bar
2727.
2728<ul>
2729<li><p>foo</p>
2730<p>bar</p></li>
2731</ul>
2732.
2733
2734
27353.  **Indentation.**  If a sequence of lines *Ls* constitutes a list item
2736    according to rule #1 or #2, then the result of indenting each line
2737    of *L* by 1-3 spaces (the same for each line) also constitutes a
2738    list item with the same contents and attributes.  If a line is
2739    empty, then it need not be indented.
2740
2741Indented one space:
2742
2743.
2744 1.  A paragraph
2745     with two lines.
2746
2747         indented code
2748
2749     > A block quote.
2750.
2751<ol>
2752<li><p>A paragraph
2753with two lines.</p>
2754<pre><code>indented code
2755</code></pre>
2756<blockquote>
2757<p>A block quote.</p>
2758</blockquote></li>
2759</ol>
2760.
2761
2762Indented two spaces:
2763
2764.
2765  1.  A paragraph
2766      with two lines.
2767
2768          indented code
2769
2770      > A block quote.
2771.
2772<ol>
2773<li><p>A paragraph
2774with two lines.</p>
2775<pre><code>indented code
2776</code></pre>
2777<blockquote>
2778<p>A block quote.</p>
2779</blockquote></li>
2780</ol>
2781.
2782
2783Indented three spaces:
2784
2785.
2786   1.  A paragraph
2787       with two lines.
2788
2789           indented code
2790
2791       > A block quote.
2792.
2793<ol>
2794<li><p>A paragraph
2795with two lines.</p>
2796<pre><code>indented code
2797</code></pre>
2798<blockquote>
2799<p>A block quote.</p>
2800</blockquote></li>
2801</ol>
2802.
2803
2804Four spaces indent gives a code block:
2805
2806.
2807    1.  A paragraph
2808        with two lines.
2809
2810            indented code
2811
2812        > A block quote.
2813.
2814<pre><code>1.  A paragraph
2815    with two lines.
2816
2817        indented code
2818
2819    &gt; A block quote.
2820</code></pre>
2821.
2822
2823
28244.  **Laziness.**  If a string of lines *Ls* constitute a [list
2825    item](#list-item) with contents *Bs*, then the result of deleting
2826    some or all of the indentation from one or more lines in which the
2827    next non-space character after the indentation is
2828    [paragraph continuation text](#paragraph-continuation-text) is a
2829    list item with the same contents and attributes.
2830
2831Here is an example with lazy continuation lines:
2832
2833.
2834  1.  A paragraph
2835with two lines.
2836
2837          indented code
2838
2839      > A block quote.
2840.
2841<ol>
2842<li><p>A paragraph
2843with two lines.</p>
2844<pre><code>indented code
2845</code></pre>
2846<blockquote>
2847<p>A block quote.</p>
2848</blockquote></li>
2849</ol>
2850.
2851
2852Indentation can be partially deleted:
2853
2854.
2855  1.  A paragraph
2856    with two lines.
2857.
2858<ol>
2859<li>A paragraph
2860with two lines.</li>
2861</ol>
2862.
2863
2864These examples show how laziness can work in nested structures:
2865
2866.
2867> 1. > Blockquote
2868continued here.
2869.
2870<blockquote>
2871<ol>
2872<li><blockquote>
2873<p>Blockquote
2874continued here.</p>
2875</blockquote></li>
2876</ol>
2877</blockquote>
2878.
2879
2880.
2881> 1. > Blockquote
2882> continued here.
2883.
2884<blockquote>
2885<ol>
2886<li><blockquote>
2887<p>Blockquote
2888continued here.</p>
2889</blockquote></li>
2890</ol>
2891</blockquote>
2892.
2893
2894
28955.  **That's all.** Nothing that is not counted as a list item by rules
2896    #1--4 counts as a [list item](#list-item).
2897
2898The rules for sublists follow from the general rules above.  A sublist
2899must be indented the same number of spaces a paragraph would need to be
2900in order to be included in the list item.
2901
2902So, in this case we need two spaces indent:
2903
2904.
2905- foo
2906  - bar
2907    - baz
2908.
2909<ul>
2910<li>foo
2911<ul>
2912<li>bar
2913<ul>
2914<li>baz</li>
2915</ul></li>
2916</ul></li>
2917</ul>
2918.
2919
2920One is not enough:
2921
2922.
2923- foo
2924 - bar
2925  - baz
2926.
2927<ul>
2928<li>foo</li>
2929<li>bar</li>
2930<li>baz</li>
2931</ul>
2932.
2933
2934Here we need four, because the list marker is wider:
2935
2936.
293710) foo
2938    - bar
2939.
2940<ol start="10">
2941<li>foo
2942<ul>
2943<li>bar</li>
2944</ul></li>
2945</ol>
2946.
2947
2948Three is not enough:
2949
2950.
295110) foo
2952   - bar
2953.
2954<ol start="10">
2955<li>foo</li>
2956</ol>
2957<ul>
2958<li>bar</li>
2959</ul>
2960.
2961
2962A list may be the first block in a list item:
2963
2964.
2965- - foo
2966.
2967<ul>
2968<li><ul>
2969<li>foo</li>
2970</ul></li>
2971</ul>
2972.
2973
2974.
29751. - 2. foo
2976.
2977<ol>
2978<li><ul>
2979<li><ol start="2">
2980<li>foo</li>
2981</ol></li>
2982</ul></li>
2983</ol>
2984.
2985
2986A list item may be empty:
2987
2988.
2989- foo
2990-
2991- bar
2992.
2993<ul>
2994<li>foo</li>
2995<li></li>
2996<li>bar</li>
2997</ul>
2998.
2999
3000.
3001-
3002.
3003<ul>
3004<li></li>
3005</ul>
3006.
3007
3008### Motivation
3009
3010John Gruber's Markdown spec says the following about list items:
3011
30121. "List markers typically start at the left margin, but may be indented
3013   by up to three spaces. List markers must be followed by one or more
3014   spaces or a tab."
3015
30162. "To make lists look nice, you can wrap items with hanging indents....
3017   But if you don't want to, you don't have to."
3018
30193. "List items may consist of multiple paragraphs. Each subsequent
3020   paragraph in a list item must be indented by either 4 spaces or one
3021   tab."
3022
30234. "It looks nice if you indent every line of the subsequent paragraphs,
3024   but here again, Markdown will allow you to be lazy."
3025
30265. "To put a blockquote within a list item, the blockquote's `>`
3027   delimiters need to be indented."
3028
30296. "To put a code block within a list item, the code block needs to be
3030   indented twice — 8 spaces or two tabs."
3031
3032These rules specify that a paragraph under a list item must be indented
3033four spaces (presumably, from the left margin, rather than the start of
3034the list marker, but this is not said), and that code under a list item
3035must be indented eight spaces instead of the usual four.  They also say
3036that a block quote must be indented, but not by how much; however, the
3037example given has four spaces indentation.  Although nothing is said
3038about other kinds of block-level content, it is certainly reasonable to
3039infer that *all* block elements under a list item, including other
3040lists, must be indented four spaces.  This principle has been called the
3041*four-space rule*.
3042
3043The four-space rule is clear and principled, and if the reference
3044implementation `Markdown.pl` had followed it, it probably would have
3045become the standard.  However, `Markdown.pl` allowed paragraphs and
3046sublists to start with only two spaces indentation, at least on the
3047outer level.  Worse, its behavior was inconsistent: a sublist of an
3048outer-level list needed two spaces indentation, but a sublist of this
3049sublist needed three spaces.  It is not surprising, then, that different
3050implementations of Markdown have developed very different rules for
3051determining what comes under a list item.  (Pandoc and python-Markdown,
3052for example, stuck with Gruber's syntax description and the four-space
3053rule, while discount, redcarpet, marked, PHP Markdown, and others
3054followed `Markdown.pl`'s behavior more closely.)
3055
3056Unfortunately, given the divergences between implementations, there
3057is no way to give a spec for list items that will be guaranteed not
3058to break any existing documents.  However, the spec given here should
3059correctly handle lists formatted with either the four-space rule or
3060the more forgiving `Markdown.pl` behavior, provided they are laid out
3061in a way that is natural for a human to read.
3062
3063The strategy here is to let the width and indentation of the list marker
3064determine the indentation necessary for blocks to fall under the list
3065item, rather than having a fixed and arbitrary number.  The writer can
3066think of the body of the list item as a unit which gets indented to the
3067right enough to fit the list marker (and any indentation on the list
3068marker).  (The laziness rule, #4, then allows continuation lines to be
3069unindented if needed.)
3070
3071This rule is superior, we claim, to any rule requiring a fixed level of
3072indentation from the margin.  The four-space rule is clear but
3073unnatural. It is quite unintuitive that
3074
3075``` markdown
3076- foo
3077
3078  bar
3079
3080  - baz
3081```
3082
3083should be parsed as two lists with an intervening paragraph,
3084
3085``` html
3086<ul>
3087<li>foo</li>
3088</ul>
3089<p>bar</p>
3090<ul>
3091<li>baz</li>
3092</ul>
3093```
3094
3095as the four-space rule demands, rather than a single list,
3096
3097``` html
3098<ul>
3099<li><p>foo</p>
3100<p>bar</p>
3101<ul>
3102<li>baz</li>
3103</ul></li>
3104</ul>
3105```
3106
3107The choice of four spaces is arbitrary.  It can be learned, but it is
3108not likely to be guessed, and it trips up beginners regularly.
3109
3110Would it help to adopt a two-space rule?  The problem is that such
3111a rule, together with the rule allowing 1--3 spaces indentation of the
3112initial list marker, allows text that is indented *less than* the
3113original list marker to be included in the list item. For example,
3114`Markdown.pl` parses
3115
3116``` markdown
3117   - one
3118
3119  two
3120```
3121
3122as a single list item, with `two` a continuation paragraph:
3123
3124``` html
3125<ul>
3126<li><p>one</p>
3127<p>two</p></li>
3128</ul>
3129```
3130
3131and similarly
3132
3133``` markdown
3134>   - one
3135>
3136>  two
3137```
3138
3139as
3140
3141``` html
3142<blockquote>
3143<ul>
3144<li><p>one</p>
3145<p>two</p></li>
3146</ul>
3147</blockquote>
3148```
3149
3150This is extremely unintuitive.
3151
3152Rather than requiring a fixed indent from the margin, we could require
3153a fixed indent (say, two spaces, or even one space) from the list marker (which
3154may itself be indented).  This proposal would remove the last anomaly
3155discussed.  Unlike the spec presented above, it would count the following
3156as a list item with a subparagraph, even though the paragraph `bar`
3157is not indented as far as the first paragraph `foo`:
3158
3159``` markdown
3160 10. foo
3161
3162   bar
3163```
3164
3165Arguably this text does read like a list item with `bar` as a subparagraph,
3166which may count in favor of the proposal.  However, on this proposal indented
3167code would have to be indented six spaces after the list marker.  And this
3168would break a lot of existing Markdown, which has the pattern:
3169
3170``` markdown
31711.  foo
3172
3173        indented code
3174```
3175
3176where the code is indented eight spaces.  The spec above, by contrast, will
3177parse this text as expected, since the code block's indentation is measured
3178from the beginning of `foo`.
3179
3180The one case that needs special treatment is a list item that *starts*
3181with indented code.  How much indentation is required in that case, since
3182we don't have a "first paragraph" to measure from?  Rule #2 simply stipulates
3183that in such cases, we require one space indentation from the list marker
3184(and then the normal four spaces for the indented code).  This will match the
3185four-space rule in cases where the list marker plus its initial indentation
3186takes four spaces (a common case), but diverge in other cases.
3187
3188## Lists
3189
3190A [list](#list) <a id="list"></a> is a sequence of one or more
3191list items [of the same type](#of-the-same-type).  The list items
3192may be separated by single [blank lines](#blank-line), but two
3193blank lines end all containing lists.
3194
3195Two list items are [of the same type](#of-the-same-type)
3196<a id="of-the-same-type"></a> if they begin with a [list
3197marker](#list-marker) of the same type.  Two list markers are of the
3198same type if (a) they are bullet list markers using the same character
3199(`-`, `+`, or `*`) or (b) they are ordered list numbers with the same
3200delimiter (either `.` or `)`).
3201
3202A list is an [ordered list](#ordered-list) <a id="ordered-list"></a>
3203if its constituent list items begin with
3204[ordered list markers](#ordered-list-marker), and a [bullet
3205list](#bullet-list) <a id="bullet-list"></a> if its constituent list
3206items begin with [bullet list markers](#bullet-list-marker).
3207
3208The [start number](#start-number) <a id="start-number"></a>
3209of an [ordered list](#ordered-list) is determined by the list number of
3210its initial list item.  The numbers of subsequent list items are
3211disregarded.
3212
3213A list is [loose](#loose) if it any of its constituent list items are
3214separated by blank lines, or if any of its constituent list items
3215directly contain two block-level elements with a blank line between
3216them.  Otherwise a list is [tight](#tight).  (The difference in HTML output
3217is that paragraphs in a loose with are wrapped in `<p>` tags, while
3218paragraphs in a tight list are not.)
3219
3220Changing the bullet or ordered list delimiter starts a new list:
3221
3222.
3223- foo
3224- bar
3225+ baz
3226.
3227<ul>
3228<li>foo</li>
3229<li>bar</li>
3230</ul>
3231<ul>
3232<li>baz</li>
3233</ul>
3234.
3235
3236.
32371. foo
32382. bar
32393) baz
3240.
3241<ol>
3242<li>foo</li>
3243<li>bar</li>
3244</ol>
3245<ol start="3">
3246<li>baz</li>
3247</ol>
3248.
3249
3250There can be blank lines between items, but two blank lines end
3251a list:
3252
3253.
3254- foo
3255
3256- bar
3257
3258
3259- baz
3260.
3261<ul>
3262<li><p>foo</p></li>
3263<li><p>bar</p></li>
3264</ul>
3265<ul>
3266<li>baz</li>
3267</ul>
3268.
3269
3270As illustrated above in the section on [list items](#list-item),
3271two blank lines between blocks *within* a list item will also end a
3272list:
3273
3274.
3275- foo
3276
3277
3278  bar
3279- baz
3280.
3281<ul>
3282<li>foo</li>
3283</ul>
3284<p>bar</p>
3285<ul>
3286<li>baz</li>
3287</ul>
3288.
3289
3290Indeed, two blank lines will end *all* containing lists:
3291
3292.
3293- foo
3294  - bar
3295    - baz
3296
3297
3298      bim
3299.
3300<ul>
3301<li>foo
3302<ul>
3303<li>bar
3304<ul>
3305<li>baz</li>
3306</ul></li>
3307</ul></li>
3308</ul>
3309<pre><code>  bim
3310</code></pre>
3311.
3312
3313Thus, two blank lines can be used to separate consecutive lists of
3314the same type, or to separate a list from an indented code block
3315that would otherwise be parsed as a subparagraph of the final list
3316item:
3317
3318.
3319- foo
3320- bar
3321
3322
3323- baz
3324- bim
3325.
3326<ul>
3327<li>foo</li>
3328<li>bar</li>
3329</ul>
3330<ul>
3331<li>baz</li>
3332<li>bim</li>
3333</ul>
3334.
3335
3336.
3337-   foo
3338
3339    notcode
3340
3341-   foo
3342
3343
3344    code
3345.
3346<ul>
3347<li><p>foo</p>
3348<p>notcode</p></li>
3349<li><p>foo</p></li>
3350</ul>
3351<pre><code>code
3352</code></pre>
3353.
3354
3355List items need not be indented to the same level.  The following
3356list items will be treated as items at the same list level,
3357since none is indented enough to belong to the previous list
3358item:
3359
3360.
3361- a
3362 - b
3363  - c
3364   - d
3365  - e
3366 - f
3367- g
3368.
3369<ul>
3370<li>a</li>
3371<li>b</li>
3372<li>c</li>
3373<li>d</li>
3374<li>e</li>
3375<li>f</li>
3376<li>g</li>
3377</ul>
3378.
3379
3380This is a loose list, because there is a blank line between
3381two of the list items:
3382
3383.
3384- a
3385- b
3386
3387- c
3388.
3389<ul>
3390<li><p>a</p></li>
3391<li><p>b</p></li>
3392<li><p>c</p></li>
3393</ul>
3394.
3395
3396So is this, with a empty second item:
3397
3398.
3399* a
3400*
3401
3402* c
3403.
3404<ul>
3405<li><p>a</p></li>
3406<li></li>
3407<li><p>c</p></li>
3408</ul>
3409.
3410
3411These are loose lists, even though there is no space between the items,
3412because one of the items directly contains two block-level elements
3413with a blank line between them:
3414
3415.
3416- a
3417- b
3418
3419  c
3420- d
3421.
3422<ul>
3423<li><p>a</p></li>
3424<li><p>b</p>
3425<p>c</p></li>
3426<li><p>d</p></li>
3427</ul>
3428.
3429
3430.
3431- a
3432- b
3433
3434  [ref]: /url
3435- d
3436.
3437<ul>
3438<li><p>a</p></li>
3439<li><p>b</p></li>
3440<li><p>d</p></li>
3441</ul>
3442.
3443
3444This is a tight list, because the blank lines are in a code block:
3445
3446.
3447- a
3448- ```
3449  b
3450
3451
3452  ```
3453- c
3454.
3455<ul>
3456<li>a</li>
3457<li><pre><code>b
3458
3459
3460</code></pre></li>
3461<li>c</li>
3462</ul>
3463.
3464
3465This is a tight list, because the blank line is between two
3466paragraphs of a sublist.  So the inner list is loose while
3467the other list is tight:
3468
3469.
3470- a
3471  - b
3472
3473    c
3474- d
3475.
3476<ul>
3477<li>a
3478<ul>
3479<li><p>b</p>
3480<p>c</p></li>
3481</ul></li>
3482<li>d</li>
3483</ul>
3484.
3485
3486This is a tight list, because the blank line is inside the
3487block quote:
3488
3489.
3490* a
3491  > b
3492  >
3493* c
3494.
3495<ul>
3496<li>a
3497<blockquote>
3498<p>b</p>
3499</blockquote></li>
3500<li>c</li>
3501</ul>
3502.
3503
3504This list is tight, because the consecutive block elements
3505are not separated by blank lines:
3506
3507.
3508- a
3509  > b
3510  ```
3511  c
3512  ```
3513- d
3514.
3515<ul>
3516<li>a
3517<blockquote>
3518<p>b</p>
3519</blockquote>
3520<pre><code>c
3521</code></pre></li>
3522<li>d</li>
3523</ul>
3524.
3525
3526A single-paragraph list is tight:
3527
3528.
3529- a
3530.
3531<ul>
3532<li>a</li>
3533</ul>
3534.
3535
3536.
3537- a
3538  - b
3539.
3540<ul>
3541<li>a
3542<ul>
3543<li>b</li>
3544</ul></li>
3545</ul>
3546.
3547
3548Here the outer list is loose, the inner list tight:
3549
3550.
3551* foo
3552  * bar
3553
3554  baz
3555.
3556<ul>
3557<li><p>foo</p>
3558<ul>
3559<li>bar</li>
3560</ul>
3561<p>baz</p></li>
3562</ul>
3563.
3564
3565.
3566- a
3567  - b
3568  - c
3569
3570- d
3571  - e
3572  - f
3573.
3574<ul>
3575<li><p>a</p>
3576<ul>
3577<li>b</li>
3578<li>c</li>
3579</ul></li>
3580<li><p>d</p>
3581<ul>
3582<li>e</li>
3583<li>f</li>
3584</ul></li>
3585</ul>
3586.
3587
3588# Inlines
3589
3590Inlines are parsed sequentially from the beginning of the character
3591stream to the end (left to right, in left-to-right languages).
3592Thus, for example, in
3593
3594.
3595`hi`lo`
3596.
3597<p><code>hi</code>lo`</p>
3598.
3599
3600`hi` is parsed as code, leaving the backtick at the end as a literal
3601backtick.
3602
3603## Backslash escapes
3604
3605Any ASCII punctuation character may be backslash-escaped:
3606
3607.
3608\!\"\#\$\%\&\'\(\)\*\+\,\-\.\/\:\;\<\=\>\?\@\[\\\]\^\_\`\{\|\}\~
3609.
3610<p>!&quot;#$%&amp;'()*+,-./:;&lt;=&gt;?@[\]^_`{|}~</p>
3611.
3612
3613Backslashes before other characters are treated as literal
3614backslashes:
3615
3616.
3617\→\A\a\ \3\φ\«
3618.
3619<p>\   \A\a\ \3\φ\«</p>
3620.
3621
3622Escaped characters are treated as regular characters and do
3623not have their usual Markdown meanings:
3624
3625.
3626\*not emphasized*
3627\<br/> not a tag
3628\[not a link](/foo)
3629\`not code`
36301\. not a list
3631\* not a list
3632\# not a header
3633\[foo]: /url "not a reference"
3634.
3635<p>*not emphasized*
3636&lt;br/&gt; not a tag
3637[not a link](/foo)
3638`not code`
36391. not a list
3640* not a list
3641# not a header
3642[foo]: /url &quot;not a reference&quot;</p>
3643.
3644
3645If a backslash is itself escaped, the following character is not:
3646
3647.
3648\\*emphasis*
3649.
3650<p>\<em>emphasis</em></p>
3651.
3652
3653A backslash at the end of the line is a hard line break:
3654
3655.
3656foo\
3657bar
3658.
3659<p>foo<br />
3660bar</p>
3661.
3662
3663Backslash escapes do not work in code blocks, code spans, autolinks, or
3664raw HTML:
3665
3666.
3667`` \[\` ``
3668.
3669<p><code>\[\`</code></p>
3670.
3671
3672.
3673    \[\]
3674.
3675<pre><code>\[\]
3676</code></pre>
3677.
3678
3679.
3680~~~
3681\[\]
3682~~~
3683.
3684<pre><code>\[\]
3685</code></pre>
3686.
3687
3688.
3689<http://google.com?find=\*>
3690.
3691<p><a href="http://google.com?find=%5C*">http://google.com?find=\*</a></p>
3692.
3693
3694.
3695<a href="/bar\/)">
3696.
3697<p><a href="/bar\/)"></p>
3698.
3699
3700But they work in all other contexts, including URLs and link titles,
3701link references, and info strings in [fenced code
3702blocks](#fenced-code-block):
3703
3704.
3705[foo](/bar\* "ti\*tle")
3706.
3707<p><a href="/bar*" title="ti*tle">foo</a></p>
3708.
3709
3710.
3711[foo]
3712
3713[foo]: /bar\* "ti\*tle"
3714.
3715<p><a href="/bar*" title="ti*tle">foo</a></p>
3716.
3717
3718.
3719``` foo\+bar
3720foo
3721```
3722.
3723<pre><code class="language-foo+bar">foo
3724</code></pre>
3725.
3726
3727
3728## Entities
3729
3730With the goal of making this standard as HTML-agnostic as possible, all HTML valid HTML Entities in any
3731context are recognized as such and converted into their actual values (i.e. the UTF8 characters representing
3732the entity itself) before they are stored in the AST.
3733
3734This allows implementations that target HTML output to trivially escape the entities when generating HTML,
3735and simplifies the job of implementations targetting other languages, as these will only need to handle the
3736UTF8 chars and need not be HTML-entity aware.
3737
3738[Named entities](#name-entities) <a id="named-entities"></a> consist of `&`
3739+ any of the valid HTML5 entity names + `;`. The [following document](http://www.whatwg.org/specs/web-apps/current-work/multipage/entities.json)
3740is used as an authoritative source of the valid entity names and their corresponding codepoints.
3741
3742Conforming implementations that target Markdown don't need to generate entities for all the valid
3743named entities that exist, with the exception of `"` (`&quot;`), `&` (`&amp;`), `<` (`&lt;`) and `>` (`&gt;`),
3744which always need to be written as entities for security reasons.
3745
3746.
3747&nbsp; &amp; &copy; &AElig; &Dcaron; &frac34; &HilbertSpace; &DifferentialD; &ClockwiseContourIntegral;
3748.
3749<p>  &amp; © Æ Ď ¾ ℋ ⅆ ∲</p>
3750.
3751
3752[Decimal entities](#decimal-entities) <a id="decimal-entities"></a>
3753consist of `&#` + a string of 1--8 arabic digits + `;`. Again, these entities need to be recognised
3754and tranformed into their corresponding UTF8 codepoints. Invalid Unicode codepoints will be written
3755as the "unknown codepoint" character (`0xFFFD`)
3756
3757.
3758&#35; &#1234; &#992; &#98765432;
3759.
3760<p># Ӓ Ϡ �</p>
3761.
3762
3763[Hexadecimal entities](#hexadecimal-entities) <a id="hexadecimal-entities"></a>
3764consist of `&#` + either `X` or `x` + a string of 1-8 hexadecimal digits
3765+ `;`. They will also be parsed and turned into their corresponding UTF8 values in the AST.
3766
3767.
3768&#X22; &#XD06; &#xcab;
3769.
3770<p>&quot; ആ ಫ</p>
3771.
3772
3773Here are some nonentities:
3774
3775.
3776&nbsp &x; &#; &#x; &ThisIsWayTooLongToBeAnEntityIsntIt; &hi?;
3777.
3778<p>&amp;nbsp &amp;x; &amp;#; &amp;#x; &amp;ThisIsWayTooLongToBeAnEntityIsntIt; &amp;hi?;</p>
3779.
3780
3781Although HTML5 does accept some entities without a trailing semicolon
3782(such as `&copy`), these are not recognized as entities here, because it makes the grammar too ambiguous:
3783
3784.
3785&copy
3786.
3787<p>&amp;copy</p>
3788.
3789
3790Strings that are not on the list of HTML5 named entities are not recognized as entities either:
3791
3792.
3793&MadeUpEntity;
3794.
3795<p>&amp;MadeUpEntity;</p>
3796.
3797
3798Entities are recognized in any context besides code spans or
3799code blocks, including raw HTML, URLs, [link titles](#link-title), and
3800[fenced code block](#fenced-code-block) info strings:
3801
3802.
3803<a href="&ouml;&ouml;.html">
3804.
3805<p><a href="&ouml;&ouml;.html"></p>
3806.
3807
3808.
3809[foo](/f&ouml;&ouml; "f&ouml;&ouml;")
3810.
3811<p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p>
3812.
3813
3814.
3815[foo]
3816
3817[foo]: /f&ouml;&ouml; "f&ouml;&ouml;"
3818.
3819<p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p>
3820.
3821
3822.
3823``` f&ouml;&ouml;
3824foo
3825```
3826.
3827<pre><code class="language-föö">foo
3828</code></pre>
3829.
3830
3831Entities are treated as literal text in code spans and code blocks:
3832
3833.
3834`f&ouml;&ouml;`
3835.
3836<p><code>f&amp;ouml;&amp;ouml;</code></p>
3837.
3838
3839.
3840    f&ouml;f&ouml;
3841.
3842<pre><code>f&amp;ouml;f&amp;ouml;
3843</code></pre>
3844.
3845
3846## Code span
3847
3848A [backtick string](#backtick-string) <a id="backtick-string"></a>
3849is a string of one or more backtick characters (`` ` ``) that is neither
3850preceded nor followed by a backtick.
3851
3852A code span begins with a backtick string and ends with a backtick
3853string of equal length.  The contents of the code span are the
3854characters between the two backtick strings, with leading and trailing
3855spaces and newlines removed, and consecutive spaces and newlines
3856collapsed to single spaces.
3857
3858This is a simple code span:
3859
3860.
3861`foo`
3862.
3863<p><code>foo</code></p>
3864.
3865
3866Here two backticks are used, because the code contains a backtick.
3867This example also illustrates stripping of leading and trailing spaces:
3868
3869.
3870`` foo ` bar  ``
3871.
3872<p><code>foo ` bar</code></p>
3873.
3874
3875This example shows the motivation for stripping leading and trailing
3876spaces:
3877
3878.
3879` `` `
3880.
3881<p><code>``</code></p>
3882.
3883
3884Newlines are treated like spaces:
3885
3886.
3887``
3888foo
3889``
3890.
3891<p><code>foo</code></p>
3892.
3893
3894Interior spaces and newlines are collapsed into single spaces, just
3895as they would be by a browser:
3896
3897.
3898`foo   bar
3899  baz`
3900.
3901<p><code>foo bar baz</code></p>
3902.
3903
3904Q: Why not just leave the spaces, since browsers will collapse them
3905anyway?  A:  Because we might be targeting a non-HTML format, and we
3906shouldn't rely on HTML-specific rendering assumptions.
3907
3908(Existing implementations differ in their treatment of internal
3909spaces and newlines.  Some, including `Markdown.pl` and
3910`showdown`, convert an internal newline into a `<br />` tag.
3911But this makes things difficult for those who like to hard-wrap
3912their paragraphs, since a line break in the midst of a code
3913span will cause an unintended line break in the output.  Others
3914just leave internal spaces as they are, which is fine if only
3915HTML is being targeted.)
3916
3917.
3918`foo `` bar`
3919.
3920<p><code>foo `` bar</code></p>
3921.
3922
3923Note that backslash escapes do not work in code spans. All backslashes
3924are treated literally:
3925
3926.
3927`foo\`bar`
3928.
3929<p><code>foo\</code>bar`</p>
3930.
3931
3932Backslash escapes are never needed, because one can always choose a
3933string of *n* backtick characters as delimiters, where the code does
3934not contain any strings of exactly *n* backtick characters.
3935
3936Code span backticks have higher precedence than any other inline
3937constructs except HTML tags and autolinks.  Thus, for example, this is
3938not parsed as emphasized text, since the second `*` is part of a code
3939span:
3940
3941.
3942*foo`*`
3943.
3944<p>*foo<code>*</code></p>
3945.
3946
3947And this is not parsed as a link:
3948
3949.
3950[not a `link](/foo`)
3951.
3952<p>[not a <code>link](/foo</code>)</p>
3953.
3954
3955But this is a link:
3956
3957.
3958<http://foo.bar.`baz>`
3959.
3960<p><a href="http://foo.bar.%60baz">http://foo.bar.`baz</a>`</p>
3961.
3962
3963And this is an HTML tag:
3964
3965.
3966<a href="`">`
3967.
3968<p><a href="`">`</p>
3969.
3970
3971When a backtick string is not closed by a matching backtick string,
3972we just have literal backticks:
3973
3974.
3975```foo``
3976.
3977<p>```foo``</p>
3978.
3979
3980.
3981`foo
3982.
3983<p>`foo</p>
3984.
3985
3986## Emphasis and strong emphasis
3987
3988John Gruber's original [Markdown syntax
3989description](http://daringfireball.net/projects/markdown/syntax#em) says:
3990
3991> Markdown treats asterisks (`*`) and underscores (`_`) as indicators of
3992> emphasis. Text wrapped with one `*` or `_` will be wrapped with an HTML
3993> `<em>` tag; double `*`'s or `_`'s will be wrapped with an HTML `<strong>`
3994> tag.
3995
3996This is enough for most users, but these rules leave much undecided,
3997especially when it comes to nested emphasis.  The original
3998`Markdown.pl` test suite makes it clear that triple `***` and
3999`___` delimiters can be used for strong emphasis, and most
4000implementations have also allowed the following patterns:
4001
4002``` markdown
4003***strong emph***
4004***strong** in emph*
4005***emph* in strong**
4006**in strong *emph***
4007*in emph **strong***
4008```
4009
4010The following patterns are less widely supported, but the intent
4011is clear and they are useful (especially in contexts like bibliography
4012entries):
4013
4014``` markdown
4015*emph *with emph* in it*
4016**strong **with strong** in it**
4017```
4018
4019Many implementations have also restricted intraword emphasis to
4020the `*` forms, to avoid unwanted emphasis in words containing
4021internal underscores.  (It is best practice to put these in code
4022spans, but users often do not.)
4023
4024``` markdown
4025internal emphasis: foo*bar*baz
4026no emphasis: foo_bar_baz
4027```
4028
4029The following rules capture all of these patterns, while allowing
4030for efficient parsing strategies that do not backtrack:
4031
40321.  A single `*` character [can open emphasis](#can-open-emphasis)
4033    <a id="can-open-emphasis"></a> iff
4034
4035    (a) it is not part of a sequence of four or more unescaped `*`s,
4036    (b) it is not followed by whitespace, and
4037    (c) either it is not followed by a `*` character or it is
4038        followed immediately by strong emphasis.
4039
40402.  A single `_` character [can open emphasis](#can-open-emphasis) iff
4041
4042    (a) it is not part of a sequence of four or more unescaped `_`s,
4043    (b) it is not followed by whitespace,
4044    (c) it is not preceded by an ASCII alphanumeric character, and
4045    (d) either it is not followed by a `_` character or it is
4046        followed immediately by strong emphasis.
4047
40483.  A single `*` character [can close emphasis](#can-close-emphasis)
4049    <a id="can-close-emphasis"></a> iff
4050
4051    (a) it is not part of a sequence of four or more unescaped `*`s, and
4052    (b) it is not preceded by whitespace.
4053
40544.  A single `_` character [can close emphasis](#can-close-emphasis) iff
4055
4056    (a) it is not part of a sequence of four or more unescaped `_`s,
4057    (b) it is not preceded by whitespace, and
4058    (c) it is not followed by an ASCII alphanumeric character.
4059
40605.  A double `**` [can open strong emphasis](#can-open-strong-emphasis)
4061    <a id="can-open-strong-emphasis" ></a> iff
4062
4063    (a) it is not part of a sequence of four or more unescaped `*`s,
4064    (b) it is not followed by whitespace, and
4065    (c) either it is not followed by a `*` character or it is
4066        followed immediately by emphasis.
4067
40686.  A double `__` [can open strong emphasis](#can-open-strong-emphasis)
4069    iff
4070
4071    (a) it is not part of a sequence of four or more unescaped `_`s,
4072    (b) it is not followed by whitespace, and
4073    (c) it is not preceded by an ASCII alphanumeric character, and
4074    (d) either it is not followed by a `_` character or it is
4075        followed immediately by emphasis.
4076
40777.  A double `**` [can close strong emphasis](#can-close-strong-emphasis)
4078    <a id="can-close-strong-emphasis" ></a> iff
4079
4080    (a) it is not part of a sequence of four or more unescaped `*`s, and
4081    (b) it is not preceded by whitespace.
4082
40838.  A double `__` [can close strong emphasis](#can-close-strong-emphasis)
4084    iff
4085
4086    (a) it is not part of a sequence of four or more unescaped `_`s,
4087    (b) it is not preceded by whitespace, and
4088    (c) it is not followed by an ASCII alphanumeric character.
4089
40909.  Emphasis begins with a delimiter that [can open
4091    emphasis](#can-open-emphasis) and includes inlines parsed
4092    sequentially until a delimiter that [can close
4093    emphasis](#can-close-emphasis), and that uses the same
4094    character (`_` or `*`) as the opening delimiter, is reached.
4095
409610. Strong emphasis begins with a delimiter that [can open strong
4097    emphasis](#can-open-strong-emphasis) and includes inlines parsed
4098    sequentially until a delimiter that [can close strong
4099    emphasis](#can-close-strong-emphasis), and that uses the
4100    same character (`_` or `*`) as the opening delimiter, is reached.
4101
4102These rules can be illustrated through a series of examples.
4103
4104Simple emphasis:
4105
4106.
4107*foo bar*
4108.
4109<p><em>foo bar</em></p>
4110.
4111
4112.
4113_foo bar_
4114.
4115<p><em>foo bar</em></p>
4116.
4117
4118Simple strong emphasis:
4119
4120.
4121**foo bar**
4122.
4123<p><strong>foo bar</strong></p>
4124.
4125
4126.
4127__foo bar__
4128.
4129<p><strong>foo bar</strong></p>
4130.
4131
4132Emphasis can continue over line breaks:
4133
4134.
4135*foo
4136bar*
4137.
4138<p><em>foo
4139bar</em></p>
4140.
4141
4142.
4143_foo
4144bar_
4145.
4146<p><em>foo
4147bar</em></p>
4148.
4149
4150.
4151**foo
4152bar**
4153.
4154<p><strong>foo
4155bar</strong></p>
4156.
4157
4158.
4159__foo
4160bar__
4161.
4162<p><strong>foo
4163bar</strong></p>
4164.
4165
4166Emphasis can contain other inline constructs:
4167
4168.
4169*foo [bar](/url)*
4170.
4171<p><em>foo <a href="/url">bar</a></em></p>
4172.
4173
4174.
4175_foo [bar](/url)_
4176.
4177<p><em>foo <a href="/url">bar</a></em></p>
4178.
4179
4180.
4181**foo [bar](/url)**
4182.
4183<p><strong>foo <a href="/url">bar</a></strong></p>
4184.
4185
4186.
4187__foo [bar](/url)__
4188.
4189<p><strong>foo <a href="/url">bar</a></strong></p>
4190.
4191
4192Symbols contained in other inline constructs will not
4193close emphasis:
4194
4195.
4196*foo [bar*](/url)
4197.
4198<p>*foo <a href="/url">bar*</a></p>
4199.
4200
4201.
4202_foo [bar_](/url)
4203.
4204<p>_foo <a href="/url">bar_</a></p>
4205.
4206
4207.
4208**<a href="**">
4209.
4210<p>**<a href="**"></p>
4211.
4212
4213.
4214__<a href="__">
4215.
4216<p>__<a href="__"></p>
4217.
4218
4219.
4220*a `*`*
4221.
4222<p><em>a <code>*</code></em></p>
4223.
4224
4225.
4226_a `_`_
4227.
4228<p><em>a <code>_</code></em></p>
4229.
4230
4231.
4232**a<http://foo.bar?q=**>
4233.
4234<p>**a<a href="http://foo.bar?q=**">http://foo.bar?q=**</a></p>
4235.
4236
4237.
4238__a<http://foo.bar?q=__>
4239.
4240<p>__a<a href="http://foo.bar?q=__">http://foo.bar?q=__</a></p>
4241.
4242
4243This is not emphasis, because the opening delimiter is
4244followed by white space:
4245
4246.
4247and * foo bar*
4248.
4249<p>and * foo bar*</p>
4250.
4251
4252.
4253_ foo bar_
4254.
4255<p>_ foo bar_</p>
4256.
4257
4258.
4259and ** foo bar**
4260.
4261<p>and ** foo bar**</p>
4262.
4263
4264.
4265__ foo bar__
4266.
4267<p>__ foo bar__</p>
4268.
4269
4270This is not emphasis, because the closing delimiter is
4271preceded by white space:
4272
4273.
4274and *foo bar *
4275.
4276<p>and *foo bar *</p>
4277.
4278
4279.
4280and _foo bar _
4281.
4282<p>and _foo bar _</p>
4283.
4284
4285.
4286and **foo bar **
4287.
4288<p>and **foo bar **</p>
4289.
4290
4291.
4292and __foo bar __
4293.
4294<p>and __foo bar __</p>
4295.
4296
4297The rules imply that a sequence of four or more unescaped `*` or
4298`_` characters will always be parsed as a literal string:
4299
4300.
4301****hi****
4302.
4303<p>****hi****</p>
4304.
4305
4306.
4307_____hi_____
4308.
4309<p>_____hi_____</p>
4310.
4311
4312.
4313Sign here: _________
4314.
4315<p>Sign here: _________</p>
4316.
4317
4318The rules also imply that there can be no empty emphasis or strong
4319emphasis:
4320
4321.
4322** is not an empty emphasis
4323.
4324<p>** is not an empty emphasis</p>
4325.
4326
4327.
4328**** is not an empty strong emphasis
4329.
4330<p>**** is not an empty strong emphasis</p>
4331.
4332
4333To include `*` or `_` in emphasized sections, use backslash escapes
4334or code spans:
4335
4336.
4337*here is a \**
4338.
4339<p><em>here is a *</em></p>
4340.
4341
4342.
4343__this is a double underscore (`__`)__
4344.
4345<p><strong>this is a double underscore (<code>__</code>)</strong></p>
4346.
4347
4348`*` delimiters allow intra-word emphasis; `_` delimiters do not:
4349
4350.
4351foo*bar*baz
4352.
4353<p>foo<em>bar</em>baz</p>
4354.
4355
4356.
4357foo_bar_baz
4358.
4359<p>foo_bar_baz</p>
4360.
4361
4362.
4363foo__bar__baz
4364.
4365<p>foo__bar__baz</p>
4366.
4367
4368.
4369_foo_bar_baz_
4370.
4371<p><em>foo_bar_baz</em></p>
4372.
4373
4374.
437511*15*32
4376.
4377<p>11<em>15</em>32</p>
4378.
4379
4380.
438111_15_32
4382.
4383<p>11_15_32</p>
4384.
4385
4386Internal underscores will be ignored in underscore-delimited
4387emphasis:
4388
4389.
4390_foo_bar_baz_
4391.
4392<p><em>foo_bar_baz</em></p>
4393.
4394
4395.
4396__foo__bar__baz__
4397.
4398<p><strong>foo__bar__baz</strong></p>
4399.
4400
4401The rules are sufficient for the following nesting patterns:
4402
4403.
4404***foo bar***
4405.
4406<p><strong><em>foo bar</em></strong></p>
4407.
4408
4409.
4410___foo bar___
4411.
4412<p><strong><em>foo bar</em></strong></p>
4413.
4414
4415.
4416***foo** bar*
4417.
4418<p><em><strong>foo</strong> bar</em></p>
4419.
4420
4421.
4422___foo__ bar_
4423.
4424<p><em><strong>foo</strong> bar</em></p>
4425.
4426
4427.
4428***foo* bar**
4429.
4430<p><strong><em>foo</em> bar</strong></p>
4431.
4432
4433.
4434___foo_ bar__
4435.
4436<p><strong><em>foo</em> bar</strong></p>
4437.
4438
4439.
4440*foo **bar***
4441.
4442<p><em>foo <strong>bar</strong></em></p>
4443.
4444
4445.
4446_foo __bar___
4447.
4448<p><em>foo <strong>bar</strong></em></p>
4449.
4450
4451.
4452**foo *bar***
4453.
4454<p><strong>foo <em>bar</em></strong></p>
4455.
4456
4457.
4458__foo _bar___
4459.
4460<p><strong>foo <em>bar</em></strong></p>
4461.
4462
4463.
4464*foo **bar***
4465.
4466<p><em>foo <strong>bar</strong></em></p>
4467.
4468
4469.
4470_foo __bar___
4471.
4472<p><em>foo <strong>bar</strong></em></p>
4473.
4474
4475.
4476*foo *bar* baz*
4477.
4478<p><em>foo <em>bar</em> baz</em></p>
4479.
4480
4481.
4482_foo _bar_ baz_
4483.
4484<p><em>foo <em>bar</em> baz</em></p>
4485.
4486
4487.
4488**foo **bar** baz**
4489.
4490<p><strong>foo <strong>bar</strong> baz</strong></p>
4491.
4492
4493.
4494__foo __bar__ baz__
4495.
4496<p><strong>foo <strong>bar</strong> baz</strong></p>
4497.
4498
4499.
4500*foo **bar** baz*
4501.
4502<p><em>foo <strong>bar</strong> baz</em></p>
4503.
4504
4505.
4506_foo __bar__ baz_
4507.
4508<p><em>foo <strong>bar</strong> baz</em></p>
4509.
4510
4511.
4512**foo *bar* baz**
4513.
4514<p><strong>foo <em>bar</em> baz</strong></p>
4515.
4516
4517.
4518__foo _bar_ baz__
4519.
4520<p><strong>foo <em>bar</em> baz</strong></p>
4521.
4522
4523Note that you cannot nest emphasis directly inside emphasis
4524using the same delimeter, or strong emphasis directly inside
4525strong emphasis:
4526
4527.
4528**foo**
4529.
4530<p><strong>foo</strong></p>
4531.
4532
4533.
4534****foo****
4535.
4536<p>****foo****</p>
4537.
4538
4539For these nestings, you need to switch delimiters:
4540
4541.
4542*_foo_*
4543.
4544<p><em><em>foo</em></em></p>
4545.
4546
4547.
4548**__foo__**
4549.
4550<p><strong><strong>foo</strong></strong></p>
4551.
4552
4553Note that a `*` followed by a `*` can close emphasis, and
4554a `**` followed by a `*` can close strong emphasis (and
4555similarly for `_` and `__`):
4556
4557.
4558*foo**
4559.
4560<p><em>foo</em>*</p>
4561.
4562
4563.
4564*foo *bar**
4565.
4566<p><em>foo <em>bar</em></em></p>
4567.
4568
4569.
4570**foo***
4571.
4572<p><strong>foo</strong>*</p>
4573.
4574
4575.
4576***foo* bar***
4577.
4578<p><strong><em>foo</em> bar</strong>*</p>
4579.
4580
4581.
4582***foo** bar***
4583.
4584<p><em><strong>foo</strong> bar</em>**</p>
4585.
4586
4587The following contains no strong emphasis, because the opening
4588delimiter is closed by the first `*` before `bar`:
4589
4590.
4591*foo**bar***
4592.
4593<p><em>foo</em><em>bar</em>**</p>
4594.
4595
4596However, a string of four or more `****` can never close emphasis:
4597
4598.
4599*foo****
4600.
4601<p>*foo****</p>
4602.
4603
4604Note that there are some asymmetries here:
4605
4606.
4607*foo**
4608
4609**foo*
4610.
4611<p><em>foo</em>*</p>
4612<p>**foo*</p>
4613.
4614
4615.
4616*foo *bar**
4617
4618**foo* bar*
4619.
4620<p><em>foo <em>bar</em></em></p>
4621<p>**foo* bar*</p>
4622.
4623
4624More cases with mismatched delimiters:
4625
4626.
4627**foo* bar*
4628.
4629<p>**foo* bar*</p>
4630.
4631
4632.
4633*bar***
4634.
4635<p><em>bar</em>**</p>
4636.
4637
4638.
4639***foo*
4640.
4641<p>***foo*</p>
4642.
4643
4644.
4645**bar***
4646.
4647<p><strong>bar</strong>*</p>
4648.
4649
4650.
4651***foo**
4652.
4653<p>***foo**</p>
4654.
4655
4656.
4657***foo *bar*
4658.
4659<p>***foo <em>bar</em></p>
4660.
4661
4662## Links
4663
4664A link contains a [link label](#link-label) (the visible text),
4665a [destination](#destination) (the URI that is the link destination),
4666and optionally a [link title](#link-title).  There are two basic kinds
4667of links in Markdown.  In [inline links](#inline-links) the destination
4668and title are given immediately after the label.  In [reference
4669links](#reference-links) the destination and title are defined elsewhere
4670in the document.
4671
4672A [link label](#link-label) <a id="link-label"></a>  consists of
4673
4674- an opening `[`, followed by
4675- zero or more backtick code spans, autolinks, HTML tags, link labels,
4676  backslash-escaped ASCII punctuation characters, or non-`]` characters,
4677  followed by
4678- a closing `]`.
4679
4680These rules are motivated by the following intuitive ideas:
4681
4682- A link label is a container for inline elements.
4683- The square brackets bind more tightly than emphasis markers,
4684  but less tightly than `<>` or `` ` ``.
4685- Link labels may contain material in matching square brackets.
4686
4687A [link destination](#link-destination) <a id="link-destination"></a>
4688consists of either
4689
4690- a sequence of zero or more characters between an opening `<` and a
4691  closing `>` that contains no line breaks or unescaped `<` or `>`
4692  characters, or
4693
4694- a nonempty sequence of characters that does not include
4695  ASCII space or control characters, and includes parentheses
4696  only if (a) they are backslash-escaped or (b) they are part of
4697  a balanced pair of unescaped parentheses that is not itself
4698  inside a balanced pair of unescaped paretheses.
4699
4700A [link title](#link-title) <a id="link-title"></a>  consists of either
4701
4702- a sequence of zero or more characters between straight double-quote
4703  characters (`"`), including a `"` character only if it is
4704  backslash-escaped, or
4705
4706- a sequence of zero or more characters between straight single-quote
4707  characters (`'`), including a `'` character only if it is
4708  backslash-escaped, or
4709
4710- a sequence of zero or more characters between matching parentheses
4711  (`(...)`), including a `)` character only if it is backslash-escaped.
4712
4713An [inline link](#inline-link) <a id="inline-link"></a>
4714consists of a [link label](#link-label) followed immediately
4715by a left parenthesis `(`, optional whitespace,
4716an optional [link destination](#link-destination),
4717an optional [link title](#link-title) separated from the link
4718destination by whitespace, optional whitespace, and a right
4719parenthesis `)`.  The link's text consists of the label (excluding
4720the enclosing square brackets) parsed as inlines.  The link's
4721URI consists of the link destination, excluding enclosing `<...>` if
4722present, with backslash-escapes in effect as described above.  The
4723link's title consists of the link title, excluding its enclosing
4724delimiters, with backslash-escapes in effect as described above.
4725
4726Here is a simple inline link:
4727
4728.
4729[link](/uri "title")
4730.
4731<p><a href="/uri" title="title">link</a></p>
4732.
4733
4734The title may be omitted:
4735
4736.
4737[link](/uri)
4738.
4739<p><a href="/uri">link</a></p>
4740.
4741
4742Both the title and the destination may be omitted:
4743
4744.
4745[link]()
4746.
4747<p><a href="">link</a></p>
4748.
4749
4750.
4751[link](<>)
4752.
4753<p><a href="">link</a></p>
4754.
4755
4756
4757If the destination contains spaces, it must be enclosed in pointy
4758braces:
4759
4760.
4761[link](/my uri)
4762.
4763<p>[link](/my uri)</p>
4764.
4765
4766.
4767[link](</my uri>)
4768.
4769<p><a href="/my%20uri">link</a></p>
4770.
4771
4772The destination cannot contain line breaks, even with pointy braces:
4773
4774.
4775[link](foo
4776bar)
4777.
4778<p>[link](foo
4779bar)</p>
4780.
4781
4782One level of balanced parentheses is allowed without escaping:
4783
4784.
4785[link]((foo)and(bar))
4786.
4787<p><a href="(foo)and(bar)">link</a></p>
4788.
4789
4790However, if you have parentheses within parentheses, you need to escape
4791or use the `<...>` form:
4792
4793.
4794[link](foo(and(bar)))
4795.
4796<p>[link](foo(and(bar)))</p>
4797.
4798
4799.
4800[link](foo(and\(bar\)))
4801.
4802<p><a href="foo(and(bar))">link</a></p>
4803.
4804
4805.
4806[link](<foo(and(bar))>)
4807.
4808<p><a href="foo(and(bar))">link</a></p>
4809.
4810
4811Parentheses and other symbols can also be escaped, as usual
4812in Markdown:
4813
4814.
4815[link](foo\)\:)
4816.
4817<p><a href="foo):">link</a></p>
4818.
4819
4820URL-escaping and should be left alone inside the destination, as all URL-escaped characters
4821are also valid URL characters. HTML entities in the destination will be parsed into their UTF8
4822codepoints, as usual, and optionally URL-escaped when written as HTML.
4823
4824.
4825[link](foo%20b&auml;)
4826.
4827<p><a href="foo%20b%C3%A4">link</a></p>
4828.
4829
4830Note that, because titles can often be parsed as destinations,
4831if you try to omit the destination and keep the title, you'll
4832get unexpected results:
4833
4834.
4835[link]("title")
4836.
4837<p><a href="%22title%22">link</a></p>
4838.
4839
4840Titles may be in single quotes, double quotes, or parentheses:
4841
4842.
4843[link](/url "title")
4844[link](/url 'title')
4845[link](/url (title))
4846.
4847<p><a href="/url" title="title">link</a>
4848<a href="/url" title="title">link</a>
4849<a href="/url" title="title">link</a></p>
4850.
4851
4852Backslash escapes and entities may be used in titles:
4853
4854.
4855[link](/url "title \"&quot;")
4856.
4857<p><a href="/url" title="title &quot;&quot;">link</a></p>
4858.
4859
4860Nested balanced quotes are not allowed without escaping:
4861
4862.
4863[link](/url "title "and" title")
4864.
4865<p>[link](/url &quot;title &quot;and&quot; title&quot;)</p>
4866.
4867
4868But it is easy to work around this by using a different quote type:
4869
4870.
4871[link](/url 'title "and" title')
4872.
4873<p><a href="/url" title="title &quot;and&quot; title">link</a></p>
4874.
4875
4876(Note:  `Markdown.pl` did allow double quotes inside a double-quoted
4877title, and its test suite included a test demonstrating this.
4878But it is hard to see a good rationale for the extra complexity this
4879brings, since there are already many ways---backslash escaping,
4880entities, or using a different quote type for the enclosing title---to
4881write titles containing double quotes.  `Markdown.pl`'s handling of
4882titles has a number of other strange features.  For example, it allows
4883single-quoted titles in inline links, but not reference links.  And, in
4884reference links but not inline links, it allows a title to begin with
4885`"` and end with `)`.  `Markdown.pl` 1.0.1 even allows titles with no closing
4886quotation mark, though 1.0.2b8 does not.  It seems preferable to adopt
4887a simple, rational rule that works the same way in inline links and
4888link reference definitions.)
4889
4890Whitespace is allowed around the destination and title:
4891
4892.
4893[link](   /uri
4894  "title"  )
4895.
4896<p><a href="/uri" title="title">link</a></p>
4897.
4898
4899But it is not allowed between the link label and the
4900following parenthesis:
4901
4902.
4903[link] (/uri)
4904.
4905<p>[link] (/uri)</p>
4906.
4907
4908Note that this is not a link, because the closing `]` occurs in
4909an HTML tag:
4910
4911.
4912[foo <bar attr="](baz)">
4913.
4914<p>[foo <bar attr="](baz)"></p>
4915.
4916
4917
4918There are three kinds of [reference links](#reference-link):
4919<a id="reference-link"></a>
4920
4921A [full reference link](#full-reference-link) <a id="full-reference-link"></a>
4922consists of a [link label](#link-label), optional whitespace, and
4923another [link label](#link-label) that [matches](#matches) a
4924[link reference definition](#link-reference-definition) elsewhere in the
4925document.
4926
4927One label [matches](#matches) <a id="matches"></a>
4928another just in case their normalized forms are equal.  To normalize a
4929label, perform the *unicode case fold* and collapse consecutive internal
4930whitespace to a single space.  If there are multiple matching reference
4931link definitions, the one that comes first in the document is used.  (It
4932is desirable in such cases to emit a warning.)
4933
4934The contents of the first link label are parsed as inlines, which are
4935used as the link's text.  The link's URI and title are provided by the
4936matching [link reference definition](#link-reference-definition).
4937
4938Here is a simple example:
4939
4940.
4941[foo][bar]
4942
4943[bar]: /url "title"
4944.
4945<p><a href="/url" title="title">foo</a></p>
4946.
4947
4948The first label can contain inline content:
4949
4950.
4951[*foo\!*][bar]
4952
4953[bar]: /url "title"
4954.
4955<p><a href="/url" title="title"><em>foo!</em></a></p>
4956.
4957
4958Matching is case-insensitive:
4959
4960.
4961[foo][BaR]
4962
4963[bar]: /url "title"
4964.
4965<p><a href="/url" title="title">foo</a></p>
4966.
4967
4968Unicode case fold is used:
4969
4970.
4971[Толпой][Толпой] is a Russian word.
4972
4973[ТОЛПОЙ]: /url
4974.
4975<p><a href="/url">Толпой</a> is a Russian word.</p>
4976.
4977
4978Consecutive internal whitespace is treated as one space for
4979purposes of determining matching:
4980
4981.
4982[Foo
4983  bar]: /url
4984
4985[Baz][Foo bar]
4986.
4987<p><a href="/url">Baz</a></p>
4988.
4989
4990There can be whitespace between the two labels:
4991
4992.
4993[foo] [bar]
4994
4995[bar]: /url "title"
4996.
4997<p><a href="/url" title="title">foo</a></p>
4998.
4999
5000.
5001[foo]
5002[bar]
5003
5004[bar]: /url "title"
5005.
5006<p><a href="/url" title="title">foo</a></p>
5007.
5008
5009When there are multiple matching [link reference
5010definitions](#link-reference-definition), the first is used:
5011
5012.
5013[foo]: /url1
5014
5015[foo]: /url2
5016
5017[bar][foo]
5018.
5019<p><a href="/url1">bar</a></p>
5020.
5021
5022Note that matching is performed on normalized strings, not parsed
5023inline content.  So the following does not match, even though the
5024labels define equivalent inline content:
5025
5026.
5027[bar][foo\!]
5028
5029[foo!]: /url
5030.
5031<p>[bar][foo!]</p>
5032.
5033
5034A [collapsed reference link](#collapsed-reference-link)
5035<a id="collapsed-reference-link"></a> consists of a [link
5036label](#link-label) that [matches](#matches) a [link reference
5037definition](#link-reference-definition) elsewhere in the
5038document, optional whitespace, and the string `[]`.  The contents of the
5039first link label are parsed as inlines, which are used as the link's
5040text.  The link's URI and title are provided by the matching reference
5041link definition.  Thus, `[foo][]` is equivalent to `[foo][foo]`.
5042
5043.
5044[foo][]
5045
5046[foo]: /url "title"
5047.
5048<p><a href="/url" title="title">foo</a></p>
5049.
5050
5051.
5052[*foo* bar][]
5053
5054[*foo* bar]: /url "title"
5055.
5056<p><a href="/url" title="title"><em>foo</em> bar</a></p>
5057.
5058
5059The link labels are case-insensitive:
5060
5061.
5062[Foo][]
5063
5064[foo]: /url "title"
5065.
5066<p><a href="/url" title="title">Foo</a></p>
5067.
5068
5069
5070As with full reference links, whitespace is allowed
5071between the two sets of brackets:
5072
5073.
5074[foo]
5075[]
5076
5077[foo]: /url "title"
5078.
5079<p><a href="/url" title="title">foo</a></p>
5080.
5081
5082A [shortcut reference link](#shortcut-reference-link)
5083<a id="shortcut-reference-link"></a> consists of a [link
5084label](#link-label) that [matches](#matches) a [link reference
5085definition](#link-reference-definition)  elsewhere in the
5086document and is not followed by `[]` or a link label.
5087The contents of the first link label are parsed as inlines,
5088which are used as the link's text.  the link's URI and title
5089are provided by the matching link reference definition.
5090Thus, `[foo]` is equivalent to `[foo][]`.
5091
5092.
5093[foo]
5094
5095[foo]: /url "title"
5096.
5097<p><a href="/url" title="title">foo</a></p>
5098.
5099
5100.
5101[*foo* bar]
5102
5103[*foo* bar]: /url "title"
5104.
5105<p><a href="/url" title="title"><em>foo</em> bar</a></p>
5106.
5107
5108.
5109[[*foo* bar]]
5110
5111[*foo* bar]: /url "title"
5112.
5113<p>[<a href="/url" title="title"><em>foo</em> bar</a>]</p>
5114.
5115
5116The link labels are case-insensitive:
5117
5118.
5119[Foo]
5120
5121[foo]: /url "title"
5122.
5123<p><a href="/url" title="title">Foo</a></p>
5124.
5125
5126If you just want bracketed text, you can backslash-escape the
5127opening bracket to avoid links:
5128
5129.
5130\[foo]
5131
5132[foo]: /url "title"
5133.
5134<p>[foo]</p>
5135.
5136
5137Note that this is a link, because link labels bind more tightly
5138than emphasis:
5139
5140.
5141[foo*]: /url
5142
5143*[foo*]
5144.
5145<p>*<a href="/url">foo*</a></p>
5146.
5147
5148However, this is not, because link labels bind less
5149tightly than code backticks:
5150
5151.
5152[foo`]: /url
5153
5154[foo`]`
5155.
5156<p>[foo<code>]</code></p>
5157.
5158
5159Link labels can contain matched square brackets:
5160
5161.
5162[[[foo]]]
5163
5164[[[foo]]]: /url
5165.
5166<p><a href="/url">[[foo]]</a></p>
5167.
5168
5169.
5170[[[foo]]]
5171
5172[[[foo]]]: /url1
5173[foo]: /url2
5174.
5175<p><a href="/url1">[[foo]]</a></p>
5176.
5177
5178For non-matching brackets, use backslash escapes:
5179
5180.
5181[\[foo]
5182
5183[\[foo]: /url
5184.
5185<p><a href="/url">[foo</a></p>
5186.
5187
5188Full references take precedence over shortcut references:
5189
5190.
5191[foo][bar]
5192
5193[foo]: /url1
5194[bar]: /url2
5195.
5196<p><a href="/url2">foo</a></p>
5197.
5198
5199In the following case `[bar][baz]` is parsed as a reference,
5200`[foo]` as normal text:
5201
5202.
5203[foo][bar][baz]
5204
5205[baz]: /url
5206.
5207<p>[foo]<a href="/url">bar</a></p>
5208.
5209
5210Here, though, `[foo][bar]` is parsed as a reference, since
5211`[bar]` is defined:
5212
5213.
5214[foo][bar][baz]
5215
5216[baz]: /url1
5217[bar]: /url2
5218.
5219<p><a href="/url2">foo</a><a href="/url1">baz</a></p>
5220.
5221
5222Here `[foo]` is not parsed as a shortcut reference, because it
5223is followed by a link label (even though `[bar]` is not defined):
5224
5225.
5226[foo][bar][baz]
5227
5228[baz]: /url1
5229[foo]: /url2
5230.
5231<p>[foo]<a href="/url1">bar</a></p>
5232.
5233
5234
5235## Images
5236
5237An (unescaped) exclamation mark (`!`) followed by a reference or
5238inline link will be parsed as an image.  The link label will be
5239used as the image's alt text, and the link title, if any, will
5240be used as the image's title.
5241
5242.
5243![foo](/url "title")
5244.
5245<p><img src="/url" alt="foo" title="title" /></p>
5246.
5247
5248.
5249![foo *bar*]
5250
5251[foo *bar*]: train.jpg "train & tracks"
5252.
5253<p><img src="train.jpg" alt="foo &lt;em&gt;bar&lt;/em&gt;" title="train &amp; tracks" /></p>
5254.
5255
5256.
5257![foo *bar*][]
5258
5259[foo *bar*]: train.jpg "train & tracks"
5260.
5261<p><img src="train.jpg" alt="foo &lt;em&gt;bar&lt;/em&gt;" title="train &amp; tracks" /></p>
5262.
5263
5264.
5265![foo *bar*][foobar]
5266
5267[FOOBAR]: train.jpg "train & tracks"
5268.
5269<p><img src="train.jpg" alt="foo &lt;em&gt;bar&lt;/em&gt;" title="train &amp; tracks" /></p>
5270.
5271
5272.
5273![foo](train.jpg)
5274.
5275<p><img src="train.jpg" alt="foo" /></p>
5276.
5277
5278.
5279My ![foo bar](/path/to/train.jpg  "title"   )
5280.
5281<p>My <img src="/path/to/train.jpg" alt="foo bar" title="title" /></p>
5282.
5283
5284.
5285![foo](<url>)
5286.
5287<p><img src="url" alt="foo" /></p>
5288.
5289
5290.
5291![](/url)
5292.
5293<p><img src="/url" alt="" /></p>
5294.
5295
5296Reference-style:
5297
5298.
5299![foo] [bar]
5300
5301[bar]: /url
5302.
5303<p><img src="/url" alt="foo" /></p>
5304.
5305
5306.
5307![foo] [bar]
5308
5309[BAR]: /url
5310.
5311<p><img src="/url" alt="foo" /></p>
5312.
5313
5314Collapsed:
5315
5316.
5317![foo][]
5318
5319[foo]: /url "title"
5320.
5321<p><img src="/url" alt="foo" title="title" /></p>
5322.
5323
5324.
5325![*foo* bar][]
5326
5327[*foo* bar]: /url "title"
5328.
5329<p><img src="/url" alt="&lt;em&gt;foo&lt;/em&gt; bar" title="title" /></p>
5330.
5331
5332The labels are case-insensitive:
5333
5334.
5335![Foo][]
5336
5337[foo]: /url "title"
5338.
5339<p><img src="/url" alt="Foo" title="title" /></p>
5340.
5341
5342As with full reference links, whitespace is allowed
5343between the two sets of brackets:
5344
5345.
5346![foo]
5347[]
5348
5349[foo]: /url "title"
5350.
5351<p><img src="/url" alt="foo" title="title" /></p>
5352.
5353
5354Shortcut:
5355
5356.
5357![foo]
5358
5359[foo]: /url "title"
5360.
5361<p><img src="/url" alt="foo" title="title" /></p>
5362.
5363
5364.
5365![*foo* bar]
5366
5367[*foo* bar]: /url "title"
5368.
5369<p><img src="/url" alt="&lt;em&gt;foo&lt;/em&gt; bar" title="title" /></p>
5370.
5371
5372.
5373![[foo]]
5374
5375[[foo]]: /url "title"
5376.
5377<p><img src="/url" alt="[foo]" title="title" /></p>
5378.
5379
5380The link labels are case-insensitive:
5381
5382.
5383![Foo]
5384
5385[foo]: /url "title"
5386.
5387<p><img src="/url" alt="Foo" title="title" /></p>
5388.
5389
5390If you just want bracketed text, you can backslash-escape the
5391opening `!` and `[`:
5392
5393.
5394\!\[foo]
5395
5396[foo]: /url "title"
5397.
5398<p>![foo]</p>
5399.
5400
5401If you want a link after a literal `!`, backslash-escape the
5402`!`:
5403
5404.
5405\![foo]
5406
5407[foo]: /url "title"
5408.
5409<p>!<a href="/url" title="title">foo</a></p>
5410.
5411
5412## Autolinks
5413
5414Autolinks are absolute URIs and email addresses inside `<` and `>`.
5415They are parsed as links, with the URL or email address as the link
5416label.
5417
5418A [URI autolink](#uri-autolink) <a id="uri-autolink"></a>
5419consists of `<`, followed by an [absolute
5420URI](#absolute-uri) not containing `<`, followed by `>`.  It is parsed
5421as a link to the URI, with the URI as the link's label.
5422
5423An [absolute URI](#absolute-uri), <a id="absolute-uri"></a>
5424for these purposes, consists of a [scheme](#scheme) followed by a colon (`:`)
5425followed by zero or more characters other than ASCII whitespace and
5426control characters, `<`, and `>`.  If the URI includes these characters,
5427you must use percent-encoding (e.g. `%20` for a space).
5428
5429The following [schemes](#scheme) <a id="scheme"></a>
5430are recognized (case-insensitive):
5431`coap`, `doi`, `javascript`, `aaa`, `aaas`, `about`, `acap`, `cap`,
5432`cid`, `crid`, `data`, `dav`, `dict`, `dns`, `file`, `ftp`, `geo`, `go`,
5433`gopher`, `h323`, `http`, `https`, `iax`, `icap`, `im`, `imap`, `info`,
5434`ipp`, `iris`, `iris.beep`, `iris.xpc`, `iris.xpcs`, `iris.lwz`, `ldap`,
5435`mailto`, `mid`, `msrp`, `msrps`, `mtqp`, `mupdate`, `news`, `nfs`,
5436`ni`, `nih`, `nntp`, `opaquelocktoken`, `pop`, `pres`, `rtsp`,
5437`service`, `session`, `shttp`, `sieve`, `sip`, `sips`, `sms`, `snmp`,`
5438soap.beep`, `soap.beeps`, `tag`, `tel`, `telnet`, `tftp`, `thismessage`,
5439`tn3270`, `tip`, `tv`, `urn`, `vemmi`, `ws`, `wss`, `xcon`,
5440`xcon-userid`, `xmlrpc.beep`, `xmlrpc.beeps`, `xmpp`, `z39.50r`,
5441`z39.50s`, `adiumxtra`, `afp`, `afs`, `aim`, `apt`,` attachment`, `aw`,
5442`beshare`, `bitcoin`, `bolo`, `callto`, `chrome`,` chrome-extension`,
5443`com-eventbrite-attendee`, `content`, `cvs`,` dlna-playsingle`,
5444`dlna-playcontainer`, `dtn`, `dvb`, `ed2k`, `facetime`, `feed`,
5445`finger`, `fish`, `gg`, `git`, `gizmoproject`, `gtalk`, `hcp`, `icon`,
5446`ipn`, `irc`, `irc6`, `ircs`, `itms`, `jar`, `jms`, `keyparc`, `lastfm`,
5447`ldaps`, `magnet`, `maps`, `market`,` message`, `mms`, `ms-help`,
5448`msnim`, `mumble`, `mvn`, `notes`, `oid`, `palm`, `paparazzi`,
5449`platform`, `proxy`, `psyc`, `query`, `res`, `resource`, `rmi`, `rsync`,
5450`rtmp`, `secondlife`, `sftp`, `sgn`, `skype`, `smb`, `soldat`,
5451`spotify`, `ssh`, `steam`, `svn`, `teamspeak`, `things`, `udp`,
5452`unreal`, `ut2004`, `ventrilo`, `view-source`, `webcal`, `wtai`,
5453`wyciwyg`, `xfire`, `xri`, `ymsgr`.
5454
5455Here are some valid autolinks:
5456
5457.
5458<http://foo.bar.baz>
5459.
5460<p><a href="http://foo.bar.baz">http://foo.bar.baz</a></p>
5461.
5462
5463.
5464<http://foo.bar.baz?q=hello&id=22&boolean>
5465.
5466<p><a href="http://foo.bar.baz?q=hello&amp;id=22&amp;boolean">http://foo.bar.baz?q=hello&amp;id=22&amp;boolean</a></p>
5467.
5468
5469.
5470<irc://foo.bar:2233/baz>
5471.
5472<p><a href="irc://foo.bar:2233/baz">irc://foo.bar:2233/baz</a></p>
5473.
5474
5475Uppercase is also fine:
5476
5477.
5478<MAILTO:FOO@BAR.BAZ>
5479.
5480<p><a href="MAILTO:FOO@BAR.BAZ">MAILTO:FOO@BAR.BAZ</a></p>
5481.
5482
5483Spaces are not allowed in autolinks:
5484
5485.
5486<http://foo.bar/baz bim>
5487.
5488<p>&lt;http://foo.bar/baz bim&gt;</p>
5489.
5490
5491An [email autolink](#email-autolink) <a id="email-autolink"></a>
5492consists of `<`, followed by an [email address](#email-address),
5493followed by `>`.  The link's label is the email address,
5494and the URL is `mailto:` followed by the email address.
5495
5496An [email address](#email-address), <a id="email-address"></a>
5497for these purposes, is anything that matches
5498the [non-normative regex from the HTML5
5499spec](http://www.whatwg.org/specs/web-apps/current-work/multipage/forms.html#e-mail-state-%28type=email%29):
5500
5501    /^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?
5502    (?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/
5503
5504Examples of email autolinks:
5505
5506.
5507<foo@bar.baz.com>
5508.
5509<p><a href="mailto:foo@bar.baz.com">foo@bar.baz.com</a></p>
5510.
5511
5512.
5513<foo+special@Bar.baz-bar0.com>
5514.
5515<p><a href="mailto:foo+special@Bar.baz-bar0.com">foo+special@Bar.baz-bar0.com</a></p>
5516.
5517
5518These are not autolinks:
5519
5520.
5521<>
5522.
5523<p>&lt;&gt;</p>
5524.
5525
5526.
5527<heck://bing.bong>
5528.
5529<p>&lt;heck://bing.bong&gt;</p>
5530.
5531
5532.
5533< http://foo.bar >
5534.
5535<p>&lt; http://foo.bar &gt;</p>
5536.
5537
5538.
5539<foo.bar.baz>
5540.
5541<p>&lt;foo.bar.baz&gt;</p>
5542.
5543
5544.
5545<localhost:5001/foo>
5546.
5547<p>&lt;localhost:5001/foo&gt;</p>
5548.
5549
5550.
5551http://google.com
5552.
5553<p>http://google.com</p>
5554.
5555
5556.
5557foo@bar.baz.com
5558.
5559<p>foo@bar.baz.com</p>
5560.
5561
5562## Raw HTML
5563
5564Text between `<` and `>` that looks like an HTML tag is parsed as a
5565raw HTML tag and will be rendered in HTML without escaping.
5566Tag and attribute names are not limited to current HTML tags,
5567so custom tags (and even, say, DocBook tags) may be used.
5568
5569Here is the grammar for tags:
5570
5571A [tag name](#tag-name) <a id="tag-name"></a> consists of an ASCII letter
5572followed by zero or more ASCII letters or digits.
5573
5574An [attribute](#attribute) <a id="attribute"></a> consists of whitespace,
5575an **attribute name**, and an optional **attribute value
5576specification**.
5577
5578An [attribute name](#attribute-name) <a id="attribute-name"></a>
5579consists of an ASCII letter, `_`, or `:`, followed by zero or more ASCII
5580letters, digits, `_`, `.`, `:`, or `-`.  (Note:  This is the XML
5581specification restricted to ASCII.  HTML5 is laxer.)
5582
5583An [attribute value specification](#attribute-value-specification)
5584<a id="attribute-value-specification"></a> consists of optional whitespace,
5585a `=` character, optional whitespace, and an [attribute
5586value](#attribute-value).
5587
5588An [attribute value](#attribute-value) <a id="attribute-value"></a>
5589consists of an [unquoted attribute value](#unquoted-attribute-value),
5590a [single-quoted attribute value](#single-quoted-attribute-value),
5591or a [double-quoted attribute value](#double-quoted-attribute-value).
5592
5593An [unquoted attribute value](#unquoted-attribute-value)
5594<a id="unquoted-attribute-value"></a> is a nonempty string of characters not
5595including spaces, `"`, `'`, `=`, `<`, `>`, or `` ` ``.
5596
5597A [single-quoted attribute value](#single-quoted-attribute-value)
5598<a id="single-quoted-attribute-value"></a> consists of `'`, zero or more
5599characters not including `'`, and a final `'`.
5600
5601A [double-quoted attribute value](#double-quoted-attribute-value)
5602<a id="double-quoted-attribute-value"></a> consists of `"`, zero or more
5603characters not including `"`, and a final `"`.
5604
5605An [open tag](#open-tag) <a id="open-tag"></a> consists of a `<` character,
5606a [tag name](#tag-name), zero or more [attributes](#attribute),
5607optional whitespace, an optional `/` character, and a `>` character.
5608
5609A [closing tag](#closing-tag) <a id="closing-tag"></a> consists of the
5610string `</`, a [tag name](#tag-name), optional whitespace, and the
5611character `>`.
5612
5613An [HTML comment](#html-comment) <a id="html-comment"></a> consists of the
5614string `<!--`, a string of characters not including the string `--`, and
5615the string `-->`.
5616
5617A [processing instruction](#processing-instruction)
5618<a id="processing-instruction"></a> consists of the string `<?`, a string
5619of characters not including the string `?>`, and the string
5620`?>`.
5621
5622A [declaration](#declaration) <a id="declaration"></a> consists of the
5623string `<!`, a name consisting of one or more uppercase ASCII letters,
5624whitespace, a string of characters not including the character `>`, and
5625the character `>`.
5626
5627A [CDATA section](#cdata-section) <a id="cdata-section"></a> consists of
5628the string `<![CDATA[`, a string of characters not including the string
5629`]]>`, and the string `]]>`.
5630
5631An [HTML tag](#html-tag) <a id="html-tag"></a> consists of an [open
5632tag](#open-tag), a [closing tag](#closing-tag), an [HTML
5633comment](#html-comment), a [processing
5634instruction](#processing-instruction), an [element type
5635declaration](#element-type-declaration), or a [CDATA
5636section](#cdata-section).
5637
5638Here are some simple open tags:
5639
5640.
5641<a><bab><c2c>
5642.
5643<p><a><bab><c2c></p>
5644.
5645
5646Empty elements:
5647
5648.
5649<a/><b2/>
5650.
5651<p><a/><b2/></p>
5652.
5653
5654Whitespace is allowed:
5655
5656.
5657<a  /><b2
5658data="foo" >
5659.
5660<p><a  /><b2
5661data="foo" ></p>
5662.
5663
5664With attributes:
5665
5666.
5667<a foo="bar" bam = 'baz <em>"</em>'
5668_boolean zoop:33=zoop:33 />
5669.
5670<p><a foo="bar" bam = 'baz <em>"</em>'
5671_boolean zoop:33=zoop:33 /></p>
5672.
5673
5674Illegal tag names, not parsed as HTML:
5675
5676.
5677<33> <__>
5678.
5679<p>&lt;33&gt; &lt;__&gt;</p>
5680.
5681
5682Illegal attribute names:
5683
5684.
5685<a h*#ref="hi">
5686.
5687<p>&lt;a h*#ref=&quot;hi&quot;&gt;</p>
5688.
5689
5690Illegal attribute values:
5691
5692.
5693<a href="hi'> <a href=hi'>
5694.
5695<p>&lt;a href=&quot;hi'&gt; &lt;a href=hi'&gt;</p>
5696.
5697
5698Illegal whitespace:
5699
5700.
5701< a><
5702foo><bar/ >
5703.
5704<p>&lt; a&gt;&lt;
5705foo&gt;&lt;bar/ &gt;</p>
5706.
5707
5708Missing whitespace:
5709
5710.
5711<a href='bar'title=title>
5712.
5713<p>&lt;a href='bar'title=title&gt;</p>
5714.
5715
5716Closing tags:
5717
5718.
5719</a>
5720</foo >
5721.
5722<p></a>
5723</foo ></p>
5724.
5725
5726Illegal attributes in closing tag:
5727
5728.
5729</a href="foo">
5730.
5731<p>&lt;/a href=&quot;foo&quot;&gt;</p>
5732.
5733
5734Comments:
5735
5736.
5737foo <!-- this is a
5738comment - with hyphen -->
5739.
5740<p>foo <!-- this is a
5741comment - with hyphen --></p>
5742.
5743
5744.
5745foo <!-- not a comment -- two hyphens -->
5746.
5747<p>foo &lt;!-- not a comment -- two hyphens --&gt;</p>
5748.
5749
5750Processing instructions:
5751
5752.
5753foo <?php echo $a; ?>
5754.
5755<p>foo <?php echo $a; ?></p>
5756.
5757
5758Declarations:
5759
5760.
5761foo <!ELEMENT br EMPTY>
5762.
5763<p>foo <!ELEMENT br EMPTY></p>
5764.
5765
5766CDATA sections:
5767
5768.
5769foo <![CDATA[>&<]]>
5770.
5771<p>foo <![CDATA[>&<]]></p>
5772.
5773
5774Entities are preserved in HTML attributes:
5775
5776.
5777<a href="&ouml;">
5778.
5779<p><a href="&ouml;"></p>
5780.
5781
5782Backslash escapes do not work in HTML attributes:
5783
5784.
5785<a href="\*">
5786.
5787<p><a href="\*"></p>
5788.
5789
5790.
5791<a href="\"">
5792.
5793<p>&lt;a href=&quot;&quot;&quot;&gt;</p>
5794.
5795
5796## Hard line breaks
5797
5798A line break (not in a code span or HTML tag) that is preceded
5799by two or more spaces is parsed as a linebreak (rendered
5800in HTML as a `<br />` tag):
5801
5802.
5803foo
5804baz
5805.
5806<p>foo<br />
5807baz</p>
5808.
5809
5810For a more visible alternative, a backslash before the newline may be
5811used instead of two spaces:
5812
5813.
5814foo\
5815baz
5816.
5817<p>foo<br />
5818baz</p>
5819.
5820
5821More than two spaces can be used:
5822
5823.
5824foo
5825baz
5826.
5827<p>foo<br />
5828baz</p>
5829.
5830
5831Leading spaces at the beginning of the next line are ignored:
5832
5833.
5834foo
5835     bar
5836.
5837<p>foo<br />
5838bar</p>
5839.
5840
5841.
5842foo\
5843     bar
5844.
5845<p>foo<br />
5846bar</p>
5847.
5848
5849Line breaks can occur inside emphasis, links, and other constructs
5850that allow inline content:
5851
5852.
5853*foo
5854bar*
5855.
5856<p><em>foo<br />
5857bar</em></p>
5858.
5859
5860.
5861*foo\
5862bar*
5863.
5864<p><em>foo<br />
5865bar</em></p>
5866.
5867
5868Line breaks do not occur inside code spans
5869
5870.
5871`code
5872span`
5873.
5874<p><code>code span</code></p>
5875.
5876
5877.
5878`code\
5879span`
5880.
5881<p><code>code\ span</code></p>
5882.
5883
5884or HTML tags:
5885
5886.
5887<a href="foo
5888bar">
5889.
5890<p><a href="foo
5891bar"></p>
5892.
5893
5894.
5895<a href="foo\
5896bar">
5897.
5898<p><a href="foo\
5899bar"></p>
5900.
5901
5902## Soft line breaks
5903
5904A regular line break (not in a code span or HTML tag) that is not
5905preceded by two or more spaces is parsed as a softbreak.  (A
5906softbreak may be rendered in HTML either as a newline or as a space.
5907The result will be the same in browsers. In the examples here, a
5908newline will be used.)
5909
5910.
5911foo
5912baz
5913.
5914<p>foo
5915baz</p>
5916.
5917
5918Spaces at the end of the line and beginning of the next line are
5919removed:
5920
5921.
5922foo
5923 baz
5924.
5925<p>foo
5926baz</p>
5927.
5928
5929A conforming parser may render a soft line break in HTML either as a
5930line break or as a space.
5931
5932A renderer may also provide an option to render soft line breaks
5933as hard line breaks.
5934
5935## Strings
5936
5937Any characters not given an interpretation by the above rules will
5938be parsed as string content.
5939
5940.
5941hello $.;'there
5942.
5943<p>hello $.;'there</p>
5944.
5945
5946.
5947Foo χρῆν
5948.
5949<p>Foo χρῆν</p>
5950.
5951
5952Internal spaces are preserved verbatim:
5953
5954.
5955Multiple     spaces
5956.
5957<p>Multiple     spaces</p>
5958.
5959
5960<!-- END TESTS -->
5961
5962# Appendix A: A parsing strategy {-}
5963
5964## Overview {-}
5965
5966Parsing has two phases:
5967
59681. In the first phase, lines of input are consumed and the block
5969structure of the document---its division into paragraphs, block quotes,
5970list items, and so on---is constructed.  Text is assigned to these
5971blocks but not parsed. Link reference definitions are parsed and a
5972map of links is constructed.
5973
59742. In the second phase, the raw text contents of paragraphs and headers
5975are parsed into sequences of Markdown inline elements (strings,
5976code spans, links, emphasis, and so on), using the map of link
5977references constructed in phase 1.
5978
5979## The document tree {-}
5980
5981At each point in processing, the document is represented as a tree of
5982**blocks**.  The root of the tree is a `document` block.  The `document`
5983may have any number of other blocks as **children**.  These children
5984may, in turn, have other blocks as children.  The last child of a block
5985is normally considered **open**, meaning that subsequent lines of input
5986can alter its contents.  (Blocks that are not open are **closed**.)
5987Here, for example, is a possible document tree, with the open blocks
5988marked by arrows:
5989
5990``` tree
5991-> document
5992  -> block_quote
5993       paragraph
5994         "Lorem ipsum dolor\nsit amet."
5995    -> list (type=bullet tight=true bullet_char=-)
5996         list_item
5997           paragraph
5998             "Qui *quodsi iracundia*"
5999      -> list_item
6000        -> paragraph
6001             "aliquando id"
6002```
6003
6004## How source lines alter the document tree {-}
6005
6006Each line that is processed has an effect on this tree.  The line is
6007analyzed and, depending on its contents, the document may be altered
6008in one or more of the following ways:
6009
60101. One or more open blocks may be closed.
60112. One or more new blocks may be created as children of the
6012   last open block.
60133. Text may be added to the last (deepest) open block remaining
6014   on the tree.
6015
6016Once a line has been incorporated into the tree in this way,
6017it can be discarded, so input can be read in a stream.
6018
6019We can see how this works by considering how the tree above is
6020generated by four lines of Markdown:
6021
6022``` markdown
6023> Lorem ipsum dolor
6024sit amet.
6025> - Qui *quodsi iracundia*
6026> - aliquando id
6027```
6028
6029At the outset, our document model is just
6030
6031``` tree
6032-> document
6033```
6034
6035The first line of our text,
6036
6037``` markdown
6038> Lorem ipsum dolor
6039```
6040
6041causes a `block_quote` block to be created as a child of our
6042open `document` block, and a `paragraph` block as a child of
6043the `block_quote`.  Then the text is added to the last open
6044block, the `paragraph`:
6045
6046``` tree
6047-> document
6048  -> block_quote
6049    -> paragraph
6050         "Lorem ipsum dolor"
6051```
6052
6053The next line,
6054
6055``` markdown
6056sit amet.
6057```
6058
6059is a "lazy continuation" of the open `paragraph`, so it gets added
6060to the paragraph's text:
6061
6062``` tree
6063-> document
6064  -> block_quote
6065    -> paragraph
6066         "Lorem ipsum dolor\nsit amet."
6067```
6068
6069The third line,
6070
6071``` markdown
6072> - Qui *quodsi iracundia*
6073```
6074
6075causes the `paragraph` block to be closed, and a new `list` block
6076opened as a child of the `block_quote`.  A `list_item` is also
6077added as a child of the `list`, and a `paragraph` as a child of
6078the `list_item`.  The text is then added to the new `paragraph`:
6079
6080``` tree
6081-> document
6082  -> block_quote
6083       paragraph
6084         "Lorem ipsum dolor\nsit amet."
6085    -> list (type=bullet tight=true bullet_char=-)
6086      -> list_item
6087        -> paragraph
6088             "Qui *quodsi iracundia*"
6089```
6090
6091The fourth line,
6092
6093``` markdown
6094> - aliquando id
6095```
6096
6097causes the `list_item` (and its child the `paragraph`) to be closed,
6098and a new `list_item` opened up as child of the `list`.  A `paragraph`
6099is added as a child of the new `list_item`, to contain the text.
6100We thus obtain the final tree:
6101
6102``` tree
6103-> document
6104  -> block_quote
6105       paragraph
6106         "Lorem ipsum dolor\nsit amet."
6107    -> list (type=bullet tight=true bullet_char=-)
6108         list_item
6109           paragraph
6110             "Qui *quodsi iracundia*"
6111      -> list_item
6112        -> paragraph
6113             "aliquando id"
6114```
6115
6116## From block structure to the final document {-}
6117
6118Once all of the input has been parsed, all open blocks are closed.
6119
6120We then "walk the tree," visiting every node, and parse raw
6121string contents of paragraphs and headers as inlines.  At this
6122point we have seen all the link reference definitions, so we can
6123resolve reference links as we go.
6124
6125``` tree
6126document
6127  block_quote
6128    paragraph
6129      str "Lorem ipsum dolor"
6130      softbreak
6131      str "sit amet."
6132    list (type=bullet tight=true bullet_char=-)
6133      list_item
6134        paragraph
6135          str "Qui "
6136          emph
6137            str "quodsi iracundia"
6138      list_item
6139        paragraph
6140          str "aliquando id"
6141```
6142
6143Notice how the newline in the first paragraph has been parsed as
6144a `softbreak`, and the asterisks in the first list item have become
6145an `emph`.
6146
6147The document can be rendered as HTML, or in any other format, given
6148an appropriate renderer.
6149
6150
6151