what-is-markdown.md (202662B)
1 # Introduction 2 3 ## What is Markdown? 4 5 Markdown is a plain text format for writing structured documents, 6 based on conventions for indicating formatting in email 7 and usenet posts. It was developed by John Gruber (with 8 help from Aaron Swartz) and released in 2004 in the form of a 9 [syntax description](http://daringfireball.net/projects/markdown/syntax) 10 and a Perl script (`Markdown.pl`) for converting Markdown to 11 HTML. In the next decade, dozens of implementations were 12 developed in many languages. Some extended the original 13 Markdown syntax with conventions for footnotes, tables, and 14 other document elements. Some allowed Markdown documents to be 15 rendered in formats other than HTML. Websites like Reddit, 16 StackOverflow, and GitHub had millions of people using Markdown. 17 And Markdown started to be used beyond the web, to author books, 18 articles, slide shows, letters, and lecture notes. 19 20 What distinguishes Markdown from many other lightweight markup 21 syntaxes, which are often easier to write, is its readability. 22 As Gruber writes: 23 24 > The overriding design goal for Markdown's formatting syntax is 25 > to make it as readable as possible. The idea is that a 26 > Markdown-formatted document should be publishable as-is, as 27 > plain text, without looking like it's been marked up with tags 28 > or formatting instructions. 29 > (<http://daringfireball.net/projects/markdown/>) 30 31 The point can be illustrated by comparing a sample of 32 [AsciiDoc](http://www.methods.co.nz/asciidoc/) with 33 an equivalent sample of Markdown. Here is a sample of 34 AsciiDoc from the AsciiDoc manual: 35 36 ``` 37 1. List item one. 38 + 39 List item one continued with a second paragraph followed by an 40 Indented block. 41 + 42 ................. 43 $ ls *.sh 44 $ mv *.sh ~/tmp 45 ................. 46 + 47 List item continued with a third paragraph. 48 49 2. List item two continued with an open block. 50 + 51 -- 52 This paragraph is part of the preceding list item. 53 54 a. This list is nested and does not require explicit item 55 continuation. 56 + 57 This paragraph is part of the preceding list item. 58 59 b. List item b. 60 61 This paragraph belongs to item two of the outer list. 62 -- 63 ``` 64 65 And here is the equivalent in Markdown: 66 ``` 67 1. List item one. 68 69 List item one continued with a second paragraph followed by an 70 Indented block. 71 72 $ ls *.sh 73 $ mv *.sh ~/tmp 74 75 List item continued with a third paragraph. 76 77 2. List item two continued with an open block. 78 79 This paragraph is part of the preceding list item. 80 81 1. This list is nested and does not require explicit item continuation. 82 83 This paragraph is part of the preceding list item. 84 85 2. List item b. 86 87 This paragraph belongs to item two of the outer list. 88 ``` 89 90 The AsciiDoc version is, arguably, easier to write. You don't need 91 to worry about indentation. But the Markdown version is much easier 92 to read. The nesting of list items is apparent to the eye in the 93 source, not just in the processed document. 94 95 ## Why is a spec needed? 96 97 John Gruber's [canonical description of Markdown's 98 syntax](http://daringfireball.net/projects/markdown/syntax) 99 does not specify the syntax unambiguously. Here are some examples of 100 questions it does not answer: 101 102 1. How much indentation is needed for a sublist? The spec says that 103 continuation paragraphs need to be indented four spaces, but is 104 not fully explicit about sublists. It is natural to think that 105 they, too, must be indented four spaces, but `Markdown.pl` does 106 not require that. This is hardly a "corner case," and divergences 107 between implementations on this issue often lead to surprises for 108 users in real documents. (See [this comment by John 109 Gruber](http://article.gmane.org/gmane.text.markdown.general/1997).) 110 111 2. Is a blank line needed before a block quote or heading? 112 Most implementations do not require the blank line. However, 113 this can lead to unexpected results in hard-wrapped text, and 114 also to ambiguities in parsing (note that some implementations 115 put the heading inside the blockquote, while others do not). 116 (John Gruber has also spoken [in favor of requiring the blank 117 lines](http://article.gmane.org/gmane.text.markdown.general/2146).) 118 119 3. Is a blank line needed before an indented code block? 120 (`Markdown.pl` requires it, but this is not mentioned in the 121 documentation, and some implementations do not require it.) 122 123 ``` markdown 124 paragraph 125 code? 126 ``` 127 128 4. What is the exact rule for determining when list items get 129 wrapped in `<p>` tags? Can a list be partially "loose" and partially 130 "tight"? What should we do with a list like this? 131 132 ``` markdown 133 1. one 134 135 2. two 136 3. three 137 ``` 138 139 Or this? 140 141 ``` markdown 142 1. one 143 - a 144 145 - b 146 2. two 147 ``` 148 149 (There are some relevant comments by John Gruber 150 [here](http://article.gmane.org/gmane.text.markdown.general/2554).) 151 152 5. Can list markers be indented? Can ordered list markers be right-aligned? 153 154 ``` markdown 155 8. item 1 156 9. item 2 157 10. item 2a 158 ``` 159 160 6. Is this one list with a thematic break in its second item, 161 or two lists separated by a thematic break? 162 163 ``` markdown 164 * a 165 * * * * * 166 * b 167 ``` 168 169 7. When list markers change from numbers to bullets, do we have 170 two lists or one? (The Markdown syntax description suggests two, 171 but the perl scripts and many other implementations produce one.) 172 173 ``` markdown 174 1. fee 175 2. fie 176 - foe 177 - fum 178 ``` 179 180 8. What are the precedence rules for the markers of inline structure? 181 For example, is the following a valid link, or does the code span 182 take precedence ? 183 184 ``` markdown 185 [a backtick (`)](/url) and [another backtick (`)](/url). 186 ``` 187 188 9. What are the precedence rules for markers of emphasis and strong 189 emphasis? For example, how should the following be parsed? 190 191 ``` markdown 192 *foo *bar* baz* 193 ``` 194 195 10. What are the precedence rules between block-level and inline-level 196 structure? For example, how should the following be parsed? 197 198 ``` markdown 199 - `a long code span can contain a hyphen like this 200 - and it can screw things up` 201 ``` 202 203 11. Can list items include section headings? (`Markdown.pl` does not 204 allow this, but does allow blockquotes to include headings.) 205 206 ``` markdown 207 - # Heading 208 ``` 209 210 12. Can list items be empty? 211 212 ``` markdown 213 * a 214 * 215 * b 216 ``` 217 218 13. Can link references be defined inside block quotes or list items? 219 220 ``` markdown 221 > Blockquote [foo]. 222 > 223 > [foo]: /url 224 ``` 225 226 14. If there are multiple definitions for the same reference, which takes 227 precedence? 228 229 ``` markdown 230 [foo]: /url1 231 [foo]: /url2 232 233 [foo][] 234 ``` 235 236 In the absence of a spec, early implementers consulted `Markdown.pl` 237 to resolve these ambiguities. But `Markdown.pl` was quite buggy, and 238 gave manifestly bad results in many cases, so it was not a 239 satisfactory replacement for a spec. 240 241 Because there is no unambiguous spec, implementations have diverged 242 considerably. As a result, users are often surprised to find that 243 a document that renders one way on one system (say, a GitHub wiki) 244 renders differently on another (say, converting to docbook using 245 pandoc). To make matters worse, because nothing in Markdown counts 246 as a "syntax error," the divergence often isn't discovered right away. 247 248 ## About this document 249 250 This document attempts to specify Markdown syntax unambiguously. 251 It contains many examples with side-by-side Markdown and 252 HTML. These are intended to double as conformance tests. An 253 accompanying script `spec_tests.py` can be used to run the tests 254 against any Markdown program: 255 256 python test/spec_tests.py --spec spec.txt --program PROGRAM 257 258 Since this document describes how Markdown is to be parsed into 259 an abstract syntax tree, it would have made sense to use an abstract 260 representation of the syntax tree instead of HTML. But HTML is capable 261 of representing the structural distinctions we need to make, and the 262 choice of HTML for the tests makes it possible to run the tests against 263 an implementation without writing an abstract syntax tree renderer. 264 265 This document is generated from a text file, `spec.txt`, written 266 in Markdown with a small extension for the side-by-side tests. 267 The script `tools/makespec.py` can be used to convert `spec.txt` into 268 HTML or CommonMark (which can then be converted into other formats). 269 270 In the examples, the `→` character is used to represent tabs. 271 272 # Preliminaries 273 274 ## Characters and lines 275 276 Any sequence of [characters] is a valid CommonMark 277 document. 278 279 A [character](@) is a Unicode code point. Although some 280 code points (for example, combining accents) do not correspond to 281 characters in an intuitive sense, all code points count as characters 282 for purposes of this spec. 283 284 This spec does not specify an encoding; it thinks of lines as composed 285 of [characters] rather than bytes. A conforming parser may be limited 286 to a certain encoding. 287 288 A [line](@) is a sequence of zero or more [characters] 289 other than newline (`U+000A`) or carriage return (`U+000D`), 290 followed by a [line ending] or by the end of file. 291 292 A [line ending](@) is a newline (`U+000A`), a carriage return 293 (`U+000D`) not followed by a newline, or a carriage return and a 294 following newline. 295 296 A line containing no characters, or a line containing only spaces 297 (`U+0020`) or tabs (`U+0009`), is called a [blank line](@). 298 299 The following definitions of character classes will be used in this spec: 300 301 A [whitespace character](@) is a space 302 (`U+0020`), tab (`U+0009`), newline (`U+000A`), line tabulation (`U+000B`), 303 form feed (`U+000C`), or carriage return (`U+000D`). 304 305 [Whitespace](@) is a sequence of one or more [whitespace 306 characters]. 307 308 A [Unicode whitespace character](@) is 309 any code point in the Unicode `Zs` general category, or a tab (`U+0009`), 310 carriage return (`U+000D`), newline (`U+000A`), or form feed 311 (`U+000C`). 312 313 [Unicode whitespace](@) is a sequence of one 314 or more [Unicode whitespace characters]. 315 316 A [space](@) is `U+0020`. 317 318 A [non-whitespace character](@) is any character 319 that is not a [whitespace character]. 320 321 An [ASCII punctuation character](@) 322 is `!`, `"`, `#`, `$`, `%`, `&`, `'`, `(`, `)`, 323 `*`, `+`, `,`, `-`, `.`, `/` (U+0021–2F), 324 `:`, `;`, `<`, `=`, `>`, `?`, `@` (U+003A–0040), 325 `[`, `\`, `]`, `^`, `_`, `` ` `` (U+005B–0060), 326 `{`, `|`, `}`, or `~` (U+007B–007E). 327 328 A [punctuation character](@) is an [ASCII 329 punctuation character] or anything in 330 the general Unicode categories `Pc`, `Pd`, `Pe`, `Pf`, `Pi`, `Po`, or `Ps`. 331 332 ## Tabs 333 334 Tabs in lines are not expanded to [spaces]. However, 335 in contexts where whitespace helps to define block structure, 336 tabs behave as if they were replaced by spaces with a tab stop 337 of 4 characters. 338 339 Thus, for example, a tab can be used instead of four spaces 340 in an indented code block. (Note, however, that internal 341 tabs are passed through as literal tabs, not expanded to 342 spaces.) 343 344 ```````````````````````````````` example 345 →foo→baz→→bim 346 . 347 <pre><code>foo→baz→→bim 348 </code></pre> 349 ```````````````````````````````` 350 351 ```````````````````````````````` example 352 →foo→baz→→bim 353 . 354 <pre><code>foo→baz→→bim 355 </code></pre> 356 ```````````````````````````````` 357 358 ```````````````````````````````` example 359 a→a 360 ὐ→a 361 . 362 <pre><code>a→a 363 ὐ→a 364 </code></pre> 365 ```````````````````````````````` 366 367 In the following example, a continuation paragraph of a list 368 item is indented with a tab; this has exactly the same effect 369 as indentation with four spaces would: 370 371 ```````````````````````````````` example 372 - foo 373 374 →bar 375 . 376 <ul> 377 <li> 378 <p>foo</p> 379 <p>bar</p> 380 </li> 381 </ul> 382 ```````````````````````````````` 383 384 ```````````````````````````````` example 385 - foo 386 387 →→bar 388 . 389 <ul> 390 <li> 391 <p>foo</p> 392 <pre><code> bar 393 </code></pre> 394 </li> 395 </ul> 396 ```````````````````````````````` 397 398 Normally the `>` that begins a block quote may be followed 399 optionally by a space, which is not considered part of the 400 content. In the following case `>` is followed by a tab, 401 which is treated as if it were expanded into three spaces. 402 Since one of these spaces is considered part of the 403 delimiter, `foo` is considered to be indented six spaces 404 inside the block quote context, so we get an indented 405 code block starting with two spaces. 406 407 ```````````````````````````````` example 408 >→→foo 409 . 410 <blockquote> 411 <pre><code> foo 412 </code></pre> 413 </blockquote> 414 ```````````````````````````````` 415 416 ```````````````````````````````` example 417 -→→foo 418 . 419 <ul> 420 <li> 421 <pre><code> foo 422 </code></pre> 423 </li> 424 </ul> 425 ```````````````````````````````` 426 427 428 ```````````````````````````````` example 429 foo 430 →bar 431 . 432 <pre><code>foo 433 bar 434 </code></pre> 435 ```````````````````````````````` 436 437 ```````````````````````````````` example 438 - foo 439 - bar 440 → - baz 441 . 442 <ul> 443 <li>foo 444 <ul> 445 <li>bar 446 <ul> 447 <li>baz</li> 448 </ul> 449 </li> 450 </ul> 451 </li> 452 </ul> 453 ```````````````````````````````` 454 455 ```````````````````````````````` example 456 #→Foo 457 . 458 <h1>Foo</h1> 459 ```````````````````````````````` 460 461 ```````````````````````````````` example 462 *→*→*→ 463 . 464 <hr /> 465 ```````````````````````````````` 466 467 468 ## Insecure characters 469 470 For security reasons, the Unicode character `U+0000` must be replaced 471 with the REPLACEMENT CHARACTER (`U+FFFD`). 472 473 # Blocks and inlines 474 475 We can think of a document as a sequence of 476 [blocks](@)---structural elements like paragraphs, block 477 quotations, lists, headings, rules, and code blocks. Some blocks (like 478 block quotes and list items) contain other blocks; others (like 479 headings and paragraphs) contain [inline](@) content---text, 480 links, emphasized text, images, code spans, and so on. 481 482 ## Precedence 483 484 Indicators of block structure always take precedence over indicators 485 of inline structure. So, for example, the following is a list with 486 two items, not a list with one item containing a code span: 487 488 ```````````````````````````````` example 489 - `one 490 - two` 491 . 492 <ul> 493 <li>`one</li> 494 <li>two`</li> 495 </ul> 496 ```````````````````````````````` 497 498 499 This means that parsing can proceed in two steps: first, the block 500 structure of the document can be discerned; second, text lines inside 501 paragraphs, headings, and other block constructs can be parsed for inline 502 structure. The second step requires information about link reference 503 definitions that will be available only at the end of the first 504 step. Note that the first step requires processing lines in sequence, 505 but the second can be parallelized, since the inline parsing of 506 one block element does not affect the inline parsing of any other. 507 508 ## Container blocks and leaf blocks 509 510 We can divide blocks into two types: 511 [container blocks](@), 512 which can contain other blocks, and [leaf blocks](@), 513 which cannot. 514 515 # Leaf blocks 516 517 This section describes the different kinds of leaf block that make up a 518 Markdown document. 519 520 ## Thematic breaks 521 522 A line consisting of 0-3 spaces of indentation, followed by a sequence 523 of three or more matching `-`, `_`, or `*` characters, each followed 524 optionally by any number of spaces or tabs, forms a 525 [thematic break](@). 526 527 ```````````````````````````````` example 528 *** 529 --- 530 ___ 531 . 532 <hr /> 533 <hr /> 534 <hr /> 535 ```````````````````````````````` 536 537 538 Wrong characters: 539 540 ```````````````````````````````` example 541 +++ 542 . 543 <p>+++</p> 544 ```````````````````````````````` 545 546 547 ```````````````````````````````` example 548 === 549 . 550 <p>===</p> 551 ```````````````````````````````` 552 553 554 Not enough characters: 555 556 ```````````````````````````````` example 557 -- 558 ** 559 __ 560 . 561 <p>-- 562 ** 563 __</p> 564 ```````````````````````````````` 565 566 567 One to three spaces indent are allowed: 568 569 ```````````````````````````````` example 570 *** 571 *** 572 *** 573 . 574 <hr /> 575 <hr /> 576 <hr /> 577 ```````````````````````````````` 578 579 580 Four spaces is too many: 581 582 ```````````````````````````````` example 583 *** 584 . 585 <pre><code>*** 586 </code></pre> 587 ```````````````````````````````` 588 589 590 ```````````````````````````````` example 591 Foo 592 *** 593 . 594 <p>Foo 595 ***</p> 596 ```````````````````````````````` 597 598 599 More than three characters may be used: 600 601 ```````````````````````````````` example 602 _____________________________________ 603 . 604 <hr /> 605 ```````````````````````````````` 606 607 608 Spaces are allowed between the characters: 609 610 ```````````````````````````````` example 611 - - - 612 . 613 <hr /> 614 ```````````````````````````````` 615 616 617 ```````````````````````````````` example 618 ** * ** * ** * ** 619 . 620 <hr /> 621 ```````````````````````````````` 622 623 624 ```````````````````````````````` example 625 - - - - 626 . 627 <hr /> 628 ```````````````````````````````` 629 630 631 Spaces are allowed at the end: 632 633 ```````````````````````````````` example 634 - - - - 635 . 636 <hr /> 637 ```````````````````````````````` 638 639 640 However, no other characters may occur in the line: 641 642 ```````````````````````````````` example 643 _ _ _ _ a 644 645 a------ 646 647 ---a--- 648 . 649 <p>_ _ _ _ a</p> 650 <p>a------</p> 651 <p>---a---</p> 652 ```````````````````````````````` 653 654 655 It is required that all of the [non-whitespace characters] be the same. 656 So, this is not a thematic break: 657 658 ```````````````````````````````` example 659 *-* 660 . 661 <p><em>-</em></p> 662 ```````````````````````````````` 663 664 665 Thematic breaks do not need blank lines before or after: 666 667 ```````````````````````````````` example 668 - foo 669 *** 670 - bar 671 . 672 <ul> 673 <li>foo</li> 674 </ul> 675 <hr /> 676 <ul> 677 <li>bar</li> 678 </ul> 679 ```````````````````````````````` 680 681 682 Thematic breaks can interrupt a paragraph: 683 684 ```````````````````````````````` example 685 Foo 686 *** 687 bar 688 . 689 <p>Foo</p> 690 <hr /> 691 <p>bar</p> 692 ```````````````````````````````` 693 694 695 If a line of dashes that meets the above conditions for being a 696 thematic break could also be interpreted as the underline of a [setext 697 heading], the interpretation as a 698 [setext heading] takes precedence. Thus, for example, 699 this is a setext heading, not a paragraph followed by a thematic break: 700 701 ```````````````````````````````` example 702 Foo 703 --- 704 bar 705 . 706 <h2>Foo</h2> 707 <p>bar</p> 708 ```````````````````````````````` 709 710 711 When both a thematic break and a list item are possible 712 interpretations of a line, the thematic break takes precedence: 713 714 ```````````````````````````````` example 715 * Foo 716 * * * 717 * Bar 718 . 719 <ul> 720 <li>Foo</li> 721 </ul> 722 <hr /> 723 <ul> 724 <li>Bar</li> 725 </ul> 726 ```````````````````````````````` 727 728 729 If you want a thematic break in a list item, use a different bullet: 730 731 ```````````````````````````````` example 732 - Foo 733 - * * * 734 . 735 <ul> 736 <li>Foo</li> 737 <li> 738 <hr /> 739 </li> 740 </ul> 741 ```````````````````````````````` 742 743 744 ## ATX headings 745 746 An [ATX heading](@) 747 consists of a string of characters, parsed as inline content, between an 748 opening sequence of 1--6 unescaped `#` characters and an optional 749 closing sequence of any number of unescaped `#` characters. 750 The opening sequence of `#` characters must be followed by a 751 [space] or by the end of line. The optional closing sequence of `#`s must be 752 preceded by a [space] and may be followed by spaces only. The opening 753 `#` character may be indented 0-3 spaces. The raw contents of the 754 heading are stripped of leading and trailing spaces before being parsed 755 as inline content. The heading level is equal to the number of `#` 756 characters in the opening sequence. 757 758 Simple headings: 759 760 ```````````````````````````````` example 761 # foo 762 ## foo 763 ### foo 764 #### foo 765 ##### foo 766 ###### foo 767 . 768 <h1>foo</h1> 769 <h2>foo</h2> 770 <h3>foo</h3> 771 <h4>foo</h4> 772 <h5>foo</h5> 773 <h6>foo</h6> 774 ```````````````````````````````` 775 776 777 More than six `#` characters is not a heading: 778 779 ```````````````````````````````` example 780 ####### foo 781 . 782 <p>####### foo</p> 783 ```````````````````````````````` 784 785 786 At least one space is required between the `#` characters and the 787 heading's contents, unless the heading is empty. Note that many 788 implementations currently do not require the space. However, the 789 space was required by the 790 [original ATX implementation](http://www.aaronsw.com/2002/atx/atx.py), 791 and it helps prevent things like the following from being parsed as 792 headings: 793 794 ```````````````````````````````` example 795 #5 bolt 796 797 #hashtag 798 . 799 <p>#5 bolt</p> 800 <p>#hashtag</p> 801 ```````````````````````````````` 802 803 804 This is not a heading, because the first `#` is escaped: 805 806 ```````````````````````````````` example 807 \## foo 808 . 809 <p>## foo</p> 810 ```````````````````````````````` 811 812 813 Contents are parsed as inlines: 814 815 ```````````````````````````````` example 816 # foo *bar* \*baz\* 817 . 818 <h1>foo <em>bar</em> *baz*</h1> 819 ```````````````````````````````` 820 821 822 Leading and trailing [whitespace] is ignored in parsing inline content: 823 824 ```````````````````````````````` example 825 # foo 826 . 827 <h1>foo</h1> 828 ```````````````````````````````` 829 830 831 One to three spaces indentation are allowed: 832 833 ```````````````````````````````` example 834 ### foo 835 ## foo 836 # foo 837 . 838 <h3>foo</h3> 839 <h2>foo</h2> 840 <h1>foo</h1> 841 ```````````````````````````````` 842 843 844 Four spaces are too much: 845 846 ```````````````````````````````` example 847 # foo 848 . 849 <pre><code># foo 850 </code></pre> 851 ```````````````````````````````` 852 853 854 ```````````````````````````````` example 855 foo 856 # bar 857 . 858 <p>foo 859 # bar</p> 860 ```````````````````````````````` 861 862 863 A closing sequence of `#` characters is optional: 864 865 ```````````````````````````````` example 866 ## foo ## 867 ### bar ### 868 . 869 <h2>foo</h2> 870 <h3>bar</h3> 871 ```````````````````````````````` 872 873 874 It need not be the same length as the opening sequence: 875 876 ```````````````````````````````` example 877 # foo ################################## 878 ##### foo ## 879 . 880 <h1>foo</h1> 881 <h5>foo</h5> 882 ```````````````````````````````` 883 884 885 Spaces are allowed after the closing sequence: 886 887 ```````````````````````````````` example 888 ### foo ### 889 . 890 <h3>foo</h3> 891 ```````````````````````````````` 892 893 894 A sequence of `#` characters with anything but [spaces] following it 895 is not a closing sequence, but counts as part of the contents of the 896 heading: 897 898 ```````````````````````````````` example 899 ### foo ### b 900 . 901 <h3>foo ### b</h3> 902 ```````````````````````````````` 903 904 905 The closing sequence must be preceded by a space: 906 907 ```````````````````````````````` example 908 # foo# 909 . 910 <h1>foo#</h1> 911 ```````````````````````````````` 912 913 914 Backslash-escaped `#` characters do not count as part 915 of the closing sequence: 916 917 ```````````````````````````````` example 918 ### foo \### 919 ## foo #\## 920 # foo \# 921 . 922 <h3>foo ###</h3> 923 <h2>foo ###</h2> 924 <h1>foo #</h1> 925 ```````````````````````````````` 926 927 928 ATX headings need not be separated from surrounding content by blank 929 lines, and they can interrupt paragraphs: 930 931 ```````````````````````````````` example 932 **** 933 ## foo 934 **** 935 . 936 <hr /> 937 <h2>foo</h2> 938 <hr /> 939 ```````````````````````````````` 940 941 942 ```````````````````````````````` example 943 Foo bar 944 # baz 945 Bar foo 946 . 947 <p>Foo bar</p> 948 <h1>baz</h1> 949 <p>Bar foo</p> 950 ```````````````````````````````` 951 952 953 ATX headings can be empty: 954 955 ```````````````````````````````` example 956 ## 957 # 958 ### ### 959 . 960 <h2></h2> 961 <h1></h1> 962 <h3></h3> 963 ```````````````````````````````` 964 965 966 ## Setext headings 967 968 A [setext heading](@) consists of one or more 969 lines of text, each containing at least one [non-whitespace 970 character], with no more than 3 spaces indentation, followed by 971 a [setext heading underline]. The lines of text must be such 972 that, were they not followed by the setext heading underline, 973 they would be interpreted as a paragraph: they cannot be 974 interpretable as a [code fence], [ATX heading][ATX headings], 975 [block quote][block quotes], [thematic break][thematic breaks], 976 [list item][list items], or [HTML block][HTML blocks]. 977 978 A [setext heading underline](@) is a sequence of 979 `=` characters or a sequence of `-` characters, with no more than 3 980 spaces indentation and any number of trailing spaces. If a line 981 containing a single `-` can be interpreted as an 982 empty [list items], it should be interpreted this way 983 and not as a [setext heading underline]. 984 985 The heading is a level 1 heading if `=` characters are used in 986 the [setext heading underline], and a level 2 heading if `-` 987 characters are used. The contents of the heading are the result 988 of parsing the preceding lines of text as CommonMark inline 989 content. 990 991 In general, a setext heading need not be preceded or followed by a 992 blank line. However, it cannot interrupt a paragraph, so when a 993 setext heading comes after a paragraph, a blank line is needed between 994 them. 995 996 Simple examples: 997 998 ```````````````````````````````` example 999 Foo *bar* 1000 ========= 1001 1002 Foo *bar* 1003 --------- 1004 . 1005 <h1>Foo <em>bar</em></h1> 1006 <h2>Foo <em>bar</em></h2> 1007 ```````````````````````````````` 1008 1009 1010 The content of the header may span more than one line: 1011 1012 ```````````````````````````````` example 1013 Foo *bar 1014 baz* 1015 ==== 1016 . 1017 <h1>Foo <em>bar 1018 baz</em></h1> 1019 ```````````````````````````````` 1020 1021 The contents are the result of parsing the headings's raw 1022 content as inlines. The heading's raw content is formed by 1023 concatenating the lines and removing initial and final 1024 [whitespace]. 1025 1026 ```````````````````````````````` example 1027 Foo *bar 1028 baz*→ 1029 ==== 1030 . 1031 <h1>Foo <em>bar 1032 baz</em></h1> 1033 ```````````````````````````````` 1034 1035 1036 The underlining can be any length: 1037 1038 ```````````````````````````````` example 1039 Foo 1040 ------------------------- 1041 1042 Foo 1043 = 1044 . 1045 <h2>Foo</h2> 1046 <h1>Foo</h1> 1047 ```````````````````````````````` 1048 1049 1050 The heading content can be indented up to three spaces, and need 1051 not line up with the underlining: 1052 1053 ```````````````````````````````` example 1054 Foo 1055 --- 1056 1057 Foo 1058 ----- 1059 1060 Foo 1061 === 1062 . 1063 <h2>Foo</h2> 1064 <h2>Foo</h2> 1065 <h1>Foo</h1> 1066 ```````````````````````````````` 1067 1068 1069 Four spaces indent is too much: 1070 1071 ```````````````````````````````` example 1072 Foo 1073 --- 1074 1075 Foo 1076 --- 1077 . 1078 <pre><code>Foo 1079 --- 1080 1081 Foo 1082 </code></pre> 1083 <hr /> 1084 ```````````````````````````````` 1085 1086 1087 The setext heading underline can be indented up to three spaces, and 1088 may have trailing spaces: 1089 1090 ```````````````````````````````` example 1091 Foo 1092 ---- 1093 . 1094 <h2>Foo</h2> 1095 ```````````````````````````````` 1096 1097 1098 Four spaces is too much: 1099 1100 ```````````````````````````````` example 1101 Foo 1102 --- 1103 . 1104 <p>Foo 1105 ---</p> 1106 ```````````````````````````````` 1107 1108 1109 The setext heading underline cannot contain internal spaces: 1110 1111 ```````````````````````````````` example 1112 Foo 1113 = = 1114 1115 Foo 1116 --- - 1117 . 1118 <p>Foo 1119 = =</p> 1120 <p>Foo</p> 1121 <hr /> 1122 ```````````````````````````````` 1123 1124 1125 Trailing spaces in the content line do not cause a line break: 1126 1127 ```````````````````````````````` example 1128 Foo 1129 ----- 1130 . 1131 <h2>Foo</h2> 1132 ```````````````````````````````` 1133 1134 1135 Nor does a backslash at the end: 1136 1137 ```````````````````````````````` example 1138 Foo\ 1139 ---- 1140 . 1141 <h2>Foo\</h2> 1142 ```````````````````````````````` 1143 1144 1145 Since indicators of block structure take precedence over 1146 indicators of inline structure, the following are setext headings: 1147 1148 ```````````````````````````````` example 1149 `Foo 1150 ---- 1151 ` 1152 1153 <a title="a lot 1154 --- 1155 of dashes"/> 1156 . 1157 <h2>`Foo</h2> 1158 <p>`</p> 1159 <h2><a title="a lot</h2> 1160 <p>of dashes"/></p> 1161 ```````````````````````````````` 1162 1163 1164 The setext heading underline cannot be a [lazy continuation 1165 line] in a list item or block quote: 1166 1167 ```````````````````````````````` example 1168 > Foo 1169 --- 1170 . 1171 <blockquote> 1172 <p>Foo</p> 1173 </blockquote> 1174 <hr /> 1175 ```````````````````````````````` 1176 1177 1178 ```````````````````````````````` example 1179 > foo 1180 bar 1181 === 1182 . 1183 <blockquote> 1184 <p>foo 1185 bar 1186 ===</p> 1187 </blockquote> 1188 ```````````````````````````````` 1189 1190 1191 ```````````````````````````````` example 1192 - Foo 1193 --- 1194 . 1195 <ul> 1196 <li>Foo</li> 1197 </ul> 1198 <hr /> 1199 ```````````````````````````````` 1200 1201 1202 A blank line is needed between a paragraph and a following 1203 setext heading, since otherwise the paragraph becomes part 1204 of the heading's content: 1205 1206 ```````````````````````````````` example 1207 Foo 1208 Bar 1209 --- 1210 . 1211 <h2>Foo 1212 Bar</h2> 1213 ```````````````````````````````` 1214 1215 1216 But in general a blank line is not required before or after 1217 setext headings: 1218 1219 ```````````````````````````````` example 1220 --- 1221 Foo 1222 --- 1223 Bar 1224 --- 1225 Baz 1226 . 1227 <hr /> 1228 <h2>Foo</h2> 1229 <h2>Bar</h2> 1230 <p>Baz</p> 1231 ```````````````````````````````` 1232 1233 1234 Setext headings cannot be empty: 1235 1236 ```````````````````````````````` example 1237 1238 ==== 1239 . 1240 <p>====</p> 1241 ```````````````````````````````` 1242 1243 1244 Setext heading text lines must not be interpretable as block 1245 constructs other than paragraphs. So, the line of dashes 1246 in these examples gets interpreted as a thematic break: 1247 1248 ```````````````````````````````` example 1249 --- 1250 --- 1251 . 1252 <hr /> 1253 <hr /> 1254 ```````````````````````````````` 1255 1256 1257 ```````````````````````````````` example 1258 - foo 1259 ----- 1260 . 1261 <ul> 1262 <li>foo</li> 1263 </ul> 1264 <hr /> 1265 ```````````````````````````````` 1266 1267 1268 ```````````````````````````````` example 1269 foo 1270 --- 1271 . 1272 <pre><code>foo 1273 </code></pre> 1274 <hr /> 1275 ```````````````````````````````` 1276 1277 1278 ```````````````````````````````` example 1279 > foo 1280 ----- 1281 . 1282 <blockquote> 1283 <p>foo</p> 1284 </blockquote> 1285 <hr /> 1286 ```````````````````````````````` 1287 1288 1289 If you want a heading with `> foo` as its literal text, you can 1290 use backslash escapes: 1291 1292 ```````````````````````````````` example 1293 \> foo 1294 ------ 1295 . 1296 <h2>> foo</h2> 1297 ```````````````````````````````` 1298 1299 1300 **Compatibility note:** Most existing Markdown implementations 1301 do not allow the text of setext headings to span multiple lines. 1302 But there is no consensus about how to interpret 1303 1304 ``` markdown 1305 Foo 1306 bar 1307 --- 1308 baz 1309 ``` 1310 1311 One can find four different interpretations: 1312 1313 1. paragraph "Foo", heading "bar", paragraph "baz" 1314 2. paragraph "Foo bar", thematic break, paragraph "baz" 1315 3. paragraph "Foo bar --- baz" 1316 4. heading "Foo bar", paragraph "baz" 1317 1318 We find interpretation 4 most natural, and interpretation 4 1319 increases the expressive power of CommonMark, by allowing 1320 multiline headings. Authors who want interpretation 1 can 1321 put a blank line after the first paragraph: 1322 1323 ```````````````````````````````` example 1324 Foo 1325 1326 bar 1327 --- 1328 baz 1329 . 1330 <p>Foo</p> 1331 <h2>bar</h2> 1332 <p>baz</p> 1333 ```````````````````````````````` 1334 1335 1336 Authors who want interpretation 2 can put blank lines around 1337 the thematic break, 1338 1339 ```````````````````````````````` example 1340 Foo 1341 bar 1342 1343 --- 1344 1345 baz 1346 . 1347 <p>Foo 1348 bar</p> 1349 <hr /> 1350 <p>baz</p> 1351 ```````````````````````````````` 1352 1353 1354 or use a thematic break that cannot count as a [setext heading 1355 underline], such as 1356 1357 ```````````````````````````````` example 1358 Foo 1359 bar 1360 * * * 1361 baz 1362 . 1363 <p>Foo 1364 bar</p> 1365 <hr /> 1366 <p>baz</p> 1367 ```````````````````````````````` 1368 1369 1370 Authors who want interpretation 3 can use backslash escapes: 1371 1372 ```````````````````````````````` example 1373 Foo 1374 bar 1375 \--- 1376 baz 1377 . 1378 <p>Foo 1379 bar 1380 --- 1381 baz</p> 1382 ```````````````````````````````` 1383 1384 1385 ## Indented code blocks 1386 1387 An [indented code block](@) is composed of one or more 1388 [indented chunks] separated by blank lines. 1389 An [indented chunk](@) is a sequence of non-blank lines, 1390 each indented four or more spaces. The contents of the code block are 1391 the literal contents of the lines, including trailing 1392 [line endings], minus four spaces of indentation. 1393 An indented code block has no [info string]. 1394 1395 An indented code block cannot interrupt a paragraph, so there must be 1396 a blank line between a paragraph and a following indented code block. 1397 (A blank line is not needed, however, between a code block and a following 1398 paragraph.) 1399 1400 ```````````````````````````````` example 1401 a simple 1402 indented code block 1403 . 1404 <pre><code>a simple 1405 indented code block 1406 </code></pre> 1407 ```````````````````````````````` 1408 1409 1410 If there is any ambiguity between an interpretation of indentation 1411 as a code block and as indicating that material belongs to a [list 1412 item][list items], the list item interpretation takes precedence: 1413 1414 ```````````````````````````````` example 1415 - foo 1416 1417 bar 1418 . 1419 <ul> 1420 <li> 1421 <p>foo</p> 1422 <p>bar</p> 1423 </li> 1424 </ul> 1425 ```````````````````````````````` 1426 1427 1428 ```````````````````````````````` example 1429 1. foo 1430 1431 - bar 1432 . 1433 <ol> 1434 <li> 1435 <p>foo</p> 1436 <ul> 1437 <li>bar</li> 1438 </ul> 1439 </li> 1440 </ol> 1441 ```````````````````````````````` 1442 1443 1444 1445 The contents of a code block are literal text, and do not get parsed 1446 as Markdown: 1447 1448 ```````````````````````````````` example 1449 <a/> 1450 *hi* 1451 1452 - one 1453 . 1454 <pre><code><a/> 1455 *hi* 1456 1457 - one 1458 </code></pre> 1459 ```````````````````````````````` 1460 1461 1462 Here we have three chunks separated by blank lines: 1463 1464 ```````````````````````````````` example 1465 chunk1 1466 1467 chunk2 1468 1469 1470 1471 chunk3 1472 . 1473 <pre><code>chunk1 1474 1475 chunk2 1476 1477 1478 1479 chunk3 1480 </code></pre> 1481 ```````````````````````````````` 1482 1483 1484 Any initial spaces beyond four will be included in the content, even 1485 in interior blank lines: 1486 1487 ```````````````````````````````` example 1488 chunk1 1489 1490 chunk2 1491 . 1492 <pre><code>chunk1 1493 1494 chunk2 1495 </code></pre> 1496 ```````````````````````````````` 1497 1498 1499 An indented code block cannot interrupt a paragraph. (This 1500 allows hanging indents and the like.) 1501 1502 ```````````````````````````````` example 1503 Foo 1504 bar 1505 1506 . 1507 <p>Foo 1508 bar</p> 1509 ```````````````````````````````` 1510 1511 1512 However, any non-blank line with fewer than four leading spaces ends 1513 the code block immediately. So a paragraph may occur immediately 1514 after indented code: 1515 1516 ```````````````````````````````` example 1517 foo 1518 bar 1519 . 1520 <pre><code>foo 1521 </code></pre> 1522 <p>bar</p> 1523 ```````````````````````````````` 1524 1525 1526 And indented code can occur immediately before and after other kinds of 1527 blocks: 1528 1529 ```````````````````````````````` example 1530 # Heading 1531 foo 1532 Heading 1533 ------ 1534 foo 1535 ---- 1536 . 1537 <h1>Heading</h1> 1538 <pre><code>foo 1539 </code></pre> 1540 <h2>Heading</h2> 1541 <pre><code>foo 1542 </code></pre> 1543 <hr /> 1544 ```````````````````````````````` 1545 1546 1547 The first line can be indented more than four spaces: 1548 1549 ```````````````````````````````` example 1550 foo 1551 bar 1552 . 1553 <pre><code> foo 1554 bar 1555 </code></pre> 1556 ```````````````````````````````` 1557 1558 1559 Blank lines preceding or following an indented code block 1560 are not included in it: 1561 1562 ```````````````````````````````` example 1563 1564 1565 foo 1566 1567 1568 . 1569 <pre><code>foo 1570 </code></pre> 1571 ```````````````````````````````` 1572 1573 1574 Trailing spaces are included in the code block's content: 1575 1576 ```````````````````````````````` example 1577 foo 1578 . 1579 <pre><code>foo 1580 </code></pre> 1581 ```````````````````````````````` 1582 1583 1584 1585 ## Fenced code blocks 1586 1587 A [code fence](@) is a sequence 1588 of at least three consecutive backtick characters (`` ` ``) or 1589 tildes (`~`). (Tildes and backticks cannot be mixed.) 1590 A [fenced code block](@) 1591 begins with a code fence, indented no more than three spaces. 1592 1593 The line with the opening code fence may optionally contain some text 1594 following the code fence; this is trimmed of leading and trailing 1595 whitespace and called the [info string](@). If the [info string] comes 1596 after a backtick fence, it may not contain any backtick 1597 characters. (The reason for this restriction is that otherwise 1598 some inline code would be incorrectly interpreted as the 1599 beginning of a fenced code block.) 1600 1601 The content of the code block consists of all subsequent lines, until 1602 a closing [code fence] of the same type as the code block 1603 began with (backticks or tildes), and with at least as many backticks 1604 or tildes as the opening code fence. If the leading code fence is 1605 indented N spaces, then up to N spaces of indentation are removed from 1606 each line of the content (if present). (If a content line is not 1607 indented, it is preserved unchanged. If it is indented less than N 1608 spaces, all of the indentation is removed.) 1609 1610 The closing code fence may be indented up to three spaces, and may be 1611 followed only by spaces, which are ignored. If the end of the 1612 containing block (or document) is reached and no closing code fence 1613 has been found, the code block contains all of the lines after the 1614 opening code fence until the end of the containing block (or 1615 document). (An alternative spec would require backtracking in the 1616 event that a closing code fence is not found. But this makes parsing 1617 much less efficient, and there seems to be no real down side to the 1618 behavior described here.) 1619 1620 A fenced code block may interrupt a paragraph, and does not require 1621 a blank line either before or after. 1622 1623 The content of a code fence is treated as literal text, not parsed 1624 as inlines. The first word of the [info string] is typically used to 1625 specify the language of the code sample, and rendered in the `class` 1626 attribute of the `code` tag. However, this spec does not mandate any 1627 particular treatment of the [info string]. 1628 1629 Here is a simple example with backticks: 1630 1631 ```````````````````````````````` example 1632 ``` 1633 < 1634 > 1635 ``` 1636 . 1637 <pre><code>< 1638 > 1639 </code></pre> 1640 ```````````````````````````````` 1641 1642 1643 With tildes: 1644 1645 ```````````````````````````````` example 1646 ~~~ 1647 < 1648 > 1649 ~~~ 1650 . 1651 <pre><code>< 1652 > 1653 </code></pre> 1654 ```````````````````````````````` 1655 1656 Fewer than three backticks is not enough: 1657 1658 ```````````````````````````````` example 1659 `` 1660 foo 1661 `` 1662 . 1663 <p><code>foo</code></p> 1664 ```````````````````````````````` 1665 1666 The closing code fence must use the same character as the opening 1667 fence: 1668 1669 ```````````````````````````````` example 1670 ``` 1671 aaa 1672 ~~~ 1673 ``` 1674 . 1675 <pre><code>aaa 1676 ~~~ 1677 </code></pre> 1678 ```````````````````````````````` 1679 1680 1681 ```````````````````````````````` example 1682 ~~~ 1683 aaa 1684 ``` 1685 ~~~ 1686 . 1687 <pre><code>aaa 1688 ``` 1689 </code></pre> 1690 ```````````````````````````````` 1691 1692 1693 The closing code fence must be at least as long as the opening fence: 1694 1695 ```````````````````````````````` example 1696 ```` 1697 aaa 1698 ``` 1699 `````` 1700 . 1701 <pre><code>aaa 1702 ``` 1703 </code></pre> 1704 ```````````````````````````````` 1705 1706 1707 ```````````````````````````````` example 1708 ~~~~ 1709 aaa 1710 ~~~ 1711 ~~~~ 1712 . 1713 <pre><code>aaa 1714 ~~~ 1715 </code></pre> 1716 ```````````````````````````````` 1717 1718 1719 Unclosed code blocks are closed by the end of the document 1720 (or the enclosing [block quote][block quotes] or [list item][list items]): 1721 1722 ```````````````````````````````` example 1723 ``` 1724 . 1725 <pre><code></code></pre> 1726 ```````````````````````````````` 1727 1728 1729 ```````````````````````````````` example 1730 ````` 1731 1732 ``` 1733 aaa 1734 . 1735 <pre><code> 1736 ``` 1737 aaa 1738 </code></pre> 1739 ```````````````````````````````` 1740 1741 1742 ```````````````````````````````` example 1743 > ``` 1744 > aaa 1745 1746 bbb 1747 . 1748 <blockquote> 1749 <pre><code>aaa 1750 </code></pre> 1751 </blockquote> 1752 <p>bbb</p> 1753 ```````````````````````````````` 1754 1755 1756 A code block can have all empty lines as its content: 1757 1758 ```````````````````````````````` example 1759 ``` 1760 1761 1762 ``` 1763 . 1764 <pre><code> 1765 1766 </code></pre> 1767 ```````````````````````````````` 1768 1769 1770 A code block can be empty: 1771 1772 ```````````````````````````````` example 1773 ``` 1774 ``` 1775 . 1776 <pre><code></code></pre> 1777 ```````````````````````````````` 1778 1779 1780 Fences can be indented. If the opening fence is indented, 1781 content lines will have equivalent opening indentation removed, 1782 if present: 1783 1784 ```````````````````````````````` example 1785 ``` 1786 aaa 1787 aaa 1788 ``` 1789 . 1790 <pre><code>aaa 1791 aaa 1792 </code></pre> 1793 ```````````````````````````````` 1794 1795 1796 ```````````````````````````````` example 1797 ``` 1798 aaa 1799 aaa 1800 aaa 1801 ``` 1802 . 1803 <pre><code>aaa 1804 aaa 1805 aaa 1806 </code></pre> 1807 ```````````````````````````````` 1808 1809 1810 ```````````````````````````````` example 1811 ``` 1812 aaa 1813 aaa 1814 aaa 1815 ``` 1816 . 1817 <pre><code>aaa 1818 aaa 1819 aaa 1820 </code></pre> 1821 ```````````````````````````````` 1822 1823 1824 Four spaces indentation produces an indented code block: 1825 1826 ```````````````````````````````` example 1827 ``` 1828 aaa 1829 ``` 1830 . 1831 <pre><code>``` 1832 aaa 1833 ``` 1834 </code></pre> 1835 ```````````````````````````````` 1836 1837 1838 Closing fences may be indented by 0-3 spaces, and their indentation 1839 need not match that of the opening fence: 1840 1841 ```````````````````````````````` example 1842 ``` 1843 aaa 1844 ``` 1845 . 1846 <pre><code>aaa 1847 </code></pre> 1848 ```````````````````````````````` 1849 1850 1851 ```````````````````````````````` example 1852 ``` 1853 aaa 1854 ``` 1855 . 1856 <pre><code>aaa 1857 </code></pre> 1858 ```````````````````````````````` 1859 1860 1861 This is not a closing fence, because it is indented 4 spaces: 1862 1863 ```````````````````````````````` example 1864 ``` 1865 aaa 1866 ``` 1867 . 1868 <pre><code>aaa 1869 ``` 1870 </code></pre> 1871 ```````````````````````````````` 1872 1873 1874 1875 Code fences (opening and closing) cannot contain internal spaces: 1876 1877 ```````````````````````````````` example 1878 ``` ``` 1879 aaa 1880 . 1881 <p><code> </code> 1882 aaa</p> 1883 ```````````````````````````````` 1884 1885 1886 ```````````````````````````````` example 1887 ~~~~~~ 1888 aaa 1889 ~~~ ~~ 1890 . 1891 <pre><code>aaa 1892 ~~~ ~~ 1893 </code></pre> 1894 ```````````````````````````````` 1895 1896 1897 Fenced code blocks can interrupt paragraphs, and can be followed 1898 directly by paragraphs, without a blank line between: 1899 1900 ```````````````````````````````` example 1901 foo 1902 ``` 1903 bar 1904 ``` 1905 baz 1906 . 1907 <p>foo</p> 1908 <pre><code>bar 1909 </code></pre> 1910 <p>baz</p> 1911 ```````````````````````````````` 1912 1913 1914 Other blocks can also occur before and after fenced code blocks 1915 without an intervening blank line: 1916 1917 ```````````````````````````````` example 1918 foo 1919 --- 1920 ~~~ 1921 bar 1922 ~~~ 1923 # baz 1924 . 1925 <h2>foo</h2> 1926 <pre><code>bar 1927 </code></pre> 1928 <h1>baz</h1> 1929 ```````````````````````````````` 1930 1931 1932 An [info string] can be provided after the opening code fence. 1933 Although this spec doesn't mandate any particular treatment of 1934 the info string, the first word is typically used to specify 1935 the language of the code block. In HTML output, the language is 1936 normally indicated by adding a class to the `code` element consisting 1937 of `language-` followed by the language name. 1938 1939 ```````````````````````````````` example 1940 ```ruby 1941 def foo(x) 1942 return 3 1943 end 1944 ``` 1945 . 1946 <pre><code class="language-ruby">def foo(x) 1947 return 3 1948 end 1949 </code></pre> 1950 ```````````````````````````````` 1951 1952 1953 ```````````````````````````````` example 1954 ~~~~ ruby startline=3 $%@#$ 1955 def foo(x) 1956 return 3 1957 end 1958 ~~~~~~~ 1959 . 1960 <pre><code class="language-ruby">def foo(x) 1961 return 3 1962 end 1963 </code></pre> 1964 ```````````````````````````````` 1965 1966 1967 ```````````````````````````````` example 1968 ````; 1969 ```` 1970 . 1971 <pre><code class="language-;"></code></pre> 1972 ```````````````````````````````` 1973 1974 1975 [Info strings] for backtick code blocks cannot contain backticks: 1976 1977 ```````````````````````````````` example 1978 ``` aa ``` 1979 foo 1980 . 1981 <p><code>aa</code> 1982 foo</p> 1983 ```````````````````````````````` 1984 1985 1986 [Info strings] for tilde code blocks can contain backticks and tildes: 1987 1988 ```````````````````````````````` example 1989 ~~~ aa ``` ~~~ 1990 foo 1991 ~~~ 1992 . 1993 <pre><code class="language-aa">foo 1994 </code></pre> 1995 ```````````````````````````````` 1996 1997 1998 Closing code fences cannot have [info strings]: 1999 2000 ```````````````````````````````` example 2001 ``` 2002 ``` aaa 2003 ``` 2004 . 2005 <pre><code>``` aaa 2006 </code></pre> 2007 ```````````````````````````````` 2008 2009 2010 2011 ## HTML blocks 2012 2013 An [HTML block](@) is a group of lines that is treated 2014 as raw HTML (and will not be escaped in HTML output). 2015 2016 There are seven kinds of [HTML block], which can be defined by their 2017 start and end conditions. The block begins with a line that meets a 2018 [start condition](@) (after up to three spaces optional indentation). 2019 It ends with the first subsequent line that meets a matching [end 2020 condition](@), or the last line of the document, or the last line of 2021 the [container block](#container-blocks) containing the current HTML 2022 block, if no line is encountered that meets the [end condition]. If 2023 the first line meets both the [start condition] and the [end 2024 condition], the block will contain just that line. 2025 2026 1. **Start condition:** line begins with the string `<script`, 2027 `<pre`, or `<style` (case-insensitive), followed by whitespace, 2028 the string `>`, or the end of the line.\ 2029 **End condition:** line contains an end tag 2030 `</script>`, `</pre>`, or `</style>` (case-insensitive; it 2031 need not match the start tag). 2032 2033 2. **Start condition:** line begins with the string `<!--`.\ 2034 **End condition:** line contains the string `-->`. 2035 2036 3. **Start condition:** line begins with the string `<?`.\ 2037 **End condition:** line contains the string `?>`. 2038 2039 4. **Start condition:** line begins with the string `<!` 2040 followed by an uppercase ASCII letter.\ 2041 **End condition:** line contains the character `>`. 2042 2043 5. **Start condition:** line begins with the string 2044 `<![CDATA[`.\ 2045 **End condition:** line contains the string `]]>`. 2046 2047 6. **Start condition:** line begins the string `<` or `</` 2048 followed by one of the strings (case-insensitive) `address`, 2049 `article`, `aside`, `base`, `basefont`, `blockquote`, `body`, 2050 `caption`, `center`, `col`, `colgroup`, `dd`, `details`, `dialog`, 2051 `dir`, `div`, `dl`, `dt`, `fieldset`, `figcaption`, `figure`, 2052 `footer`, `form`, `frame`, `frameset`, 2053 `h1`, `h2`, `h3`, `h4`, `h5`, `h6`, `head`, `header`, `hr`, 2054 `html`, `iframe`, `legend`, `li`, `link`, `main`, `menu`, `menuitem`, 2055 `nav`, `noframes`, `ol`, `optgroup`, `option`, `p`, `param`, 2056 `section`, `source`, `summary`, `table`, `tbody`, `td`, 2057 `tfoot`, `th`, `thead`, `title`, `tr`, `track`, `ul`, followed 2058 by [whitespace], the end of the line, the string `>`, or 2059 the string `/>`.\ 2060 **End condition:** line is followed by a [blank line]. 2061 2062 7. **Start condition:** line begins with a complete [open tag] 2063 (with any [tag name] other than `script`, 2064 `style`, or `pre`) or a complete [closing tag], 2065 followed only by [whitespace] or the end of the line.\ 2066 **End condition:** line is followed by a [blank line]. 2067 2068 HTML blocks continue until they are closed by their appropriate 2069 [end condition], or the last line of the document or other [container 2070 block](#container-blocks). This means any HTML **within an HTML 2071 block** that might otherwise be recognised as a start condition will 2072 be ignored by the parser and passed through as-is, without changing 2073 the parser's state. 2074 2075 For instance, `<pre>` within a HTML block started by `<table>` will not affect 2076 the parser state; as the HTML block was started in by start condition 6, it 2077 will end at any blank line. This can be surprising: 2078 2079 ```````````````````````````````` example 2080 <table><tr><td> 2081 <pre> 2082 **Hello**, 2083 2084 _world_. 2085 </pre> 2086 </td></tr></table> 2087 . 2088 <table><tr><td> 2089 <pre> 2090 **Hello**, 2091 <p><em>world</em>. 2092 </pre></p> 2093 </td></tr></table> 2094 ```````````````````````````````` 2095 2096 In this case, the HTML block is terminated by the newline — the `**Hello**` 2097 text remains verbatim — and regular parsing resumes, with a paragraph, 2098 emphasised `world` and inline and block HTML following. 2099 2100 All types of [HTML blocks] except type 7 may interrupt 2101 a paragraph. Blocks of type 7 may not interrupt a paragraph. 2102 (This restriction is intended to prevent unwanted interpretation 2103 of long tags inside a wrapped paragraph as starting HTML blocks.) 2104 2105 Some simple examples follow. Here are some basic HTML blocks 2106 of type 6: 2107 2108 ```````````````````````````````` example 2109 <table> 2110 <tr> 2111 <td> 2112 hi 2113 </td> 2114 </tr> 2115 </table> 2116 2117 okay. 2118 . 2119 <table> 2120 <tr> 2121 <td> 2122 hi 2123 </td> 2124 </tr> 2125 </table> 2126 <p>okay.</p> 2127 ```````````````````````````````` 2128 2129 2130 ```````````````````````````````` example 2131 <div> 2132 *hello* 2133 <foo><a> 2134 . 2135 <div> 2136 *hello* 2137 <foo><a> 2138 ```````````````````````````````` 2139 2140 2141 A block can also start with a closing tag: 2142 2143 ```````````````````````````````` example 2144 </div> 2145 *foo* 2146 . 2147 </div> 2148 *foo* 2149 ```````````````````````````````` 2150 2151 2152 Here we have two HTML blocks with a Markdown paragraph between them: 2153 2154 ```````````````````````````````` example 2155 <DIV CLASS="foo"> 2156 2157 *Markdown* 2158 2159 </DIV> 2160 . 2161 <DIV CLASS="foo"> 2162 <p><em>Markdown</em></p> 2163 </DIV> 2164 ```````````````````````````````` 2165 2166 2167 The tag on the first line can be partial, as long 2168 as it is split where there would be whitespace: 2169 2170 ```````````````````````````````` example 2171 <div id="foo" 2172 class="bar"> 2173 </div> 2174 . 2175 <div id="foo" 2176 class="bar"> 2177 </div> 2178 ```````````````````````````````` 2179 2180 2181 ```````````````````````````````` example 2182 <div id="foo" class="bar 2183 baz"> 2184 </div> 2185 . 2186 <div id="foo" class="bar 2187 baz"> 2188 </div> 2189 ```````````````````````````````` 2190 2191 2192 An open tag need not be closed: 2193 ```````````````````````````````` example 2194 <div> 2195 *foo* 2196 2197 *bar* 2198 . 2199 <div> 2200 *foo* 2201 <p><em>bar</em></p> 2202 ```````````````````````````````` 2203 2204 2205 2206 A partial tag need not even be completed (garbage 2207 in, garbage out): 2208 2209 ```````````````````````````````` example 2210 <div id="foo" 2211 *hi* 2212 . 2213 <div id="foo" 2214 *hi* 2215 ```````````````````````````````` 2216 2217 2218 ```````````````````````````````` example 2219 <div class 2220 foo 2221 . 2222 <div class 2223 foo 2224 ```````````````````````````````` 2225 2226 2227 The initial tag doesn't even need to be a valid 2228 tag, as long as it starts like one: 2229 2230 ```````````````````````````````` example 2231 <div *???-&&&-<--- 2232 *foo* 2233 . 2234 <div *???-&&&-<--- 2235 *foo* 2236 ```````````````````````````````` 2237 2238 2239 In type 6 blocks, the initial tag need not be on a line by 2240 itself: 2241 2242 ```````````````````````````````` example 2243 <div><a href="bar">*foo*</a></div> 2244 . 2245 <div><a href="bar">*foo*</a></div> 2246 ```````````````````````````````` 2247 2248 2249 ```````````````````````````````` example 2250 <table><tr><td> 2251 foo 2252 </td></tr></table> 2253 . 2254 <table><tr><td> 2255 foo 2256 </td></tr></table> 2257 ```````````````````````````````` 2258 2259 2260 Everything until the next blank line or end of document 2261 gets included in the HTML block. So, in the following 2262 example, what looks like a Markdown code block 2263 is actually part of the HTML block, which continues until a blank 2264 line or the end of the document is reached: 2265 2266 ```````````````````````````````` example 2267 <div></div> 2268 ``` c 2269 int x = 33; 2270 ``` 2271 . 2272 <div></div> 2273 ``` c 2274 int x = 33; 2275 ``` 2276 ```````````````````````````````` 2277 2278 2279 To start an [HTML block] with a tag that is *not* in the 2280 list of block-level tags in (6), you must put the tag by 2281 itself on the first line (and it must be complete): 2282 2283 ```````````````````````````````` example 2284 <a href="foo"> 2285 *bar* 2286 </a> 2287 . 2288 <a href="foo"> 2289 *bar* 2290 </a> 2291 ```````````````````````````````` 2292 2293 2294 In type 7 blocks, the [tag name] can be anything: 2295 2296 ```````````````````````````````` example 2297 <Warning> 2298 *bar* 2299 </Warning> 2300 . 2301 <Warning> 2302 *bar* 2303 </Warning> 2304 ```````````````````````````````` 2305 2306 2307 ```````````````````````````````` example 2308 <i class="foo"> 2309 *bar* 2310 </i> 2311 . 2312 <i class="foo"> 2313 *bar* 2314 </i> 2315 ```````````````````````````````` 2316 2317 2318 ```````````````````````````````` example 2319 </ins> 2320 *bar* 2321 . 2322 </ins> 2323 *bar* 2324 ```````````````````````````````` 2325 2326 2327 These rules are designed to allow us to work with tags that 2328 can function as either block-level or inline-level tags. 2329 The `<del>` tag is a nice example. We can surround content with 2330 `<del>` tags in three different ways. In this case, we get a raw 2331 HTML block, because the `<del>` tag is on a line by itself: 2332 2333 ```````````````````````````````` example 2334 <del> 2335 *foo* 2336 </del> 2337 . 2338 <del> 2339 *foo* 2340 </del> 2341 ```````````````````````````````` 2342 2343 2344 In this case, we get a raw HTML block that just includes 2345 the `<del>` tag (because it ends with the following blank 2346 line). So the contents get interpreted as CommonMark: 2347 2348 ```````````````````````````````` example 2349 <del> 2350 2351 *foo* 2352 2353 </del> 2354 . 2355 <del> 2356 <p><em>foo</em></p> 2357 </del> 2358 ```````````````````````````````` 2359 2360 2361 Finally, in this case, the `<del>` tags are interpreted 2362 as [raw HTML] *inside* the CommonMark paragraph. (Because 2363 the tag is not on a line by itself, we get inline HTML 2364 rather than an [HTML block].) 2365 2366 ```````````````````````````````` example 2367 <del>*foo*</del> 2368 . 2369 <p><del><em>foo</em></del></p> 2370 ```````````````````````````````` 2371 2372 2373 HTML tags designed to contain literal content 2374 (`script`, `style`, `pre`), comments, processing instructions, 2375 and declarations are treated somewhat differently. 2376 Instead of ending at the first blank line, these blocks 2377 end at the first line containing a corresponding end tag. 2378 As a result, these blocks can contain blank lines: 2379 2380 A pre tag (type 1): 2381 2382 ```````````````````````````````` example 2383 <pre language="haskell"><code> 2384 import Text.HTML.TagSoup 2385 2386 main :: IO () 2387 main = print $ parseTags tags 2388 </code></pre> 2389 okay 2390 . 2391 <pre language="haskell"><code> 2392 import Text.HTML.TagSoup 2393 2394 main :: IO () 2395 main = print $ parseTags tags 2396 </code></pre> 2397 <p>okay</p> 2398 ```````````````````````````````` 2399 2400 2401 A script tag (type 1): 2402 2403 ```````````````````````````````` example 2404 <script type="text/javascript"> 2405 // JavaScript example 2406 2407 document.getElementById("demo").innerHTML = "Hello JavaScript!"; 2408 </script> 2409 okay 2410 . 2411 <script type="text/javascript"> 2412 // JavaScript example 2413 2414 document.getElementById("demo").innerHTML = "Hello JavaScript!"; 2415 </script> 2416 <p>okay</p> 2417 ```````````````````````````````` 2418 2419 2420 A style tag (type 1): 2421 2422 ```````````````````````````````` example 2423 <style 2424 type="text/css"> 2425 h1 {color:red;} 2426 2427 p {color:blue;} 2428 </style> 2429 okay 2430 . 2431 <style 2432 type="text/css"> 2433 h1 {color:red;} 2434 2435 p {color:blue;} 2436 </style> 2437 <p>okay</p> 2438 ```````````````````````````````` 2439 2440 2441 If there is no matching end tag, the block will end at the 2442 end of the document (or the enclosing [block quote][block quotes] 2443 or [list item][list items]): 2444 2445 ```````````````````````````````` example 2446 <style 2447 type="text/css"> 2448 2449 foo 2450 . 2451 <style 2452 type="text/css"> 2453 2454 foo 2455 ```````````````````````````````` 2456 2457 2458 ```````````````````````````````` example 2459 > <div> 2460 > foo 2461 2462 bar 2463 . 2464 <blockquote> 2465 <div> 2466 foo 2467 </blockquote> 2468 <p>bar</p> 2469 ```````````````````````````````` 2470 2471 2472 ```````````````````````````````` example 2473 - <div> 2474 - foo 2475 . 2476 <ul> 2477 <li> 2478 <div> 2479 </li> 2480 <li>foo</li> 2481 </ul> 2482 ```````````````````````````````` 2483 2484 2485 The end tag can occur on the same line as the start tag: 2486 2487 ```````````````````````````````` example 2488 <style>p{color:red;}</style> 2489 *foo* 2490 . 2491 <style>p{color:red;}</style> 2492 <p><em>foo</em></p> 2493 ```````````````````````````````` 2494 2495 2496 ```````````````````````````````` example 2497 <!-- foo -->*bar* 2498 *baz* 2499 . 2500 <!-- foo -->*bar* 2501 <p><em>baz</em></p> 2502 ```````````````````````````````` 2503 2504 2505 Note that anything on the last line after the 2506 end tag will be included in the [HTML block]: 2507 2508 ```````````````````````````````` example 2509 <script> 2510 foo 2511 </script>1. *bar* 2512 . 2513 <script> 2514 foo 2515 </script>1. *bar* 2516 ```````````````````````````````` 2517 2518 2519 A comment (type 2): 2520 2521 ```````````````````````````````` example 2522 <!-- Foo 2523 2524 bar 2525 baz --> 2526 okay 2527 . 2528 <!-- Foo 2529 2530 bar 2531 baz --> 2532 <p>okay</p> 2533 ```````````````````````````````` 2534 2535 2536 2537 A processing instruction (type 3): 2538 2539 ```````````````````````````````` example 2540 <?php 2541 2542 echo '>'; 2543 2544 ?> 2545 okay 2546 . 2547 <?php 2548 2549 echo '>'; 2550 2551 ?> 2552 <p>okay</p> 2553 ```````````````````````````````` 2554 2555 2556 A declaration (type 4): 2557 2558 ```````````````````````````````` example 2559 <!DOCTYPE html> 2560 . 2561 <!DOCTYPE html> 2562 ```````````````````````````````` 2563 2564 2565 CDATA (type 5): 2566 2567 ```````````````````````````````` example 2568 <![CDATA[ 2569 function matchwo(a,b) 2570 { 2571 if (a < b && a < 0) then { 2572 return 1; 2573 2574 } else { 2575 2576 return 0; 2577 } 2578 } 2579 ]]> 2580 okay 2581 . 2582 <![CDATA[ 2583 function matchwo(a,b) 2584 { 2585 if (a < b && a < 0) then { 2586 return 1; 2587 2588 } else { 2589 2590 return 0; 2591 } 2592 } 2593 ]]> 2594 <p>okay</p> 2595 ```````````````````````````````` 2596 2597 2598 The opening tag can be indented 1-3 spaces, but not 4: 2599 2600 ```````````````````````````````` example 2601 <!-- foo --> 2602 2603 <!-- foo --> 2604 . 2605 <!-- foo --> 2606 <pre><code><!-- foo --> 2607 </code></pre> 2608 ```````````````````````````````` 2609 2610 2611 ```````````````````````````````` example 2612 <div> 2613 2614 <div> 2615 . 2616 <div> 2617 <pre><code><div> 2618 </code></pre> 2619 ```````````````````````````````` 2620 2621 2622 An HTML block of types 1--6 can interrupt a paragraph, and need not be 2623 preceded by a blank line. 2624 2625 ```````````````````````````````` example 2626 Foo 2627 <div> 2628 bar 2629 </div> 2630 . 2631 <p>Foo</p> 2632 <div> 2633 bar 2634 </div> 2635 ```````````````````````````````` 2636 2637 2638 However, a following blank line is needed, except at the end of 2639 a document, and except for blocks of types 1--5, [above][HTML 2640 block]: 2641 2642 ```````````````````````````````` example 2643 <div> 2644 bar 2645 </div> 2646 *foo* 2647 . 2648 <div> 2649 bar 2650 </div> 2651 *foo* 2652 ```````````````````````````````` 2653 2654 2655 HTML blocks of type 7 cannot interrupt a paragraph: 2656 2657 ```````````````````````````````` example 2658 Foo 2659 <a href="bar"> 2660 baz 2661 . 2662 <p>Foo 2663 <a href="bar"> 2664 baz</p> 2665 ```````````````````````````````` 2666 2667 2668 This rule differs from John Gruber's original Markdown syntax 2669 specification, which says: 2670 2671 > The only restrictions are that block-level HTML elements — 2672 > e.g. `<div>`, `<table>`, `<pre>`, `<p>`, etc. — must be separated from 2673 > surrounding content by blank lines, and the start and end tags of the 2674 > block should not be indented with tabs or spaces. 2675 2676 In some ways Gruber's rule is more restrictive than the one given 2677 here: 2678 2679 - It requires that an HTML block be preceded by a blank line. 2680 - It does not allow the start tag to be indented. 2681 - It requires a matching end tag, which it also does not allow to 2682 be indented. 2683 2684 Most Markdown implementations (including some of Gruber's own) do not 2685 respect all of these restrictions. 2686 2687 There is one respect, however, in which Gruber's rule is more liberal 2688 than the one given here, since it allows blank lines to occur inside 2689 an HTML block. There are two reasons for disallowing them here. 2690 First, it removes the need to parse balanced tags, which is 2691 expensive and can require backtracking from the end of the document 2692 if no matching end tag is found. Second, it provides a very simple 2693 and flexible way of including Markdown content inside HTML tags: 2694 simply separate the Markdown from the HTML using blank lines: 2695 2696 Compare: 2697 2698 ```````````````````````````````` example 2699 <div> 2700 2701 *Emphasized* text. 2702 2703 </div> 2704 . 2705 <div> 2706 <p><em>Emphasized</em> text.</p> 2707 </div> 2708 ```````````````````````````````` 2709 2710 2711 ```````````````````````````````` example 2712 <div> 2713 *Emphasized* text. 2714 </div> 2715 . 2716 <div> 2717 *Emphasized* text. 2718 </div> 2719 ```````````````````````````````` 2720 2721 2722 Some Markdown implementations have adopted a convention of 2723 interpreting content inside tags as text if the open tag has 2724 the attribute `markdown=1`. The rule given above seems a simpler and 2725 more elegant way of achieving the same expressive power, which is also 2726 much simpler to parse. 2727 2728 The main potential drawback is that one can no longer paste HTML 2729 blocks into Markdown documents with 100% reliability. However, 2730 *in most cases* this will work fine, because the blank lines in 2731 HTML are usually followed by HTML block tags. For example: 2732 2733 ```````````````````````````````` example 2734 <table> 2735 2736 <tr> 2737 2738 <td> 2739 Hi 2740 </td> 2741 2742 </tr> 2743 2744 </table> 2745 . 2746 <table> 2747 <tr> 2748 <td> 2749 Hi 2750 </td> 2751 </tr> 2752 </table> 2753 ```````````````````````````````` 2754 2755 2756 There are problems, however, if the inner tags are indented 2757 *and* separated by spaces, as then they will be interpreted as 2758 an indented code block: 2759 2760 ```````````````````````````````` example 2761 <table> 2762 2763 <tr> 2764 2765 <td> 2766 Hi 2767 </td> 2768 2769 </tr> 2770 2771 </table> 2772 . 2773 <table> 2774 <tr> 2775 <pre><code><td> 2776 Hi 2777 </td> 2778 </code></pre> 2779 </tr> 2780 </table> 2781 ```````````````````````````````` 2782 2783 2784 Fortunately, blank lines are usually not necessary and can be 2785 deleted. The exception is inside `<pre>` tags, but as described 2786 [above][HTML blocks], raw HTML blocks starting with `<pre>` 2787 *can* contain blank lines. 2788 2789 ## Link reference definitions 2790 2791 A [link reference definition](@) 2792 consists of a [link label], indented up to three spaces, followed 2793 by a colon (`:`), optional [whitespace] (including up to one 2794 [line ending]), a [link destination], 2795 optional [whitespace] (including up to one 2796 [line ending]), and an optional [link 2797 title], which if it is present must be separated 2798 from the [link destination] by [whitespace]. 2799 No further [non-whitespace characters] may occur on the line. 2800 2801 A [link reference definition] 2802 does not correspond to a structural element of a document. Instead, it 2803 defines a label which can be used in [reference links] 2804 and reference-style [images] elsewhere in the document. [Link 2805 reference definitions] can come either before or after the links that use 2806 them. 2807 2808 ```````````````````````````````` example 2809 [foo]: /url "title" 2810 2811 [foo] 2812 . 2813 <p><a href="/url" title="title">foo</a></p> 2814 ```````````````````````````````` 2815 2816 2817 ```````````````````````````````` example 2818 [foo]: 2819 /url 2820 'the title' 2821 2822 [foo] 2823 . 2824 <p><a href="/url" title="the title">foo</a></p> 2825 ```````````````````````````````` 2826 2827 2828 ```````````````````````````````` example 2829 [Foo*bar\]]:my_(url) 'title (with parens)' 2830 2831 [Foo*bar\]] 2832 . 2833 <p><a href="my_(url)" title="title (with parens)">Foo*bar]</a></p> 2834 ```````````````````````````````` 2835 2836 2837 ```````````````````````````````` example 2838 [Foo bar]: 2839 <my url> 2840 'title' 2841 2842 [Foo bar] 2843 . 2844 <p><a href="my%20url" title="title">Foo bar</a></p> 2845 ```````````````````````````````` 2846 2847 2848 The title may extend over multiple lines: 2849 2850 ```````````````````````````````` example 2851 [foo]: /url ' 2852 title 2853 line1 2854 line2 2855 ' 2856 2857 [foo] 2858 . 2859 <p><a href="/url" title=" 2860 title 2861 line1 2862 line2 2863 ">foo</a></p> 2864 ```````````````````````````````` 2865 2866 2867 However, it may not contain a [blank line]: 2868 2869 ```````````````````````````````` example 2870 [foo]: /url 'title 2871 2872 with blank line' 2873 2874 [foo] 2875 . 2876 <p>[foo]: /url 'title</p> 2877 <p>with blank line'</p> 2878 <p>[foo]</p> 2879 ```````````````````````````````` 2880 2881 2882 The title may be omitted: 2883 2884 ```````````````````````````````` example 2885 [foo]: 2886 /url 2887 2888 [foo] 2889 . 2890 <p><a href="/url">foo</a></p> 2891 ```````````````````````````````` 2892 2893 2894 The link destination may not be omitted: 2895 2896 ```````````````````````````````` example 2897 [foo]: 2898 2899 [foo] 2900 . 2901 <p>[foo]:</p> 2902 <p>[foo]</p> 2903 ```````````````````````````````` 2904 2905 However, an empty link destination may be specified using 2906 angle brackets: 2907 2908 ```````````````````````````````` example 2909 [foo]: <> 2910 2911 [foo] 2912 . 2913 <p><a href="">foo</a></p> 2914 ```````````````````````````````` 2915 2916 The title must be separated from the link destination by 2917 whitespace: 2918 2919 ```````````````````````````````` example 2920 [foo]: <bar>(baz) 2921 2922 [foo] 2923 . 2924 <p>[foo]: <bar>(baz)</p> 2925 <p>[foo]</p> 2926 ```````````````````````````````` 2927 2928 2929 Both title and destination can contain backslash escapes 2930 and literal backslashes: 2931 2932 ```````````````````````````````` example 2933 [foo]: /url\bar\*baz "foo\"bar\baz" 2934 2935 [foo] 2936 . 2937 <p><a href="/url%5Cbar*baz" title="foo"bar\baz">foo</a></p> 2938 ```````````````````````````````` 2939 2940 2941 A link can come before its corresponding definition: 2942 2943 ```````````````````````````````` example 2944 [foo] 2945 2946 [foo]: url 2947 . 2948 <p><a href="url">foo</a></p> 2949 ```````````````````````````````` 2950 2951 2952 If there are several matching definitions, the first one takes 2953 precedence: 2954 2955 ```````````````````````````````` example 2956 [foo] 2957 2958 [foo]: first 2959 [foo]: second 2960 . 2961 <p><a href="first">foo</a></p> 2962 ```````````````````````````````` 2963 2964 2965 As noted in the section on [Links], matching of labels is 2966 case-insensitive (see [matches]). 2967 2968 ```````````````````````````````` example 2969 [FOO]: /url 2970 2971 [Foo] 2972 . 2973 <p><a href="/url">Foo</a></p> 2974 ```````````````````````````````` 2975 2976 2977 ```````````````````````````````` example 2978 [ΑΓΩ]: /φου 2979 2980 [αγω] 2981 . 2982 <p><a href="/%CF%86%CE%BF%CF%85">αγω</a></p> 2983 ```````````````````````````````` 2984 2985 2986 Here is a link reference definition with no corresponding link. 2987 It contributes nothing to the document. 2988 2989 ```````````````````````````````` example 2990 [foo]: /url 2991 . 2992 ```````````````````````````````` 2993 2994 2995 Here is another one: 2996 2997 ```````````````````````````````` example 2998 [ 2999 foo 3000 ]: /url 3001 bar 3002 . 3003 <p>bar</p> 3004 ```````````````````````````````` 3005 3006 3007 This is not a link reference definition, because there are 3008 [non-whitespace characters] after the title: 3009 3010 ```````````````````````````````` example 3011 [foo]: /url "title" ok 3012 . 3013 <p>[foo]: /url "title" ok</p> 3014 ```````````````````````````````` 3015 3016 3017 This is a link reference definition, but it has no title: 3018 3019 ```````````````````````````````` example 3020 [foo]: /url 3021 "title" ok 3022 . 3023 <p>"title" ok</p> 3024 ```````````````````````````````` 3025 3026 3027 This is not a link reference definition, because it is indented 3028 four spaces: 3029 3030 ```````````````````````````````` example 3031 [foo]: /url "title" 3032 3033 [foo] 3034 . 3035 <pre><code>[foo]: /url "title" 3036 </code></pre> 3037 <p>[foo]</p> 3038 ```````````````````````````````` 3039 3040 3041 This is not a link reference definition, because it occurs inside 3042 a code block: 3043 3044 ```````````````````````````````` example 3045 ``` 3046 [foo]: /url 3047 ``` 3048 3049 [foo] 3050 . 3051 <pre><code>[foo]: /url 3052 </code></pre> 3053 <p>[foo]</p> 3054 ```````````````````````````````` 3055 3056 3057 A [link reference definition] cannot interrupt a paragraph. 3058 3059 ```````````````````````````````` example 3060 Foo 3061 [bar]: /baz 3062 3063 [bar] 3064 . 3065 <p>Foo 3066 [bar]: /baz</p> 3067 <p>[bar]</p> 3068 ```````````````````````````````` 3069 3070 3071 However, it can directly follow other block elements, such as headings 3072 and thematic breaks, and it need not be followed by a blank line. 3073 3074 ```````````````````````````````` example 3075 # [Foo] 3076 [foo]: /url 3077 > bar 3078 . 3079 <h1><a href="/url">Foo</a></h1> 3080 <blockquote> 3081 <p>bar</p> 3082 </blockquote> 3083 ```````````````````````````````` 3084 3085 ```````````````````````````````` example 3086 [foo]: /url 3087 bar 3088 === 3089 [foo] 3090 . 3091 <h1>bar</h1> 3092 <p><a href="/url">foo</a></p> 3093 ```````````````````````````````` 3094 3095 ```````````````````````````````` example 3096 [foo]: /url 3097 === 3098 [foo] 3099 . 3100 <p>=== 3101 <a href="/url">foo</a></p> 3102 ```````````````````````````````` 3103 3104 3105 Several [link reference definitions] 3106 can occur one after another, without intervening blank lines. 3107 3108 ```````````````````````````````` example 3109 [foo]: /foo-url "foo" 3110 [bar]: /bar-url 3111 "bar" 3112 [baz]: /baz-url 3113 3114 [foo], 3115 [bar], 3116 [baz] 3117 . 3118 <p><a href="/foo-url" title="foo">foo</a>, 3119 <a href="/bar-url" title="bar">bar</a>, 3120 <a href="/baz-url">baz</a></p> 3121 ```````````````````````````````` 3122 3123 3124 [Link reference definitions] can occur 3125 inside block containers, like lists and block quotations. They 3126 affect the entire document, not just the container in which they 3127 are defined: 3128 3129 ```````````````````````````````` example 3130 [foo] 3131 3132 > [foo]: /url 3133 . 3134 <p><a href="/url">foo</a></p> 3135 <blockquote> 3136 </blockquote> 3137 ```````````````````````````````` 3138 3139 3140 Whether something is a [link reference definition] is 3141 independent of whether the link reference it defines is 3142 used in the document. Thus, for example, the following 3143 document contains just a link reference definition, and 3144 no visible content: 3145 3146 ```````````````````````````````` example 3147 [foo]: /url 3148 . 3149 ```````````````````````````````` 3150 3151 3152 ## Paragraphs 3153 3154 A sequence of non-blank lines that cannot be interpreted as other 3155 kinds of blocks forms a [paragraph](@). 3156 The contents of the paragraph are the result of parsing the 3157 paragraph's raw content as inlines. The paragraph's raw content 3158 is formed by concatenating the lines and removing initial and final 3159 [whitespace]. 3160 3161 A simple example with two paragraphs: 3162 3163 ```````````````````````````````` example 3164 aaa 3165 3166 bbb 3167 . 3168 <p>aaa</p> 3169 <p>bbb</p> 3170 ```````````````````````````````` 3171 3172 3173 Paragraphs can contain multiple lines, but no blank lines: 3174 3175 ```````````````````````````````` example 3176 aaa 3177 bbb 3178 3179 ccc 3180 ddd 3181 . 3182 <p>aaa 3183 bbb</p> 3184 <p>ccc 3185 ddd</p> 3186 ```````````````````````````````` 3187 3188 3189 Multiple blank lines between paragraph have no effect: 3190 3191 ```````````````````````````````` example 3192 aaa 3193 3194 3195 bbb 3196 . 3197 <p>aaa</p> 3198 <p>bbb</p> 3199 ```````````````````````````````` 3200 3201 3202 Leading spaces are skipped: 3203 3204 ```````````````````````````````` example 3205 aaa 3206 bbb 3207 . 3208 <p>aaa 3209 bbb</p> 3210 ```````````````````````````````` 3211 3212 3213 Lines after the first may be indented any amount, since indented 3214 code blocks cannot interrupt paragraphs. 3215 3216 ```````````````````````````````` example 3217 aaa 3218 bbb 3219 ccc 3220 . 3221 <p>aaa 3222 bbb 3223 ccc</p> 3224 ```````````````````````````````` 3225 3226 3227 However, the first line may be indented at most three spaces, 3228 or an indented code block will be triggered: 3229 3230 ```````````````````````````````` example 3231 aaa 3232 bbb 3233 . 3234 <p>aaa 3235 bbb</p> 3236 ```````````````````````````````` 3237 3238 3239 ```````````````````````````````` example 3240 aaa 3241 bbb 3242 . 3243 <pre><code>aaa 3244 </code></pre> 3245 <p>bbb</p> 3246 ```````````````````````````````` 3247 3248 3249 Final spaces are stripped before inline parsing, so a paragraph 3250 that ends with two or more spaces will not end with a [hard line 3251 break]: 3252 3253 ```````````````````````````````` example 3254 aaa 3255 bbb 3256 . 3257 <p>aaa<br /> 3258 bbb</p> 3259 ```````````````````````````````` 3260 3261 3262 ## Blank lines 3263 3264 [Blank lines] between block-level elements are ignored, 3265 except for the role they play in determining whether a [list] 3266 is [tight] or [loose]. 3267 3268 Blank lines at the beginning and end of the document are also ignored. 3269 3270 ```````````````````````````````` example 3271 3272 3273 aaa 3274 3275 3276 # aaa 3277 3278 3279 . 3280 <p>aaa</p> 3281 <h1>aaa</h1> 3282 ```````````````````````````````` 3283 3284 3285 3286 # Container blocks 3287 3288 A [container block](#container-blocks) is a block that has other 3289 blocks as its contents. There are two basic kinds of container blocks: 3290 [block quotes] and [list items]. 3291 [Lists] are meta-containers for [list items]. 3292 3293 We define the syntax for container blocks recursively. The general 3294 form of the definition is: 3295 3296 > If X is a sequence of blocks, then the result of 3297 > transforming X in such-and-such a way is a container of type Y 3298 > with these blocks as its content. 3299 3300 So, we explain what counts as a block quote or list item by explaining 3301 how these can be *generated* from their contents. This should suffice 3302 to define the syntax, although it does not give a recipe for *parsing* 3303 these constructions. (A recipe is provided below in the section entitled 3304 [A parsing strategy](#appendix-a-parsing-strategy).) 3305 3306 ## Block quotes 3307 3308 A [block quote marker](@) 3309 consists of 0-3 spaces of initial indent, plus (a) the character `>` together 3310 with a following space, or (b) a single character `>` not followed by a space. 3311 3312 The following rules define [block quotes]: 3313 3314 1. **Basic case.** If a string of lines *Ls* constitute a sequence 3315 of blocks *Bs*, then the result of prepending a [block quote 3316 marker] to the beginning of each line in *Ls* 3317 is a [block quote](#block-quotes) containing *Bs*. 3318 3319 2. **Laziness.** If a string of lines *Ls* constitute a [block 3320 quote](#block-quotes) with contents *Bs*, then the result of deleting 3321 the initial [block quote marker] from one or 3322 more lines in which the next [non-whitespace character] after the [block 3323 quote marker] is [paragraph continuation 3324 text] is a block quote with *Bs* as its content. 3325 [Paragraph continuation text](@) is text 3326 that will be parsed as part of the content of a paragraph, but does 3327 not occur at the beginning of the paragraph. 3328 3329 3. **Consecutiveness.** A document cannot contain two [block 3330 quotes] in a row unless there is a [blank line] between them. 3331 3332 Nothing else counts as a [block quote](#block-quotes). 3333 3334 Here is a simple example: 3335 3336 ```````````````````````````````` example 3337 > # Foo 3338 > bar 3339 > baz 3340 . 3341 <blockquote> 3342 <h1>Foo</h1> 3343 <p>bar 3344 baz</p> 3345 </blockquote> 3346 ```````````````````````````````` 3347 3348 3349 The spaces after the `>` characters can be omitted: 3350 3351 ```````````````````````````````` example 3352 ># Foo 3353 >bar 3354 > baz 3355 . 3356 <blockquote> 3357 <h1>Foo</h1> 3358 <p>bar 3359 baz</p> 3360 </blockquote> 3361 ```````````````````````````````` 3362 3363 3364 The `>` characters can be indented 1-3 spaces: 3365 3366 ```````````````````````````````` example 3367 > # Foo 3368 > bar 3369 > baz 3370 . 3371 <blockquote> 3372 <h1>Foo</h1> 3373 <p>bar 3374 baz</p> 3375 </blockquote> 3376 ```````````````````````````````` 3377 3378 3379 Four spaces gives us a code block: 3380 3381 ```````````````````````````````` example 3382 > # Foo 3383 > bar 3384 > baz 3385 . 3386 <pre><code>> # Foo 3387 > bar 3388 > baz 3389 </code></pre> 3390 ```````````````````````````````` 3391 3392 3393 The Laziness clause allows us to omit the `>` before 3394 [paragraph continuation text]: 3395 3396 ```````````````````````````````` example 3397 > # Foo 3398 > bar 3399 baz 3400 . 3401 <blockquote> 3402 <h1>Foo</h1> 3403 <p>bar 3404 baz</p> 3405 </blockquote> 3406 ```````````````````````````````` 3407 3408 3409 A block quote can contain some lazy and some non-lazy 3410 continuation lines: 3411 3412 ```````````````````````````````` example 3413 > bar 3414 baz 3415 > foo 3416 . 3417 <blockquote> 3418 <p>bar 3419 baz 3420 foo</p> 3421 </blockquote> 3422 ```````````````````````````````` 3423 3424 3425 Laziness only applies to lines that would have been continuations of 3426 paragraphs had they been prepended with [block quote markers]. 3427 For example, the `> ` cannot be omitted in the second line of 3428 3429 ``` markdown 3430 > foo 3431 > --- 3432 ``` 3433 3434 without changing the meaning: 3435 3436 ```````````````````````````````` example 3437 > foo 3438 --- 3439 . 3440 <blockquote> 3441 <p>foo</p> 3442 </blockquote> 3443 <hr /> 3444 ```````````````````````````````` 3445 3446 3447 Similarly, if we omit the `> ` in the second line of 3448 3449 ``` markdown 3450 > - foo 3451 > - bar 3452 ``` 3453 3454 then the block quote ends after the first line: 3455 3456 ```````````````````````````````` example 3457 > - foo 3458 - bar 3459 . 3460 <blockquote> 3461 <ul> 3462 <li>foo</li> 3463 </ul> 3464 </blockquote> 3465 <ul> 3466 <li>bar</li> 3467 </ul> 3468 ```````````````````````````````` 3469 3470 3471 For the same reason, we can't omit the `> ` in front of 3472 subsequent lines of an indented or fenced code block: 3473 3474 ```````````````````````````````` example 3475 > foo 3476 bar 3477 . 3478 <blockquote> 3479 <pre><code>foo 3480 </code></pre> 3481 </blockquote> 3482 <pre><code>bar 3483 </code></pre> 3484 ```````````````````````````````` 3485 3486 3487 ```````````````````````````````` example 3488 > ``` 3489 foo 3490 ``` 3491 . 3492 <blockquote> 3493 <pre><code></code></pre> 3494 </blockquote> 3495 <p>foo</p> 3496 <pre><code></code></pre> 3497 ```````````````````````````````` 3498 3499 3500 Note that in the following case, we have a [lazy 3501 continuation line]: 3502 3503 ```````````````````````````````` example 3504 > foo 3505 - bar 3506 . 3507 <blockquote> 3508 <p>foo 3509 - bar</p> 3510 </blockquote> 3511 ```````````````````````````````` 3512 3513 3514 To see why, note that in 3515 3516 ```markdown 3517 > foo 3518 > - bar 3519 ``` 3520 3521 the `- bar` is indented too far to start a list, and can't 3522 be an indented code block because indented code blocks cannot 3523 interrupt paragraphs, so it is [paragraph continuation text]. 3524 3525 A block quote can be empty: 3526 3527 ```````````````````````````````` example 3528 > 3529 . 3530 <blockquote> 3531 </blockquote> 3532 ```````````````````````````````` 3533 3534 3535 ```````````````````````````````` example 3536 > 3537 > 3538 > 3539 . 3540 <blockquote> 3541 </blockquote> 3542 ```````````````````````````````` 3543 3544 3545 A block quote can have initial or final blank lines: 3546 3547 ```````````````````````````````` example 3548 > 3549 > foo 3550 > 3551 . 3552 <blockquote> 3553 <p>foo</p> 3554 </blockquote> 3555 ```````````````````````````````` 3556 3557 3558 A blank line always separates block quotes: 3559 3560 ```````````````````````````````` example 3561 > foo 3562 3563 > bar 3564 . 3565 <blockquote> 3566 <p>foo</p> 3567 </blockquote> 3568 <blockquote> 3569 <p>bar</p> 3570 </blockquote> 3571 ```````````````````````````````` 3572 3573 3574 (Most current Markdown implementations, including John Gruber's 3575 original `Markdown.pl`, will parse this example as a single block quote 3576 with two paragraphs. But it seems better to allow the author to decide 3577 whether two block quotes or one are wanted.) 3578 3579 Consecutiveness means that if we put these block quotes together, 3580 we get a single block quote: 3581 3582 ```````````````````````````````` example 3583 > foo 3584 > bar 3585 . 3586 <blockquote> 3587 <p>foo 3588 bar</p> 3589 </blockquote> 3590 ```````````````````````````````` 3591 3592 3593 To get a block quote with two paragraphs, use: 3594 3595 ```````````````````````````````` example 3596 > foo 3597 > 3598 > bar 3599 . 3600 <blockquote> 3601 <p>foo</p> 3602 <p>bar</p> 3603 </blockquote> 3604 ```````````````````````````````` 3605 3606 3607 Block quotes can interrupt paragraphs: 3608 3609 ```````````````````````````````` example 3610 foo 3611 > bar 3612 . 3613 <p>foo</p> 3614 <blockquote> 3615 <p>bar</p> 3616 </blockquote> 3617 ```````````````````````````````` 3618 3619 3620 In general, blank lines are not needed before or after block 3621 quotes: 3622 3623 ```````````````````````````````` example 3624 > aaa 3625 *** 3626 > bbb 3627 . 3628 <blockquote> 3629 <p>aaa</p> 3630 </blockquote> 3631 <hr /> 3632 <blockquote> 3633 <p>bbb</p> 3634 </blockquote> 3635 ```````````````````````````````` 3636 3637 3638 However, because of laziness, a blank line is needed between 3639 a block quote and a following paragraph: 3640 3641 ```````````````````````````````` example 3642 > bar 3643 baz 3644 . 3645 <blockquote> 3646 <p>bar 3647 baz</p> 3648 </blockquote> 3649 ```````````````````````````````` 3650 3651 3652 ```````````````````````````````` example 3653 > bar 3654 3655 baz 3656 . 3657 <blockquote> 3658 <p>bar</p> 3659 </blockquote> 3660 <p>baz</p> 3661 ```````````````````````````````` 3662 3663 3664 ```````````````````````````````` example 3665 > bar 3666 > 3667 baz 3668 . 3669 <blockquote> 3670 <p>bar</p> 3671 </blockquote> 3672 <p>baz</p> 3673 ```````````````````````````````` 3674 3675 3676 It is a consequence of the Laziness rule that any number 3677 of initial `>`s may be omitted on a continuation line of a 3678 nested block quote: 3679 3680 ```````````````````````````````` example 3681 > > > foo 3682 bar 3683 . 3684 <blockquote> 3685 <blockquote> 3686 <blockquote> 3687 <p>foo 3688 bar</p> 3689 </blockquote> 3690 </blockquote> 3691 </blockquote> 3692 ```````````````````````````````` 3693 3694 3695 ```````````````````````````````` example 3696 >>> foo 3697 > bar 3698 >>baz 3699 . 3700 <blockquote> 3701 <blockquote> 3702 <blockquote> 3703 <p>foo 3704 bar 3705 baz</p> 3706 </blockquote> 3707 </blockquote> 3708 </blockquote> 3709 ```````````````````````````````` 3710 3711 3712 When including an indented code block in a block quote, 3713 remember that the [block quote marker] includes 3714 both the `>` and a following space. So *five spaces* are needed after 3715 the `>`: 3716 3717 ```````````````````````````````` example 3718 > code 3719 3720 > not code 3721 . 3722 <blockquote> 3723 <pre><code>code 3724 </code></pre> 3725 </blockquote> 3726 <blockquote> 3727 <p>not code</p> 3728 </blockquote> 3729 ```````````````````````````````` 3730 3731 3732 3733 ## List items 3734 3735 A [list marker](@) is a 3736 [bullet list marker] or an [ordered list marker]. 3737 3738 A [bullet list marker](@) 3739 is a `-`, `+`, or `*` character. 3740 3741 An [ordered list marker](@) 3742 is a sequence of 1--9 arabic digits (`0-9`), followed by either a 3743 `.` character or a `)` character. (The reason for the length 3744 limit is that with 10 digits we start seeing integer overflows 3745 in some browsers.) 3746 3747 The following rules define [list items]: 3748 3749 1. **Basic case.** If a sequence of lines *Ls* constitute a sequence of 3750 blocks *Bs* starting with a [non-whitespace character], and *M* is a 3751 list marker of width *W* followed by 1 ≤ *N* ≤ 4 spaces, then the result 3752 of prepending *M* and the following spaces to the first line of 3753 *Ls*, and indenting subsequent lines of *Ls* by *W + N* spaces, is a 3754 list item with *Bs* as its contents. The type of the list item 3755 (bullet or ordered) is determined by the type of its list marker. 3756 If the list item is ordered, then it is also assigned a start 3757 number, based on the ordered list marker. 3758 3759 Exceptions: 3760 3761 1. When the first list item in a [list] interrupts 3762 a paragraph---that is, when it starts on a line that would 3763 otherwise count as [paragraph continuation text]---then (a) 3764 the lines *Ls* must not begin with a blank line, and (b) if 3765 the list item is ordered, the start number must be 1. 3766 2. If any line is a [thematic break][thematic breaks] then 3767 that line is not a list item. 3768 3769 For example, let *Ls* be the lines 3770 3771 ```````````````````````````````` example 3772 A paragraph 3773 with two lines. 3774 3775 indented code 3776 3777 > A block quote. 3778 . 3779 <p>A paragraph 3780 with two lines.</p> 3781 <pre><code>indented code 3782 </code></pre> 3783 <blockquote> 3784 <p>A block quote.</p> 3785 </blockquote> 3786 ```````````````````````````````` 3787 3788 3789 And let *M* be the marker `1.`, and *N* = 2. Then rule #1 says 3790 that the following is an ordered list item with start number 1, 3791 and the same contents as *Ls*: 3792 3793 ```````````````````````````````` example 3794 1. A paragraph 3795 with two lines. 3796 3797 indented code 3798 3799 > A block quote. 3800 . 3801 <ol> 3802 <li> 3803 <p>A paragraph 3804 with two lines.</p> 3805 <pre><code>indented code 3806 </code></pre> 3807 <blockquote> 3808 <p>A block quote.</p> 3809 </blockquote> 3810 </li> 3811 </ol> 3812 ```````````````````````````````` 3813 3814 3815 The most important thing to notice is that the position of 3816 the text after the list marker determines how much indentation 3817 is needed in subsequent blocks in the list item. If the list 3818 marker takes up two spaces, and there are three spaces between 3819 the list marker and the next [non-whitespace character], then blocks 3820 must be indented five spaces in order to fall under the list 3821 item. 3822 3823 Here are some examples showing how far content must be indented to be 3824 put under the list item: 3825 3826 ```````````````````````````````` example 3827 - one 3828 3829 two 3830 . 3831 <ul> 3832 <li>one</li> 3833 </ul> 3834 <p>two</p> 3835 ```````````````````````````````` 3836 3837 3838 ```````````````````````````````` example 3839 - one 3840 3841 two 3842 . 3843 <ul> 3844 <li> 3845 <p>one</p> 3846 <p>two</p> 3847 </li> 3848 </ul> 3849 ```````````````````````````````` 3850 3851 3852 ```````````````````````````````` example 3853 - one 3854 3855 two 3856 . 3857 <ul> 3858 <li>one</li> 3859 </ul> 3860 <pre><code> two 3861 </code></pre> 3862 ```````````````````````````````` 3863 3864 3865 ```````````````````````````````` example 3866 - one 3867 3868 two 3869 . 3870 <ul> 3871 <li> 3872 <p>one</p> 3873 <p>two</p> 3874 </li> 3875 </ul> 3876 ```````````````````````````````` 3877 3878 3879 It is tempting to think of this in terms of columns: the continuation 3880 blocks must be indented at least to the column of the first 3881 [non-whitespace character] after the list marker. However, that is not quite right. 3882 The spaces after the list marker determine how much relative indentation 3883 is needed. Which column this indentation reaches will depend on 3884 how the list item is embedded in other constructions, as shown by 3885 this example: 3886 3887 ```````````````````````````````` example 3888 > > 1. one 3889 >> 3890 >> two 3891 . 3892 <blockquote> 3893 <blockquote> 3894 <ol> 3895 <li> 3896 <p>one</p> 3897 <p>two</p> 3898 </li> 3899 </ol> 3900 </blockquote> 3901 </blockquote> 3902 ```````````````````````````````` 3903 3904 3905 Here `two` occurs in the same column as the list marker `1.`, 3906 but is actually contained in the list item, because there is 3907 sufficient indentation after the last containing blockquote marker. 3908 3909 The converse is also possible. In the following example, the word `two` 3910 occurs far to the right of the initial text of the list item, `one`, but 3911 it is not considered part of the list item, because it is not indented 3912 far enough past the blockquote marker: 3913 3914 ```````````````````````````````` example 3915 >>- one 3916 >> 3917 > > two 3918 . 3919 <blockquote> 3920 <blockquote> 3921 <ul> 3922 <li>one</li> 3923 </ul> 3924 <p>two</p> 3925 </blockquote> 3926 </blockquote> 3927 ```````````````````````````````` 3928 3929 3930 Note that at least one space is needed between the list marker and 3931 any following content, so these are not list items: 3932 3933 ```````````````````````````````` example 3934 -one 3935 3936 2.two 3937 . 3938 <p>-one</p> 3939 <p>2.two</p> 3940 ```````````````````````````````` 3941 3942 3943 A list item may contain blocks that are separated by more than 3944 one blank line. 3945 3946 ```````````````````````````````` example 3947 - foo 3948 3949 3950 bar 3951 . 3952 <ul> 3953 <li> 3954 <p>foo</p> 3955 <p>bar</p> 3956 </li> 3957 </ul> 3958 ```````````````````````````````` 3959 3960 3961 A list item may contain any kind of block: 3962 3963 ```````````````````````````````` example 3964 1. foo 3965 3966 ``` 3967 bar 3968 ``` 3969 3970 baz 3971 3972 > bam 3973 . 3974 <ol> 3975 <li> 3976 <p>foo</p> 3977 <pre><code>bar 3978 </code></pre> 3979 <p>baz</p> 3980 <blockquote> 3981 <p>bam</p> 3982 </blockquote> 3983 </li> 3984 </ol> 3985 ```````````````````````````````` 3986 3987 3988 A list item that contains an indented code block will preserve 3989 empty lines within the code block verbatim. 3990 3991 ```````````````````````````````` example 3992 - Foo 3993 3994 bar 3995 3996 3997 baz 3998 . 3999 <ul> 4000 <li> 4001 <p>Foo</p> 4002 <pre><code>bar 4003 4004 4005 baz 4006 </code></pre> 4007 </li> 4008 </ul> 4009 ```````````````````````````````` 4010 4011 Note that ordered list start numbers must be nine digits or less: 4012 4013 ```````````````````````````````` example 4014 123456789. ok 4015 . 4016 <ol start="123456789"> 4017 <li>ok</li> 4018 </ol> 4019 ```````````````````````````````` 4020 4021 4022 ```````````````````````````````` example 4023 1234567890. not ok 4024 . 4025 <p>1234567890. not ok</p> 4026 ```````````````````````````````` 4027 4028 4029 A start number may begin with 0s: 4030 4031 ```````````````````````````````` example 4032 0. ok 4033 . 4034 <ol start="0"> 4035 <li>ok</li> 4036 </ol> 4037 ```````````````````````````````` 4038 4039 4040 ```````````````````````````````` example 4041 003. ok 4042 . 4043 <ol start="3"> 4044 <li>ok</li> 4045 </ol> 4046 ```````````````````````````````` 4047 4048 4049 A start number may not be negative: 4050 4051 ```````````````````````````````` example 4052 -1. not ok 4053 . 4054 <p>-1. not ok</p> 4055 ```````````````````````````````` 4056 4057 4058 4059 2. **Item starting with indented code.** If a sequence of lines *Ls* 4060 constitute a sequence of blocks *Bs* starting with an indented code 4061 block, and *M* is a list marker of width *W* followed by 4062 one space, then the result of prepending *M* and the following 4063 space to the first line of *Ls*, and indenting subsequent lines of 4064 *Ls* by *W + 1* spaces, is a list item with *Bs* as its contents. 4065 If a line is empty, then it need not be indented. The type of the 4066 list item (bullet or ordered) is determined by the type of its list 4067 marker. If the list item is ordered, then it is also assigned a 4068 start number, based on the ordered list marker. 4069 4070 An indented code block will have to be indented four spaces beyond 4071 the edge of the region where text will be included in the list item. 4072 In the following case that is 6 spaces: 4073 4074 ```````````````````````````````` example 4075 - foo 4076 4077 bar 4078 . 4079 <ul> 4080 <li> 4081 <p>foo</p> 4082 <pre><code>bar 4083 </code></pre> 4084 </li> 4085 </ul> 4086 ```````````````````````````````` 4087 4088 4089 And in this case it is 11 spaces: 4090 4091 ```````````````````````````````` example 4092 10. foo 4093 4094 bar 4095 . 4096 <ol start="10"> 4097 <li> 4098 <p>foo</p> 4099 <pre><code>bar 4100 </code></pre> 4101 </li> 4102 </ol> 4103 ```````````````````````````````` 4104 4105 4106 If the *first* block in the list item is an indented code block, 4107 then by rule #2, the contents must be indented *one* space after the 4108 list marker: 4109 4110 ```````````````````````````````` example 4111 indented code 4112 4113 paragraph 4114 4115 more code 4116 . 4117 <pre><code>indented code 4118 </code></pre> 4119 <p>paragraph</p> 4120 <pre><code>more code 4121 </code></pre> 4122 ```````````````````````````````` 4123 4124 4125 ```````````````````````````````` example 4126 1. indented code 4127 4128 paragraph 4129 4130 more code 4131 . 4132 <ol> 4133 <li> 4134 <pre><code>indented code 4135 </code></pre> 4136 <p>paragraph</p> 4137 <pre><code>more code 4138 </code></pre> 4139 </li> 4140 </ol> 4141 ```````````````````````````````` 4142 4143 4144 Note that an additional space indent is interpreted as space 4145 inside the code block: 4146 4147 ```````````````````````````````` example 4148 1. indented code 4149 4150 paragraph 4151 4152 more code 4153 . 4154 <ol> 4155 <li> 4156 <pre><code> indented code 4157 </code></pre> 4158 <p>paragraph</p> 4159 <pre><code>more code 4160 </code></pre> 4161 </li> 4162 </ol> 4163 ```````````````````````````````` 4164 4165 4166 Note that rules #1 and #2 only apply to two cases: (a) cases 4167 in which the lines to be included in a list item begin with a 4168 [non-whitespace character], and (b) cases in which 4169 they begin with an indented code 4170 block. In a case like the following, where the first block begins with 4171 a three-space indent, the rules do not allow us to form a list item by 4172 indenting the whole thing and prepending a list marker: 4173 4174 ```````````````````````````````` example 4175 foo 4176 4177 bar 4178 . 4179 <p>foo</p> 4180 <p>bar</p> 4181 ```````````````````````````````` 4182 4183 4184 ```````````````````````````````` example 4185 - foo 4186 4187 bar 4188 . 4189 <ul> 4190 <li>foo</li> 4191 </ul> 4192 <p>bar</p> 4193 ```````````````````````````````` 4194 4195 4196 This is not a significant restriction, because when a block begins 4197 with 1-3 spaces indent, the indentation can always be removed without 4198 a change in interpretation, allowing rule #1 to be applied. So, in 4199 the above case: 4200 4201 ```````````````````````````````` example 4202 - foo 4203 4204 bar 4205 . 4206 <ul> 4207 <li> 4208 <p>foo</p> 4209 <p>bar</p> 4210 </li> 4211 </ul> 4212 ```````````````````````````````` 4213 4214 4215 3. **Item starting with a blank line.** If a sequence of lines *Ls* 4216 starting with a single [blank line] constitute a (possibly empty) 4217 sequence of blocks *Bs*, not separated from each other by more than 4218 one blank line, and *M* is a list marker of width *W*, 4219 then the result of prepending *M* to the first line of *Ls*, and 4220 indenting subsequent lines of *Ls* by *W + 1* spaces, is a list 4221 item with *Bs* as its contents. 4222 If a line is empty, then it need not be indented. The type of the 4223 list item (bullet or ordered) is determined by the type of its list 4224 marker. If the list item is ordered, then it is also assigned a 4225 start number, based on the ordered list marker. 4226 4227 Here are some list items that start with a blank line but are not empty: 4228 4229 ```````````````````````````````` example 4230 - 4231 foo 4232 - 4233 ``` 4234 bar 4235 ``` 4236 - 4237 baz 4238 . 4239 <ul> 4240 <li>foo</li> 4241 <li> 4242 <pre><code>bar 4243 </code></pre> 4244 </li> 4245 <li> 4246 <pre><code>baz 4247 </code></pre> 4248 </li> 4249 </ul> 4250 ```````````````````````````````` 4251 4252 When the list item starts with a blank line, the number of spaces 4253 following the list marker doesn't change the required indentation: 4254 4255 ```````````````````````````````` example 4256 - 4257 foo 4258 . 4259 <ul> 4260 <li>foo</li> 4261 </ul> 4262 ```````````````````````````````` 4263 4264 4265 A list item can begin with at most one blank line. 4266 In the following example, `foo` is not part of the list 4267 item: 4268 4269 ```````````````````````````````` example 4270 - 4271 4272 foo 4273 . 4274 <ul> 4275 <li></li> 4276 </ul> 4277 <p>foo</p> 4278 ```````````````````````````````` 4279 4280 4281 Here is an empty bullet list item: 4282 4283 ```````````````````````````````` example 4284 - foo 4285 - 4286 - bar 4287 . 4288 <ul> 4289 <li>foo</li> 4290 <li></li> 4291 <li>bar</li> 4292 </ul> 4293 ```````````````````````````````` 4294 4295 4296 It does not matter whether there are spaces following the [list marker]: 4297 4298 ```````````````````````````````` example 4299 - foo 4300 - 4301 - bar 4302 . 4303 <ul> 4304 <li>foo</li> 4305 <li></li> 4306 <li>bar</li> 4307 </ul> 4308 ```````````````````````````````` 4309 4310 4311 Here is an empty ordered list item: 4312 4313 ```````````````````````````````` example 4314 1. foo 4315 2. 4316 3. bar 4317 . 4318 <ol> 4319 <li>foo</li> 4320 <li></li> 4321 <li>bar</li> 4322 </ol> 4323 ```````````````````````````````` 4324 4325 4326 A list may start or end with an empty list item: 4327 4328 ```````````````````````````````` example 4329 * 4330 . 4331 <ul> 4332 <li></li> 4333 </ul> 4334 ```````````````````````````````` 4335 4336 However, an empty list item cannot interrupt a paragraph: 4337 4338 ```````````````````````````````` example 4339 foo 4340 * 4341 4342 foo 4343 1. 4344 . 4345 <p>foo 4346 *</p> 4347 <p>foo 4348 1.</p> 4349 ```````````````````````````````` 4350 4351 4352 4. **Indentation.** If a sequence of lines *Ls* constitutes a list item 4353 according to rule #1, #2, or #3, then the result of indenting each line 4354 of *Ls* by 1-3 spaces (the same for each line) also constitutes a 4355 list item with the same contents and attributes. If a line is 4356 empty, then it need not be indented. 4357 4358 Indented one space: 4359 4360 ```````````````````````````````` example 4361 1. A paragraph 4362 with two lines. 4363 4364 indented code 4365 4366 > A block quote. 4367 . 4368 <ol> 4369 <li> 4370 <p>A paragraph 4371 with two lines.</p> 4372 <pre><code>indented code 4373 </code></pre> 4374 <blockquote> 4375 <p>A block quote.</p> 4376 </blockquote> 4377 </li> 4378 </ol> 4379 ```````````````````````````````` 4380 4381 4382 Indented two spaces: 4383 4384 ```````````````````````````````` example 4385 1. A paragraph 4386 with two lines. 4387 4388 indented code 4389 4390 > A block quote. 4391 . 4392 <ol> 4393 <li> 4394 <p>A paragraph 4395 with two lines.</p> 4396 <pre><code>indented code 4397 </code></pre> 4398 <blockquote> 4399 <p>A block quote.</p> 4400 </blockquote> 4401 </li> 4402 </ol> 4403 ```````````````````````````````` 4404 4405 4406 Indented three spaces: 4407 4408 ```````````````````````````````` example 4409 1. A paragraph 4410 with two lines. 4411 4412 indented code 4413 4414 > A block quote. 4415 . 4416 <ol> 4417 <li> 4418 <p>A paragraph 4419 with two lines.</p> 4420 <pre><code>indented code 4421 </code></pre> 4422 <blockquote> 4423 <p>A block quote.</p> 4424 </blockquote> 4425 </li> 4426 </ol> 4427 ```````````````````````````````` 4428 4429 4430 Four spaces indent gives a code block: 4431 4432 ```````````````````````````````` example 4433 1. A paragraph 4434 with two lines. 4435 4436 indented code 4437 4438 > A block quote. 4439 . 4440 <pre><code>1. A paragraph 4441 with two lines. 4442 4443 indented code 4444 4445 > A block quote. 4446 </code></pre> 4447 ```````````````````````````````` 4448 4449 4450 4451 5. **Laziness.** If a string of lines *Ls* constitute a [list 4452 item](#list-items) with contents *Bs*, then the result of deleting 4453 some or all of the indentation from one or more lines in which the 4454 next [non-whitespace character] after the indentation is 4455 [paragraph continuation text] is a 4456 list item with the same contents and attributes. The unindented 4457 lines are called 4458 [lazy continuation line](@)s. 4459 4460 Here is an example with [lazy continuation lines]: 4461 4462 ```````````````````````````````` example 4463 1. A paragraph 4464 with two lines. 4465 4466 indented code 4467 4468 > A block quote. 4469 . 4470 <ol> 4471 <li> 4472 <p>A paragraph 4473 with two lines.</p> 4474 <pre><code>indented code 4475 </code></pre> 4476 <blockquote> 4477 <p>A block quote.</p> 4478 </blockquote> 4479 </li> 4480 </ol> 4481 ```````````````````````````````` 4482 4483 4484 Indentation can be partially deleted: 4485 4486 ```````````````````````````````` example 4487 1. A paragraph 4488 with two lines. 4489 . 4490 <ol> 4491 <li>A paragraph 4492 with two lines.</li> 4493 </ol> 4494 ```````````````````````````````` 4495 4496 4497 These examples show how laziness can work in nested structures: 4498 4499 ```````````````````````````````` example 4500 > 1. > Blockquote 4501 continued here. 4502 . 4503 <blockquote> 4504 <ol> 4505 <li> 4506 <blockquote> 4507 <p>Blockquote 4508 continued here.</p> 4509 </blockquote> 4510 </li> 4511 </ol> 4512 </blockquote> 4513 ```````````````````````````````` 4514 4515 4516 ```````````````````````````````` example 4517 > 1. > Blockquote 4518 > continued here. 4519 . 4520 <blockquote> 4521 <ol> 4522 <li> 4523 <blockquote> 4524 <p>Blockquote 4525 continued here.</p> 4526 </blockquote> 4527 </li> 4528 </ol> 4529 </blockquote> 4530 ```````````````````````````````` 4531 4532 4533 4534 6. **That's all.** Nothing that is not counted as a list item by rules 4535 #1--5 counts as a [list item](#list-items). 4536 4537 The rules for sublists follow from the general rules 4538 [above][List items]. A sublist must be indented the same number 4539 of spaces a paragraph would need to be in order to be included 4540 in the list item. 4541 4542 So, in this case we need two spaces indent: 4543 4544 ```````````````````````````````` example 4545 - foo 4546 - bar 4547 - baz 4548 - boo 4549 . 4550 <ul> 4551 <li>foo 4552 <ul> 4553 <li>bar 4554 <ul> 4555 <li>baz 4556 <ul> 4557 <li>boo</li> 4558 </ul> 4559 </li> 4560 </ul> 4561 </li> 4562 </ul> 4563 </li> 4564 </ul> 4565 ```````````````````````````````` 4566 4567 4568 One is not enough: 4569 4570 ```````````````````````````````` example 4571 - foo 4572 - bar 4573 - baz 4574 - boo 4575 . 4576 <ul> 4577 <li>foo</li> 4578 <li>bar</li> 4579 <li>baz</li> 4580 <li>boo</li> 4581 </ul> 4582 ```````````````````````````````` 4583 4584 4585 Here we need four, because the list marker is wider: 4586 4587 ```````````````````````````````` example 4588 10) foo 4589 - bar 4590 . 4591 <ol start="10"> 4592 <li>foo 4593 <ul> 4594 <li>bar</li> 4595 </ul> 4596 </li> 4597 </ol> 4598 ```````````````````````````````` 4599 4600 4601 Three is not enough: 4602 4603 ```````````````````````````````` example 4604 10) foo 4605 - bar 4606 . 4607 <ol start="10"> 4608 <li>foo</li> 4609 </ol> 4610 <ul> 4611 <li>bar</li> 4612 </ul> 4613 ```````````````````````````````` 4614 4615 4616 A list may be the first block in a list item: 4617 4618 ```````````````````````````````` example 4619 - - foo 4620 . 4621 <ul> 4622 <li> 4623 <ul> 4624 <li>foo</li> 4625 </ul> 4626 </li> 4627 </ul> 4628 ```````````````````````````````` 4629 4630 4631 ```````````````````````````````` example 4632 1. - 2. foo 4633 . 4634 <ol> 4635 <li> 4636 <ul> 4637 <li> 4638 <ol start="2"> 4639 <li>foo</li> 4640 </ol> 4641 </li> 4642 </ul> 4643 </li> 4644 </ol> 4645 ```````````````````````````````` 4646 4647 4648 A list item can contain a heading: 4649 4650 ```````````````````````````````` example 4651 - # Foo 4652 - Bar 4653 --- 4654 baz 4655 . 4656 <ul> 4657 <li> 4658 <h1>Foo</h1> 4659 </li> 4660 <li> 4661 <h2>Bar</h2> 4662 baz</li> 4663 </ul> 4664 ```````````````````````````````` 4665 4666 4667 ### Motivation 4668 4669 John Gruber's Markdown spec says the following about list items: 4670 4671 1. "List markers typically start at the left margin, but may be indented 4672 by up to three spaces. List markers must be followed by one or more 4673 spaces or a tab." 4674 4675 2. "To make lists look nice, you can wrap items with hanging indents.... 4676 But if you don't want to, you don't have to." 4677 4678 3. "List items may consist of multiple paragraphs. Each subsequent 4679 paragraph in a list item must be indented by either 4 spaces or one 4680 tab." 4681 4682 4. "It looks nice if you indent every line of the subsequent paragraphs, 4683 but here again, Markdown will allow you to be lazy." 4684 4685 5. "To put a blockquote within a list item, the blockquote's `>` 4686 delimiters need to be indented." 4687 4688 6. "To put a code block within a list item, the code block needs to be 4689 indented twice — 8 spaces or two tabs." 4690 4691 These rules specify that a paragraph under a list item must be indented 4692 four spaces (presumably, from the left margin, rather than the start of 4693 the list marker, but this is not said), and that code under a list item 4694 must be indented eight spaces instead of the usual four. They also say 4695 that a block quote must be indented, but not by how much; however, the 4696 example given has four spaces indentation. Although nothing is said 4697 about other kinds of block-level content, it is certainly reasonable to 4698 infer that *all* block elements under a list item, including other 4699 lists, must be indented four spaces. This principle has been called the 4700 *four-space rule*. 4701 4702 The four-space rule is clear and principled, and if the reference 4703 implementation `Markdown.pl` had followed it, it probably would have 4704 become the standard. However, `Markdown.pl` allowed paragraphs and 4705 sublists to start with only two spaces indentation, at least on the 4706 outer level. Worse, its behavior was inconsistent: a sublist of an 4707 outer-level list needed two spaces indentation, but a sublist of this 4708 sublist needed three spaces. It is not surprising, then, that different 4709 implementations of Markdown have developed very different rules for 4710 determining what comes under a list item. (Pandoc and python-Markdown, 4711 for example, stuck with Gruber's syntax description and the four-space 4712 rule, while discount, redcarpet, marked, PHP Markdown, and others 4713 followed `Markdown.pl`'s behavior more closely.) 4714 4715 Unfortunately, given the divergences between implementations, there 4716 is no way to give a spec for list items that will be guaranteed not 4717 to break any existing documents. However, the spec given here should 4718 correctly handle lists formatted with either the four-space rule or 4719 the more forgiving `Markdown.pl` behavior, provided they are laid out 4720 in a way that is natural for a human to read. 4721 4722 The strategy here is to let the width and indentation of the list marker 4723 determine the indentation necessary for blocks to fall under the list 4724 item, rather than having a fixed and arbitrary number. The writer can 4725 think of the body of the list item as a unit which gets indented to the 4726 right enough to fit the list marker (and any indentation on the list 4727 marker). (The laziness rule, #5, then allows continuation lines to be 4728 unindented if needed.) 4729 4730 This rule is superior, we claim, to any rule requiring a fixed level of 4731 indentation from the margin. The four-space rule is clear but 4732 unnatural. It is quite unintuitive that 4733 4734 ``` markdown 4735 - foo 4736 4737 bar 4738 4739 - baz 4740 ``` 4741 4742 should be parsed as two lists with an intervening paragraph, 4743 4744 ``` html 4745 <ul> 4746 <li>foo</li> 4747 </ul> 4748 <p>bar</p> 4749 <ul> 4750 <li>baz</li> 4751 </ul> 4752 ``` 4753 4754 as the four-space rule demands, rather than a single list, 4755 4756 ``` html 4757 <ul> 4758 <li> 4759 <p>foo</p> 4760 <p>bar</p> 4761 <ul> 4762 <li>baz</li> 4763 </ul> 4764 </li> 4765 </ul> 4766 ``` 4767 4768 The choice of four spaces is arbitrary. It can be learned, but it is 4769 not likely to be guessed, and it trips up beginners regularly. 4770 4771 Would it help to adopt a two-space rule? The problem is that such 4772 a rule, together with the rule allowing 1--3 spaces indentation of the 4773 initial list marker, allows text that is indented *less than* the 4774 original list marker to be included in the list item. For example, 4775 `Markdown.pl` parses 4776 4777 ``` markdown 4778 - one 4779 4780 two 4781 ``` 4782 4783 as a single list item, with `two` a continuation paragraph: 4784 4785 ``` html 4786 <ul> 4787 <li> 4788 <p>one</p> 4789 <p>two</p> 4790 </li> 4791 </ul> 4792 ``` 4793 4794 and similarly 4795 4796 ``` markdown 4797 > - one 4798 > 4799 > two 4800 ``` 4801 4802 as 4803 4804 ``` html 4805 <blockquote> 4806 <ul> 4807 <li> 4808 <p>one</p> 4809 <p>two</p> 4810 </li> 4811 </ul> 4812 </blockquote> 4813 ``` 4814 4815 This is extremely unintuitive. 4816 4817 Rather than requiring a fixed indent from the margin, we could require 4818 a fixed indent (say, two spaces, or even one space) from the list marker (which 4819 may itself be indented). This proposal would remove the last anomaly 4820 discussed. Unlike the spec presented above, it would count the following 4821 as a list item with a subparagraph, even though the paragraph `bar` 4822 is not indented as far as the first paragraph `foo`: 4823 4824 ``` markdown 4825 10. foo 4826 4827 bar 4828 ``` 4829 4830 Arguably this text does read like a list item with `bar` as a subparagraph, 4831 which may count in favor of the proposal. However, on this proposal indented 4832 code would have to be indented six spaces after the list marker. And this 4833 would break a lot of existing Markdown, which has the pattern: 4834 4835 ``` markdown 4836 1. foo 4837 4838 indented code 4839 ``` 4840 4841 where the code is indented eight spaces. The spec above, by contrast, will 4842 parse this text as expected, since the code block's indentation is measured 4843 from the beginning of `foo`. 4844 4845 The one case that needs special treatment is a list item that *starts* 4846 with indented code. How much indentation is required in that case, since 4847 we don't have a "first paragraph" to measure from? Rule #2 simply stipulates 4848 that in such cases, we require one space indentation from the list marker 4849 (and then the normal four spaces for the indented code). This will match the 4850 four-space rule in cases where the list marker plus its initial indentation 4851 takes four spaces (a common case), but diverge in other cases. 4852 4853 ## Lists 4854 4855 A [list](@) is a sequence of one or more 4856 list items [of the same type]. The list items 4857 may be separated by any number of blank lines. 4858 4859 Two list items are [of the same type](@) 4860 if they begin with a [list marker] of the same type. 4861 Two list markers are of the 4862 same type if (a) they are bullet list markers using the same character 4863 (`-`, `+`, or `*`) or (b) they are ordered list numbers with the same 4864 delimiter (either `.` or `)`). 4865 4866 A list is an [ordered list](@) 4867 if its constituent list items begin with 4868 [ordered list markers], and a 4869 [bullet list](@) if its constituent list 4870 items begin with [bullet list markers]. 4871 4872 The [start number](@) 4873 of an [ordered list] is determined by the list number of 4874 its initial list item. The numbers of subsequent list items are 4875 disregarded. 4876 4877 A list is [loose](@) if any of its constituent 4878 list items are separated by blank lines, or if any of its constituent 4879 list items directly contain two block-level elements with a blank line 4880 between them. Otherwise a list is [tight](@). 4881 (The difference in HTML output is that paragraphs in a loose list are 4882 wrapped in `<p>` tags, while paragraphs in a tight list are not.) 4883 4884 Changing the bullet or ordered list delimiter starts a new list: 4885 4886 ```````````````````````````````` example 4887 - foo 4888 - bar 4889 + baz 4890 . 4891 <ul> 4892 <li>foo</li> 4893 <li>bar</li> 4894 </ul> 4895 <ul> 4896 <li>baz</li> 4897 </ul> 4898 ```````````````````````````````` 4899 4900 4901 ```````````````````````````````` example 4902 1. foo 4903 2. bar 4904 3) baz 4905 . 4906 <ol> 4907 <li>foo</li> 4908 <li>bar</li> 4909 </ol> 4910 <ol start="3"> 4911 <li>baz</li> 4912 </ol> 4913 ```````````````````````````````` 4914 4915 4916 In CommonMark, a list can interrupt a paragraph. That is, 4917 no blank line is needed to separate a paragraph from a following 4918 list: 4919 4920 ```````````````````````````````` example 4921 Foo 4922 - bar 4923 - baz 4924 . 4925 <p>Foo</p> 4926 <ul> 4927 <li>bar</li> 4928 <li>baz</li> 4929 </ul> 4930 ```````````````````````````````` 4931 4932 `Markdown.pl` does not allow this, through fear of triggering a list 4933 via a numeral in a hard-wrapped line: 4934 4935 ``` markdown 4936 The number of windows in my house is 4937 14. The number of doors is 6. 4938 ``` 4939 4940 Oddly, though, `Markdown.pl` *does* allow a blockquote to 4941 interrupt a paragraph, even though the same considerations might 4942 apply. 4943 4944 In CommonMark, we do allow lists to interrupt paragraphs, for 4945 two reasons. First, it is natural and not uncommon for people 4946 to start lists without blank lines: 4947 4948 ``` markdown 4949 I need to buy 4950 - new shoes 4951 - a coat 4952 - a plane ticket 4953 ``` 4954 4955 Second, we are attracted to a 4956 4957 > [principle of uniformity](@): 4958 > if a chunk of text has a certain 4959 > meaning, it will continue to have the same meaning when put into a 4960 > container block (such as a list item or blockquote). 4961 4962 (Indeed, the spec for [list items] and [block quotes] presupposes 4963 this principle.) This principle implies that if 4964 4965 ``` markdown 4966 * I need to buy 4967 - new shoes 4968 - a coat 4969 - a plane ticket 4970 ``` 4971 4972 is a list item containing a paragraph followed by a nested sublist, 4973 as all Markdown implementations agree it is (though the paragraph 4974 may be rendered without `<p>` tags, since the list is "tight"), 4975 then 4976 4977 ``` markdown 4978 I need to buy 4979 - new shoes 4980 - a coat 4981 - a plane ticket 4982 ``` 4983 4984 by itself should be a paragraph followed by a nested sublist. 4985 4986 Since it is well established Markdown practice to allow lists to 4987 interrupt paragraphs inside list items, the [principle of 4988 uniformity] requires us to allow this outside list items as 4989 well. ([reStructuredText](http://docutils.sourceforge.net/rst.html) 4990 takes a different approach, requiring blank lines before lists 4991 even inside other list items.) 4992 4993 In order to solve of unwanted lists in paragraphs with 4994 hard-wrapped numerals, we allow only lists starting with `1` to 4995 interrupt paragraphs. Thus, 4996 4997 ```````````````````````````````` example 4998 The number of windows in my house is 4999 14. The number of doors is 6. 5000 . 5001 <p>The number of windows in my house is 5002 14. The number of doors is 6.</p> 5003 ```````````````````````````````` 5004 5005 We may still get an unintended result in cases like 5006 5007 ```````````````````````````````` example 5008 The number of windows in my house is 5009 1. The number of doors is 6. 5010 . 5011 <p>The number of windows in my house is</p> 5012 <ol> 5013 <li>The number of doors is 6.</li> 5014 </ol> 5015 ```````````````````````````````` 5016 5017 but this rule should prevent most spurious list captures. 5018 5019 There can be any number of blank lines between items: 5020 5021 ```````````````````````````````` example 5022 - foo 5023 5024 - bar 5025 5026 5027 - baz 5028 . 5029 <ul> 5030 <li> 5031 <p>foo</p> 5032 </li> 5033 <li> 5034 <p>bar</p> 5035 </li> 5036 <li> 5037 <p>baz</p> 5038 </li> 5039 </ul> 5040 ```````````````````````````````` 5041 5042 ```````````````````````````````` example 5043 - foo 5044 - bar 5045 - baz 5046 5047 5048 bim 5049 . 5050 <ul> 5051 <li>foo 5052 <ul> 5053 <li>bar 5054 <ul> 5055 <li> 5056 <p>baz</p> 5057 <p>bim</p> 5058 </li> 5059 </ul> 5060 </li> 5061 </ul> 5062 </li> 5063 </ul> 5064 ```````````````````````````````` 5065 5066 5067 To separate consecutive lists of the same type, or to separate a 5068 list from an indented code block that would otherwise be parsed 5069 as a subparagraph of the final list item, you can insert a blank HTML 5070 comment: 5071 5072 ```````````````````````````````` example 5073 - foo 5074 - bar 5075 5076 <!-- --> 5077 5078 - baz 5079 - bim 5080 . 5081 <ul> 5082 <li>foo</li> 5083 <li>bar</li> 5084 </ul> 5085 <!-- --> 5086 <ul> 5087 <li>baz</li> 5088 <li>bim</li> 5089 </ul> 5090 ```````````````````````````````` 5091 5092 5093 ```````````````````````````````` example 5094 - foo 5095 5096 notcode 5097 5098 - foo 5099 5100 <!-- --> 5101 5102 code 5103 . 5104 <ul> 5105 <li> 5106 <p>foo</p> 5107 <p>notcode</p> 5108 </li> 5109 <li> 5110 <p>foo</p> 5111 </li> 5112 </ul> 5113 <!-- --> 5114 <pre><code>code 5115 </code></pre> 5116 ```````````````````````````````` 5117 5118 5119 List items need not be indented to the same level. The following 5120 list items will be treated as items at the same list level, 5121 since none is indented enough to belong to the previous list 5122 item: 5123 5124 ```````````````````````````````` example 5125 - a 5126 - b 5127 - c 5128 - d 5129 - e 5130 - f 5131 - g 5132 . 5133 <ul> 5134 <li>a</li> 5135 <li>b</li> 5136 <li>c</li> 5137 <li>d</li> 5138 <li>e</li> 5139 <li>f</li> 5140 <li>g</li> 5141 </ul> 5142 ```````````````````````````````` 5143 5144 5145 ```````````````````````````````` example 5146 1. a 5147 5148 2. b 5149 5150 3. c 5151 . 5152 <ol> 5153 <li> 5154 <p>a</p> 5155 </li> 5156 <li> 5157 <p>b</p> 5158 </li> 5159 <li> 5160 <p>c</p> 5161 </li> 5162 </ol> 5163 ```````````````````````````````` 5164 5165 Note, however, that list items may not be indented more than 5166 three spaces. Here `- e` is treated as a paragraph continuation 5167 line, because it is indented more than three spaces: 5168 5169 ```````````````````````````````` example 5170 - a 5171 - b 5172 - c 5173 - d 5174 - e 5175 . 5176 <ul> 5177 <li>a</li> 5178 <li>b</li> 5179 <li>c</li> 5180 <li>d 5181 - e</li> 5182 </ul> 5183 ```````````````````````````````` 5184 5185 And here, `3. c` is treated as in indented code block, 5186 because it is indented four spaces and preceded by a 5187 blank line. 5188 5189 ```````````````````````````````` example 5190 1. a 5191 5192 2. b 5193 5194 3. c 5195 . 5196 <ol> 5197 <li> 5198 <p>a</p> 5199 </li> 5200 <li> 5201 <p>b</p> 5202 </li> 5203 </ol> 5204 <pre><code>3. c 5205 </code></pre> 5206 ```````````````````````````````` 5207 5208 5209 This is a loose list, because there is a blank line between 5210 two of the list items: 5211 5212 ```````````````````````````````` example 5213 - a 5214 - b 5215 5216 - c 5217 . 5218 <ul> 5219 <li> 5220 <p>a</p> 5221 </li> 5222 <li> 5223 <p>b</p> 5224 </li> 5225 <li> 5226 <p>c</p> 5227 </li> 5228 </ul> 5229 ```````````````````````````````` 5230 5231 5232 So is this, with a empty second item: 5233 5234 ```````````````````````````````` example 5235 * a 5236 * 5237 5238 * c 5239 . 5240 <ul> 5241 <li> 5242 <p>a</p> 5243 </li> 5244 <li></li> 5245 <li> 5246 <p>c</p> 5247 </li> 5248 </ul> 5249 ```````````````````````````````` 5250 5251 5252 These are loose lists, even though there is no space between the items, 5253 because one of the items directly contains two block-level elements 5254 with a blank line between them: 5255 5256 ```````````````````````````````` example 5257 - a 5258 - b 5259 5260 c 5261 - d 5262 . 5263 <ul> 5264 <li> 5265 <p>a</p> 5266 </li> 5267 <li> 5268 <p>b</p> 5269 <p>c</p> 5270 </li> 5271 <li> 5272 <p>d</p> 5273 </li> 5274 </ul> 5275 ```````````````````````````````` 5276 5277 5278 ```````````````````````````````` example 5279 - a 5280 - b 5281 5282 [ref]: /url 5283 - d 5284 . 5285 <ul> 5286 <li> 5287 <p>a</p> 5288 </li> 5289 <li> 5290 <p>b</p> 5291 </li> 5292 <li> 5293 <p>d</p> 5294 </li> 5295 </ul> 5296 ```````````````````````````````` 5297 5298 5299 This is a tight list, because the blank lines are in a code block: 5300 5301 ```````````````````````````````` example 5302 - a 5303 - ``` 5304 b 5305 5306 5307 ``` 5308 - c 5309 . 5310 <ul> 5311 <li>a</li> 5312 <li> 5313 <pre><code>b 5314 5315 5316 </code></pre> 5317 </li> 5318 <li>c</li> 5319 </ul> 5320 ```````````````````````````````` 5321 5322 5323 This is a tight list, because the blank line is between two 5324 paragraphs of a sublist. So the sublist is loose while 5325 the outer list is tight: 5326 5327 ```````````````````````````````` example 5328 - a 5329 - b 5330 5331 c 5332 - d 5333 . 5334 <ul> 5335 <li>a 5336 <ul> 5337 <li> 5338 <p>b</p> 5339 <p>c</p> 5340 </li> 5341 </ul> 5342 </li> 5343 <li>d</li> 5344 </ul> 5345 ```````````````````````````````` 5346 5347 5348 This is a tight list, because the blank line is inside the 5349 block quote: 5350 5351 ```````````````````````````````` example 5352 * a 5353 > b 5354 > 5355 * c 5356 . 5357 <ul> 5358 <li>a 5359 <blockquote> 5360 <p>b</p> 5361 </blockquote> 5362 </li> 5363 <li>c</li> 5364 </ul> 5365 ```````````````````````````````` 5366 5367 5368 This list is tight, because the consecutive block elements 5369 are not separated by blank lines: 5370 5371 ```````````````````````````````` example 5372 - a 5373 > b 5374 ``` 5375 c 5376 ``` 5377 - d 5378 . 5379 <ul> 5380 <li>a 5381 <blockquote> 5382 <p>b</p> 5383 </blockquote> 5384 <pre><code>c 5385 </code></pre> 5386 </li> 5387 <li>d</li> 5388 </ul> 5389 ```````````````````````````````` 5390 5391 5392 A single-paragraph list is tight: 5393 5394 ```````````````````````````````` example 5395 - a 5396 . 5397 <ul> 5398 <li>a</li> 5399 </ul> 5400 ```````````````````````````````` 5401 5402 5403 ```````````````````````````````` example 5404 - a 5405 - b 5406 . 5407 <ul> 5408 <li>a 5409 <ul> 5410 <li>b</li> 5411 </ul> 5412 </li> 5413 </ul> 5414 ```````````````````````````````` 5415 5416 5417 This list is loose, because of the blank line between the 5418 two block elements in the list item: 5419 5420 ```````````````````````````````` example 5421 1. ``` 5422 foo 5423 ``` 5424 5425 bar 5426 . 5427 <ol> 5428 <li> 5429 <pre><code>foo 5430 </code></pre> 5431 <p>bar</p> 5432 </li> 5433 </ol> 5434 ```````````````````````````````` 5435 5436 5437 Here the outer list is loose, the inner list tight: 5438 5439 ```````````````````````````````` example 5440 * foo 5441 * bar 5442 5443 baz 5444 . 5445 <ul> 5446 <li> 5447 <p>foo</p> 5448 <ul> 5449 <li>bar</li> 5450 </ul> 5451 <p>baz</p> 5452 </li> 5453 </ul> 5454 ```````````````````````````````` 5455 5456 5457 ```````````````````````````````` example 5458 - a 5459 - b 5460 - c 5461 5462 - d 5463 - e 5464 - f 5465 . 5466 <ul> 5467 <li> 5468 <p>a</p> 5469 <ul> 5470 <li>b</li> 5471 <li>c</li> 5472 </ul> 5473 </li> 5474 <li> 5475 <p>d</p> 5476 <ul> 5477 <li>e</li> 5478 <li>f</li> 5479 </ul> 5480 </li> 5481 </ul> 5482 ```````````````````````````````` 5483 5484 5485 # Inlines 5486 5487 Inlines are parsed sequentially from the beginning of the character 5488 stream to the end (left to right, in left-to-right languages). 5489 Thus, for example, in 5490 5491 ```````````````````````````````` example 5492 `hi`lo` 5493 . 5494 <p><code>hi</code>lo`</p> 5495 ```````````````````````````````` 5496 5497 `hi` is parsed as code, leaving the backtick at the end as a literal 5498 backtick. 5499 5500 5501 ## Backslash escapes 5502 5503 Any ASCII punctuation character may be backslash-escaped: 5504 5505 ```````````````````````````````` example 5506 \!\"\#\$\%\&\'\(\)\*\+\,\-\.\/\:\;\<\=\>\?\@\[\\\]\^\_\`\{\|\}\~ 5507 . 5508 <p>!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~</p> 5509 ```````````````````````````````` 5510 5511 5512 Backslashes before other characters are treated as literal 5513 backslashes: 5514 5515 ```````````````````````````````` example 5516 \→\A\a\ \3\φ\« 5517 . 5518 <p>\→\A\a\ \3\φ\«</p> 5519 ```````````````````````````````` 5520 5521 5522 Escaped characters are treated as regular characters and do 5523 not have their usual Markdown meanings: 5524 5525 ```````````````````````````````` example 5526 \*not emphasized* 5527 \<br/> not a tag 5528 \[not a link](/foo) 5529 \`not code` 5530 1\. not a list 5531 \* not a list 5532 \# not a heading 5533 \[foo]: /url "not a reference" 5534 \ö not a character entity 5535 . 5536 <p>*not emphasized* 5537 <br/> not a tag 5538 [not a link](/foo) 5539 `not code` 5540 1. not a list 5541 * not a list 5542 # not a heading 5543 [foo]: /url "not a reference" 5544 &ouml; not a character entity</p> 5545 ```````````````````````````````` 5546 5547 5548 If a backslash is itself escaped, the following character is not: 5549 5550 ```````````````````````````````` example 5551 \\*emphasis* 5552 . 5553 <p>\<em>emphasis</em></p> 5554 ```````````````````````````````` 5555 5556 5557 A backslash at the end of the line is a [hard line break]: 5558 5559 ```````````````````````````````` example 5560 foo\ 5561 bar 5562 . 5563 <p>foo<br /> 5564 bar</p> 5565 ```````````````````````````````` 5566 5567 5568 Backslash escapes do not work in code blocks, code spans, autolinks, or 5569 raw HTML: 5570 5571 ```````````````````````````````` example 5572 `` \[\` `` 5573 . 5574 <p><code>\[\`</code></p> 5575 ```````````````````````````````` 5576 5577 5578 ```````````````````````````````` example 5579 \[\] 5580 . 5581 <pre><code>\[\] 5582 </code></pre> 5583 ```````````````````````````````` 5584 5585 5586 ```````````````````````````````` example 5587 ~~~ 5588 \[\] 5589 ~~~ 5590 . 5591 <pre><code>\[\] 5592 </code></pre> 5593 ```````````````````````````````` 5594 5595 5596 ```````````````````````````````` example 5597 <http://example.com?find=\*> 5598 . 5599 <p><a href="http://example.com?find=%5C*">http://example.com?find=\*</a></p> 5600 ```````````````````````````````` 5601 5602 5603 ```````````````````````````````` example 5604 <a href="/bar\/)"> 5605 . 5606 <a href="/bar\/)"> 5607 ```````````````````````````````` 5608 5609 5610 But they work in all other contexts, including URLs and link titles, 5611 link references, and [info strings] in [fenced code blocks]: 5612 5613 ```````````````````````````````` example 5614 [foo](/bar\* "ti\*tle") 5615 . 5616 <p><a href="/bar*" title="ti*tle">foo</a></p> 5617 ```````````````````````````````` 5618 5619 5620 ```````````````````````````````` example 5621 [foo] 5622 5623 [foo]: /bar\* "ti\*tle" 5624 . 5625 <p><a href="/bar*" title="ti*tle">foo</a></p> 5626 ```````````````````````````````` 5627 5628 5629 ```````````````````````````````` example 5630 ``` foo\+bar 5631 foo 5632 ``` 5633 . 5634 <pre><code class="language-foo+bar">foo 5635 </code></pre> 5636 ```````````````````````````````` 5637 5638 5639 5640 ## Entity and numeric character references 5641 5642 Valid HTML entity references and numeric character references 5643 can be used in place of the corresponding Unicode character, 5644 with the following exceptions: 5645 5646 - Entity and character references are not recognized in code 5647 blocks and code spans. 5648 5649 - Entity and character references cannot stand in place of 5650 special characters that define structural elements in 5651 CommonMark. For example, although `*` can be used 5652 in place of a literal `*` character, `*` cannot replace 5653 `*` in emphasis delimiters, bullet list markers, or thematic 5654 breaks. 5655 5656 Conforming CommonMark parsers need not store information about 5657 whether a particular character was represented in the source 5658 using a Unicode character or an entity reference. 5659 5660 [Entity references](@) consist of `&` + any of the valid 5661 HTML5 entity names + `;`. The 5662 document <https://html.spec.whatwg.org/multipage/entities.json> 5663 is used as an authoritative source for the valid entity 5664 references and their corresponding code points. 5665 5666 ```````````````````````````````` example 5667 & © Æ Ď 5668 ¾ ℋ ⅆ 5669 ∲ ≧̸ 5670 . 5671 <p> & © Æ Ď 5672 ¾ ℋ ⅆ 5673 ∲ ≧̸</p> 5674 ```````````````````````````````` 5675 5676 5677 [Decimal numeric character 5678 references](@) 5679 consist of `&#` + a string of 1--7 arabic digits + `;`. A 5680 numeric character reference is parsed as the corresponding 5681 Unicode character. Invalid Unicode code points will be replaced by 5682 the REPLACEMENT CHARACTER (`U+FFFD`). For security reasons, 5683 the code point `U+0000` will also be replaced by `U+FFFD`. 5684 5685 ```````````````````````````````` example 5686 # Ӓ Ϡ � 5687 . 5688 <p># Ӓ Ϡ �</p> 5689 ```````````````````````````````` 5690 5691 5692 [Hexadecimal numeric character 5693 references](@) consist of `&#` + 5694 either `X` or `x` + a string of 1-6 hexadecimal digits + `;`. 5695 They too are parsed as the corresponding Unicode character (this 5696 time specified with a hexadecimal numeral instead of decimal). 5697 5698 ```````````````````````````````` example 5699 " ആ ಫ 5700 . 5701 <p>" ആ ಫ</p> 5702 ```````````````````````````````` 5703 5704 5705 Here are some nonentities: 5706 5707 ```````````````````````````````` example 5708   &x; &#; &#x; 5709 � 5710 &#abcdef0; 5711 &ThisIsNotDefined; &hi?; 5712 . 5713 <p>&nbsp &x; &#; &#x; 5714 &#87654321; 5715 &#abcdef0; 5716 &ThisIsNotDefined; &hi?;</p> 5717 ```````````````````````````````` 5718 5719 5720 Although HTML5 does accept some entity references 5721 without a trailing semicolon (such as `©`), these are not 5722 recognized here, because it makes the grammar too ambiguous: 5723 5724 ```````````````````````````````` example 5725 © 5726 . 5727 <p>&copy</p> 5728 ```````````````````````````````` 5729 5730 5731 Strings that are not on the list of HTML5 named entities are not 5732 recognized as entity references either: 5733 5734 ```````````````````````````````` example 5735 &MadeUpEntity; 5736 . 5737 <p>&MadeUpEntity;</p> 5738 ```````````````````````````````` 5739 5740 5741 Entity and numeric character references are recognized in any 5742 context besides code spans or code blocks, including 5743 URLs, [link titles], and [fenced code block][] [info strings]: 5744 5745 ```````````````````````````````` example 5746 <a href="öö.html"> 5747 . 5748 <a href="öö.html"> 5749 ```````````````````````````````` 5750 5751 5752 ```````````````````````````````` example 5753 [foo](/föö "föö") 5754 . 5755 <p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p> 5756 ```````````````````````````````` 5757 5758 5759 ```````````````````````````````` example 5760 [foo] 5761 5762 [foo]: /föö "föö" 5763 . 5764 <p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p> 5765 ```````````````````````````````` 5766 5767 5768 ```````````````````````````````` example 5769 ``` föö 5770 foo 5771 ``` 5772 . 5773 <pre><code class="language-föö">foo 5774 </code></pre> 5775 ```````````````````````````````` 5776 5777 5778 Entity and numeric character references are treated as literal 5779 text in code spans and code blocks: 5780 5781 ```````````````````````````````` example 5782 `föö` 5783 . 5784 <p><code>f&ouml;&ouml;</code></p> 5785 ```````````````````````````````` 5786 5787 5788 ```````````````````````````````` example 5789 föfö 5790 . 5791 <pre><code>f&ouml;f&ouml; 5792 </code></pre> 5793 ```````````````````````````````` 5794 5795 5796 Entity and numeric character references cannot be used 5797 in place of symbols indicating structure in CommonMark 5798 documents. 5799 5800 ```````````````````````````````` example 5801 *foo* 5802 *foo* 5803 . 5804 <p>*foo* 5805 <em>foo</em></p> 5806 ```````````````````````````````` 5807 5808 ```````````````````````````````` example 5809 * foo 5810 5811 * foo 5812 . 5813 <p>* foo</p> 5814 <ul> 5815 <li>foo</li> 5816 </ul> 5817 ```````````````````````````````` 5818 5819 ```````````````````````````````` example 5820 foo bar 5821 . 5822 <p>foo 5823 5824 bar</p> 5825 ```````````````````````````````` 5826 5827 ```````````````````````````````` example 5828 	foo 5829 . 5830 <p>→foo</p> 5831 ```````````````````````````````` 5832 5833 5834 ```````````````````````````````` example 5835 [a](url "tit") 5836 . 5837 <p>[a](url "tit")</p> 5838 ```````````````````````````````` 5839 5840 5841 ## Code spans 5842 5843 A [backtick string](@) 5844 is a string of one or more backtick characters (`` ` ``) that is neither 5845 preceded nor followed by a backtick. 5846 5847 A [code span](@) begins with a backtick string and ends with 5848 a backtick string of equal length. The contents of the code span are 5849 the characters between the two backtick strings, normalized in the 5850 following ways: 5851 5852 - First, [line endings] are converted to [spaces]. 5853 - If the resulting string both begins *and* ends with a [space] 5854 character, but does not consist entirely of [space] 5855 characters, a single [space] character is removed from the 5856 front and back. This allows you to include code that begins 5857 or ends with backtick characters, which must be separated by 5858 whitespace from the opening or closing backtick strings. 5859 5860 This is a simple code span: 5861 5862 ```````````````````````````````` example 5863 `foo` 5864 . 5865 <p><code>foo</code></p> 5866 ```````````````````````````````` 5867 5868 5869 Here two backticks are used, because the code contains a backtick. 5870 This example also illustrates stripping of a single leading and 5871 trailing space: 5872 5873 ```````````````````````````````` example 5874 `` foo ` bar `` 5875 . 5876 <p><code>foo ` bar</code></p> 5877 ```````````````````````````````` 5878 5879 5880 This example shows the motivation for stripping leading and trailing 5881 spaces: 5882 5883 ```````````````````````````````` example 5884 ` `` ` 5885 . 5886 <p><code>``</code></p> 5887 ```````````````````````````````` 5888 5889 Note that only *one* space is stripped: 5890 5891 ```````````````````````````````` example 5892 ` `` ` 5893 . 5894 <p><code> `` </code></p> 5895 ```````````````````````````````` 5896 5897 The stripping only happens if the space is on both 5898 sides of the string: 5899 5900 ```````````````````````````````` example 5901 ` a` 5902 . 5903 <p><code> a</code></p> 5904 ```````````````````````````````` 5905 5906 Only [spaces], and not [unicode whitespace] in general, are 5907 stripped in this way: 5908 5909 ```````````````````````````````` example 5910 ` b ` 5911 . 5912 <p><code> b </code></p> 5913 ```````````````````````````````` 5914 5915 No stripping occurs if the code span contains only spaces: 5916 5917 ```````````````````````````````` example 5918 ` ` 5919 ` ` 5920 . 5921 <p><code> </code> 5922 <code> </code></p> 5923 ```````````````````````````````` 5924 5925 5926 [Line endings] are treated like spaces: 5927 5928 ```````````````````````````````` example 5929 `` 5930 foo 5931 bar 5932 baz 5933 `` 5934 . 5935 <p><code>foo bar baz</code></p> 5936 ```````````````````````````````` 5937 5938 ```````````````````````````````` example 5939 `` 5940 foo 5941 `` 5942 . 5943 <p><code>foo </code></p> 5944 ```````````````````````````````` 5945 5946 5947 Interior spaces are not collapsed: 5948 5949 ```````````````````````````````` example 5950 `foo bar 5951 baz` 5952 . 5953 <p><code>foo bar baz</code></p> 5954 ```````````````````````````````` 5955 5956 Note that browsers will typically collapse consecutive spaces 5957 when rendering `<code>` elements, so it is recommended that 5958 the following CSS be used: 5959 5960 code{white-space: pre-wrap;} 5961 5962 5963 Note that backslash escapes do not work in code spans. All backslashes 5964 are treated literally: 5965 5966 ```````````````````````````````` example 5967 `foo\`bar` 5968 . 5969 <p><code>foo\</code>bar`</p> 5970 ```````````````````````````````` 5971 5972 5973 Backslash escapes are never needed, because one can always choose a 5974 string of *n* backtick characters as delimiters, where the code does 5975 not contain any strings of exactly *n* backtick characters. 5976 5977 ```````````````````````````````` example 5978 ``foo`bar`` 5979 . 5980 <p><code>foo`bar</code></p> 5981 ```````````````````````````````` 5982 5983 ```````````````````````````````` example 5984 ` foo `` bar ` 5985 . 5986 <p><code>foo `` bar</code></p> 5987 ```````````````````````````````` 5988 5989 5990 Code span backticks have higher precedence than any other inline 5991 constructs except HTML tags and autolinks. Thus, for example, this is 5992 not parsed as emphasized text, since the second `*` is part of a code 5993 span: 5994 5995 ```````````````````````````````` example 5996 *foo`*` 5997 . 5998 <p>*foo<code>*</code></p> 5999 ```````````````````````````````` 6000 6001 6002 And this is not parsed as a link: 6003 6004 ```````````````````````````````` example 6005 [not a `link](/foo`) 6006 . 6007 <p>[not a <code>link](/foo</code>)</p> 6008 ```````````````````````````````` 6009 6010 6011 Code spans, HTML tags, and autolinks have the same precedence. 6012 Thus, this is code: 6013 6014 ```````````````````````````````` example 6015 `<a href="`">` 6016 . 6017 <p><code><a href="</code>">`</p> 6018 ```````````````````````````````` 6019 6020 6021 But this is an HTML tag: 6022 6023 ```````````````````````````````` example 6024 <a href="`">` 6025 . 6026 <p><a href="`">`</p> 6027 ```````````````````````````````` 6028 6029 6030 And this is code: 6031 6032 ```````````````````````````````` example 6033 `<http://foo.bar.`baz>` 6034 . 6035 <p><code><http://foo.bar.</code>baz>`</p> 6036 ```````````````````````````````` 6037 6038 6039 But this is an autolink: 6040 6041 ```````````````````````````````` example 6042 <http://foo.bar.`baz>` 6043 . 6044 <p><a href="http://foo.bar.%60baz">http://foo.bar.`baz</a>`</p> 6045 ```````````````````````````````` 6046 6047 6048 When a backtick string is not closed by a matching backtick string, 6049 we just have literal backticks: 6050 6051 ```````````````````````````````` example 6052 ```foo`` 6053 . 6054 <p>```foo``</p> 6055 ```````````````````````````````` 6056 6057 6058 ```````````````````````````````` example 6059 `foo 6060 . 6061 <p>`foo</p> 6062 ```````````````````````````````` 6063 6064 The following case also illustrates the need for opening and 6065 closing backtick strings to be equal in length: 6066 6067 ```````````````````````````````` example 6068 `foo``bar`` 6069 . 6070 <p>`foo<code>bar</code></p> 6071 ```````````````````````````````` 6072 6073 6074 ## Emphasis and strong emphasis 6075 6076 John Gruber's original [Markdown syntax 6077 description](http://daringfireball.net/projects/markdown/syntax#em) says: 6078 6079 > Markdown treats asterisks (`*`) and underscores (`_`) as indicators of 6080 > emphasis. Text wrapped with one `*` or `_` will be wrapped with an HTML 6081 > `<em>` tag; double `*`'s or `_`'s will be wrapped with an HTML `<strong>` 6082 > tag. 6083 6084 This is enough for most users, but these rules leave much undecided, 6085 especially when it comes to nested emphasis. The original 6086 `Markdown.pl` test suite makes it clear that triple `***` and 6087 `___` delimiters can be used for strong emphasis, and most 6088 implementations have also allowed the following patterns: 6089 6090 ``` markdown 6091 ***strong emph*** 6092 ***strong** in emph* 6093 ***emph* in strong** 6094 **in strong *emph*** 6095 *in emph **strong*** 6096 ``` 6097 6098 The following patterns are less widely supported, but the intent 6099 is clear and they are useful (especially in contexts like bibliography 6100 entries): 6101 6102 ``` markdown 6103 *emph *with emph* in it* 6104 **strong **with strong** in it** 6105 ``` 6106 6107 Many implementations have also restricted intraword emphasis to 6108 the `*` forms, to avoid unwanted emphasis in words containing 6109 internal underscores. (It is best practice to put these in code 6110 spans, but users often do not.) 6111 6112 ``` markdown 6113 internal emphasis: foo*bar*baz 6114 no emphasis: foo_bar_baz 6115 ``` 6116 6117 The rules given below capture all of these patterns, while allowing 6118 for efficient parsing strategies that do not backtrack. 6119 6120 First, some definitions. A [delimiter run](@) is either 6121 a sequence of one or more `*` characters that is not preceded or 6122 followed by a non-backslash-escaped `*` character, or a sequence 6123 of one or more `_` characters that is not preceded or followed by 6124 a non-backslash-escaped `_` character. 6125 6126 A [left-flanking delimiter run](@) is 6127 a [delimiter run] that is (1) not followed by [Unicode whitespace], 6128 and either (2a) not followed by a [punctuation character], or 6129 (2b) followed by a [punctuation character] and 6130 preceded by [Unicode whitespace] or a [punctuation character]. 6131 For purposes of this definition, the beginning and the end of 6132 the line count as Unicode whitespace. 6133 6134 A [right-flanking delimiter run](@) is 6135 a [delimiter run] that is (1) not preceded by [Unicode whitespace], 6136 and either (2a) not preceded by a [punctuation character], or 6137 (2b) preceded by a [punctuation character] and 6138 followed by [Unicode whitespace] or a [punctuation character]. 6139 For purposes of this definition, the beginning and the end of 6140 the line count as Unicode whitespace. 6141 6142 Here are some examples of delimiter runs. 6143 6144 - left-flanking but not right-flanking: 6145 6146 ``` 6147 ***abc 6148 _abc 6149 **"abc" 6150 _"abc" 6151 ``` 6152 6153 - right-flanking but not left-flanking: 6154 6155 ``` 6156 abc*** 6157 abc_ 6158 "abc"** 6159 "abc"_ 6160 ``` 6161 6162 - Both left and right-flanking: 6163 6164 ``` 6165 abc***def 6166 "abc"_"def" 6167 ``` 6168 6169 - Neither left nor right-flanking: 6170 6171 ``` 6172 abc *** def 6173 a _ b 6174 ``` 6175 6176 (The idea of distinguishing left-flanking and right-flanking 6177 delimiter runs based on the character before and the character 6178 after comes from Roopesh Chander's 6179 [vfmd](http://www.vfmd.org/vfmd-spec/specification/#procedure-for-identifying-emphasis-tags). 6180 vfmd uses the terminology "emphasis indicator string" instead of "delimiter 6181 run," and its rules for distinguishing left- and right-flanking runs 6182 are a bit more complex than the ones given here.) 6183 6184 The following rules define emphasis and strong emphasis: 6185 6186 1. A single `*` character [can open emphasis](@) 6187 iff (if and only if) it is part of a [left-flanking delimiter run]. 6188 6189 2. A single `_` character [can open emphasis] iff 6190 it is part of a [left-flanking delimiter run] 6191 and either (a) not part of a [right-flanking delimiter run] 6192 or (b) part of a [right-flanking delimiter run] 6193 preceded by punctuation. 6194 6195 3. A single `*` character [can close emphasis](@) 6196 iff it is part of a [right-flanking delimiter run]. 6197 6198 4. A single `_` character [can close emphasis] iff 6199 it is part of a [right-flanking delimiter run] 6200 and either (a) not part of a [left-flanking delimiter run] 6201 or (b) part of a [left-flanking delimiter run] 6202 followed by punctuation. 6203 6204 5. A double `**` [can open strong emphasis](@) 6205 iff it is part of a [left-flanking delimiter run]. 6206 6207 6. A double `__` [can open strong emphasis] iff 6208 it is part of a [left-flanking delimiter run] 6209 and either (a) not part of a [right-flanking delimiter run] 6210 or (b) part of a [right-flanking delimiter run] 6211 preceded by punctuation. 6212 6213 7. A double `**` [can close strong emphasis](@) 6214 iff it is part of a [right-flanking delimiter run]. 6215 6216 8. A double `__` [can close strong emphasis] iff 6217 it is part of a [right-flanking delimiter run] 6218 and either (a) not part of a [left-flanking delimiter run] 6219 or (b) part of a [left-flanking delimiter run] 6220 followed by punctuation. 6221 6222 9. Emphasis begins with a delimiter that [can open emphasis] and ends 6223 with a delimiter that [can close emphasis], and that uses the same 6224 character (`_` or `*`) as the opening delimiter. The 6225 opening and closing delimiters must belong to separate 6226 [delimiter runs]. If one of the delimiters can both 6227 open and close emphasis, then the sum of the lengths of the 6228 delimiter runs containing the opening and closing delimiters 6229 must not be a multiple of 3 unless both lengths are 6230 multiples of 3. 6231 6232 10. Strong emphasis begins with a delimiter that 6233 [can open strong emphasis] and ends with a delimiter that 6234 [can close strong emphasis], and that uses the same character 6235 (`_` or `*`) as the opening delimiter. The 6236 opening and closing delimiters must belong to separate 6237 [delimiter runs]. If one of the delimiters can both open 6238 and close strong emphasis, then the sum of the lengths of 6239 the delimiter runs containing the opening and closing 6240 delimiters must not be a multiple of 3 unless both lengths 6241 are multiples of 3. 6242 6243 11. A literal `*` character cannot occur at the beginning or end of 6244 `*`-delimited emphasis or `**`-delimited strong emphasis, unless it 6245 is backslash-escaped. 6246 6247 12. A literal `_` character cannot occur at the beginning or end of 6248 `_`-delimited emphasis or `__`-delimited strong emphasis, unless it 6249 is backslash-escaped. 6250 6251 Where rules 1--12 above are compatible with multiple parsings, 6252 the following principles resolve ambiguity: 6253 6254 13. The number of nestings should be minimized. Thus, for example, 6255 an interpretation `<strong>...</strong>` is always preferred to 6256 `<em><em>...</em></em>`. 6257 6258 14. An interpretation `<em><strong>...</strong></em>` is always 6259 preferred to `<strong><em>...</em></strong>`. 6260 6261 15. When two potential emphasis or strong emphasis spans overlap, 6262 so that the second begins before the first ends and ends after 6263 the first ends, the first takes precedence. Thus, for example, 6264 `*foo _bar* baz_` is parsed as `<em>foo _bar</em> baz_` rather 6265 than `*foo <em>bar* baz</em>`. 6266 6267 16. When there are two potential emphasis or strong emphasis spans 6268 with the same closing delimiter, the shorter one (the one that 6269 opens later) takes precedence. Thus, for example, 6270 `**foo **bar baz**` is parsed as `**foo <strong>bar baz</strong>` 6271 rather than `<strong>foo **bar baz</strong>`. 6272 6273 17. Inline code spans, links, images, and HTML tags group more tightly 6274 than emphasis. So, when there is a choice between an interpretation 6275 that contains one of these elements and one that does not, the 6276 former always wins. Thus, for example, `*[foo*](bar)` is 6277 parsed as `*<a href="bar">foo*</a>` rather than as 6278 `<em>[foo</em>](bar)`. 6279 6280 These rules can be illustrated through a series of examples. 6281 6282 Rule 1: 6283 6284 ```````````````````````````````` example 6285 *foo bar* 6286 . 6287 <p><em>foo bar</em></p> 6288 ```````````````````````````````` 6289 6290 6291 This is not emphasis, because the opening `*` is followed by 6292 whitespace, and hence not part of a [left-flanking delimiter run]: 6293 6294 ```````````````````````````````` example 6295 a * foo bar* 6296 . 6297 <p>a * foo bar*</p> 6298 ```````````````````````````````` 6299 6300 6301 This is not emphasis, because the opening `*` is preceded 6302 by an alphanumeric and followed by punctuation, and hence 6303 not part of a [left-flanking delimiter run]: 6304 6305 ```````````````````````````````` example 6306 a*"foo"* 6307 . 6308 <p>a*"foo"*</p> 6309 ```````````````````````````````` 6310 6311 6312 Unicode nonbreaking spaces count as whitespace, too: 6313 6314 ```````````````````````````````` example 6315 * a * 6316 . 6317 <p>* a *</p> 6318 ```````````````````````````````` 6319 6320 6321 Intraword emphasis with `*` is permitted: 6322 6323 ```````````````````````````````` example 6324 foo*bar* 6325 . 6326 <p>foo<em>bar</em></p> 6327 ```````````````````````````````` 6328 6329 6330 ```````````````````````````````` example 6331 5*6*78 6332 . 6333 <p>5<em>6</em>78</p> 6334 ```````````````````````````````` 6335 6336 6337 Rule 2: 6338 6339 ```````````````````````````````` example 6340 _foo bar_ 6341 . 6342 <p><em>foo bar</em></p> 6343 ```````````````````````````````` 6344 6345 6346 This is not emphasis, because the opening `_` is followed by 6347 whitespace: 6348 6349 ```````````````````````````````` example 6350 _ foo bar_ 6351 . 6352 <p>_ foo bar_</p> 6353 ```````````````````````````````` 6354 6355 6356 This is not emphasis, because the opening `_` is preceded 6357 by an alphanumeric and followed by punctuation: 6358 6359 ```````````````````````````````` example 6360 a_"foo"_ 6361 . 6362 <p>a_"foo"_</p> 6363 ```````````````````````````````` 6364 6365 6366 Emphasis with `_` is not allowed inside words: 6367 6368 ```````````````````````````````` example 6369 foo_bar_ 6370 . 6371 <p>foo_bar_</p> 6372 ```````````````````````````````` 6373 6374 6375 ```````````````````````````````` example 6376 5_6_78 6377 . 6378 <p>5_6_78</p> 6379 ```````````````````````````````` 6380 6381 6382 ```````````````````````````````` example 6383 пристаням_стремятся_ 6384 . 6385 <p>пристаням_стремятся_</p> 6386 ```````````````````````````````` 6387 6388 6389 Here `_` does not generate emphasis, because the first delimiter run 6390 is right-flanking and the second left-flanking: 6391 6392 ```````````````````````````````` example 6393 aa_"bb"_cc 6394 . 6395 <p>aa_"bb"_cc</p> 6396 ```````````````````````````````` 6397 6398 6399 This is emphasis, even though the opening delimiter is 6400 both left- and right-flanking, because it is preceded by 6401 punctuation: 6402 6403 ```````````````````````````````` example 6404 foo-_(bar)_ 6405 . 6406 <p>foo-<em>(bar)</em></p> 6407 ```````````````````````````````` 6408 6409 6410 Rule 3: 6411 6412 This is not emphasis, because the closing delimiter does 6413 not match the opening delimiter: 6414 6415 ```````````````````````````````` example 6416 _foo* 6417 . 6418 <p>_foo*</p> 6419 ```````````````````````````````` 6420 6421 6422 This is not emphasis, because the closing `*` is preceded by 6423 whitespace: 6424 6425 ```````````````````````````````` example 6426 *foo bar * 6427 . 6428 <p>*foo bar *</p> 6429 ```````````````````````````````` 6430 6431 6432 A newline also counts as whitespace: 6433 6434 ```````````````````````````````` example 6435 *foo bar 6436 * 6437 . 6438 <p>*foo bar 6439 *</p> 6440 ```````````````````````````````` 6441 6442 6443 This is not emphasis, because the second `*` is 6444 preceded by punctuation and followed by an alphanumeric 6445 (hence it is not part of a [right-flanking delimiter run]: 6446 6447 ```````````````````````````````` example 6448 *(*foo) 6449 . 6450 <p>*(*foo)</p> 6451 ```````````````````````````````` 6452 6453 6454 The point of this restriction is more easily appreciated 6455 with this example: 6456 6457 ```````````````````````````````` example 6458 *(*foo*)* 6459 . 6460 <p><em>(<em>foo</em>)</em></p> 6461 ```````````````````````````````` 6462 6463 6464 Intraword emphasis with `*` is allowed: 6465 6466 ```````````````````````````````` example 6467 *foo*bar 6468 . 6469 <p><em>foo</em>bar</p> 6470 ```````````````````````````````` 6471 6472 6473 6474 Rule 4: 6475 6476 This is not emphasis, because the closing `_` is preceded by 6477 whitespace: 6478 6479 ```````````````````````````````` example 6480 _foo bar _ 6481 . 6482 <p>_foo bar _</p> 6483 ```````````````````````````````` 6484 6485 6486 This is not emphasis, because the second `_` is 6487 preceded by punctuation and followed by an alphanumeric: 6488 6489 ```````````````````````````````` example 6490 _(_foo) 6491 . 6492 <p>_(_foo)</p> 6493 ```````````````````````````````` 6494 6495 6496 This is emphasis within emphasis: 6497 6498 ```````````````````````````````` example 6499 _(_foo_)_ 6500 . 6501 <p><em>(<em>foo</em>)</em></p> 6502 ```````````````````````````````` 6503 6504 6505 Intraword emphasis is disallowed for `_`: 6506 6507 ```````````````````````````````` example 6508 _foo_bar 6509 . 6510 <p>_foo_bar</p> 6511 ```````````````````````````````` 6512 6513 6514 ```````````````````````````````` example 6515 _пристаням_стремятся 6516 . 6517 <p>_пристаням_стремятся</p> 6518 ```````````````````````````````` 6519 6520 6521 ```````````````````````````````` example 6522 _foo_bar_baz_ 6523 . 6524 <p><em>foo_bar_baz</em></p> 6525 ```````````````````````````````` 6526 6527 6528 This is emphasis, even though the closing delimiter is 6529 both left- and right-flanking, because it is followed by 6530 punctuation: 6531 6532 ```````````````````````````````` example 6533 _(bar)_. 6534 . 6535 <p><em>(bar)</em>.</p> 6536 ```````````````````````````````` 6537 6538 6539 Rule 5: 6540 6541 ```````````````````````````````` example 6542 **foo bar** 6543 . 6544 <p><strong>foo bar</strong></p> 6545 ```````````````````````````````` 6546 6547 6548 This is not strong emphasis, because the opening delimiter is 6549 followed by whitespace: 6550 6551 ```````````````````````````````` example 6552 ** foo bar** 6553 . 6554 <p>** foo bar**</p> 6555 ```````````````````````````````` 6556 6557 6558 This is not strong emphasis, because the opening `**` is preceded 6559 by an alphanumeric and followed by punctuation, and hence 6560 not part of a [left-flanking delimiter run]: 6561 6562 ```````````````````````````````` example 6563 a**"foo"** 6564 . 6565 <p>a**"foo"**</p> 6566 ```````````````````````````````` 6567 6568 6569 Intraword strong emphasis with `**` is permitted: 6570 6571 ```````````````````````````````` example 6572 foo**bar** 6573 . 6574 <p>foo<strong>bar</strong></p> 6575 ```````````````````````````````` 6576 6577 6578 Rule 6: 6579 6580 ```````````````````````````````` example 6581 __foo bar__ 6582 . 6583 <p><strong>foo bar</strong></p> 6584 ```````````````````````````````` 6585 6586 6587 This is not strong emphasis, because the opening delimiter is 6588 followed by whitespace: 6589 6590 ```````````````````````````````` example 6591 __ foo bar__ 6592 . 6593 <p>__ foo bar__</p> 6594 ```````````````````````````````` 6595 6596 6597 A newline counts as whitespace: 6598 ```````````````````````````````` example 6599 __ 6600 foo bar__ 6601 . 6602 <p>__ 6603 foo bar__</p> 6604 ```````````````````````````````` 6605 6606 6607 This is not strong emphasis, because the opening `__` is preceded 6608 by an alphanumeric and followed by punctuation: 6609 6610 ```````````````````````````````` example 6611 a__"foo"__ 6612 . 6613 <p>a__"foo"__</p> 6614 ```````````````````````````````` 6615 6616 6617 Intraword strong emphasis is forbidden with `__`: 6618 6619 ```````````````````````````````` example 6620 foo__bar__ 6621 . 6622 <p>foo__bar__</p> 6623 ```````````````````````````````` 6624 6625 6626 ```````````````````````````````` example 6627 5__6__78 6628 . 6629 <p>5__6__78</p> 6630 ```````````````````````````````` 6631 6632 6633 ```````````````````````````````` example 6634 пристаням__стремятся__ 6635 . 6636 <p>пристаням__стремятся__</p> 6637 ```````````````````````````````` 6638 6639 6640 ```````````````````````````````` example 6641 __foo, __bar__, baz__ 6642 . 6643 <p><strong>foo, <strong>bar</strong>, baz</strong></p> 6644 ```````````````````````````````` 6645 6646 6647 This is strong emphasis, even though the opening delimiter is 6648 both left- and right-flanking, because it is preceded by 6649 punctuation: 6650 6651 ```````````````````````````````` example 6652 foo-__(bar)__ 6653 . 6654 <p>foo-<strong>(bar)</strong></p> 6655 ```````````````````````````````` 6656 6657 6658 6659 Rule 7: 6660 6661 This is not strong emphasis, because the closing delimiter is preceded 6662 by whitespace: 6663 6664 ```````````````````````````````` example 6665 **foo bar ** 6666 . 6667 <p>**foo bar **</p> 6668 ```````````````````````````````` 6669 6670 6671 (Nor can it be interpreted as an emphasized `*foo bar *`, because of 6672 Rule 11.) 6673 6674 This is not strong emphasis, because the second `**` is 6675 preceded by punctuation and followed by an alphanumeric: 6676 6677 ```````````````````````````````` example 6678 **(**foo) 6679 . 6680 <p>**(**foo)</p> 6681 ```````````````````````````````` 6682 6683 6684 The point of this restriction is more easily appreciated 6685 with these examples: 6686 6687 ```````````````````````````````` example 6688 *(**foo**)* 6689 . 6690 <p><em>(<strong>foo</strong>)</em></p> 6691 ```````````````````````````````` 6692 6693 6694 ```````````````````````````````` example 6695 **Gomphocarpus (*Gomphocarpus physocarpus*, syn. 6696 *Asclepias physocarpa*)** 6697 . 6698 <p><strong>Gomphocarpus (<em>Gomphocarpus physocarpus</em>, syn. 6699 <em>Asclepias physocarpa</em>)</strong></p> 6700 ```````````````````````````````` 6701 6702 6703 ```````````````````````````````` example 6704 **foo "*bar*" foo** 6705 . 6706 <p><strong>foo "<em>bar</em>" foo</strong></p> 6707 ```````````````````````````````` 6708 6709 6710 Intraword emphasis: 6711 6712 ```````````````````````````````` example 6713 **foo**bar 6714 . 6715 <p><strong>foo</strong>bar</p> 6716 ```````````````````````````````` 6717 6718 6719 Rule 8: 6720 6721 This is not strong emphasis, because the closing delimiter is 6722 preceded by whitespace: 6723 6724 ```````````````````````````````` example 6725 __foo bar __ 6726 . 6727 <p>__foo bar __</p> 6728 ```````````````````````````````` 6729 6730 6731 This is not strong emphasis, because the second `__` is 6732 preceded by punctuation and followed by an alphanumeric: 6733 6734 ```````````````````````````````` example 6735 __(__foo) 6736 . 6737 <p>__(__foo)</p> 6738 ```````````````````````````````` 6739 6740 6741 The point of this restriction is more easily appreciated 6742 with this example: 6743 6744 ```````````````````````````````` example 6745 _(__foo__)_ 6746 . 6747 <p><em>(<strong>foo</strong>)</em></p> 6748 ```````````````````````````````` 6749 6750 6751 Intraword strong emphasis is forbidden with `__`: 6752 6753 ```````````````````````````````` example 6754 __foo__bar 6755 . 6756 <p>__foo__bar</p> 6757 ```````````````````````````````` 6758 6759 6760 ```````````````````````````````` example 6761 __пристаням__стремятся 6762 . 6763 <p>__пристаням__стремятся</p> 6764 ```````````````````````````````` 6765 6766 6767 ```````````````````````````````` example 6768 __foo__bar__baz__ 6769 . 6770 <p><strong>foo__bar__baz</strong></p> 6771 ```````````````````````````````` 6772 6773 6774 This is strong emphasis, even though the closing delimiter is 6775 both left- and right-flanking, because it is followed by 6776 punctuation: 6777 6778 ```````````````````````````````` example 6779 __(bar)__. 6780 . 6781 <p><strong>(bar)</strong>.</p> 6782 ```````````````````````````````` 6783 6784 6785 Rule 9: 6786 6787 Any nonempty sequence of inline elements can be the contents of an 6788 emphasized span. 6789 6790 ```````````````````````````````` example 6791 *foo [bar](/url)* 6792 . 6793 <p><em>foo <a href="/url">bar</a></em></p> 6794 ```````````````````````````````` 6795 6796 6797 ```````````````````````````````` example 6798 *foo 6799 bar* 6800 . 6801 <p><em>foo 6802 bar</em></p> 6803 ```````````````````````````````` 6804 6805 6806 In particular, emphasis and strong emphasis can be nested 6807 inside emphasis: 6808 6809 ```````````````````````````````` example 6810 _foo __bar__ baz_ 6811 . 6812 <p><em>foo <strong>bar</strong> baz</em></p> 6813 ```````````````````````````````` 6814 6815 6816 ```````````````````````````````` example 6817 _foo _bar_ baz_ 6818 . 6819 <p><em>foo <em>bar</em> baz</em></p> 6820 ```````````````````````````````` 6821 6822 6823 ```````````````````````````````` example 6824 __foo_ bar_ 6825 . 6826 <p><em><em>foo</em> bar</em></p> 6827 ```````````````````````````````` 6828 6829 6830 ```````````````````````````````` example 6831 *foo *bar** 6832 . 6833 <p><em>foo <em>bar</em></em></p> 6834 ```````````````````````````````` 6835 6836 6837 ```````````````````````````````` example 6838 *foo **bar** baz* 6839 . 6840 <p><em>foo <strong>bar</strong> baz</em></p> 6841 ```````````````````````````````` 6842 6843 ```````````````````````````````` example 6844 *foo**bar**baz* 6845 . 6846 <p><em>foo<strong>bar</strong>baz</em></p> 6847 ```````````````````````````````` 6848 6849 Note that in the preceding case, the interpretation 6850 6851 ``` markdown 6852 <p><em>foo</em><em>bar<em></em>baz</em></p> 6853 ``` 6854 6855 6856 is precluded by the condition that a delimiter that 6857 can both open and close (like the `*` after `foo`) 6858 cannot form emphasis if the sum of the lengths of 6859 the delimiter runs containing the opening and 6860 closing delimiters is a multiple of 3 unless 6861 both lengths are multiples of 3. 6862 6863 6864 For the same reason, we don't get two consecutive 6865 emphasis sections in this example: 6866 6867 ```````````````````````````````` example 6868 *foo**bar* 6869 . 6870 <p><em>foo**bar</em></p> 6871 ```````````````````````````````` 6872 6873 6874 The same condition ensures that the following 6875 cases are all strong emphasis nested inside 6876 emphasis, even when the interior spaces are 6877 omitted: 6878 6879 6880 ```````````````````````````````` example 6881 ***foo** bar* 6882 . 6883 <p><em><strong>foo</strong> bar</em></p> 6884 ```````````````````````````````` 6885 6886 6887 ```````````````````````````````` example 6888 *foo **bar*** 6889 . 6890 <p><em>foo <strong>bar</strong></em></p> 6891 ```````````````````````````````` 6892 6893 6894 ```````````````````````````````` example 6895 *foo**bar*** 6896 . 6897 <p><em>foo<strong>bar</strong></em></p> 6898 ```````````````````````````````` 6899 6900 6901 When the lengths of the interior closing and opening 6902 delimiter runs are *both* multiples of 3, though, 6903 they can match to create emphasis: 6904 6905 ```````````````````````````````` example 6906 foo***bar***baz 6907 . 6908 <p>foo<em><strong>bar</strong></em>baz</p> 6909 ```````````````````````````````` 6910 6911 ```````````````````````````````` example 6912 foo******bar*********baz 6913 . 6914 <p>foo<strong><strong><strong>bar</strong></strong></strong>***baz</p> 6915 ```````````````````````````````` 6916 6917 6918 Indefinite levels of nesting are possible: 6919 6920 ```````````````````````````````` example 6921 *foo **bar *baz* bim** bop* 6922 . 6923 <p><em>foo <strong>bar <em>baz</em> bim</strong> bop</em></p> 6924 ```````````````````````````````` 6925 6926 6927 ```````````````````````````````` example 6928 *foo [*bar*](/url)* 6929 . 6930 <p><em>foo <a href="/url"><em>bar</em></a></em></p> 6931 ```````````````````````````````` 6932 6933 6934 There can be no empty emphasis or strong emphasis: 6935 6936 ```````````````````````````````` example 6937 ** is not an empty emphasis 6938 . 6939 <p>** is not an empty emphasis</p> 6940 ```````````````````````````````` 6941 6942 6943 ```````````````````````````````` example 6944 **** is not an empty strong emphasis 6945 . 6946 <p>**** is not an empty strong emphasis</p> 6947 ```````````````````````````````` 6948 6949 6950 6951 Rule 10: 6952 6953 Any nonempty sequence of inline elements can be the contents of an 6954 strongly emphasized span. 6955 6956 ```````````````````````````````` example 6957 **foo [bar](/url)** 6958 . 6959 <p><strong>foo <a href="/url">bar</a></strong></p> 6960 ```````````````````````````````` 6961 6962 6963 ```````````````````````````````` example 6964 **foo 6965 bar** 6966 . 6967 <p><strong>foo 6968 bar</strong></p> 6969 ```````````````````````````````` 6970 6971 6972 In particular, emphasis and strong emphasis can be nested 6973 inside strong emphasis: 6974 6975 ```````````````````````````````` example 6976 __foo _bar_ baz__ 6977 . 6978 <p><strong>foo <em>bar</em> baz</strong></p> 6979 ```````````````````````````````` 6980 6981 6982 ```````````````````````````````` example 6983 __foo __bar__ baz__ 6984 . 6985 <p><strong>foo <strong>bar</strong> baz</strong></p> 6986 ```````````````````````````````` 6987 6988 6989 ```````````````````````````````` example 6990 ____foo__ bar__ 6991 . 6992 <p><strong><strong>foo</strong> bar</strong></p> 6993 ```````````````````````````````` 6994 6995 6996 ```````````````````````````````` example 6997 **foo **bar**** 6998 . 6999 <p><strong>foo <strong>bar</strong></strong></p> 7000 ```````````````````````````````` 7001 7002 7003 ```````````````````````````````` example 7004 **foo *bar* baz** 7005 . 7006 <p><strong>foo <em>bar</em> baz</strong></p> 7007 ```````````````````````````````` 7008 7009 7010 ```````````````````````````````` example 7011 **foo*bar*baz** 7012 . 7013 <p><strong>foo<em>bar</em>baz</strong></p> 7014 ```````````````````````````````` 7015 7016 7017 ```````````````````````````````` example 7018 ***foo* bar** 7019 . 7020 <p><strong><em>foo</em> bar</strong></p> 7021 ```````````````````````````````` 7022 7023 7024 ```````````````````````````````` example 7025 **foo *bar*** 7026 . 7027 <p><strong>foo <em>bar</em></strong></p> 7028 ```````````````````````````````` 7029 7030 7031 Indefinite levels of nesting are possible: 7032 7033 ```````````````````````````````` example 7034 **foo *bar **baz** 7035 bim* bop** 7036 . 7037 <p><strong>foo <em>bar <strong>baz</strong> 7038 bim</em> bop</strong></p> 7039 ```````````````````````````````` 7040 7041 7042 ```````````````````````````````` example 7043 **foo [*bar*](/url)** 7044 . 7045 <p><strong>foo <a href="/url"><em>bar</em></a></strong></p> 7046 ```````````````````````````````` 7047 7048 7049 There can be no empty emphasis or strong emphasis: 7050 7051 ```````````````````````````````` example 7052 __ is not an empty emphasis 7053 . 7054 <p>__ is not an empty emphasis</p> 7055 ```````````````````````````````` 7056 7057 7058 ```````````````````````````````` example 7059 ____ is not an empty strong emphasis 7060 . 7061 <p>____ is not an empty strong emphasis</p> 7062 ```````````````````````````````` 7063 7064 7065 7066 Rule 11: 7067 7068 ```````````````````````````````` example 7069 foo *** 7070 . 7071 <p>foo ***</p> 7072 ```````````````````````````````` 7073 7074 7075 ```````````````````````````````` example 7076 foo *\** 7077 . 7078 <p>foo <em>*</em></p> 7079 ```````````````````````````````` 7080 7081 7082 ```````````````````````````````` example 7083 foo *_* 7084 . 7085 <p>foo <em>_</em></p> 7086 ```````````````````````````````` 7087 7088 7089 ```````````````````````````````` example 7090 foo ***** 7091 . 7092 <p>foo *****</p> 7093 ```````````````````````````````` 7094 7095 7096 ```````````````````````````````` example 7097 foo **\*** 7098 . 7099 <p>foo <strong>*</strong></p> 7100 ```````````````````````````````` 7101 7102 7103 ```````````````````````````````` example 7104 foo **_** 7105 . 7106 <p>foo <strong>_</strong></p> 7107 ```````````````````````````````` 7108 7109 7110 Note that when delimiters do not match evenly, Rule 11 determines 7111 that the excess literal `*` characters will appear outside of the 7112 emphasis, rather than inside it: 7113 7114 ```````````````````````````````` example 7115 **foo* 7116 . 7117 <p>*<em>foo</em></p> 7118 ```````````````````````````````` 7119 7120 7121 ```````````````````````````````` example 7122 *foo** 7123 . 7124 <p><em>foo</em>*</p> 7125 ```````````````````````````````` 7126 7127 7128 ```````````````````````````````` example 7129 ***foo** 7130 . 7131 <p>*<strong>foo</strong></p> 7132 ```````````````````````````````` 7133 7134 7135 ```````````````````````````````` example 7136 ****foo* 7137 . 7138 <p>***<em>foo</em></p> 7139 ```````````````````````````````` 7140 7141 7142 ```````````````````````````````` example 7143 **foo*** 7144 . 7145 <p><strong>foo</strong>*</p> 7146 ```````````````````````````````` 7147 7148 7149 ```````````````````````````````` example 7150 *foo**** 7151 . 7152 <p><em>foo</em>***</p> 7153 ```````````````````````````````` 7154 7155 7156 7157 Rule 12: 7158 7159 ```````````````````````````````` example 7160 foo ___ 7161 . 7162 <p>foo ___</p> 7163 ```````````````````````````````` 7164 7165 7166 ```````````````````````````````` example 7167 foo _\__ 7168 . 7169 <p>foo <em>_</em></p> 7170 ```````````````````````````````` 7171 7172 7173 ```````````````````````````````` example 7174 foo _*_ 7175 . 7176 <p>foo <em>*</em></p> 7177 ```````````````````````````````` 7178 7179 7180 ```````````````````````````````` example 7181 foo _____ 7182 . 7183 <p>foo _____</p> 7184 ```````````````````````````````` 7185 7186 7187 ```````````````````````````````` example 7188 foo __\___ 7189 . 7190 <p>foo <strong>_</strong></p> 7191 ```````````````````````````````` 7192 7193 7194 ```````````````````````````````` example 7195 foo __*__ 7196 . 7197 <p>foo <strong>*</strong></p> 7198 ```````````````````````````````` 7199 7200 7201 ```````````````````````````````` example 7202 __foo_ 7203 . 7204 <p>_<em>foo</em></p> 7205 ```````````````````````````````` 7206 7207 7208 Note that when delimiters do not match evenly, Rule 12 determines 7209 that the excess literal `_` characters will appear outside of the 7210 emphasis, rather than inside it: 7211 7212 ```````````````````````````````` example 7213 _foo__ 7214 . 7215 <p><em>foo</em>_</p> 7216 ```````````````````````````````` 7217 7218 7219 ```````````````````````````````` example 7220 ___foo__ 7221 . 7222 <p>_<strong>foo</strong></p> 7223 ```````````````````````````````` 7224 7225 7226 ```````````````````````````````` example 7227 ____foo_ 7228 . 7229 <p>___<em>foo</em></p> 7230 ```````````````````````````````` 7231 7232 7233 ```````````````````````````````` example 7234 __foo___ 7235 . 7236 <p><strong>foo</strong>_</p> 7237 ```````````````````````````````` 7238 7239 7240 ```````````````````````````````` example 7241 _foo____ 7242 . 7243 <p><em>foo</em>___</p> 7244 ```````````````````````````````` 7245 7246 7247 Rule 13 implies that if you want emphasis nested directly inside 7248 emphasis, you must use different delimiters: 7249 7250 ```````````````````````````````` example 7251 **foo** 7252 . 7253 <p><strong>foo</strong></p> 7254 ```````````````````````````````` 7255 7256 7257 ```````````````````````````````` example 7258 *_foo_* 7259 . 7260 <p><em><em>foo</em></em></p> 7261 ```````````````````````````````` 7262 7263 7264 ```````````````````````````````` example 7265 __foo__ 7266 . 7267 <p><strong>foo</strong></p> 7268 ```````````````````````````````` 7269 7270 7271 ```````````````````````````````` example 7272 _*foo*_ 7273 . 7274 <p><em><em>foo</em></em></p> 7275 ```````````````````````````````` 7276 7277 7278 However, strong emphasis within strong emphasis is possible without 7279 switching delimiters: 7280 7281 ```````````````````````````````` example 7282 ****foo**** 7283 . 7284 <p><strong><strong>foo</strong></strong></p> 7285 ```````````````````````````````` 7286 7287 7288 ```````````````````````````````` example 7289 ____foo____ 7290 . 7291 <p><strong><strong>foo</strong></strong></p> 7292 ```````````````````````````````` 7293 7294 7295 7296 Rule 13 can be applied to arbitrarily long sequences of 7297 delimiters: 7298 7299 ```````````````````````````````` example 7300 ******foo****** 7301 . 7302 <p><strong><strong><strong>foo</strong></strong></strong></p> 7303 ```````````````````````````````` 7304 7305 7306 Rule 14: 7307 7308 ```````````````````````````````` example 7309 ***foo*** 7310 . 7311 <p><em><strong>foo</strong></em></p> 7312 ```````````````````````````````` 7313 7314 7315 ```````````````````````````````` example 7316 _____foo_____ 7317 . 7318 <p><em><strong><strong>foo</strong></strong></em></p> 7319 ```````````````````````````````` 7320 7321 7322 Rule 15: 7323 7324 ```````````````````````````````` example 7325 *foo _bar* baz_ 7326 . 7327 <p><em>foo _bar</em> baz_</p> 7328 ```````````````````````````````` 7329 7330 7331 ```````````````````````````````` example 7332 *foo __bar *baz bim__ bam* 7333 . 7334 <p><em>foo <strong>bar *baz bim</strong> bam</em></p> 7335 ```````````````````````````````` 7336 7337 7338 Rule 16: 7339 7340 ```````````````````````````````` example 7341 **foo **bar baz** 7342 . 7343 <p>**foo <strong>bar baz</strong></p> 7344 ```````````````````````````````` 7345 7346 7347 ```````````````````````````````` example 7348 *foo *bar baz* 7349 . 7350 <p>*foo <em>bar baz</em></p> 7351 ```````````````````````````````` 7352 7353 7354 Rule 17: 7355 7356 ```````````````````````````````` example 7357 *[bar*](/url) 7358 . 7359 <p>*<a href="/url">bar*</a></p> 7360 ```````````````````````````````` 7361 7362 7363 ```````````````````````````````` example 7364 _foo [bar_](/url) 7365 . 7366 <p>_foo <a href="/url">bar_</a></p> 7367 ```````````````````````````````` 7368 7369 7370 ```````````````````````````````` example 7371 *<img src="foo" title="*"/> 7372 . 7373 <p>*<img src="foo" title="*"/></p> 7374 ```````````````````````````````` 7375 7376 7377 ```````````````````````````````` example 7378 **<a href="**"> 7379 . 7380 <p>**<a href="**"></p> 7381 ```````````````````````````````` 7382 7383 7384 ```````````````````````````````` example 7385 __<a href="__"> 7386 . 7387 <p>__<a href="__"></p> 7388 ```````````````````````````````` 7389 7390 7391 ```````````````````````````````` example 7392 *a `*`* 7393 . 7394 <p><em>a <code>*</code></em></p> 7395 ```````````````````````````````` 7396 7397 7398 ```````````````````````````````` example 7399 _a `_`_ 7400 . 7401 <p><em>a <code>_</code></em></p> 7402 ```````````````````````````````` 7403 7404 7405 ```````````````````````````````` example 7406 **a<http://foo.bar/?q=**> 7407 . 7408 <p>**a<a href="http://foo.bar/?q=**">http://foo.bar/?q=**</a></p> 7409 ```````````````````````````````` 7410 7411 7412 ```````````````````````````````` example 7413 __a<http://foo.bar/?q=__> 7414 . 7415 <p>__a<a href="http://foo.bar/?q=__">http://foo.bar/?q=__</a></p> 7416 ```````````````````````````````` 7417 7418 7419 7420 ## Links 7421 7422 A link contains [link text] (the visible text), a [link destination] 7423 (the URI that is the link destination), and optionally a [link title]. 7424 There are two basic kinds of links in Markdown. In [inline links] the 7425 destination and title are given immediately after the link text. In 7426 [reference links] the destination and title are defined elsewhere in 7427 the document. 7428 7429 A [link text](@) consists of a sequence of zero or more 7430 inline elements enclosed by square brackets (`[` and `]`). The 7431 following rules apply: 7432 7433 - Links may not contain other links, at any level of nesting. If 7434 multiple otherwise valid link definitions appear nested inside each 7435 other, the inner-most definition is used. 7436 7437 - Brackets are allowed in the [link text] only if (a) they 7438 are backslash-escaped or (b) they appear as a matched pair of brackets, 7439 with an open bracket `[`, a sequence of zero or more inlines, and 7440 a close bracket `]`. 7441 7442 - Backtick [code spans], [autolinks], and raw [HTML tags] bind more tightly 7443 than the brackets in link text. Thus, for example, 7444 `` [foo`]` `` could not be a link text, since the second `]` 7445 is part of a code span. 7446 7447 - The brackets in link text bind more tightly than markers for 7448 [emphasis and strong emphasis]. Thus, for example, `*[foo*](url)` is a link. 7449 7450 A [link destination](@) consists of either 7451 7452 - a sequence of zero or more characters between an opening `<` and a 7453 closing `>` that contains no line breaks or unescaped 7454 `<` or `>` characters, or 7455 7456 - a nonempty sequence of characters that does not start with 7457 `<`, does not include ASCII space or control characters, and 7458 includes parentheses only if (a) they are backslash-escaped or 7459 (b) they are part of a balanced pair of unescaped parentheses. 7460 (Implementations may impose limits on parentheses nesting to 7461 avoid performance issues, but at least three levels of nesting 7462 should be supported.) 7463 7464 A [link title](@) consists of either 7465 7466 - a sequence of zero or more characters between straight double-quote 7467 characters (`"`), including a `"` character only if it is 7468 backslash-escaped, or 7469 7470 - a sequence of zero or more characters between straight single-quote 7471 characters (`'`), including a `'` character only if it is 7472 backslash-escaped, or 7473 7474 - a sequence of zero or more characters between matching parentheses 7475 (`(...)`), including a `(` or `)` character only if it is 7476 backslash-escaped. 7477 7478 Although [link titles] may span multiple lines, they may not contain 7479 a [blank line]. 7480 7481 An [inline link](@) consists of a [link text] followed immediately 7482 by a left parenthesis `(`, optional [whitespace], an optional 7483 [link destination], an optional [link title] separated from the link 7484 destination by [whitespace], optional [whitespace], and a right 7485 parenthesis `)`. The link's text consists of the inlines contained 7486 in the [link text] (excluding the enclosing square brackets). 7487 The link's URI consists of the link destination, excluding enclosing 7488 `<...>` if present, with backslash-escapes in effect as described 7489 above. The link's title consists of the link title, excluding its 7490 enclosing delimiters, with backslash-escapes in effect as described 7491 above. 7492 7493 Here is a simple inline link: 7494 7495 ```````````````````````````````` example 7496 [link](/uri "title") 7497 . 7498 <p><a href="/uri" title="title">link</a></p> 7499 ```````````````````````````````` 7500 7501 7502 The title may be omitted: 7503 7504 ```````````````````````````````` example 7505 [link](/uri) 7506 . 7507 <p><a href="/uri">link</a></p> 7508 ```````````````````````````````` 7509 7510 7511 Both the title and the destination may be omitted: 7512 7513 ```````````````````````````````` example 7514 [link]() 7515 . 7516 <p><a href="">link</a></p> 7517 ```````````````````````````````` 7518 7519 7520 ```````````````````````````````` example 7521 [link](<>) 7522 . 7523 <p><a href="">link</a></p> 7524 ```````````````````````````````` 7525 7526 The destination can only contain spaces if it is 7527 enclosed in pointy brackets: 7528 7529 ```````````````````````````````` example 7530 [link](/my uri) 7531 . 7532 <p>[link](/my uri)</p> 7533 ```````````````````````````````` 7534 7535 ```````````````````````````````` example 7536 [link](</my uri>) 7537 . 7538 <p><a href="/my%20uri">link</a></p> 7539 ```````````````````````````````` 7540 7541 The destination cannot contain line breaks, 7542 even if enclosed in pointy brackets: 7543 7544 ```````````````````````````````` example 7545 [link](foo 7546 bar) 7547 . 7548 <p>[link](foo 7549 bar)</p> 7550 ```````````````````````````````` 7551 7552 ```````````````````````````````` example 7553 [link](<foo 7554 bar>) 7555 . 7556 <p>[link](<foo 7557 bar>)</p> 7558 ```````````````````````````````` 7559 7560 The destination can contain `)` if it is enclosed 7561 in pointy brackets: 7562 7563 ```````````````````````````````` example 7564 [a](<b)c>) 7565 . 7566 <p><a href="b)c">a</a></p> 7567 ```````````````````````````````` 7568 7569 Pointy brackets that enclose links must be unescaped: 7570 7571 ```````````````````````````````` example 7572 [link](<foo\>) 7573 . 7574 <p>[link](<foo>)</p> 7575 ```````````````````````````````` 7576 7577 These are not links, because the opening pointy bracket 7578 is not matched properly: 7579 7580 ```````````````````````````````` example 7581 [a](<b)c 7582 [a](<b)c> 7583 [a](<b>c) 7584 . 7585 <p>[a](<b)c 7586 [a](<b)c> 7587 [a](<b>c)</p> 7588 ```````````````````````````````` 7589 7590 Parentheses inside the link destination may be escaped: 7591 7592 ```````````````````````````````` example 7593 [link](\(foo\)) 7594 . 7595 <p><a href="(foo)">link</a></p> 7596 ```````````````````````````````` 7597 7598 Any number of parentheses are allowed without escaping, as long as they are 7599 balanced: 7600 7601 ```````````````````````````````` example 7602 [link](foo(and(bar))) 7603 . 7604 <p><a href="foo(and(bar))">link</a></p> 7605 ```````````````````````````````` 7606 7607 However, if you have unbalanced parentheses, you need to escape or use the 7608 `<...>` form: 7609 7610 ```````````````````````````````` example 7611 [link](foo\(and\(bar\)) 7612 . 7613 <p><a href="foo(and(bar)">link</a></p> 7614 ```````````````````````````````` 7615 7616 7617 ```````````````````````````````` example 7618 [link](<foo(and(bar)>) 7619 . 7620 <p><a href="foo(and(bar)">link</a></p> 7621 ```````````````````````````````` 7622 7623 7624 Parentheses and other symbols can also be escaped, as usual 7625 in Markdown: 7626 7627 ```````````````````````````````` example 7628 [link](foo\)\:) 7629 . 7630 <p><a href="foo):">link</a></p> 7631 ```````````````````````````````` 7632 7633 7634 A link can contain fragment identifiers and queries: 7635 7636 ```````````````````````````````` example 7637 [link](#fragment) 7638 7639 [link](http://example.com#fragment) 7640 7641 [link](http://example.com?foo=3#frag) 7642 . 7643 <p><a href="#fragment">link</a></p> 7644 <p><a href="http://example.com#fragment">link</a></p> 7645 <p><a href="http://example.com?foo=3#frag">link</a></p> 7646 ```````````````````````````````` 7647 7648 7649 Note that a backslash before a non-escapable character is 7650 just a backslash: 7651 7652 ```````````````````````````````` example 7653 [link](foo\bar) 7654 . 7655 <p><a href="foo%5Cbar">link</a></p> 7656 ```````````````````````````````` 7657 7658 7659 URL-escaping should be left alone inside the destination, as all 7660 URL-escaped characters are also valid URL characters. Entity and 7661 numerical character references in the destination will be parsed 7662 into the corresponding Unicode code points, as usual. These may 7663 be optionally URL-escaped when written as HTML, but this spec 7664 does not enforce any particular policy for rendering URLs in 7665 HTML or other formats. Renderers may make different decisions 7666 about how to escape or normalize URLs in the output. 7667 7668 ```````````````````````````````` example 7669 [link](foo%20bä) 7670 . 7671 <p><a href="foo%20b%C3%A4">link</a></p> 7672 ```````````````````````````````` 7673 7674 7675 Note that, because titles can often be parsed as destinations, 7676 if you try to omit the destination and keep the title, you'll 7677 get unexpected results: 7678 7679 ```````````````````````````````` example 7680 [link]("title") 7681 . 7682 <p><a href="%22title%22">link</a></p> 7683 ```````````````````````````````` 7684 7685 7686 Titles may be in single quotes, double quotes, or parentheses: 7687 7688 ```````````````````````````````` example 7689 [link](/url "title") 7690 [link](/url 'title') 7691 [link](/url (title)) 7692 . 7693 <p><a href="/url" title="title">link</a> 7694 <a href="/url" title="title">link</a> 7695 <a href="/url" title="title">link</a></p> 7696 ```````````````````````````````` 7697 7698 7699 Backslash escapes and entity and numeric character references 7700 may be used in titles: 7701 7702 ```````````````````````````````` example 7703 [link](/url "title \""") 7704 . 7705 <p><a href="/url" title="title """>link</a></p> 7706 ```````````````````````````````` 7707 7708 7709 Titles must be separated from the link using a [whitespace]. 7710 Other [Unicode whitespace] like non-breaking space doesn't work. 7711 7712 ```````````````````````````````` example 7713 [link](/url "title") 7714 . 7715 <p><a href="/url%C2%A0%22title%22">link</a></p> 7716 ```````````````````````````````` 7717 7718 7719 Nested balanced quotes are not allowed without escaping: 7720 7721 ```````````````````````````````` example 7722 [link](/url "title "and" title") 7723 . 7724 <p>[link](/url "title "and" title")</p> 7725 ```````````````````````````````` 7726 7727 7728 But it is easy to work around this by using a different quote type: 7729 7730 ```````````````````````````````` example 7731 [link](/url 'title "and" title') 7732 . 7733 <p><a href="/url" title="title "and" title">link</a></p> 7734 ```````````````````````````````` 7735 7736 7737 (Note: `Markdown.pl` did allow double quotes inside a double-quoted 7738 title, and its test suite included a test demonstrating this. 7739 But it is hard to see a good rationale for the extra complexity this 7740 brings, since there are already many ways---backslash escaping, 7741 entity and numeric character references, or using a different 7742 quote type for the enclosing title---to write titles containing 7743 double quotes. `Markdown.pl`'s handling of titles has a number 7744 of other strange features. For example, it allows single-quoted 7745 titles in inline links, but not reference links. And, in 7746 reference links but not inline links, it allows a title to begin 7747 with `"` and end with `)`. `Markdown.pl` 1.0.1 even allows 7748 titles with no closing quotation mark, though 1.0.2b8 does not. 7749 It seems preferable to adopt a simple, rational rule that works 7750 the same way in inline links and link reference definitions.) 7751 7752 [Whitespace] is allowed around the destination and title: 7753 7754 ```````````````````````````````` example 7755 [link]( /uri 7756 "title" ) 7757 . 7758 <p><a href="/uri" title="title">link</a></p> 7759 ```````````````````````````````` 7760 7761 7762 But it is not allowed between the link text and the 7763 following parenthesis: 7764 7765 ```````````````````````````````` example 7766 [link] (/uri) 7767 . 7768 <p>[link] (/uri)</p> 7769 ```````````````````````````````` 7770 7771 7772 The link text may contain balanced brackets, but not unbalanced ones, 7773 unless they are escaped: 7774 7775 ```````````````````````````````` example 7776 [link [foo [bar]]](/uri) 7777 . 7778 <p><a href="/uri">link [foo [bar]]</a></p> 7779 ```````````````````````````````` 7780 7781 7782 ```````````````````````````````` example 7783 [link] bar](/uri) 7784 . 7785 <p>[link] bar](/uri)</p> 7786 ```````````````````````````````` 7787 7788 7789 ```````````````````````````````` example 7790 [link [bar](/uri) 7791 . 7792 <p>[link <a href="/uri">bar</a></p> 7793 ```````````````````````````````` 7794 7795 7796 ```````````````````````````````` example 7797 [link \[bar](/uri) 7798 . 7799 <p><a href="/uri">link [bar</a></p> 7800 ```````````````````````````````` 7801 7802 7803 The link text may contain inline content: 7804 7805 ```````````````````````````````` example 7806 [link *foo **bar** `#`*](/uri) 7807 . 7808 <p><a href="/uri">link <em>foo <strong>bar</strong> <code>#</code></em></a></p> 7809 ```````````````````````````````` 7810 7811 7812 ```````````````````````````````` example 7813 [![moon](moon.jpg)](/uri) 7814 . 7815 <p><a href="/uri"><img src="moon.jpg" alt="moon" /></a></p> 7816 ```````````````````````````````` 7817 7818 7819 However, links may not contain other links, at any level of nesting. 7820 7821 ```````````````````````````````` example 7822 [foo [bar](/uri)](/uri) 7823 . 7824 <p>[foo <a href="/uri">bar</a>](/uri)</p> 7825 ```````````````````````````````` 7826 7827 7828 ```````````````````````````````` example 7829 [foo *[bar [baz](/uri)](/uri)*](/uri) 7830 . 7831 <p>[foo <em>[bar <a href="/uri">baz</a>](/uri)</em>](/uri)</p> 7832 ```````````````````````````````` 7833 7834 7835 ```````````````````````````````` example 7836 ![[[foo](uri1)](uri2)](uri3) 7837 . 7838 <p><img src="uri3" alt="[foo](uri2)" /></p> 7839 ```````````````````````````````` 7840 7841 7842 These cases illustrate the precedence of link text grouping over 7843 emphasis grouping: 7844 7845 ```````````````````````````````` example 7846 *[foo*](/uri) 7847 . 7848 <p>*<a href="/uri">foo*</a></p> 7849 ```````````````````````````````` 7850 7851 7852 ```````````````````````````````` example 7853 [foo *bar](baz*) 7854 . 7855 <p><a href="baz*">foo *bar</a></p> 7856 ```````````````````````````````` 7857 7858 7859 Note that brackets that *aren't* part of links do not take 7860 precedence: 7861 7862 ```````````````````````````````` example 7863 *foo [bar* baz] 7864 . 7865 <p><em>foo [bar</em> baz]</p> 7866 ```````````````````````````````` 7867 7868 7869 These cases illustrate the precedence of HTML tags, code spans, 7870 and autolinks over link grouping: 7871 7872 ```````````````````````````````` example 7873 [foo <bar attr="](baz)"> 7874 . 7875 <p>[foo <bar attr="](baz)"></p> 7876 ```````````````````````````````` 7877 7878 7879 ```````````````````````````````` example 7880 [foo`](/uri)` 7881 . 7882 <p>[foo<code>](/uri)</code></p> 7883 ```````````````````````````````` 7884 7885 7886 ```````````````````````````````` example 7887 [foo<http://example.com/?search=](uri)> 7888 . 7889 <p>[foo<a href="http://example.com/?search=%5D(uri)">http://example.com/?search=](uri)</a></p> 7890 ```````````````````````````````` 7891 7892 7893 There are three kinds of [reference link](@)s: 7894 [full](#full-reference-link), [collapsed](#collapsed-reference-link), 7895 and [shortcut](#shortcut-reference-link). 7896 7897 A [full reference link](@) 7898 consists of a [link text] immediately followed by a [link label] 7899 that [matches] a [link reference definition] elsewhere in the document. 7900 7901 A [link label](@) begins with a left bracket (`[`) and ends 7902 with the first right bracket (`]`) that is not backslash-escaped. 7903 Between these brackets there must be at least one [non-whitespace character]. 7904 Unescaped square bracket characters are not allowed inside the 7905 opening and closing square brackets of [link labels]. A link 7906 label can have at most 999 characters inside the square 7907 brackets. 7908 7909 One label [matches](@) 7910 another just in case their normalized forms are equal. To normalize a 7911 label, strip off the opening and closing brackets, 7912 perform the *Unicode case fold*, strip leading and trailing 7913 [whitespace] and collapse consecutive internal 7914 [whitespace] to a single space. If there are multiple 7915 matching reference link definitions, the one that comes first in the 7916 document is used. (It is desirable in such cases to emit a warning.) 7917 7918 The contents of the first link label are parsed as inlines, which are 7919 used as the link's text. The link's URI and title are provided by the 7920 matching [link reference definition]. 7921 7922 Here is a simple example: 7923 7924 ```````````````````````````````` example 7925 [foo][bar] 7926 7927 [bar]: /url "title" 7928 . 7929 <p><a href="/url" title="title">foo</a></p> 7930 ```````````````````````````````` 7931 7932 7933 The rules for the [link text] are the same as with 7934 [inline links]. Thus: 7935 7936 The link text may contain balanced brackets, but not unbalanced ones, 7937 unless they are escaped: 7938 7939 ```````````````````````````````` example 7940 [link [foo [bar]]][ref] 7941 7942 [ref]: /uri 7943 . 7944 <p><a href="/uri">link [foo [bar]]</a></p> 7945 ```````````````````````````````` 7946 7947 7948 ```````````````````````````````` example 7949 [link \[bar][ref] 7950 7951 [ref]: /uri 7952 . 7953 <p><a href="/uri">link [bar</a></p> 7954 ```````````````````````````````` 7955 7956 7957 The link text may contain inline content: 7958 7959 ```````````````````````````````` example 7960 [link *foo **bar** `#`*][ref] 7961 7962 [ref]: /uri 7963 . 7964 <p><a href="/uri">link <em>foo <strong>bar</strong> <code>#</code></em></a></p> 7965 ```````````````````````````````` 7966 7967 7968 ```````````````````````````````` example 7969 [![moon](moon.jpg)][ref] 7970 7971 [ref]: /uri 7972 . 7973 <p><a href="/uri"><img src="moon.jpg" alt="moon" /></a></p> 7974 ```````````````````````````````` 7975 7976 7977 However, links may not contain other links, at any level of nesting. 7978 7979 ```````````````````````````````` example 7980 [foo [bar](/uri)][ref] 7981 7982 [ref]: /uri 7983 . 7984 <p>[foo <a href="/uri">bar</a>]<a href="/uri">ref</a></p> 7985 ```````````````````````````````` 7986 7987 7988 ```````````````````````````````` example 7989 [foo *bar [baz][ref]*][ref] 7990 7991 [ref]: /uri 7992 . 7993 <p>[foo <em>bar <a href="/uri">baz</a></em>]<a href="/uri">ref</a></p> 7994 ```````````````````````````````` 7995 7996 7997 (In the examples above, we have two [shortcut reference links] 7998 instead of one [full reference link].) 7999 8000 The following cases illustrate the precedence of link text grouping over 8001 emphasis grouping: 8002 8003 ```````````````````````````````` example 8004 *[foo*][ref] 8005 8006 [ref]: /uri 8007 . 8008 <p>*<a href="/uri">foo*</a></p> 8009 ```````````````````````````````` 8010 8011 8012 ```````````````````````````````` example 8013 [foo *bar][ref] 8014 8015 [ref]: /uri 8016 . 8017 <p><a href="/uri">foo *bar</a></p> 8018 ```````````````````````````````` 8019 8020 8021 These cases illustrate the precedence of HTML tags, code spans, 8022 and autolinks over link grouping: 8023 8024 ```````````````````````````````` example 8025 [foo <bar attr="][ref]"> 8026 8027 [ref]: /uri 8028 . 8029 <p>[foo <bar attr="][ref]"></p> 8030 ```````````````````````````````` 8031 8032 8033 ```````````````````````````````` example 8034 [foo`][ref]` 8035 8036 [ref]: /uri 8037 . 8038 <p>[foo<code>][ref]</code></p> 8039 ```````````````````````````````` 8040 8041 8042 ```````````````````````````````` example 8043 [foo<http://example.com/?search=][ref]> 8044 8045 [ref]: /uri 8046 . 8047 <p>[foo<a href="http://example.com/?search=%5D%5Bref%5D">http://example.com/?search=][ref]</a></p> 8048 ```````````````````````````````` 8049 8050 8051 Matching is case-insensitive: 8052 8053 ```````````````````````````````` example 8054 [foo][BaR] 8055 8056 [bar]: /url "title" 8057 . 8058 <p><a href="/url" title="title">foo</a></p> 8059 ```````````````````````````````` 8060 8061 8062 Unicode case fold is used: 8063 8064 ```````````````````````````````` example 8065 [Толпой][Толпой] is a Russian word. 8066 8067 [ТОЛПОЙ]: /url 8068 . 8069 <p><a href="/url">Толпой</a> is a Russian word.</p> 8070 ```````````````````````````````` 8071 8072 8073 Consecutive internal [whitespace] is treated as one space for 8074 purposes of determining matching: 8075 8076 ```````````````````````````````` example 8077 [Foo 8078 bar]: /url 8079 8080 [Baz][Foo bar] 8081 . 8082 <p><a href="/url">Baz</a></p> 8083 ```````````````````````````````` 8084 8085 8086 No [whitespace] is allowed between the [link text] and the 8087 [link label]: 8088 8089 ```````````````````````````````` example 8090 [foo] [bar] 8091 8092 [bar]: /url "title" 8093 . 8094 <p>[foo] <a href="/url" title="title">bar</a></p> 8095 ```````````````````````````````` 8096 8097 8098 ```````````````````````````````` example 8099 [foo] 8100 [bar] 8101 8102 [bar]: /url "title" 8103 . 8104 <p>[foo] 8105 <a href="/url" title="title">bar</a></p> 8106 ```````````````````````````````` 8107 8108 8109 This is a departure from John Gruber's original Markdown syntax 8110 description, which explicitly allows whitespace between the link 8111 text and the link label. It brings reference links in line with 8112 [inline links], which (according to both original Markdown and 8113 this spec) cannot have whitespace after the link text. More 8114 importantly, it prevents inadvertent capture of consecutive 8115 [shortcut reference links]. If whitespace is allowed between the 8116 link text and the link label, then in the following we will have 8117 a single reference link, not two shortcut reference links, as 8118 intended: 8119 8120 ``` markdown 8121 [foo] 8122 [bar] 8123 8124 [foo]: /url1 8125 [bar]: /url2 8126 ``` 8127 8128 (Note that [shortcut reference links] were introduced by Gruber 8129 himself in a beta version of `Markdown.pl`, but never included 8130 in the official syntax description. Without shortcut reference 8131 links, it is harmless to allow space between the link text and 8132 link label; but once shortcut references are introduced, it is 8133 too dangerous to allow this, as it frequently leads to 8134 unintended results.) 8135 8136 When there are multiple matching [link reference definitions], 8137 the first is used: 8138 8139 ```````````````````````````````` example 8140 [foo]: /url1 8141 8142 [foo]: /url2 8143 8144 [bar][foo] 8145 . 8146 <p><a href="/url1">bar</a></p> 8147 ```````````````````````````````` 8148 8149 8150 Note that matching is performed on normalized strings, not parsed 8151 inline content. So the following does not match, even though the 8152 labels define equivalent inline content: 8153 8154 ```````````````````````````````` example 8155 [bar][foo\!] 8156 8157 [foo!]: /url 8158 . 8159 <p>[bar][foo!]</p> 8160 ```````````````````````````````` 8161 8162 8163 [Link labels] cannot contain brackets, unless they are 8164 backslash-escaped: 8165 8166 ```````````````````````````````` example 8167 [foo][ref[] 8168 8169 [ref[]: /uri 8170 . 8171 <p>[foo][ref[]</p> 8172 <p>[ref[]: /uri</p> 8173 ```````````````````````````````` 8174 8175 8176 ```````````````````````````````` example 8177 [foo][ref[bar]] 8178 8179 [ref[bar]]: /uri 8180 . 8181 <p>[foo][ref[bar]]</p> 8182 <p>[ref[bar]]: /uri</p> 8183 ```````````````````````````````` 8184 8185 8186 ```````````````````````````````` example 8187 [[[foo]]] 8188 8189 [[[foo]]]: /url 8190 . 8191 <p>[[[foo]]]</p> 8192 <p>[[[foo]]]: /url</p> 8193 ```````````````````````````````` 8194 8195 8196 ```````````````````````````````` example 8197 [foo][ref\[] 8198 8199 [ref\[]: /uri 8200 . 8201 <p><a href="/uri">foo</a></p> 8202 ```````````````````````````````` 8203 8204 8205 Note that in this example `]` is not backslash-escaped: 8206 8207 ```````````````````````````````` example 8208 [bar\\]: /uri 8209 8210 [bar\\] 8211 . 8212 <p><a href="/uri">bar\</a></p> 8213 ```````````````````````````````` 8214 8215 8216 A [link label] must contain at least one [non-whitespace character]: 8217 8218 ```````````````````````````````` example 8219 [] 8220 8221 []: /uri 8222 . 8223 <p>[]</p> 8224 <p>[]: /uri</p> 8225 ```````````````````````````````` 8226 8227 8228 ```````````````````````````````` example 8229 [ 8230 ] 8231 8232 [ 8233 ]: /uri 8234 . 8235 <p>[ 8236 ]</p> 8237 <p>[ 8238 ]: /uri</p> 8239 ```````````````````````````````` 8240 8241 8242 A [collapsed reference link](@) 8243 consists of a [link label] that [matches] a 8244 [link reference definition] elsewhere in the 8245 document, followed by the string `[]`. 8246 The contents of the first link label are parsed as inlines, 8247 which are used as the link's text. The link's URI and title are 8248 provided by the matching reference link definition. Thus, 8249 `[foo][]` is equivalent to `[foo][foo]`. 8250 8251 ```````````````````````````````` example 8252 [foo][] 8253 8254 [foo]: /url "title" 8255 . 8256 <p><a href="/url" title="title">foo</a></p> 8257 ```````````````````````````````` 8258 8259 8260 ```````````````````````````````` example 8261 [*foo* bar][] 8262 8263 [*foo* bar]: /url "title" 8264 . 8265 <p><a href="/url" title="title"><em>foo</em> bar</a></p> 8266 ```````````````````````````````` 8267 8268 8269 The link labels are case-insensitive: 8270 8271 ```````````````````````````````` example 8272 [Foo][] 8273 8274 [foo]: /url "title" 8275 . 8276 <p><a href="/url" title="title">Foo</a></p> 8277 ```````````````````````````````` 8278 8279 8280 8281 As with full reference links, [whitespace] is not 8282 allowed between the two sets of brackets: 8283 8284 ```````````````````````````````` example 8285 [foo] 8286 [] 8287 8288 [foo]: /url "title" 8289 . 8290 <p><a href="/url" title="title">foo</a> 8291 []</p> 8292 ```````````````````````````````` 8293 8294 8295 A [shortcut reference link](@) 8296 consists of a [link label] that [matches] a 8297 [link reference definition] elsewhere in the 8298 document and is not followed by `[]` or a link label. 8299 The contents of the first link label are parsed as inlines, 8300 which are used as the link's text. The link's URI and title 8301 are provided by the matching link reference definition. 8302 Thus, `[foo]` is equivalent to `[foo][]`. 8303 8304 ```````````````````````````````` example 8305 [foo] 8306 8307 [foo]: /url "title" 8308 . 8309 <p><a href="/url" title="title">foo</a></p> 8310 ```````````````````````````````` 8311 8312 8313 ```````````````````````````````` example 8314 [*foo* bar] 8315 8316 [*foo* bar]: /url "title" 8317 . 8318 <p><a href="/url" title="title"><em>foo</em> bar</a></p> 8319 ```````````````````````````````` 8320 8321 8322 ```````````````````````````````` example 8323 [[*foo* bar]] 8324 8325 [*foo* bar]: /url "title" 8326 . 8327 <p>[<a href="/url" title="title"><em>foo</em> bar</a>]</p> 8328 ```````````````````````````````` 8329 8330 8331 ```````````````````````````````` example 8332 [[bar [foo] 8333 8334 [foo]: /url 8335 . 8336 <p>[[bar <a href="/url">foo</a></p> 8337 ```````````````````````````````` 8338 8339 8340 The link labels are case-insensitive: 8341 8342 ```````````````````````````````` example 8343 [Foo] 8344 8345 [foo]: /url "title" 8346 . 8347 <p><a href="/url" title="title">Foo</a></p> 8348 ```````````````````````````````` 8349 8350 8351 A space after the link text should be preserved: 8352 8353 ```````````````````````````````` example 8354 [foo] bar 8355 8356 [foo]: /url 8357 . 8358 <p><a href="/url">foo</a> bar</p> 8359 ```````````````````````````````` 8360 8361 8362 If you just want bracketed text, you can backslash-escape the 8363 opening bracket to avoid links: 8364 8365 ```````````````````````````````` example 8366 \[foo] 8367 8368 [foo]: /url "title" 8369 . 8370 <p>[foo]</p> 8371 ```````````````````````````````` 8372 8373 8374 Note that this is a link, because a link label ends with the first 8375 following closing bracket: 8376 8377 ```````````````````````````````` example 8378 [foo*]: /url 8379 8380 *[foo*] 8381 . 8382 <p>*<a href="/url">foo*</a></p> 8383 ```````````````````````````````` 8384 8385 8386 Full and compact references take precedence over shortcut 8387 references: 8388 8389 ```````````````````````````````` example 8390 [foo][bar] 8391 8392 [foo]: /url1 8393 [bar]: /url2 8394 . 8395 <p><a href="/url2">foo</a></p> 8396 ```````````````````````````````` 8397 8398 ```````````````````````````````` example 8399 [foo][] 8400 8401 [foo]: /url1 8402 . 8403 <p><a href="/url1">foo</a></p> 8404 ```````````````````````````````` 8405 8406 Inline links also take precedence: 8407 8408 ```````````````````````````````` example 8409 [foo]() 8410 8411 [foo]: /url1 8412 . 8413 <p><a href="">foo</a></p> 8414 ```````````````````````````````` 8415 8416 ```````````````````````````````` example 8417 [foo](not a link) 8418 8419 [foo]: /url1 8420 . 8421 <p><a href="/url1">foo</a>(not a link)</p> 8422 ```````````````````````````````` 8423 8424 In the following case `[bar][baz]` is parsed as a reference, 8425 `[foo]` as normal text: 8426 8427 ```````````````````````````````` example 8428 [foo][bar][baz] 8429 8430 [baz]: /url 8431 . 8432 <p>[foo]<a href="/url">bar</a></p> 8433 ```````````````````````````````` 8434 8435 8436 Here, though, `[foo][bar]` is parsed as a reference, since 8437 `[bar]` is defined: 8438 8439 ```````````````````````````````` example 8440 [foo][bar][baz] 8441 8442 [baz]: /url1 8443 [bar]: /url2 8444 . 8445 <p><a href="/url2">foo</a><a href="/url1">baz</a></p> 8446 ```````````````````````````````` 8447 8448 8449 Here `[foo]` is not parsed as a shortcut reference, because it 8450 is followed by a link label (even though `[bar]` is not defined): 8451 8452 ```````````````````````````````` example 8453 [foo][bar][baz] 8454 8455 [baz]: /url1 8456 [foo]: /url2 8457 . 8458 <p>[foo]<a href="/url1">bar</a></p> 8459 ```````````````````````````````` 8460 8461 8462 8463 ## Images 8464 8465 Syntax for images is like the syntax for links, with one 8466 difference. Instead of [link text], we have an 8467 [image description](@). The rules for this are the 8468 same as for [link text], except that (a) an 8469 image description starts with `![` rather than `[`, and 8470 (b) an image description may contain links. 8471 An image description has inline elements 8472 as its contents. When an image is rendered to HTML, 8473 this is standardly used as the image's `alt` attribute. 8474 8475 ```````````````````````````````` example 8476 ![foo](/url "title") 8477 . 8478 <p><img src="/url" alt="foo" title="title" /></p> 8479 ```````````````````````````````` 8480 8481 8482 ```````````````````````````````` example 8483 ![foo *bar*] 8484 8485 [foo *bar*]: train.jpg "train & tracks" 8486 . 8487 <p><img src="train.jpg" alt="foo bar" title="train & tracks" /></p> 8488 ```````````````````````````````` 8489 8490 8491 ```````````````````````````````` example 8492 ![foo ![bar](/url)](/url2) 8493 . 8494 <p><img src="/url2" alt="foo bar" /></p> 8495 ```````````````````````````````` 8496 8497 8498 ```````````````````````````````` example 8499 ![foo [bar](/url)](/url2) 8500 . 8501 <p><img src="/url2" alt="foo bar" /></p> 8502 ```````````````````````````````` 8503 8504 8505 Though this spec is concerned with parsing, not rendering, it is 8506 recommended that in rendering to HTML, only the plain string content 8507 of the [image description] be used. Note that in 8508 the above example, the alt attribute's value is `foo bar`, not `foo 8509 [bar](/url)` or `foo <a href="/url">bar</a>`. Only the plain string 8510 content is rendered, without formatting. 8511 8512 ```````````````````````````````` example 8513 ![foo *bar*][] 8514 8515 [foo *bar*]: train.jpg "train & tracks" 8516 . 8517 <p><img src="train.jpg" alt="foo bar" title="train & tracks" /></p> 8518 ```````````````````````````````` 8519 8520 8521 ```````````````````````````````` example 8522 ![foo *bar*][foobar] 8523 8524 [FOOBAR]: train.jpg "train & tracks" 8525 . 8526 <p><img src="train.jpg" alt="foo bar" title="train & tracks" /></p> 8527 ```````````````````````````````` 8528 8529 8530 ```````````````````````````````` example 8531 ![foo](train.jpg) 8532 . 8533 <p><img src="train.jpg" alt="foo" /></p> 8534 ```````````````````````````````` 8535 8536 8537 ```````````````````````````````` example 8538 My ![foo bar](/path/to/train.jpg "title" ) 8539 . 8540 <p>My <img src="/path/to/train.jpg" alt="foo bar" title="title" /></p> 8541 ```````````````````````````````` 8542 8543 8544 ```````````````````````````````` example 8545 ![foo](<url>) 8546 . 8547 <p><img src="url" alt="foo" /></p> 8548 ```````````````````````````````` 8549 8550 8551 ```````````````````````````````` example 8552 ![](/url) 8553 . 8554 <p><img src="/url" alt="" /></p> 8555 ```````````````````````````````` 8556 8557 8558 Reference-style: 8559 8560 ```````````````````````````````` example 8561 ![foo][bar] 8562 8563 [bar]: /url 8564 . 8565 <p><img src="/url" alt="foo" /></p> 8566 ```````````````````````````````` 8567 8568 8569 ```````````````````````````````` example 8570 ![foo][bar] 8571 8572 [BAR]: /url 8573 . 8574 <p><img src="/url" alt="foo" /></p> 8575 ```````````````````````````````` 8576 8577 8578 Collapsed: 8579 8580 ```````````````````````````````` example 8581 ![foo][] 8582 8583 [foo]: /url "title" 8584 . 8585 <p><img src="/url" alt="foo" title="title" /></p> 8586 ```````````````````````````````` 8587 8588 8589 ```````````````````````````````` example 8590 ![*foo* bar][] 8591 8592 [*foo* bar]: /url "title" 8593 . 8594 <p><img src="/url" alt="foo bar" title="title" /></p> 8595 ```````````````````````````````` 8596 8597 8598 The labels are case-insensitive: 8599 8600 ```````````````````````````````` example 8601 ![Foo][] 8602 8603 [foo]: /url "title" 8604 . 8605 <p><img src="/url" alt="Foo" title="title" /></p> 8606 ```````````````````````````````` 8607 8608 8609 As with reference links, [whitespace] is not allowed 8610 between the two sets of brackets: 8611 8612 ```````````````````````````````` example 8613 ![foo] 8614 [] 8615 8616 [foo]: /url "title" 8617 . 8618 <p><img src="/url" alt="foo" title="title" /> 8619 []</p> 8620 ```````````````````````````````` 8621 8622 8623 Shortcut: 8624 8625 ```````````````````````````````` example 8626 ![foo] 8627 8628 [foo]: /url "title" 8629 . 8630 <p><img src="/url" alt="foo" title="title" /></p> 8631 ```````````````````````````````` 8632 8633 8634 ```````````````````````````````` example 8635 ![*foo* bar] 8636 8637 [*foo* bar]: /url "title" 8638 . 8639 <p><img src="/url" alt="foo bar" title="title" /></p> 8640 ```````````````````````````````` 8641 8642 8643 Note that link labels cannot contain unescaped brackets: 8644 8645 ```````````````````````````````` example 8646 ![[foo]] 8647 8648 [[foo]]: /url "title" 8649 . 8650 <p>![[foo]]</p> 8651 <p>[[foo]]: /url "title"</p> 8652 ```````````````````````````````` 8653 8654 8655 The link labels are case-insensitive: 8656 8657 ```````````````````````````````` example 8658 ![Foo] 8659 8660 [foo]: /url "title" 8661 . 8662 <p><img src="/url" alt="Foo" title="title" /></p> 8663 ```````````````````````````````` 8664 8665 8666 If you just want a literal `!` followed by bracketed text, you can 8667 backslash-escape the opening `[`: 8668 8669 ```````````````````````````````` example 8670 !\[foo] 8671 8672 [foo]: /url "title" 8673 . 8674 <p>![foo]</p> 8675 ```````````````````````````````` 8676 8677 8678 If you want a link after a literal `!`, backslash-escape the 8679 `!`: 8680 8681 ```````````````````````````````` example 8682 \![foo] 8683 8684 [foo]: /url "title" 8685 . 8686 <p>!<a href="/url" title="title">foo</a></p> 8687 ```````````````````````````````` 8688 8689 8690 ## Autolinks 8691 8692 [Autolink](@)s are absolute URIs and email addresses inside 8693 `<` and `>`. They are parsed as links, with the URL or email address 8694 as the link label. 8695 8696 A [URI autolink](@) consists of `<`, followed by an 8697 [absolute URI] followed by `>`. It is parsed as 8698 a link to the URI, with the URI as the link's label. 8699 8700 An [absolute URI](@), 8701 for these purposes, consists of a [scheme] followed by a colon (`:`) 8702 followed by zero or more characters other than ASCII 8703 [whitespace] and control characters, `<`, and `>`. If 8704 the URI includes these characters, they must be percent-encoded 8705 (e.g. `%20` for a space). 8706 8707 For purposes of this spec, a [scheme](@) is any sequence 8708 of 2--32 characters beginning with an ASCII letter and followed 8709 by any combination of ASCII letters, digits, or the symbols plus 8710 ("+"), period ("."), or hyphen ("-"). 8711 8712 Here are some valid autolinks: 8713 8714 ```````````````````````````````` example 8715 <http://foo.bar.baz> 8716 . 8717 <p><a href="http://foo.bar.baz">http://foo.bar.baz</a></p> 8718 ```````````````````````````````` 8719 8720 8721 ```````````````````````````````` example 8722 <http://foo.bar.baz/test?q=hello&id=22&boolean> 8723 . 8724 <p><a href="http://foo.bar.baz/test?q=hello&id=22&boolean">http://foo.bar.baz/test?q=hello&id=22&boolean</a></p> 8725 ```````````````````````````````` 8726 8727 8728 ```````````````````````````````` example 8729 <irc://foo.bar:2233/baz> 8730 . 8731 <p><a href="irc://foo.bar:2233/baz">irc://foo.bar:2233/baz</a></p> 8732 ```````````````````````````````` 8733 8734 8735 Uppercase is also fine: 8736 8737 ```````````````````````````````` example 8738 <MAILTO:FOO@BAR.BAZ> 8739 . 8740 <p><a href="MAILTO:FOO@BAR.BAZ">MAILTO:FOO@BAR.BAZ</a></p> 8741 ```````````````````````````````` 8742 8743 8744 Note that many strings that count as [absolute URIs] for 8745 purposes of this spec are not valid URIs, because their 8746 schemes are not registered or because of other problems 8747 with their syntax: 8748 8749 ```````````````````````````````` example 8750 <a+b+c:d> 8751 . 8752 <p><a href="a+b+c:d">a+b+c:d</a></p> 8753 ```````````````````````````````` 8754 8755 8756 ```````````````````````````````` example 8757 <made-up-scheme://foo,bar> 8758 . 8759 <p><a href="made-up-scheme://foo,bar">made-up-scheme://foo,bar</a></p> 8760 ```````````````````````````````` 8761 8762 8763 ```````````````````````````````` example 8764 <http://../> 8765 . 8766 <p><a href="http://../">http://../</a></p> 8767 ```````````````````````````````` 8768 8769 8770 ```````````````````````````````` example 8771 <localhost:5001/foo> 8772 . 8773 <p><a href="localhost:5001/foo">localhost:5001/foo</a></p> 8774 ```````````````````````````````` 8775 8776 8777 Spaces are not allowed in autolinks: 8778 8779 ```````````````````````````````` example 8780 <http://foo.bar/baz bim> 8781 . 8782 <p><http://foo.bar/baz bim></p> 8783 ```````````````````````````````` 8784 8785 8786 Backslash-escapes do not work inside autolinks: 8787 8788 ```````````````````````````````` example 8789 <http://example.com/\[\> 8790 . 8791 <p><a href="http://example.com/%5C%5B%5C">http://example.com/\[\</a></p> 8792 ```````````````````````````````` 8793 8794 8795 An [email autolink](@) 8796 consists of `<`, followed by an [email address], 8797 followed by `>`. The link's label is the email address, 8798 and the URL is `mailto:` followed by the email address. 8799 8800 An [email address](@), 8801 for these purposes, is anything that matches 8802 the [non-normative regex from the HTML5 8803 spec](https://html.spec.whatwg.org/multipage/forms.html#e-mail-state-(type=email)): 8804 8805 /^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])? 8806 (?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/ 8807 8808 Examples of email autolinks: 8809 8810 ```````````````````````````````` example 8811 <foo@bar.example.com> 8812 . 8813 <p><a href="mailto:foo@bar.example.com">foo@bar.example.com</a></p> 8814 ```````````````````````````````` 8815 8816 8817 ```````````````````````````````` example 8818 <foo+special@Bar.baz-bar0.com> 8819 . 8820 <p><a href="mailto:foo+special@Bar.baz-bar0.com">foo+special@Bar.baz-bar0.com</a></p> 8821 ```````````````````````````````` 8822 8823 8824 Backslash-escapes do not work inside email autolinks: 8825 8826 ```````````````````````````````` example 8827 <foo\+@bar.example.com> 8828 . 8829 <p><foo+@bar.example.com></p> 8830 ```````````````````````````````` 8831 8832 8833 These are not autolinks: 8834 8835 ```````````````````````````````` example 8836 <> 8837 . 8838 <p><></p> 8839 ```````````````````````````````` 8840 8841 8842 ```````````````````````````````` example 8843 < http://foo.bar > 8844 . 8845 <p>< http://foo.bar ></p> 8846 ```````````````````````````````` 8847 8848 8849 ```````````````````````````````` example 8850 <m:abc> 8851 . 8852 <p><m:abc></p> 8853 ```````````````````````````````` 8854 8855 8856 ```````````````````````````````` example 8857 <foo.bar.baz> 8858 . 8859 <p><foo.bar.baz></p> 8860 ```````````````````````````````` 8861 8862 8863 ```````````````````````````````` example 8864 http://example.com 8865 . 8866 <p>http://example.com</p> 8867 ```````````````````````````````` 8868 8869 8870 ```````````````````````````````` example 8871 foo@bar.example.com 8872 . 8873 <p>foo@bar.example.com</p> 8874 ```````````````````````````````` 8875 8876 8877 ## Raw HTML 8878 8879 Text between `<` and `>` that looks like an HTML tag is parsed as a 8880 raw HTML tag and will be rendered in HTML without escaping. 8881 Tag and attribute names are not limited to current HTML tags, 8882 so custom tags (and even, say, DocBook tags) may be used. 8883 8884 Here is the grammar for tags: 8885 8886 A [tag name](@) consists of an ASCII letter 8887 followed by zero or more ASCII letters, digits, or 8888 hyphens (`-`). 8889 8890 An [attribute](@) consists of [whitespace], 8891 an [attribute name], and an optional 8892 [attribute value specification]. 8893 8894 An [attribute name](@) 8895 consists of an ASCII letter, `_`, or `:`, followed by zero or more ASCII 8896 letters, digits, `_`, `.`, `:`, or `-`. (Note: This is the XML 8897 specification restricted to ASCII. HTML5 is laxer.) 8898 8899 An [attribute value specification](@) 8900 consists of optional [whitespace], 8901 a `=` character, optional [whitespace], and an [attribute 8902 value]. 8903 8904 An [attribute value](@) 8905 consists of an [unquoted attribute value], 8906 a [single-quoted attribute value], or a [double-quoted attribute value]. 8907 8908 An [unquoted attribute value](@) 8909 is a nonempty string of characters not 8910 including [whitespace], `"`, `'`, `=`, `<`, `>`, or `` ` ``. 8911 8912 A [single-quoted attribute value](@) 8913 consists of `'`, zero or more 8914 characters not including `'`, and a final `'`. 8915 8916 A [double-quoted attribute value](@) 8917 consists of `"`, zero or more 8918 characters not including `"`, and a final `"`. 8919 8920 An [open tag](@) consists of a `<` character, a [tag name], 8921 zero or more [attributes], optional [whitespace], an optional `/` 8922 character, and a `>` character. 8923 8924 A [closing tag](@) consists of the string `</`, a 8925 [tag name], optional [whitespace], and the character `>`. 8926 8927 An [HTML comment](@) consists of `<!--` + *text* + `-->`, 8928 where *text* does not start with `>` or `->`, does not end with `-`, 8929 and does not contain `--`. (See the 8930 [HTML5 spec](http://www.w3.org/TR/html5/syntax.html#comments).) 8931 8932 A [processing instruction](@) 8933 consists of the string `<?`, a string 8934 of characters not including the string `?>`, and the string 8935 `?>`. 8936 8937 A [declaration](@) consists of the 8938 string `<!`, a name consisting of one or more uppercase ASCII letters, 8939 [whitespace], a string of characters not including the 8940 character `>`, and the character `>`. 8941 8942 A [CDATA section](@) consists of 8943 the string `<![CDATA[`, a string of characters not including the string 8944 `]]>`, and the string `]]>`. 8945 8946 An [HTML tag](@) consists of an [open tag], a [closing tag], 8947 an [HTML comment], a [processing instruction], a [declaration], 8948 or a [CDATA section]. 8949 8950 Here are some simple open tags: 8951 8952 ```````````````````````````````` example 8953 <a><bab><c2c> 8954 . 8955 <p><a><bab><c2c></p> 8956 ```````````````````````````````` 8957 8958 8959 Empty elements: 8960 8961 ```````````````````````````````` example 8962 <a/><b2/> 8963 . 8964 <p><a/><b2/></p> 8965 ```````````````````````````````` 8966 8967 8968 [Whitespace] is allowed: 8969 8970 ```````````````````````````````` example 8971 <a /><b2 8972 data="foo" > 8973 . 8974 <p><a /><b2 8975 data="foo" ></p> 8976 ```````````````````````````````` 8977 8978 8979 With attributes: 8980 8981 ```````````````````````````````` example 8982 <a foo="bar" bam = 'baz <em>"</em>' 8983 _boolean zoop:33=zoop:33 /> 8984 . 8985 <p><a foo="bar" bam = 'baz <em>"</em>' 8986 _boolean zoop:33=zoop:33 /></p> 8987 ```````````````````````````````` 8988 8989 8990 Custom tag names can be used: 8991 8992 ```````````````````````````````` example 8993 Foo <responsive-image src="foo.jpg" /> 8994 . 8995 <p>Foo <responsive-image src="foo.jpg" /></p> 8996 ```````````````````````````````` 8997 8998 8999 Illegal tag names, not parsed as HTML: 9000 9001 ```````````````````````````````` example 9002 <33> <__> 9003 . 9004 <p><33> <__></p> 9005 ```````````````````````````````` 9006 9007 9008 Illegal attribute names: 9009 9010 ```````````````````````````````` example 9011 <a h*#ref="hi"> 9012 . 9013 <p><a h*#ref="hi"></p> 9014 ```````````````````````````````` 9015 9016 9017 Illegal attribute values: 9018 9019 ```````````````````````````````` example 9020 <a href="hi'> <a href=hi'> 9021 . 9022 <p><a href="hi'> <a href=hi'></p> 9023 ```````````````````````````````` 9024 9025 9026 Illegal [whitespace]: 9027 9028 ```````````````````````````````` example 9029 < a>< 9030 foo><bar/ > 9031 <foo bar=baz 9032 bim!bop /> 9033 . 9034 <p>< a>< 9035 foo><bar/ > 9036 <foo bar=baz 9037 bim!bop /></p> 9038 ```````````````````````````````` 9039 9040 9041 Missing [whitespace]: 9042 9043 ```````````````````````````````` example 9044 <a href='bar'title=title> 9045 . 9046 <p><a href='bar'title=title></p> 9047 ```````````````````````````````` 9048 9049 9050 Closing tags: 9051 9052 ```````````````````````````````` example 9053 </a></foo > 9054 . 9055 <p></a></foo ></p> 9056 ```````````````````````````````` 9057 9058 9059 Illegal attributes in closing tag: 9060 9061 ```````````````````````````````` example 9062 </a href="foo"> 9063 . 9064 <p></a href="foo"></p> 9065 ```````````````````````````````` 9066 9067 9068 Comments: 9069 9070 ```````````````````````````````` example 9071 foo <!-- this is a 9072 comment - with hyphen --> 9073 . 9074 <p>foo <!-- this is a 9075 comment - with hyphen --></p> 9076 ```````````````````````````````` 9077 9078 9079 ```````````````````````````````` example 9080 foo <!-- not a comment -- two hyphens --> 9081 . 9082 <p>foo <!-- not a comment -- two hyphens --></p> 9083 ```````````````````````````````` 9084 9085 9086 Not comments: 9087 9088 ```````````````````````````````` example 9089 foo <!--> foo --> 9090 9091 foo <!-- foo---> 9092 . 9093 <p>foo <!--> foo --></p> 9094 <p>foo <!-- foo---></p> 9095 ```````````````````````````````` 9096 9097 9098 Processing instructions: 9099 9100 ```````````````````````````````` example 9101 foo <?php echo $a; ?> 9102 . 9103 <p>foo <?php echo $a; ?></p> 9104 ```````````````````````````````` 9105 9106 9107 Declarations: 9108 9109 ```````````````````````````````` example 9110 foo <!ELEMENT br EMPTY> 9111 . 9112 <p>foo <!ELEMENT br EMPTY></p> 9113 ```````````````````````````````` 9114 9115 9116 CDATA sections: 9117 9118 ```````````````````````````````` example 9119 foo <![CDATA[>&<]]> 9120 . 9121 <p>foo <![CDATA[>&<]]></p> 9122 ```````````````````````````````` 9123 9124 9125 Entity and numeric character references are preserved in HTML 9126 attributes: 9127 9128 ```````````````````````````````` example 9129 foo <a href="ö"> 9130 . 9131 <p>foo <a href="ö"></p> 9132 ```````````````````````````````` 9133 9134 9135 Backslash escapes do not work in HTML attributes: 9136 9137 ```````````````````````````````` example 9138 foo <a href="\*"> 9139 . 9140 <p>foo <a href="\*"></p> 9141 ```````````````````````````````` 9142 9143 9144 ```````````````````````````````` example 9145 <a href="\""> 9146 . 9147 <p><a href="""></p> 9148 ```````````````````````````````` 9149 9150 9151 ## Hard line breaks 9152 9153 A line break (not in a code span or HTML tag) that is preceded 9154 by two or more spaces and does not occur at the end of a block 9155 is parsed as a [hard line break](@) (rendered 9156 in HTML as a `<br />` tag): 9157 9158 ```````````````````````````````` example 9159 foo 9160 baz 9161 . 9162 <p>foo<br /> 9163 baz</p> 9164 ```````````````````````````````` 9165 9166 9167 For a more visible alternative, a backslash before the 9168 [line ending] may be used instead of two spaces: 9169 9170 ```````````````````````````````` example 9171 foo\ 9172 baz 9173 . 9174 <p>foo<br /> 9175 baz</p> 9176 ```````````````````````````````` 9177 9178 9179 More than two spaces can be used: 9180 9181 ```````````````````````````````` example 9182 foo 9183 baz 9184 . 9185 <p>foo<br /> 9186 baz</p> 9187 ```````````````````````````````` 9188 9189 9190 Leading spaces at the beginning of the next line are ignored: 9191 9192 ```````````````````````````````` example 9193 foo 9194 bar 9195 . 9196 <p>foo<br /> 9197 bar</p> 9198 ```````````````````````````````` 9199 9200 9201 ```````````````````````````````` example 9202 foo\ 9203 bar 9204 . 9205 <p>foo<br /> 9206 bar</p> 9207 ```````````````````````````````` 9208 9209 9210 Line breaks can occur inside emphasis, links, and other constructs 9211 that allow inline content: 9212 9213 ```````````````````````````````` example 9214 *foo 9215 bar* 9216 . 9217 <p><em>foo<br /> 9218 bar</em></p> 9219 ```````````````````````````````` 9220 9221 9222 ```````````````````````````````` example 9223 *foo\ 9224 bar* 9225 . 9226 <p><em>foo<br /> 9227 bar</em></p> 9228 ```````````````````````````````` 9229 9230 9231 Line breaks do not occur inside code spans 9232 9233 ```````````````````````````````` example 9234 `code 9235 span` 9236 . 9237 <p><code>code span</code></p> 9238 ```````````````````````````````` 9239 9240 9241 ```````````````````````````````` example 9242 `code\ 9243 span` 9244 . 9245 <p><code>code\ span</code></p> 9246 ```````````````````````````````` 9247 9248 9249 or HTML tags: 9250 9251 ```````````````````````````````` example 9252 <a href="foo 9253 bar"> 9254 . 9255 <p><a href="foo 9256 bar"></p> 9257 ```````````````````````````````` 9258 9259 9260 ```````````````````````````````` example 9261 <a href="foo\ 9262 bar"> 9263 . 9264 <p><a href="foo\ 9265 bar"></p> 9266 ```````````````````````````````` 9267 9268 9269 Hard line breaks are for separating inline content within a block. 9270 Neither syntax for hard line breaks works at the end of a paragraph or 9271 other block element: 9272 9273 ```````````````````````````````` example 9274 foo\ 9275 . 9276 <p>foo\</p> 9277 ```````````````````````````````` 9278 9279 9280 ```````````````````````````````` example 9281 foo 9282 . 9283 <p>foo</p> 9284 ```````````````````````````````` 9285 9286 9287 ```````````````````````````````` example 9288 ### foo\ 9289 . 9290 <h3>foo\</h3> 9291 ```````````````````````````````` 9292 9293 9294 ```````````````````````````````` example 9295 ### foo 9296 . 9297 <h3>foo</h3> 9298 ```````````````````````````````` 9299 9300 9301 ## Soft line breaks 9302 9303 A regular line break (not in a code span or HTML tag) that is not 9304 preceded by two or more spaces or a backslash is parsed as a 9305 [softbreak](@). (A softbreak may be rendered in HTML either as a 9306 [line ending] or as a space. The result will be the same in 9307 browsers. In the examples here, a [line ending] will be used.) 9308 9309 ```````````````````````````````` example 9310 foo 9311 baz 9312 . 9313 <p>foo 9314 baz</p> 9315 ```````````````````````````````` 9316 9317 9318 Spaces at the end of the line and beginning of the next line are 9319 removed: 9320 9321 ```````````````````````````````` example 9322 foo 9323 baz 9324 . 9325 <p>foo 9326 baz</p> 9327 ```````````````````````````````` 9328 9329 9330 A conforming parser may render a soft line break in HTML either as a 9331 line break or as a space. 9332 9333 A renderer may also provide an option to render soft line breaks 9334 as hard line breaks. 9335 9336 ## Textual content 9337 9338 Any characters not given an interpretation by the above rules will 9339 be parsed as plain textual content. 9340 9341 ```````````````````````````````` example 9342 hello $.;'there 9343 . 9344 <p>hello $.;'there</p> 9345 ```````````````````````````````` 9346 9347 9348 ```````````````````````````````` example 9349 Foo χρῆν 9350 . 9351 <p>Foo χρῆν</p> 9352 ```````````````````````````````` 9353 9354 9355 Internal spaces are preserved verbatim: 9356 9357 ```````````````````````````````` example 9358 Multiple spaces 9359 . 9360 <p>Multiple spaces</p> 9361 ```````````````````````````````` 9362 9363 9364 <!-- END TESTS --> 9365 9366 # Appendix: A parsing strategy 9367 9368 In this appendix we describe some features of the parsing strategy 9369 used in the CommonMark reference implementations. 9370 9371 ## Overview 9372 9373 Parsing has two phases: 9374 9375 1. In the first phase, lines of input are consumed and the block 9376 structure of the document---its division into paragraphs, block quotes, 9377 list items, and so on---is constructed. Text is assigned to these 9378 blocks but not parsed. Link reference definitions are parsed and a 9379 map of links is constructed. 9380 9381 2. In the second phase, the raw text contents of paragraphs and headings 9382 are parsed into sequences of Markdown inline elements (strings, 9383 code spans, links, emphasis, and so on), using the map of link 9384 references constructed in phase 1. 9385 9386 At each point in processing, the document is represented as a tree of 9387 **blocks**. The root of the tree is a `document` block. The `document` 9388 may have any number of other blocks as **children**. These children 9389 may, in turn, have other blocks as children. The last child of a block 9390 is normally considered **open**, meaning that subsequent lines of input 9391 can alter its contents. (Blocks that are not open are **closed**.) 9392 Here, for example, is a possible document tree, with the open blocks 9393 marked by arrows: 9394 9395 ``` tree 9396 -> document 9397 -> block_quote 9398 paragraph 9399 "Lorem ipsum dolor\nsit amet." 9400 -> list (type=bullet tight=true bullet_char=-) 9401 list_item 9402 paragraph 9403 "Qui *quodsi iracundia*" 9404 -> list_item 9405 -> paragraph 9406 "aliquando id" 9407 ``` 9408 9409 ## Phase 1: block structure 9410 9411 Each line that is processed has an effect on this tree. The line is 9412 analyzed and, depending on its contents, the document may be altered 9413 in one or more of the following ways: 9414 9415 1. One or more open blocks may be closed. 9416 2. One or more new blocks may be created as children of the 9417 last open block. 9418 3. Text may be added to the last (deepest) open block remaining 9419 on the tree. 9420 9421 Once a line has been incorporated into the tree in this way, 9422 it can be discarded, so input can be read in a stream. 9423 9424 For each line, we follow this procedure: 9425 9426 1. First we iterate through the open blocks, starting with the 9427 root document, and descending through last children down to the last 9428 open block. Each block imposes a condition that the line must satisfy 9429 if the block is to remain open. For example, a block quote requires a 9430 `>` character. A paragraph requires a non-blank line. 9431 In this phase we may match all or just some of the open 9432 blocks. But we cannot close unmatched blocks yet, because we may have a 9433 [lazy continuation line]. 9434 9435 2. Next, after consuming the continuation markers for existing 9436 blocks, we look for new block starts (e.g. `>` for a block quote). 9437 If we encounter a new block start, we close any blocks unmatched 9438 in step 1 before creating the new block as a child of the last 9439 matched block. 9440 9441 3. Finally, we look at the remainder of the line (after block 9442 markers like `>`, list markers, and indentation have been consumed). 9443 This is text that can be incorporated into the last open 9444 block (a paragraph, code block, heading, or raw HTML). 9445 9446 Setext headings are formed when we see a line of a paragraph 9447 that is a [setext heading underline]. 9448 9449 Reference link definitions are detected when a paragraph is closed; 9450 the accumulated text lines are parsed to see if they begin with 9451 one or more reference link definitions. Any remainder becomes a 9452 normal paragraph. 9453 9454 We can see how this works by considering how the tree above is 9455 generated by four lines of Markdown: 9456 9457 ``` markdown 9458 > Lorem ipsum dolor 9459 sit amet. 9460 > - Qui *quodsi iracundia* 9461 > - aliquando id 9462 ``` 9463 9464 At the outset, our document model is just 9465 9466 ``` tree 9467 -> document 9468 ``` 9469 9470 The first line of our text, 9471 9472 ``` markdown 9473 > Lorem ipsum dolor 9474 ``` 9475 9476 causes a `block_quote` block to be created as a child of our 9477 open `document` block, and a `paragraph` block as a child of 9478 the `block_quote`. Then the text is added to the last open 9479 block, the `paragraph`: 9480 9481 ``` tree 9482 -> document 9483 -> block_quote 9484 -> paragraph 9485 "Lorem ipsum dolor" 9486 ``` 9487 9488 The next line, 9489 9490 ``` markdown 9491 sit amet. 9492 ``` 9493 9494 is a "lazy continuation" of the open `paragraph`, so it gets added 9495 to the paragraph's text: 9496 9497 ``` tree 9498 -> document 9499 -> block_quote 9500 -> paragraph 9501 "Lorem ipsum dolor\nsit amet." 9502 ``` 9503 9504 The third line, 9505 9506 ``` markdown 9507 > - Qui *quodsi iracundia* 9508 ``` 9509 9510 causes the `paragraph` block to be closed, and a new `list` block 9511 opened as a child of the `block_quote`. A `list_item` is also 9512 added as a child of the `list`, and a `paragraph` as a child of 9513 the `list_item`. The text is then added to the new `paragraph`: 9514 9515 ``` tree 9516 -> document 9517 -> block_quote 9518 paragraph 9519 "Lorem ipsum dolor\nsit amet." 9520 -> list (type=bullet tight=true bullet_char=-) 9521 -> list_item 9522 -> paragraph 9523 "Qui *quodsi iracundia*" 9524 ``` 9525 9526 The fourth line, 9527 9528 ``` markdown 9529 > - aliquando id 9530 ``` 9531 9532 causes the `list_item` (and its child the `paragraph`) to be closed, 9533 and a new `list_item` opened up as child of the `list`. A `paragraph` 9534 is added as a child of the new `list_item`, to contain the text. 9535 We thus obtain the final tree: 9536 9537 ``` tree 9538 -> document 9539 -> block_quote 9540 paragraph 9541 "Lorem ipsum dolor\nsit amet." 9542 -> list (type=bullet tight=true bullet_char=-) 9543 list_item 9544 paragraph 9545 "Qui *quodsi iracundia*" 9546 -> list_item 9547 -> paragraph 9548 "aliquando id" 9549 ``` 9550 9551 ## Phase 2: inline structure 9552 9553 Once all of the input has been parsed, all open blocks are closed. 9554 9555 We then "walk the tree," visiting every node, and parse raw 9556 string contents of paragraphs and headings as inlines. At this 9557 point we have seen all the link reference definitions, so we can 9558 resolve reference links as we go. 9559 9560 ``` tree 9561 document 9562 block_quote 9563 paragraph 9564 str "Lorem ipsum dolor" 9565 softbreak 9566 str "sit amet." 9567 list (type=bullet tight=true bullet_char=-) 9568 list_item 9569 paragraph 9570 str "Qui " 9571 emph 9572 str "quodsi iracundia" 9573 list_item 9574 paragraph 9575 str "aliquando id" 9576 ``` 9577 9578 Notice how the [line ending] in the first paragraph has 9579 been parsed as a `softbreak`, and the asterisks in the first list item 9580 have become an `emph`. 9581 9582 ### An algorithm for parsing nested emphasis and links 9583 9584 By far the trickiest part of inline parsing is handling emphasis, 9585 strong emphasis, links, and images. This is done using the following 9586 algorithm. 9587 9588 When we're parsing inlines and we hit either 9589 9590 - a run of `*` or `_` characters, or 9591 - a `[` or `![` 9592 9593 we insert a text node with these symbols as its literal content, and we 9594 add a pointer to this text node to the [delimiter stack](@). 9595 9596 The [delimiter stack] is a doubly linked list. Each 9597 element contains a pointer to a text node, plus information about 9598 9599 - the type of delimiter (`[`, `![`, `*`, `_`) 9600 - the number of delimiters, 9601 - whether the delimiter is "active" (all are active to start), and 9602 - whether the delimiter is a potential opener, a potential closer, 9603 or both (which depends on what sort of characters precede 9604 and follow the delimiters). 9605 9606 When we hit a `]` character, we call the *look for link or image* 9607 procedure (see below). 9608 9609 When we hit the end of the input, we call the *process emphasis* 9610 procedure (see below), with `stack_bottom` = NULL. 9611 9612 #### *look for link or image* 9613 9614 Starting at the top of the delimiter stack, we look backwards 9615 through the stack for an opening `[` or `![` delimiter. 9616 9617 - If we don't find one, we return a literal text node `]`. 9618 9619 - If we do find one, but it's not *active*, we remove the inactive 9620 delimiter from the stack, and return a literal text node `]`. 9621 9622 - If we find one and it's active, then we parse ahead to see if 9623 we have an inline link/image, reference link/image, compact reference 9624 link/image, or shortcut reference link/image. 9625 9626 + If we don't, then we remove the opening delimiter from the 9627 delimiter stack and return a literal text node `]`. 9628 9629 + If we do, then 9630 9631 * We return a link or image node whose children are the inlines 9632 after the text node pointed to by the opening delimiter. 9633 9634 * We run *process emphasis* on these inlines, with the `[` opener 9635 as `stack_bottom`. 9636 9637 * We remove the opening delimiter. 9638 9639 * If we have a link (and not an image), we also set all 9640 `[` delimiters before the opening delimiter to *inactive*. (This 9641 will prevent us from getting links within links.) 9642 9643 #### *process emphasis* 9644 9645 Parameter `stack_bottom` sets a lower bound to how far we 9646 descend in the [delimiter stack]. If it is NULL, we can 9647 go all the way to the bottom. Otherwise, we stop before 9648 visiting `stack_bottom`. 9649 9650 Let `current_position` point to the element on the [delimiter stack] 9651 just above `stack_bottom` (or the first element if `stack_bottom` 9652 is NULL). 9653 9654 We keep track of the `openers_bottom` for each delimiter 9655 type (`*`, `_`) and each length of the closing delimiter run 9656 (modulo 3). Initialize this to `stack_bottom`. 9657 9658 Then we repeat the following until we run out of potential 9659 closers: 9660 9661 - Move `current_position` forward in the delimiter stack (if needed) 9662 until we find the first potential closer with delimiter `*` or `_`. 9663 (This will be the potential closer closest 9664 to the beginning of the input -- the first one in parse order.) 9665 9666 - Now, look back in the stack (staying above `stack_bottom` and 9667 the `openers_bottom` for this delimiter type) for the 9668 first matching potential opener ("matching" means same delimiter). 9669 9670 - If one is found: 9671 9672 + Figure out whether we have emphasis or strong emphasis: 9673 if both closer and opener spans have length >= 2, we have 9674 strong, otherwise regular. 9675 9676 + Insert an emph or strong emph node accordingly, after 9677 the text node corresponding to the opener. 9678 9679 + Remove any delimiters between the opener and closer from 9680 the delimiter stack. 9681 9682 + Remove 1 (for regular emph) or 2 (for strong emph) delimiters 9683 from the opening and closing text nodes. If they become empty 9684 as a result, remove them and remove the corresponding element 9685 of the delimiter stack. If the closing node is removed, reset 9686 `current_position` to the next element in the stack. 9687 9688 - If none is found: 9689 9690 + Set `openers_bottom` to the element before `current_position`. 9691 (We know that there are no openers for this kind of closer up to and 9692 including this point, so this puts a lower bound on future searches.) 9693 9694 + If the closer at `current_position` is not a potential opener, 9695 remove it from the delimiter stack (since we know it can't 9696 be a closer either). 9697 9698 + Advance `current_position` to the next element in the stack. 9699 9700 After we're done, we remove all delimiters above `stack_bottom` from the 9701 delimiter stack. 9702