• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1This is ../../doc/sed.info, produced by makeinfo version 4.12 from
2../../doc//config.texi.
3
4INFO-DIR-SECTION Text creation and manipulation
5START-INFO-DIR-ENTRY
6* sed: (sed).                   Stream EDitor.
7
8END-INFO-DIR-ENTRY
9
10   This file documents version 4.2.1 of GNU `sed', a stream editor.
11
12   Copyright (C) 1998, 1999, 2001, 2002, 2003, 2004 Free Software
13Foundation, Inc.
14
15   This document is released under the terms of the GNU Free
16Documentation License as published by the Free Software Foundation;
17either version 1.1, or (at your option) any later version.
18
19   You should have received a copy of the GNU Free Documentation
20License along with GNU `sed'; see the file `COPYING.DOC'.  If not,
21write to the Free Software Foundation, 59 Temple Place - Suite 330,
22Boston, MA 02110-1301, USA.
23
24   There are no Cover Texts and no Invariant Sections; this text, along
25with its equivalent in the printed manual, constitutes the Title Page.
26
27
28File: sed.info,  Node: Top,  Next: Introduction,  Up: (dir)
29
30sed, a stream editor
31********************
32
33This file documents version 4.2.1 of GNU `sed', a stream editor.
34
35   Copyright (C) 1998, 1999, 2001, 2002, 2003, 2004 Free Software
36Foundation, Inc.
37
38   This document is released under the terms of the GNU Free
39Documentation License as published by the Free Software Foundation;
40either version 1.1, or (at your option) any later version.
41
42   You should have received a copy of the GNU Free Documentation
43License along with GNU `sed'; see the file `COPYING.DOC'.  If not,
44write to the Free Software Foundation, 59 Temple Place - Suite 330,
45Boston, MA 02110-1301, USA.
46
47   There are no Cover Texts and no Invariant Sections; this text, along
48with its equivalent in the printed manual, constitutes the Title Page.
49
50* Menu:
51
52* Introduction::               Introduction
53* Invoking sed::               Invocation
54* sed Programs::               `sed' programs
55* Examples::                   Some sample scripts
56* Limitations::                Limitations and (non-)limitations of GNU `sed'
57* Other Resources::            Other resources for learning about `sed'
58* Reporting Bugs::             Reporting bugs
59
60* Extended regexps::           `egrep'-style regular expressions
61
62* Concept Index::              A menu with all the topics in this manual.
63* Command and Option Index::   A menu with all `sed' commands and
64                               command-line options.
65
66--- The detailed node listing ---
67
68sed Programs:
69* Execution Cycle::                 How `sed' works
70* Addresses::                       Selecting lines with `sed'
71* Regular Expressions::             Overview of regular expression syntax
72* Common Commands::                 Often used commands
73* The "s" Command::                 `sed''s Swiss Army Knife
74* Other Commands::                  Less frequently used commands
75* Programming Commands::            Commands for `sed' gurus
76* Extended Commands::               Commands specific of GNU `sed'
77* Escapes::                         Specifying special characters
78
79Examples:
80* Centering lines::
81* Increment a number::
82* Rename files to lower case::
83* Print bash environment::
84* Reverse chars of lines::
85* tac::                             Reverse lines of files
86* cat -n::                          Numbering lines
87* cat -b::                          Numbering non-blank lines
88* wc -c::                           Counting chars
89* wc -w::                           Counting words
90* wc -l::                           Counting lines
91* head::                            Printing the first lines
92* tail::                            Printing the last lines
93* uniq::                            Make duplicate lines unique
94* uniq -d::                         Print duplicated lines of input
95* uniq -u::                         Remove all duplicated lines
96* cat -s::                          Squeezing blank lines
97
98
99File: sed.info,  Node: Introduction,  Next: Invoking sed,  Prev: Top,  Up: Top
100
1011 Introduction
102**************
103
104`sed' is a stream editor.  A stream editor is used to perform basic text
105transformations on an input stream (a file or input from a pipeline).
106While in some ways similar to an editor which permits scripted edits
107(such as `ed'), `sed' works by making only one pass over the input(s),
108and is consequently more efficient.  But it is `sed''s ability to
109filter text in a pipeline which particularly distinguishes it from
110other types of editors.
111
112
113File: sed.info,  Node: Invoking sed,  Next: sed Programs,  Prev: Introduction,  Up: Top
114
1152 Invocation
116************
117
118Normally `sed' is invoked like this:
119
120     sed SCRIPT INPUTFILE...
121
122   The full format for invoking `sed' is:
123
124     sed OPTIONS... [SCRIPT] [INPUTFILE...]
125
126   If you do not specify INPUTFILE, or if INPUTFILE is `-', `sed'
127filters the contents of the standard input.  The SCRIPT is actually the
128first non-option parameter, which `sed' specially considers a script
129and not an input file if (and only if) none of the other OPTIONS
130specifies a script to be executed, that is if neither of the `-e' and
131`-f' options is specified.
132
133   `sed' may be invoked with the following command-line options:
134
135`--version'
136     Print out the version of `sed' that is being run and a copyright
137     notice, then exit.
138
139`--help'
140     Print a usage message briefly summarizing these command-line
141     options and the bug-reporting address, then exit.
142
143`-n'
144`--quiet'
145`--silent'
146     By default, `sed' prints out the pattern space at the end of each
147     cycle through the script (*note How `sed' works: Execution Cycle.).
148     These options disable this automatic printing, and `sed' only
149     produces output when explicitly told to via the `p' command.
150
151`-e SCRIPT'
152`--expression=SCRIPT'
153     Add the commands in SCRIPT to the set of commands to be run while
154     processing the input.
155
156`-f SCRIPT-FILE'
157`--file=SCRIPT-FILE'
158     Add the commands contained in the file SCRIPT-FILE to the set of
159     commands to be run while processing the input.
160
161`-i[SUFFIX]'
162`--in-place[=SUFFIX]'
163     This option specifies that files are to be edited in-place.  GNU
164     `sed' does this by creating a temporary file and sending output to
165     this file rather than to the standard output.(1).
166
167     This option implies `-s'.
168
169     When the end of the file is reached, the temporary file is renamed
170     to the output file's original name.  The extension, if supplied,
171     is used to modify the name of the old file before renaming the
172     temporary file, thereby making a backup copy(2)).
173
174     This rule is followed: if the extension doesn't contain a `*',
175     then it is appended to the end of the current filename as a
176     suffix; if the extension does contain one or more `*' characters,
177     then _each_ asterisk is replaced with the current filename.  This
178     allows you to add a prefix to the backup file, instead of (or in
179     addition to) a suffix, or even to place backup copies of the
180     original files into another directory (provided the directory
181     already exists).
182
183     If no extension is supplied, the original file is overwritten
184     without making a backup.
185
186`-l N'
187`--line-length=N'
188     Specify the default line-wrap length for the `l' command.  A
189     length of 0 (zero) means to never wrap long lines.  If not
190     specified, it is taken to be 70.
191
192`--posix'
193     GNU `sed' includes several extensions to POSIX sed.  In order to
194     simplify writing portable scripts, this option disables all the
195     extensions that this manual documents, including additional
196     commands.  Most of the extensions accept `sed' programs that are
197     outside the syntax mandated by POSIX, but some of them (such as
198     the behavior of the `N' command described in *note Reporting
199     Bugs::) actually violate the standard.  If you want to disable
200     only the latter kind of extension, you can set the
201     `POSIXLY_CORRECT' variable to a non-empty value.
202
203`-b'
204`--binary'
205     This option is available on every platform, but is only effective
206     where the operating system makes a distinction between text files
207     and binary files.  When such a distinction is made--as is the case
208     for MS-DOS, Windows, Cygwin--text files are composed of lines
209     separated by a carriage return _and_ a line feed character, and
210     `sed' does not see the ending CR.  When this option is specified,
211     `sed' will open input files in binary mode, thus not requesting
212     this special processing and considering lines to end at a line
213     feed.
214
215`--follow-symlinks'
216     This option is available only on platforms that support symbolic
217     links and has an effect only if option `-i' is specified.  In this
218     case, if the file that is specified on the command line is a
219     symbolic link, `sed' will follow the link and edit the ultimate
220     destination of the link.  The default behavior is to break the
221     symbolic link, so that the link destination will not be modified.
222
223`-r'
224`--regexp-extended'
225     Use extended regular expressions rather than basic regular
226     expressions.  Extended regexps are those that `egrep' accepts;
227     they can be clearer because they usually have less backslashes,
228     but are a GNU extension and hence scripts that use them are not
229     portable.  *Note Extended regular expressions: Extended regexps.
230
231`-s'
232`--separate'
233     By default, `sed' will consider the files specified on the command
234     line as a single continuous long stream.  This GNU `sed' extension
235     allows the user to consider them as separate files: range
236     addresses (such as `/abc/,/def/') are not allowed to span several
237     files, line numbers are relative to the start of each file, `$'
238     refers to the last line of each file, and files invoked from the
239     `R' commands are rewound at the start of each file.
240
241`-u'
242`--unbuffered'
243     Buffer both input and output as minimally as practical.  (This is
244     particularly useful if the input is coming from the likes of `tail
245     -f', and you wish to see the transformed output as soon as
246     possible.)
247
248
249   If no `-e', `-f', `--expression', or `--file' options are given on
250the command-line, then the first non-option argument on the command
251line is taken to be the SCRIPT to be executed.
252
253   If any command-line parameters remain after processing the above,
254these parameters are interpreted as the names of input files to be
255processed.  A file name of `-' refers to the standard input stream.
256The standard input will be processed if no file names are specified.
257
258   ---------- Footnotes ----------
259
260   (1) This applies to commands such as `=', `a', `c', `i', `l', `p'.
261You can still write to the standard output by using the `w' or `W'
262commands together with the `/dev/stdout' special file
263
264   (2) Note that GNU `sed' creates the backup file whether or not any
265output is actually changed.
266
267
268File: sed.info,  Node: sed Programs,  Next: Examples,  Prev: Invoking sed,  Up: Top
269
2703 `sed' Programs
271****************
272
273A `sed' program consists of one or more `sed' commands, passed in by
274one or more of the `-e', `-f', `--expression', and `--file' options, or
275the first non-option argument if zero of these options are used.  This
276document will refer to "the" `sed' script; this is understood to mean
277the in-order catenation of all of the SCRIPTs and SCRIPT-FILEs passed
278in.
279
280   Each `sed' command consists of an optional address or address range,
281followed by a one-character command name and any additional
282command-specific code.
283
284* Menu:
285
286* Execution Cycle::          How `sed' works
287* Addresses::                Selecting lines with `sed'
288* Regular Expressions::      Overview of regular expression syntax
289* Common Commands::          Often used commands
290* The "s" Command::          `sed''s Swiss Army Knife
291* Other Commands::           Less frequently used commands
292* Programming Commands::     Commands for `sed' gurus
293* Extended Commands::        Commands specific of GNU `sed'
294* Escapes::                  Specifying special characters
295
296
297File: sed.info,  Node: Execution Cycle,  Next: Addresses,  Up: sed Programs
298
2993.1 How `sed' Works
300===================
301
302`sed' maintains two data buffers: the active _pattern_ space, and the
303auxiliary _hold_ space. Both are initially empty.
304
305   `sed' operates by performing the following cycle on each lines of
306input: first, `sed' reads one line from the input stream, removes any
307trailing newline, and places it in the pattern space.  Then commands
308are executed; each command can have an address associated to it:
309addresses are a kind of condition code, and a command is only executed
310if the condition is verified before the command is to be executed.
311
312   When the end of the script is reached, unless the `-n' option is in
313use, the contents of pattern space are printed out to the output
314stream, adding back the trailing newline if it was removed.(1) Then the
315next cycle starts for the next input line.
316
317   Unless special commands (like `D') are used, the pattern space is
318deleted between two cycles. The hold space, on the other hand, keeps
319its data between cycles (see commands `h', `H', `x', `g', `G' to move
320data between both buffers).
321
322   ---------- Footnotes ----------
323
324   (1) Actually, if `sed' prints a line without the terminating
325newline, it will nevertheless print the missing newline as soon as more
326text is sent to the same output stream, which gives the "least expected
327surprise" even though it does not make commands like `sed -n p' exactly
328identical to `cat'.
329
330
331File: sed.info,  Node: Addresses,  Next: Regular Expressions,  Prev: Execution Cycle,  Up: sed Programs
332
3333.2 Selecting lines with `sed'
334==============================
335
336Addresses in a `sed' script can be in any of the following forms:
337`NUMBER'
338     Specifying a line number will match only that line in the input.
339     (Note that `sed' counts lines continuously across all input files
340     unless `-i' or `-s' options are specified.)
341
342`FIRST~STEP'
343     This GNU extension matches every STEPth line starting with line
344     FIRST.  In particular, lines will be selected when there exists a
345     non-negative N such that the current line-number equals FIRST + (N
346     * STEP).  Thus, to select the odd-numbered lines, one would use
347     `1~2'; to pick every third line starting with the second, `2~3'
348     would be used; to pick every fifth line starting with the tenth,
349     use `10~5'; and `50~0' is just an obscure way of saying `50'.
350
351`$'
352     This address matches the last line of the last file of input, or
353     the last line of each file when the `-i' or `-s' options are
354     specified.
355
356`/REGEXP/'
357     This will select any line which matches the regular expression
358     REGEXP.  If REGEXP itself includes any `/' characters, each must
359     be escaped by a backslash (`\').
360
361     The empty regular expression `//' repeats the last regular
362     expression match (the same holds if the empty regular expression is
363     passed to the `s' command).  Note that modifiers to regular
364     expressions are evaluated when the regular expression is compiled,
365     thus it is invalid to specify them together with the empty regular
366     expression.
367
368`\%REGEXP%'
369     (The `%' may be replaced by any other single character.)
370
371     This also matches the regular expression REGEXP, but allows one to
372     use a different delimiter than `/'.  This is particularly useful
373     if the REGEXP itself contains a lot of slashes, since it avoids
374     the tedious escaping of every `/'.  If REGEXP itself includes any
375     delimiter characters, each must be escaped by a backslash (`\').
376
377`/REGEXP/I'
378`\%REGEXP%I'
379     The `I' modifier to regular-expression matching is a GNU extension
380     which causes the REGEXP to be matched in a case-insensitive manner.
381
382`/REGEXP/M'
383`\%REGEXP%M'
384     The `M' modifier to regular-expression matching is a GNU `sed'
385     extension which causes `^' and `$' to match respectively (in
386     addition to the normal behavior) the empty string after a newline,
387     and the empty string before a newline.  There are special character
388     sequences (`\`' and `\'') which always match the beginning or the
389     end of the buffer.  `M' stands for `multi-line'.
390
391
392   If no addresses are given, then all lines are matched; if one
393address is given, then only lines matching that address are matched.
394
395   An address range can be specified by specifying two addresses
396separated by a comma (`,').  An address range matches lines starting
397from where the first address matches, and continues until the second
398address matches (inclusively).
399
400   If the second address is a REGEXP, then checking for the ending
401match will start with the line _following_ the line which matched the
402first address: a range will always span at least two lines (except of
403course if the input stream ends).
404
405   If the second address is a NUMBER less than (or equal to) the line
406matching the first address, then only the one line is matched.
407
408   GNU `sed' also supports some special two-address forms; all these
409are GNU extensions:
410`0,/REGEXP/'
411     A line number of `0' can be used in an address specification like
412     `0,/REGEXP/' so that `sed' will try to match REGEXP in the first
413     input line too.  In other words, `0,/REGEXP/' is similar to
414     `1,/REGEXP/', except that if ADDR2 matches the very first line of
415     input the `0,/REGEXP/' form will consider it to end the range,
416     whereas the `1,/REGEXP/' form will match the beginning of its
417     range and hence make the range span up to the _second_ occurrence
418     of the regular expression.
419
420     Note that this is the only place where the `0' address makes
421     sense; there is no 0-th line and commands which are given the `0'
422     address in any other way will give an error.
423
424`ADDR1,+N'
425     Matches ADDR1 and the N lines following ADDR1.
426
427`ADDR1,~N'
428     Matches ADDR1 and the lines following ADDR1 until the next line
429     whose input line number is a multiple of N.
430
431   Appending the `!' character to the end of an address specification
432negates the sense of the match.  That is, if the `!' character follows
433an address range, then only lines which do _not_ match the address range
434will be selected.  This also works for singleton addresses, and,
435perhaps perversely, for the null address.
436
437
438File: sed.info,  Node: Regular Expressions,  Next: Common Commands,  Prev: Addresses,  Up: sed Programs
439
4403.3 Overview of Regular Expression Syntax
441=========================================
442
443To know how to use `sed', people should understand regular expressions
444("regexp" for short).  A regular expression is a pattern that is
445matched against a subject string from left to right.  Most characters
446are "ordinary": they stand for themselves in a pattern, and match the
447corresponding characters in the subject.  As a trivial example, the
448pattern
449
450     The quick brown fox
451
452matches a portion of a subject string that is identical to itself.  The
453power of regular expressions comes from the ability to include
454alternatives and repetitions in the pattern.  These are encoded in the
455pattern by the use of "special characters", which do not stand for
456themselves but instead are interpreted in some special way.  Here is a
457brief description of regular expression syntax as used in `sed'.
458
459`CHAR'
460     A single ordinary character matches itself.
461
462`*'
463     Matches a sequence of zero or more instances of matches for the
464     preceding regular expression, which must be an ordinary character,
465     a special character preceded by `\', a `.', a grouped regexp (see
466     below), or a bracket expression.  As a GNU extension, a postfixed
467     regular expression can also be followed by `*'; for example, `a**'
468     is equivalent to `a*'.  POSIX 1003.1-2001 says that `*' stands for
469     itself when it appears at the start of a regular expression or
470     subexpression, but many nonGNU implementations do not support this
471     and portable scripts should instead use `\*' in these contexts.
472
473`\+'
474     As `*', but matches one or more.  It is a GNU extension.
475
476`\?'
477     As `*', but only matches zero or one.  It is a GNU extension.
478
479`\{I\}'
480     As `*', but matches exactly I sequences (I is a decimal integer;
481     for portability, keep it between 0 and 255 inclusive).
482
483`\{I,J\}'
484     Matches between I and J, inclusive, sequences.
485
486`\{I,\}'
487     Matches more than or equal to I sequences.
488
489`\(REGEXP\)'
490     Groups the inner REGEXP as a whole, this is used to:
491
492        * Apply postfix operators, like `\(abcd\)*': this will search
493          for zero or more whole sequences of `abcd', while `abcd*'
494          would search for `abc' followed by zero or more occurrences
495          of `d'.  Note that support for `\(abcd\)*' is required by
496          POSIX 1003.1-2001, but many non-GNU implementations do not
497          support it and hence it is not universally portable.
498
499        * Use back references (see below).
500
501`.'
502     Matches any character, including newline.
503
504`^'
505     Matches the null string at beginning of the pattern space, i.e.
506     what appears after the circumflex must appear at the beginning of
507     the pattern space.
508
509     In most scripts, pattern space is initialized to the content of
510     each line (*note How `sed' works: Execution Cycle.).  So, it is a
511     useful simplification to think of `^#include' as matching only
512     lines where `#include' is the first thing on line--if there are
513     spaces before, for example, the match fails.  This simplification
514     is valid as long as the original content of pattern space is not
515     modified, for example with an `s' command.
516
517     `^' acts as a special character only at the beginning of the
518     regular expression or subexpression (that is, after `\(' or `\|').
519     Portable scripts should avoid `^' at the beginning of a
520     subexpression, though, as POSIX allows implementations that treat
521     `^' as an ordinary character in that context.
522
523`$'
524     It is the same as `^', but refers to end of pattern space.  `$'
525     also acts as a special character only at the end of the regular
526     expression or subexpression (that is, before `\)' or `\|'), and
527     its use at the end of a subexpression is not portable.
528
529`[LIST]'
530`[^LIST]'
531     Matches any single character in LIST: for example, `[aeiou]'
532     matches all vowels.  A list may include sequences like
533     `CHAR1-CHAR2', which matches any character between (inclusive)
534     CHAR1 and CHAR2.
535
536     A leading `^' reverses the meaning of LIST, so that it matches any
537     single character _not_ in LIST.  To include `]' in the list, make
538     it the first character (after the `^' if needed), to include `-'
539     in the list, make it the first or last; to include `^' put it
540     after the first character.
541
542     The characters `$', `*', `.', `[', and `\' are normally not
543     special within LIST.  For example, `[\*]' matches either `\' or
544     `*', because the `\' is not special here.  However, strings like
545     `[.ch.]', `[=a=]', and `[:space:]' are special within LIST and
546     represent collating symbols, equivalence classes, and character
547     classes, respectively, and `[' is therefore special within LIST
548     when it is followed by `.', `=', or `:'.  Also, when not in
549     `POSIXLY_CORRECT' mode, special escapes like `\n' and `\t' are
550     recognized within LIST.  *Note Escapes::.
551
552`REGEXP1\|REGEXP2'
553     Matches either REGEXP1 or REGEXP2.  Use parentheses to use complex
554     alternative regular expressions.  The matching process tries each
555     alternative in turn, from left to right, and the first one that
556     succeeds is used.  It is a GNU extension.
557
558`REGEXP1REGEXP2'
559     Matches the concatenation of REGEXP1 and REGEXP2.  Concatenation
560     binds more tightly than `\|', `^', and `$', but less tightly than
561     the other regular expression operators.
562
563`\DIGIT'
564     Matches the DIGIT-th `\(...\)' parenthesized subexpression in the
565     regular expression.  This is called a "back reference".
566     Subexpressions are implicity numbered by counting occurrences of
567     `\(' left-to-right.
568
569`\n'
570     Matches the newline character.
571
572`\CHAR'
573     Matches CHAR, where CHAR is one of `$', `*', `.', `[', `\', or `^'.
574     Note that the only C-like backslash sequences that you can
575     portably assume to be interpreted are `\n' and `\\'; in particular
576     `\t' is not portable, and matches a `t' under most implementations
577     of `sed', rather than a tab character.
578
579
580   Note that the regular expression matcher is greedy, i.e., matches
581are attempted from left to right and, if two or more matches are
582possible starting at the same character, it selects the longest.
583
584Examples:
585`abcdef'
586     Matches `abcdef'.
587
588`a*b'
589     Matches zero or more `a's followed by a single `b'.  For example,
590     `b' or `aaaaab'.
591
592`a\?b'
593     Matches `b' or `ab'.
594
595`a\+b\+'
596     Matches one or more `a's followed by one or more `b's: `ab' is the
597     shortest possible match, but other examples are `aaaab' or
598     `abbbbb' or `aaaaaabbbbbbb'.
599
600`.*'
601`.\+'
602     These two both match all the characters in a string; however, the
603     first matches every string (including the empty string), while the
604     second matches only strings containing at least one character.
605
606`^main.*(.*)'
607     his matches a string starting with `main', followed by an opening
608     and closing parenthesis.  The `n', `(' and `)' need not be
609     adjacent.
610
611`^#'
612     This matches a string beginning with `#'.
613
614`\\$'
615     This matches a string ending with a single backslash.  The regexp
616     contains two backslashes for escaping.
617
618`\$'
619     Instead, this matches a string consisting of a single dollar sign,
620     because it is escaped.
621
622`[a-zA-Z0-9]'
623     In the C locale, this matches any ASCII letters or digits.
624
625`[^ tab]\+'
626     (Here `tab' stands for a single tab character.)  This matches a
627     string of one or more characters, none of which is a space or a
628     tab.  Usually this means a word.
629
630`^\(.*\)\n\1$'
631     This matches a string consisting of two equal substrings separated
632     by a newline.
633
634`.\{9\}A$'
635     This matches nine characters followed by an `A'.
636
637`^.\{15\}A'
638     This matches the start of a string that contains 16 characters,
639     the last of which is an `A'.
640
641
642
643File: sed.info,  Node: Common Commands,  Next: The "s" Command,  Prev: Regular Expressions,  Up: sed Programs
644
6453.4 Often-Used Commands
646=======================
647
648If you use `sed' at all, you will quite likely want to know these
649commands.
650
651`#'
652     [No addresses allowed.]
653
654     The `#' character begins a comment; the comment continues until
655     the next newline.
656
657     If you are concerned about portability, be aware that some
658     implementations of `sed' (which are not POSIX conformant) may only
659     support a single one-line comment, and then only when the very
660     first character of the script is a `#'.
661
662     Warning: if the first two characters of the `sed' script are `#n',
663     then the `-n' (no-autoprint) option is forced.  If you want to put
664     a comment in the first line of your script and that comment begins
665     with the letter `n' and you do not want this behavior, then be
666     sure to either use a capital `N', or place at least one space
667     before the `n'.
668
669`q [EXIT-CODE]'
670     This command only accepts a single address.
671
672     Exit `sed' without processing any more commands or input.  Note
673     that the current pattern space is printed if auto-print is not
674     disabled with the `-n' options.  The ability to return an exit
675     code from the `sed' script is a GNU `sed' extension.
676
677`d'
678     Delete the pattern space; immediately start next cycle.
679
680`p'
681     Print out the pattern space (to the standard output).  This
682     command is usually only used in conjunction with the `-n'
683     command-line option.
684
685`n'
686     If auto-print is not disabled, print the pattern space, then,
687     regardless, replace the pattern space with the next line of input.
688     If there is no more input then `sed' exits without processing any
689     more commands.
690
691`{ COMMANDS }'
692     A group of commands may be enclosed between `{' and `}' characters.
693     This is particularly useful when you want a group of commands to
694     be triggered by a single address (or address-range) match.
695
696
697
698File: sed.info,  Node: The "s" Command,  Next: Other Commands,  Prev: Common Commands,  Up: sed Programs
699
7003.5 The `s' Command
701===================
702
703The syntax of the `s' (as in substitute) command is
704`s/REGEXP/REPLACEMENT/FLAGS'.  The `/' characters may be uniformly
705replaced by any other single character within any given `s' command.
706The `/' character (or whatever other character is used in its stead)
707can appear in the REGEXP or REPLACEMENT only if it is preceded by a `\'
708character.
709
710   The `s' command is probably the most important in `sed' and has a
711lot of different options.  Its basic concept is simple: the `s' command
712attempts to match the pattern space against the supplied REGEXP; if the
713match is successful, then that portion of the pattern space which was
714matched is replaced with REPLACEMENT.
715
716   The REPLACEMENT can contain `\N' (N being a number from 1 to 9,
717inclusive) references, which refer to the portion of the match which is
718contained between the Nth `\(' and its matching `\)'.  Also, the
719REPLACEMENT can contain unescaped `&' characters which reference the
720whole matched portion of the pattern space.  Finally, as a GNU `sed'
721extension, you can include a special sequence made of a backslash and
722one of the letters `L', `l', `U', `u', or `E'.  The meaning is as
723follows:
724
725`\L'
726     Turn the replacement to lowercase until a `\U' or `\E' is found,
727
728`\l'
729     Turn the next character to lowercase,
730
731`\U'
732     Turn the replacement to uppercase until a `\L' or `\E' is found,
733
734`\u'
735     Turn the next character to uppercase,
736
737`\E'
738     Stop case conversion started by `\L' or `\U'.
739
740   To include a literal `\', `&', or newline in the final replacement,
741be sure to precede the desired `\', `&', or newline in the REPLACEMENT
742with a `\'.
743
744   The `s' command can be followed by zero or more of the following
745FLAGS:
746
747`g'
748     Apply the replacement to _all_ matches to the REGEXP, not just the
749     first.
750
751`NUMBER'
752     Only replace the NUMBERth match of the REGEXP.
753
754     Note: the POSIX standard does not specify what should happen when
755     you mix the `g' and NUMBER modifiers, and currently there is no
756     widely agreed upon meaning across `sed' implementations.  For GNU
757     `sed', the interaction is defined to be: ignore matches before the
758     NUMBERth, and then match and replace all matches from the NUMBERth
759     on.
760
761`p'
762     If the substitution was made, then print the new pattern space.
763
764     Note: when both the `p' and `e' options are specified, the
765     relative ordering of the two produces very different results.  In
766     general, `ep' (evaluate then print) is what you want, but
767     operating the other way round can be useful for debugging.  For
768     this reason, the current version of GNU `sed' interprets specially
769     the presence of `p' options both before and after `e', printing
770     the pattern space before and after evaluation, while in general
771     flags for the `s' command show their effect just once.  This
772     behavior, although documented, might change in future versions.
773
774`w FILE-NAME'
775     If the substitution was made, then write out the result to the
776     named file.  As a GNU `sed' extension, two special values of
777     FILE-NAME are supported: `/dev/stderr', which writes the result to
778     the standard error, and `/dev/stdout', which writes to the standard
779     output.(1)
780
781`e'
782     This command allows one to pipe input from a shell command into
783     pattern space.  If a substitution was made, the command that is
784     found in pattern space is executed and pattern space is replaced
785     with its output.  A trailing newline is suppressed; results are
786     undefined if the command to be executed contains a NUL character.
787     This is a GNU `sed' extension.
788
789`I'
790`i'
791     The `I' modifier to regular-expression matching is a GNU extension
792     which makes `sed' match REGEXP in a case-insensitive manner.
793
794`M'
795`m'
796     The `M' modifier to regular-expression matching is a GNU `sed'
797     extension which causes `^' and `$' to match respectively (in
798     addition to the normal behavior) the empty string after a newline,
799     and the empty string before a newline.  There are special character
800     sequences (`\`' and `\'') which always match the beginning or the
801     end of the buffer.  `M' stands for `multi-line'.
802
803
804   ---------- Footnotes ----------
805
806   (1) This is equivalent to `p' unless the `-i' option is being used.
807
808
809File: sed.info,  Node: Other Commands,  Next: Programming Commands,  Prev: The "s" Command,  Up: sed Programs
810
8113.6 Less Frequently-Used Commands
812=================================
813
814Though perhaps less frequently used than those in the previous section,
815some very small yet useful `sed' scripts can be built with these
816commands.
817
818`y/SOURCE-CHARS/DEST-CHARS/'
819     (The `/' characters may be uniformly replaced by any other single
820     character within any given `y' command.)
821
822     Transliterate any characters in the pattern space which match any
823     of the SOURCE-CHARS with the corresponding character in DEST-CHARS.
824
825     Instances of the `/' (or whatever other character is used in its
826     stead), `\', or newlines can appear in the SOURCE-CHARS or
827     DEST-CHARS lists, provide that each instance is escaped by a `\'.
828     The SOURCE-CHARS and DEST-CHARS lists _must_ contain the same
829     number of characters (after de-escaping).
830
831`a\'
832`TEXT'
833     As a GNU extension, this command accepts two addresses.
834
835     Queue the lines of text which follow this command (each but the
836     last ending with a `\', which are removed from the output) to be
837     output at the end of the current cycle, or when the next input
838     line is read.
839
840     Escape sequences in TEXT are processed, so you should use `\\' in
841     TEXT to print a single backslash.
842
843     As a GNU extension, if between the `a' and the newline there is
844     other than a whitespace-`\' sequence, then the text of this line,
845     starting at the first non-whitespace character after the `a', is
846     taken as the first line of the TEXT block.  (This enables a
847     simplification in scripting a one-line add.)  This extension also
848     works with the `i' and `c' commands.
849
850`i\'
851`TEXT'
852     As a GNU extension, this command accepts two addresses.
853
854     Immediately output the lines of text which follow this command
855     (each but the last ending with a `\', which are removed from the
856     output).
857
858`c\'
859`TEXT'
860     Delete the lines matching the address or address-range, and output
861     the lines of text which follow this command (each but the last
862     ending with a `\', which are removed from the output) in place of
863     the last line (or in place of each line, if no addresses were
864     specified).  A new cycle is started after this command is done,
865     since the pattern space will have been deleted.
866
867`='
868     As a GNU extension, this command accepts two addresses.
869
870     Print out the current input line number (with a trailing newline).
871
872`l N'
873     Print the pattern space in an unambiguous form: non-printable
874     characters (and the `\' character) are printed in C-style escaped
875     form; long lines are split, with a trailing `\' character to
876     indicate the split; the end of each line is marked with a `$'.
877
878     N specifies the desired line-wrap length; a length of 0 (zero)
879     means to never wrap long lines.  If omitted, the default as
880     specified on the command line is used.  The N parameter is a GNU
881     `sed' extension.
882
883`r FILENAME'
884     As a GNU extension, this command accepts two addresses.
885
886     Queue the contents of FILENAME to be read and inserted into the
887     output stream at the end of the current cycle, or when the next
888     input line is read.  Note that if FILENAME cannot be read, it is
889     treated as if it were an empty file, without any error indication.
890
891     As a GNU `sed' extension, the special value `/dev/stdin' is
892     supported for the file name, which reads the contents of the
893     standard input.
894
895`w FILENAME'
896     Write the pattern space to FILENAME.  As a GNU `sed' extension,
897     two special values of FILE-NAME are supported: `/dev/stderr',
898     which writes the result to the standard error, and `/dev/stdout',
899     which writes to the standard output.(1)
900
901     The file will be created (or truncated) before the first input
902     line is read; all `w' commands (including instances of `w' flag on
903     successful `s' commands) which refer to the same FILENAME are
904     output without closing and reopening the file.
905
906`D'
907     Delete text in the pattern space up to the first newline.  If any
908     text is left, restart cycle with the resultant pattern space
909     (without reading a new line of input), otherwise start a normal
910     new cycle.
911
912`N'
913     Add a newline to the pattern space, then append the next line of
914     input to the pattern space.  If there is no more input then `sed'
915     exits without processing any more commands.
916
917`P'
918     Print out the portion of the pattern space up to the first newline.
919
920`h'
921     Replace the contents of the hold space with the contents of the
922     pattern space.
923
924`H'
925     Append a newline to the contents of the hold space, and then
926     append the contents of the pattern space to that of the hold space.
927
928`g'
929     Replace the contents of the pattern space with the contents of the
930     hold space.
931
932`G'
933     Append a newline to the contents of the pattern space, and then
934     append the contents of the hold space to that of the pattern space.
935
936`x'
937     Exchange the contents of the hold and pattern spaces.
938
939
940   ---------- Footnotes ----------
941
942   (1) This is equivalent to `p' unless the `-i' option is being used.
943
944
945File: sed.info,  Node: Programming Commands,  Next: Extended Commands,  Prev: Other Commands,  Up: sed Programs
946
9473.7 Commands for `sed' gurus
948============================
949
950In most cases, use of these commands indicates that you are probably
951better off programming in something like `awk' or Perl.  But
952occasionally one is committed to sticking with `sed', and these
953commands can enable one to write quite convoluted scripts.
954
955`: LABEL'
956     [No addresses allowed.]
957
958     Specify the location of LABEL for branch commands.  In all other
959     respects, a no-op.
960
961`b LABEL'
962     Unconditionally branch to LABEL.  The LABEL may be omitted, in
963     which case the next cycle is started.
964
965`t LABEL'
966     Branch to LABEL only if there has been a successful `s'ubstitution
967     since the last input line was read or conditional branch was taken.
968     The LABEL may be omitted, in which case the next cycle is started.
969
970
971
972File: sed.info,  Node: Extended Commands,  Next: Escapes,  Prev: Programming Commands,  Up: sed Programs
973
9743.8 Commands Specific to GNU `sed'
975==================================
976
977These commands are specific to GNU `sed', so you must use them with
978care and only when you are sure that hindering portability is not evil.
979They allow you to check for GNU `sed' extensions or to do tasks that
980are required quite often, yet are unsupported by standard `sed's.
981
982`e [COMMAND]'
983     This command allows one to pipe input from a shell command into
984     pattern space.  Without parameters, the `e' command executes the
985     command that is found in pattern space and replaces the pattern
986     space with the output; a trailing newline is suppressed.
987
988     If a parameter is specified, instead, the `e' command interprets
989     it as a command and sends its output to the output stream (like
990     `r' does).  The command can run across multiple lines, all but the
991     last ending with a back-slash.
992
993     In both cases, the results are undefined if the command to be
994     executed contains a NUL character.
995
996`L N'
997     This GNU `sed' extension fills and joins lines in pattern space to
998     produce output lines of (at most) N characters, like `fmt' does;
999     if N is omitted, the default as specified on the command line is
1000     used.  This command is considered a failed experiment and unless
1001     there is enough request (which seems unlikely) will be removed in
1002     future versions.
1003
1004`Q [EXIT-CODE]'
1005     This command only accepts a single address.
1006
1007     This command is the same as `q', but will not print the contents
1008     of pattern space.  Like `q', it provides the ability to return an
1009     exit code to the caller.
1010
1011     This command can be useful because the only alternative ways to
1012     accomplish this apparently trivial function are to use the `-n'
1013     option (which can unnecessarily complicate your script) or
1014     resorting to the following snippet, which wastes time by reading
1015     the whole file without any visible effect:
1016
1017          :eat
1018          $d       Quit silently on the last line
1019          N        Read another line, silently
1020          g        Overwrite pattern space each time to save memory
1021          b eat
1022
1023`R FILENAME'
1024     Queue a line of FILENAME to be read and inserted into the output
1025     stream at the end of the current cycle, or when the next input
1026     line is read.  Note that if FILENAME cannot be read, or if its end
1027     is reached, no line is appended, without any error indication.
1028
1029     As with the `r' command, the special value `/dev/stdin' is
1030     supported for the file name, which reads a line from the standard
1031     input.
1032
1033`T LABEL'
1034     Branch to LABEL only if there have been no successful
1035     `s'ubstitutions since the last input line was read or conditional
1036     branch was taken. The LABEL may be omitted, in which case the next
1037     cycle is started.
1038
1039`v VERSION'
1040     This command does nothing, but makes `sed' fail if GNU `sed'
1041     extensions are not supported, simply because other versions of
1042     `sed' do not implement it.  In addition, you can specify the
1043     version of `sed' that your script requires, such as `4.0.5'.  The
1044     default is `4.0' because that is the first version that
1045     implemented this command.
1046
1047     This command enables all GNU extensions even if `POSIXLY_CORRECT'
1048     is set in the environment.
1049
1050`W FILENAME'
1051     Write to the given filename the portion of the pattern space up to
1052     the first newline.  Everything said under the `w' command about
1053     file handling holds here too.
1054
1055`z'
1056     This command empties the content of pattern space.  It is usually
1057     the same as `s/.*//', but is more efficient and works in the
1058     presence of invalid multibyte sequences in the input stream.
1059     POSIX mandates that such sequences are _not_ matched by `.', so
1060     that there is no portable way to clear `sed''s buffers in the
1061     middle of the script in most multibyte locales (including UTF-8
1062     locales).
1063
1064
1065File: sed.info,  Node: Escapes,  Prev: Extended Commands,  Up: sed Programs
1066
10673.9 GNU Extensions for Escapes in Regular Expressions
1068=====================================================
1069
1070Until this chapter, we have only encountered escapes of the form `\^',
1071which tell `sed' not to interpret the circumflex as a special
1072character, but rather to take it literally.  For example, `\*' matches
1073a single asterisk rather than zero or more backslashes.
1074
1075   This chapter introduces another kind of escape(1)--that is, escapes
1076that are applied to a character or sequence of characters that
1077ordinarily are taken literally, and that `sed' replaces with a special
1078character.  This provides a way of encoding non-printable characters in
1079patterns in a visible manner.  There is no restriction on the
1080appearance of non-printing characters in a `sed' script but when a
1081script is being prepared in the shell or by text editing, it is usually
1082easier to use one of the following escape sequences than the binary
1083character it represents:
1084
1085   The list of these escapes is:
1086
1087`\a'
1088     Produces or matches a BEL character, that is an "alert" (ASCII 7).
1089
1090`\f'
1091     Produces or matches a form feed (ASCII 12).
1092
1093`\n'
1094     Produces or matches a newline (ASCII 10).
1095
1096`\r'
1097     Produces or matches a carriage return (ASCII 13).
1098
1099`\t'
1100     Produces or matches a horizontal tab (ASCII 9).
1101
1102`\v'
1103     Produces or matches a so called "vertical tab" (ASCII 11).
1104
1105`\cX'
1106     Produces or matches `CONTROL-X', where X is any character.  The
1107     precise effect of `\cX' is as follows: if X is a lower case
1108     letter, it is converted to upper case.  Then bit 6 of the
1109     character (hex 40) is inverted.  Thus `\cz' becomes hex 1A, but
1110     `\c{' becomes hex 3B, while `\c;' becomes hex 7B.
1111
1112`\dXXX'
1113     Produces or matches a character whose decimal ASCII value is XXX.
1114
1115`\oXXX'
1116     Produces or matches a character whose octal ASCII value is XXX.
1117
1118`\xXX'
1119     Produces or matches a character whose hexadecimal ASCII value is
1120     XX.
1121
1122   `\b' (backspace) was omitted because of the conflict with the
1123existing "word boundary" meaning.
1124
1125   Other escapes match a particular character class and are valid only
1126in regular expressions:
1127
1128`\w'
1129     Matches any "word" character.  A "word" character is any letter or
1130     digit or the underscore character.
1131
1132`\W'
1133     Matches any "non-word" character.
1134
1135`\b'
1136     Matches a word boundary; that is it matches if the character to
1137     the left is a "word" character and the character to the right is a
1138     "non-word" character, or vice-versa.
1139
1140`\B'
1141     Matches everywhere but on a word boundary; that is it matches if
1142     the character to the left and the character to the right are
1143     either both "word" characters or both "non-word" characters.
1144
1145`\`'
1146     Matches only at the start of pattern space.  This is different
1147     from `^' in multi-line mode.
1148
1149`\''
1150     Matches only at the end of pattern space.  This is different from
1151     `$' in multi-line mode.
1152
1153
1154   ---------- Footnotes ----------
1155
1156   (1) All the escapes introduced here are GNU extensions, with the
1157exception of `\n'.  In basic regular expression mode, setting
1158`POSIXLY_CORRECT' disables them inside bracket expressions.
1159
1160
1161File: sed.info,  Node: Examples,  Next: Limitations,  Prev: sed Programs,  Up: Top
1162
11634 Some Sample Scripts
1164*********************
1165
1166Here are some `sed' scripts to guide you in the art of mastering `sed'.
1167
1168* Menu:
1169
1170Some exotic examples:
1171* Centering lines::
1172* Increment a number::
1173* Rename files to lower case::
1174* Print bash environment::
1175* Reverse chars of lines::
1176
1177Emulating standard utilities:
1178* tac::                             Reverse lines of files
1179* cat -n::                          Numbering lines
1180* cat -b::                          Numbering non-blank lines
1181* wc -c::                           Counting chars
1182* wc -w::                           Counting words
1183* wc -l::                           Counting lines
1184* head::                            Printing the first lines
1185* tail::                            Printing the last lines
1186* uniq::                            Make duplicate lines unique
1187* uniq -d::                         Print duplicated lines of input
1188* uniq -u::                         Remove all duplicated lines
1189* cat -s::                          Squeezing blank lines
1190
1191
1192File: sed.info,  Node: Centering lines,  Next: Increment a number,  Up: Examples
1193
11944.1 Centering Lines
1195===================
1196
1197This script centers all lines of a file on a 80 columns width.  To
1198change that width, the number in `\{...\}' must be replaced, and the
1199number of added spaces also must be changed.
1200
1201   Note how the buffer commands are used to separate parts in the
1202regular expressions to be matched--this is a common technique.
1203
1204     #!/usr/bin/sed -f
1205
1206     # Put 80 spaces in the buffer
1207     1 {
1208       x
1209       s/^$/          /
1210       s/^.*$/&&&&&&&&/
1211       x
1212     }
1213
1214     # del leading and trailing spaces
1215     y/tab/ /
1216     s/^ *//
1217     s/ *$//
1218
1219     # add a newline and 80 spaces to end of line
1220     G
1221
1222     # keep first 81 chars (80 + a newline)
1223     s/^\(.\{81\}\).*$/\1/
1224
1225     # \2 matches half of the spaces, which are moved to the beginning
1226     s/^\(.*\)\n\(.*\)\2/\2\1/
1227
1228
1229File: sed.info,  Node: Increment a number,  Next: Rename files to lower case,  Prev: Centering lines,  Up: Examples
1230
12314.2 Increment a Number
1232======================
1233
1234This script is one of a few that demonstrate how to do arithmetic in
1235`sed'.  This is indeed possible,(1) but must be done manually.
1236
1237   To increment one number you just add 1 to last digit, replacing it
1238by the following digit.  There is one exception: when the digit is a
1239nine the previous digits must be also incremented until you don't have
1240a nine.
1241
1242   This solution by Bruno Haible is very clever and smart because it
1243uses a single buffer; if you don't have this limitation, the algorithm
1244used in *note Numbering lines: cat -n, is faster.  It works by
1245replacing trailing nines with an underscore, then using multiple `s'
1246commands to increment the last digit, and then again substituting
1247underscores with zeros.
1248
1249     #!/usr/bin/sed -f
1250
1251     /[^0-9]/ d
1252
1253     # replace all leading 9s by _ (any other character except digits, could
1254     # be used)
1255     :d
1256     s/9\(_*\)$/_\1/
1257     td
1258
1259     # incr last digit only.  The first line adds a most-significant
1260     # digit of 1 if we have to add a digit.
1261     #
1262     # The `tn' commands are not necessary, but make the thing
1263     # faster
1264
1265     s/^\(_*\)$/1\1/; tn
1266     s/8\(_*\)$/9\1/; tn
1267     s/7\(_*\)$/8\1/; tn
1268     s/6\(_*\)$/7\1/; tn
1269     s/5\(_*\)$/6\1/; tn
1270     s/4\(_*\)$/5\1/; tn
1271     s/3\(_*\)$/4\1/; tn
1272     s/2\(_*\)$/3\1/; tn
1273     s/1\(_*\)$/2\1/; tn
1274     s/0\(_*\)$/1\1/; tn
1275
1276     :n
1277     y/_/0/
1278
1279   ---------- Footnotes ----------
1280
1281   (1) `sed' guru Greg Ubben wrote an implementation of the `dc' RPN
1282calculator!  It is distributed together with sed.
1283
1284
1285File: sed.info,  Node: Rename files to lower case,  Next: Print bash environment,  Prev: Increment a number,  Up: Examples
1286
12874.3 Rename Files to Lower Case
1288==============================
1289
1290This is a pretty strange use of `sed'.  We transform text, and
1291transform it to be shell commands, then just feed them to shell.  Don't
1292worry, even worse hacks are done when using `sed'; I have seen a script
1293converting the output of `date' into a `bc' program!
1294
1295   The main body of this is the `sed' script, which remaps the name
1296from lower to upper (or vice-versa) and even checks out if the remapped
1297name is the same as the original name.  Note how the script is
1298parameterized using shell variables and proper quoting.
1299
1300     #! /bin/sh
1301     # rename files to lower/upper case...
1302     #
1303     # usage:
1304     #    move-to-lower *
1305     #    move-to-upper *
1306     # or
1307     #    move-to-lower -R .
1308     #    move-to-upper -R .
1309     #
1310
1311     help()
1312     {
1313             cat << eof
1314     Usage: $0 [-n] [-r] [-h] files...
1315
1316     -n      do nothing, only see what would be done
1317     -R      recursive (use find)
1318     -h      this message
1319     files   files to remap to lower case
1320
1321     Examples:
1322            $0 -n *        (see if everything is ok, then...)
1323            $0 *
1324
1325            $0 -R .
1326
1327     eof
1328     }
1329
1330     apply_cmd='sh'
1331     finder='echo "$@" | tr " " "\n"'
1332     files_only=
1333
1334     while :
1335     do
1336         case "$1" in
1337             -n) apply_cmd='cat' ;;
1338             -R) finder='find "$@" -type f';;
1339             -h) help ; exit 1 ;;
1340             *) break ;;
1341         esac
1342         shift
1343     done
1344
1345     if [ -z "$1" ]; then
1346             echo Usage: $0 [-h] [-n] [-r] files...
1347             exit 1
1348     fi
1349
1350     LOWER='abcdefghijklmnopqrstuvwxyz'
1351     UPPER='ABCDEFGHIJKLMNOPQRSTUVWXYZ'
1352
1353     case `basename $0` in
1354             *upper*) TO=$UPPER; FROM=$LOWER ;;
1355             *)       FROM=$UPPER; TO=$LOWER ;;
1356     esac
1357
1358     eval $finder | sed -n '
1359
1360     # remove all trailing slashes
1361     s/\/*$//
1362
1363     # add ./ if there is no path, only a filename
1364     /\//! s/^/.\//
1365
1366     # save path+filename
1367     h
1368
1369     # remove path
1370     s/.*\///
1371
1372     # do conversion only on filename
1373     y/'$FROM'/'$TO'/
1374
1375     # now line contains original path+file, while
1376     # hold space contains the new filename
1377     x
1378
1379     # add converted file name to line, which now contains
1380     # path/file-name\nconverted-file-name
1381     G
1382
1383     # check if converted file name is equal to original file name,
1384     # if it is, do not print nothing
1385     /^.*\/\(.*\)\n\1/b
1386
1387     # now, transform path/fromfile\n, into
1388     # mv path/fromfile path/tofile and print it
1389     s/^\(.*\/\)\(.*\)\n\(.*\)$/mv "\1\2" "\1\3"/p
1390
1391     ' | $apply_cmd
1392
1393
1394File: sed.info,  Node: Print bash environment,  Next: Reverse chars of lines,  Prev: Rename files to lower case,  Up: Examples
1395
13964.4 Print `bash' Environment
1397============================
1398
1399This script strips the definition of the shell functions from the
1400output of the `set' Bourne-shell command.
1401
1402     #!/bin/sh
1403
1404     set | sed -n '
1405     :x
1406
1407     # if no occurrence of "=()" print and load next line
1408     /=()/! { p; b; }
1409     / () $/! { p; b; }
1410
1411     # possible start of functions section
1412     # save the line in case this is a var like FOO="() "
1413     h
1414
1415     # if the next line has a brace, we quit because
1416     # nothing comes after functions
1417     n
1418     /^{/ q
1419
1420     # print the old line
1421     x; p
1422
1423     # work on the new line now
1424     x; bx
1425     '
1426
1427
1428File: sed.info,  Node: Reverse chars of lines,  Next: tac,  Prev: Print bash environment,  Up: Examples
1429
14304.5 Reverse Characters of Lines
1431===============================
1432
1433This script can be used to reverse the position of characters in lines.
1434The technique moves two characters at a time, hence it is faster than
1435more intuitive implementations.
1436
1437   Note the `tx' command before the definition of the label.  This is
1438often needed to reset the flag that is tested by the `t' command.
1439
1440   Imaginative readers will find uses for this script.  An example is
1441reversing the output of `banner'.(1)
1442
1443     #!/usr/bin/sed -f
1444
1445     /../! b
1446
1447     # Reverse a line.  Begin embedding the line between two newlines
1448     s/^.*$/\
1449     &\
1450     /
1451
1452     # Move first character at the end.  The regexp matches until
1453     # there are zero or one characters between the markers
1454     tx
1455     :x
1456     s/\(\n.\)\(.*\)\(.\n\)/\3\2\1/
1457     tx
1458
1459     # Remove the newline markers
1460     s/\n//g
1461
1462   ---------- Footnotes ----------
1463
1464   (1) This requires another script to pad the output of banner; for
1465example
1466
1467     #! /bin/sh
1468
1469     banner -w $1 $2 $3 $4 |
1470       sed -e :a -e '/^.\{0,'$1'\}$/ { s/$/ /; ba; }' |
1471       ~/sedscripts/reverseline.sed
1472
1473
1474File: sed.info,  Node: tac,  Next: cat -n,  Prev: Reverse chars of lines,  Up: Examples
1475
14764.6 Reverse Lines of Files
1477==========================
1478
1479This one begins a series of totally useless (yet interesting) scripts
1480emulating various Unix commands.  This, in particular, is a `tac'
1481workalike.
1482
1483   Note that on implementations other than GNU `sed' this script might
1484easily overflow internal buffers.
1485
1486     #!/usr/bin/sed -nf
1487
1488     # reverse all lines of input, i.e. first line became last, ...
1489
1490     # from the second line, the buffer (which contains all previous lines)
1491     # is *appended* to current line, so, the order will be reversed
1492     1! G
1493
1494     # on the last line we're done -- print everything
1495     $ p
1496
1497     # store everything on the buffer again
1498     h
1499
1500
1501File: sed.info,  Node: cat -n,  Next: cat -b,  Prev: tac,  Up: Examples
1502
15034.7 Numbering Lines
1504===================
1505
1506This script replaces `cat -n'; in fact it formats its output exactly
1507like GNU `cat' does.
1508
1509   Of course this is completely useless and for two reasons:  first,
1510because somebody else did it in C, second, because the following
1511Bourne-shell script could be used for the same purpose and would be
1512much faster:
1513
1514     #! /bin/sh
1515     sed -e "=" $@ | sed -e '
1516       s/^/      /
1517       N
1518       s/^ *\(......\)\n/\1  /
1519     '
1520
1521   It uses `sed' to print the line number, then groups lines two by two
1522using `N'.  Of course, this script does not teach as much as the one
1523presented below.
1524
1525   The algorithm used for incrementing uses both buffers, so the line
1526is printed as soon as possible and then discarded.  The number is split
1527so that changing digits go in a buffer and unchanged ones go in the
1528other; the changed digits are modified in a single step (using a `y'
1529command).  The line number for the next line is then composed and
1530stored in the hold space, to be used in the next iteration.
1531
1532     #!/usr/bin/sed -nf
1533
1534     # Prime the pump on the first line
1535     x
1536     /^$/ s/^.*$/1/
1537
1538     # Add the correct line number before the pattern
1539     G
1540     h
1541
1542     # Format it and print it
1543     s/^/      /
1544     s/^ *\(......\)\n/\1  /p
1545
1546     # Get the line number from hold space; add a zero
1547     # if we're going to add a digit on the next line
1548     g
1549     s/\n.*$//
1550     /^9*$/ s/^/0/
1551
1552     # separate changing/unchanged digits with an x
1553     s/.9*$/x&/
1554
1555     # keep changing digits in hold space
1556     h
1557     s/^.*x//
1558     y/0123456789/1234567890/
1559     x
1560
1561     # keep unchanged digits in pattern space
1562     s/x.*$//
1563
1564     # compose the new number, remove the newline implicitly added by G
1565     G
1566     s/\n//
1567     h
1568
1569
1570File: sed.info,  Node: cat -b,  Next: wc -c,  Prev: cat -n,  Up: Examples
1571
15724.8 Numbering Non-blank Lines
1573=============================
1574
1575Emulating `cat -b' is almost the same as `cat -n'--we only have to
1576select which lines are to be numbered and which are not.
1577
1578   The part that is common to this script and the previous one is not
1579commented to show how important it is to comment `sed' scripts
1580properly...
1581
1582     #!/usr/bin/sed -nf
1583
1584     /^$/ {
1585       p
1586       b
1587     }
1588
1589     # Same as cat -n from now
1590     x
1591     /^$/ s/^.*$/1/
1592     G
1593     h
1594     s/^/      /
1595     s/^ *\(......\)\n/\1  /p
1596     x
1597     s/\n.*$//
1598     /^9*$/ s/^/0/
1599     s/.9*$/x&/
1600     h
1601     s/^.*x//
1602     y/0123456789/1234567890/
1603     x
1604     s/x.*$//
1605     G
1606     s/\n//
1607     h
1608
1609
1610File: sed.info,  Node: wc -c,  Next: wc -w,  Prev: cat -b,  Up: Examples
1611
16124.9 Counting Characters
1613=======================
1614
1615This script shows another way to do arithmetic with `sed'.  In this
1616case we have to add possibly large numbers, so implementing this by
1617successive increments would not be feasible (and possibly even more
1618complicated to contrive than this script).
1619
1620   The approach is to map numbers to letters, kind of an abacus
1621implemented with `sed'.  `a's are units, `b's are tens and so on: we
1622simply add the number of characters on the current line as units, and
1623then propagate the carry to tens, hundreds, and so on.
1624
1625   As usual, running totals are kept in hold space.
1626
1627   On the last line, we convert the abacus form back to decimal.  For
1628the sake of variety, this is done with a loop rather than with some 80
1629`s' commands(1): first we convert units, removing `a's from the number;
1630then we rotate letters so that tens become `a's, and so on until no
1631more letters remain.
1632
1633     #!/usr/bin/sed -nf
1634
1635     # Add n+1 a's to hold space (+1 is for the newline)
1636     s/./a/g
1637     H
1638     x
1639     s/\n/a/
1640
1641     # Do the carry.  The t's and b's are not necessary,
1642     # but they do speed up the thing
1643     t a
1644     : a;  s/aaaaaaaaaa/b/g; t b; b done
1645     : b;  s/bbbbbbbbbb/c/g; t c; b done
1646     : c;  s/cccccccccc/d/g; t d; b done
1647     : d;  s/dddddddddd/e/g; t e; b done
1648     : e;  s/eeeeeeeeee/f/g; t f; b done
1649     : f;  s/ffffffffff/g/g; t g; b done
1650     : g;  s/gggggggggg/h/g; t h; b done
1651     : h;  s/hhhhhhhhhh//g
1652
1653     : done
1654     $! {
1655       h
1656       b
1657     }
1658
1659     # On the last line, convert back to decimal
1660
1661     : loop
1662     /a/! s/[b-h]*/&0/
1663     s/aaaaaaaaa/9/
1664     s/aaaaaaaa/8/
1665     s/aaaaaaa/7/
1666     s/aaaaaa/6/
1667     s/aaaaa/5/
1668     s/aaaa/4/
1669     s/aaa/3/
1670     s/aa/2/
1671     s/a/1/
1672
1673     : next
1674     y/bcdefgh/abcdefg/
1675     /[a-h]/ b loop
1676     p
1677
1678   ---------- Footnotes ----------
1679
1680   (1) Some implementations have a limit of 199 commands per script
1681
1682
1683File: sed.info,  Node: wc -w,  Next: wc -l,  Prev: wc -c,  Up: Examples
1684
16854.10 Counting Words
1686===================
1687
1688This script is almost the same as the previous one, once each of the
1689words on the line is converted to a single `a' (in the previous script
1690each letter was changed to an `a').
1691
1692   It is interesting that real `wc' programs have optimized loops for
1693`wc -c', so they are much slower at counting words rather than
1694characters.  This script's bottleneck, instead, is arithmetic, and
1695hence the word-counting one is faster (it has to manage smaller
1696numbers).
1697
1698   Again, the common parts are not commented to show the importance of
1699commenting `sed' scripts.
1700
1701     #!/usr/bin/sed -nf
1702
1703     # Convert words to a's
1704     s/[ tab][ tab]*/ /g
1705     s/^/ /
1706     s/ [^ ][^ ]*/a /g
1707     s/ //g
1708
1709     # Append them to hold space
1710     H
1711     x
1712     s/\n//
1713
1714     # From here on it is the same as in wc -c.
1715     /aaaaaaaaaa/! bx;   s/aaaaaaaaaa/b/g
1716     /bbbbbbbbbb/! bx;   s/bbbbbbbbbb/c/g
1717     /cccccccccc/! bx;   s/cccccccccc/d/g
1718     /dddddddddd/! bx;   s/dddddddddd/e/g
1719     /eeeeeeeeee/! bx;   s/eeeeeeeeee/f/g
1720     /ffffffffff/! bx;   s/ffffffffff/g/g
1721     /gggggggggg/! bx;   s/gggggggggg/h/g
1722     s/hhhhhhhhhh//g
1723     :x
1724     $! { h; b; }
1725     :y
1726     /a/! s/[b-h]*/&0/
1727     s/aaaaaaaaa/9/
1728     s/aaaaaaaa/8/
1729     s/aaaaaaa/7/
1730     s/aaaaaa/6/
1731     s/aaaaa/5/
1732     s/aaaa/4/
1733     s/aaa/3/
1734     s/aa/2/
1735     s/a/1/
1736     y/bcdefgh/abcdefg/
1737     /[a-h]/ by
1738     p
1739
1740
1741File: sed.info,  Node: wc -l,  Next: head,  Prev: wc -w,  Up: Examples
1742
17434.11 Counting Lines
1744===================
1745
1746No strange things are done now, because `sed' gives us `wc -l'
1747functionality for free!!! Look:
1748
1749     #!/usr/bin/sed -nf
1750     $=
1751
1752
1753File: sed.info,  Node: head,  Next: tail,  Prev: wc -l,  Up: Examples
1754
17554.12 Printing the First Lines
1756=============================
1757
1758This script is probably the simplest useful `sed' script.  It displays
1759the first 10 lines of input; the number of displayed lines is right
1760before the `q' command.
1761
1762     #!/usr/bin/sed -f
1763     10q
1764
1765
1766File: sed.info,  Node: tail,  Next: uniq,  Prev: head,  Up: Examples
1767
17684.13 Printing the Last Lines
1769============================
1770
1771Printing the last N lines rather than the first is more complex but
1772indeed possible.  N is encoded in the second line, before the bang
1773character.
1774
1775   This script is similar to the `tac' script in that it keeps the
1776final output in the hold space and prints it at the end:
1777
1778     #!/usr/bin/sed -nf
1779
1780     1! {; H; g; }
1781     1,10 !s/[^\n]*\n//
1782     $p
1783     h
1784
1785   Mainly, the scripts keeps a window of 10 lines and slides it by
1786adding a line and deleting the oldest (the substitution command on the
1787second line works like a `D' command but does not restart the loop).
1788
1789   The "sliding window" technique is a very powerful way to write
1790efficient and complex `sed' scripts, because commands like `P' would
1791require a lot of work if implemented manually.
1792
1793   To introduce the technique, which is fully demonstrated in the rest
1794of this chapter and is based on the `N', `P' and `D' commands, here is
1795an implementation of `tail' using a simple "sliding window."
1796
1797   This looks complicated but in fact the working is the same as the
1798last script: after we have kicked in the appropriate number of lines,
1799however, we stop using the hold space to keep inter-line state, and
1800instead use `N' and `D' to slide pattern space by one line:
1801
1802     #!/usr/bin/sed -f
1803
1804     1h
1805     2,10 {; H; g; }
1806     $q
1807     1,9d
1808     N
1809     D
1810
1811   Note how the first, second and fourth line are inactive after the
1812first ten lines of input.  After that, all the script does is: exiting
1813on the last line of input, appending the next input line to pattern
1814space, and removing the first line.
1815
1816
1817File: sed.info,  Node: uniq,  Next: uniq -d,  Prev: tail,  Up: Examples
1818
18194.14 Make Duplicate Lines Unique
1820================================
1821
1822This is an example of the art of using the `N', `P' and `D' commands,
1823probably the most difficult to master.
1824
1825     #!/usr/bin/sed -f
1826     h
1827
1828     :b
1829     # On the last line, print and exit
1830     $b
1831     N
1832     /^\(.*\)\n\1$/ {
1833         # The two lines are identical.  Undo the effect of
1834         # the n command.
1835         g
1836         bb
1837     }
1838
1839     # If the `N' command had added the last line, print and exit
1840     $b
1841
1842     # The lines are different; print the first and go
1843     # back working on the second.
1844     P
1845     D
1846
1847   As you can see, we mantain a 2-line window using `P' and `D'.  This
1848technique is often used in advanced `sed' scripts.
1849
1850
1851File: sed.info,  Node: uniq -d,  Next: uniq -u,  Prev: uniq,  Up: Examples
1852
18534.15 Print Duplicated Lines of Input
1854====================================
1855
1856This script prints only duplicated lines, like `uniq -d'.
1857
1858     #!/usr/bin/sed -nf
1859
1860     $b
1861     N
1862     /^\(.*\)\n\1$/ {
1863         # Print the first of the duplicated lines
1864         s/.*\n//
1865         p
1866
1867         # Loop until we get a different line
1868         :b
1869         $b
1870         N
1871         /^\(.*\)\n\1$/ {
1872             s/.*\n//
1873             bb
1874         }
1875     }
1876
1877     # The last line cannot be followed by duplicates
1878     $b
1879
1880     # Found a different one.  Leave it alone in the pattern space
1881     # and go back to the top, hunting its duplicates
1882     D
1883
1884
1885File: sed.info,  Node: uniq -u,  Next: cat -s,  Prev: uniq -d,  Up: Examples
1886
18874.16 Remove All Duplicated Lines
1888================================
1889
1890This script prints only unique lines, like `uniq -u'.
1891
1892     #!/usr/bin/sed -f
1893
1894     # Search for a duplicate line --- until that, print what you find.
1895     $b
1896     N
1897     /^\(.*\)\n\1$/ ! {
1898         P
1899         D
1900     }
1901
1902     :c
1903     # Got two equal lines in pattern space.  At the
1904     # end of the file we simply exit
1905     $d
1906
1907     # Else, we keep reading lines with `N' until we
1908     # find a different one
1909     s/.*\n//
1910     N
1911     /^\(.*\)\n\1$/ {
1912         bc
1913     }
1914
1915     # Remove the last instance of the duplicate line
1916     # and go back to the top
1917     D
1918
1919
1920File: sed.info,  Node: cat -s,  Prev: uniq -u,  Up: Examples
1921
19224.17 Squeezing Blank Lines
1923==========================
1924
1925As a final example, here are three scripts, of increasing complexity
1926and speed, that implement the same function as `cat -s', that is
1927squeezing blank lines.
1928
1929   The first leaves a blank line at the beginning and end if there are
1930some already.
1931
1932     #!/usr/bin/sed -f
1933
1934     # on empty lines, join with next
1935     # Note there is a star in the regexp
1936     :x
1937     /^\n*$/ {
1938     N
1939     bx
1940     }
1941
1942     # now, squeeze all '\n', this can be also done by:
1943     # s/^\(\n\)*/\1/
1944     s/\n*/\
1945     /
1946
1947   This one is a bit more complex and removes all empty lines at the
1948beginning.  It does leave a single blank line at end if one was there.
1949
1950     #!/usr/bin/sed -f
1951
1952     # delete all leading empty lines
1953     1,/^./{
1954     /./!d
1955     }
1956
1957     # on an empty line we remove it and all the following
1958     # empty lines, but one
1959     :x
1960     /./!{
1961     N
1962     s/^\n$//
1963     tx
1964     }
1965
1966   This removes leading and trailing blank lines.  It is also the
1967fastest.  Note that loops are completely done with `n' and `b', without
1968relying on `sed' to restart the the script automatically at the end of
1969a line.
1970
1971     #!/usr/bin/sed -nf
1972
1973     # delete all (leading) blanks
1974     /./!d
1975
1976     # get here: so there is a non empty
1977     :x
1978     # print it
1979     p
1980     # get next
1981     n
1982     # got chars? print it again, etc...
1983     /./bx
1984
1985     # no, don't have chars: got an empty line
1986     :z
1987     # get next, if last line we finish here so no trailing
1988     # empty lines are written
1989     n
1990     # also empty? then ignore it, and get next... this will
1991     # remove ALL empty lines
1992     /./!bz
1993
1994     # all empty lines were deleted/ignored, but we have a non empty.  As
1995     # what we want to do is to squeeze, insert a blank line artificially
1996     i\
1997
1998     bx
1999
2000
2001File: sed.info,  Node: Limitations,  Next: Other Resources,  Prev: Examples,  Up: Top
2002
20035 GNU `sed''s Limitations and Non-limitations
2004*********************************************
2005
2006For those who want to write portable `sed' scripts, be aware that some
2007implementations have been known to limit line lengths (for the pattern
2008and hold spaces) to be no more than 4000 bytes.  The POSIX standard
2009specifies that conforming `sed' implementations shall support at least
20108192 byte line lengths.  GNU `sed' has no built-in limit on line length;
2011as long as it can `malloc()' more (virtual) memory, you can feed or
2012construct lines as long as you like.
2013
2014   However, recursion is used to handle subpatterns and indefinite
2015repetition.  This means that the available stack space may limit the
2016size of the buffer that can be processed by certain patterns.
2017
2018
2019File: sed.info,  Node: Other Resources,  Next: Reporting Bugs,  Prev: Limitations,  Up: Top
2020
20216 Other Resources for Learning About `sed'
2022******************************************
2023
2024In addition to several books that have been written about `sed' (either
2025specifically or as chapters in books which discuss shell programming),
2026one can find out more about `sed' (including suggestions of a few
2027books) from the FAQ for the `sed-users' mailing list, available from:
2028     `http://sed.sourceforge.net/sedfaq.html'
2029
2030   Also of interest are
2031`http://www.student.northpark.edu/pemente/sed/index.htm' and
2032`http://sed.sf.net/grabbag', which include `sed' tutorials and other
2033`sed'-related goodies.
2034
2035   The `sed-users' mailing list itself maintained by Sven Guckes.  To
2036subscribe, visit `http://groups.yahoo.com' and search for the
2037`sed-users' mailing list.
2038
2039
2040File: sed.info,  Node: Reporting Bugs,  Next: Extended regexps,  Prev: Other Resources,  Up: Top
2041
20427 Reporting Bugs
2043****************
2044
2045Email bug reports to <bonzini@gnu.org>.  Be sure to include the word
2046"sed" somewhere in the `Subject:' field.  Also, please include the
2047output of `sed --version' in the body of your report if at all possible.
2048
2049   Please do not send a bug report like this:
2050
2051     while building frobme-1.3.4
2052     $ configure
2053     error--> sed: file sedscr line 1: Unknown option to 's'
2054
2055   If GNU `sed' doesn't configure your favorite package, take a few
2056extra minutes to identify the specific problem and make a stand-alone
2057test case.  Unlike other programs such as C compilers, making such test
2058cases for `sed' is quite simple.
2059
2060   A stand-alone test case includes all the data necessary to perform
2061the test, and the specific invocation of `sed' that causes the problem.
2062The smaller a stand-alone test case is, the better.  A test case should
2063not involve something as far removed from `sed' as "try to configure
2064frobme-1.3.4".  Yes, that is in principle enough information to look
2065for the bug, but that is not a very practical prospect.
2066
2067   Here are a few commonly reported bugs that are not bugs.
2068
2069`N' command on the last line
2070     Most versions of `sed' exit without printing anything when the `N'
2071     command is issued on the last line of a file.  GNU `sed' prints
2072     pattern space before exiting unless of course the `-n' command
2073     switch has been specified.  This choice is by design.
2074
2075     For example, the behavior of
2076          sed N foo bar
2077     would depend on whether foo has an even or an odd number of
2078     lines(1).  Or, when writing a script to read the next few lines
2079     following a pattern match, traditional implementations of `sed'
2080     would force you to write something like
2081          /foo/{ $!N; $!N; $!N; $!N; $!N; $!N; $!N; $!N; $!N }
2082     instead of just
2083          /foo/{ N;N;N;N;N;N;N;N;N; }
2084
2085     In any case, the simplest workaround is to use `$d;N' in scripts
2086     that rely on the traditional behavior, or to set the
2087     `POSIXLY_CORRECT' variable to a non-empty value.
2088
2089Regex syntax clashes (problems with backslashes)
2090     `sed' uses the POSIX basic regular expression syntax.  According to
2091     the standard, the meaning of some escape sequences is undefined in
2092     this syntax;  notable in the case of `sed' are `\|', `\+', `\?',
2093     `\`', `\'', `\<', `\>', `\b', `\B', `\w', and `\W'.
2094
2095     As in all GNU programs that use POSIX basic regular expressions,
2096     `sed' interprets these escape sequences as special characters.
2097     So, `x\+' matches one or more occurrences of `x'.  `abc\|def'
2098     matches either `abc' or `def'.
2099
2100     This syntax may cause problems when running scripts written for
2101     other `sed's.  Some `sed' programs have been written with the
2102     assumption that `\|' and `\+' match the literal characters `|' and
2103     `+'.  Such scripts must be modified by removing the spurious
2104     backslashes if they are to be used with modern implementations of
2105     `sed', like GNU `sed'.
2106
2107     On the other hand, some scripts use s|abc\|def||g to remove
2108     occurrences of _either_ `abc' or `def'.  While this worked until
2109     `sed' 4.0.x, newer versions interpret this as removing the string
2110     `abc|def'.  This is again undefined behavior according to POSIX,
2111     and this interpretation is arguably more robust: older `sed's, for
2112     example, required that the regex matcher parsed `\/' as `/' in the
2113     common case of escaping a slash, which is again undefined
2114     behavior; the new behavior avoids this, and this is good because
2115     the regex matcher is only partially under our control.
2116
2117     In addition, this version of `sed' supports several escape
2118     characters (some of which are multi-character) to insert
2119     non-printable characters in scripts (`\a', `\c', `\d', `\o', `\r',
2120     `\t', `\v', `\x').  These can cause similar problems with scripts
2121     written for other `sed's.
2122
2123`-i' clobbers read-only files
2124     In short, `sed -i' will let you delete the contents of a read-only
2125     file, and in general the `-i' option (*note Invocation: Invoking
2126     sed.) lets you clobber protected files.  This is not a bug, but
2127     rather a consequence of how the Unix filesystem works.
2128
2129     The permissions on a file say what can happen to the data in that
2130     file, while the permissions on a directory say what can happen to
2131     the list of files in that directory.  `sed -i' will not ever open
2132     for writing  a file that is already on disk.  Rather, it will work
2133     on a temporary file that is finally renamed to the original name:
2134     if you rename or delete files, you're actually modifying the
2135     contents of the directory, so the operation depends on the
2136     permissions of the directory, not of the file.  For this same
2137     reason, `sed' does not let you use `-i' on a writeable file in a
2138     read-only directory, and will break hard or symbolic links when
2139     `-i' is used on such a file.
2140
2141`0a' does not work (gives an error)
2142     There is no line 0.  0 is a special address that is only used to
2143     treat addresses like `0,/RE/' as active when the script starts: if
2144     you write `1,/abc/d' and the first line includes the word `abc',
2145     then that match would be ignored because address ranges must span
2146     at least two lines (barring the end of the file); but what you
2147     probably wanted is to delete every line up to the first one
2148     including `abc', and this is obtained with `0,/abc/d'.
2149
2150`[a-z]' is case insensitive
2151     You are encountering problems with locales.  POSIX mandates that
2152     `[a-z]' uses the current locale's collation order - in C parlance,
2153     that means using `strcoll(3)' instead of `strcmp(3)'.  Some
2154     locales have a case-insensitive collation order, others don't.
2155
2156     Another problem is that `[a-z]' tries to use collation symbols.
2157     This only happens if you are on the GNU system, using GNU libc's
2158     regular expression matcher instead of compiling the one supplied
2159     with GNU sed.  In a Danish locale, for example, the regular
2160     expression `^[a-z]$' matches the string `aa', because this is a
2161     single collating symbol that comes after `a' and before `b'; `ll'
2162     behaves similarly in Spanish locales, or `ij' in Dutch locales.
2163
2164     To work around these problems, which may cause bugs in shell
2165     scripts, set the `LC_COLLATE' and `LC_CTYPE' environment variables
2166     to `C'.
2167
2168`s/.*//' does not clear pattern space
2169     This happens if your input stream includes invalid multibyte
2170     sequences.  POSIX mandates that such sequences are _not_ matched
2171     by `.', so that `s/.*//' will not clear pattern space as you would
2172     expect.  In fact, there is no way to clear sed's buffers in the
2173     middle of the script in most multibyte locales (including UTF-8
2174     locales).  For this reason, GNU `sed' provides a `z' command (for
2175     `zap') as an extension.
2176
2177     To work around these problems, which may cause bugs in shell
2178     scripts, set the `LC_COLLATE' and `LC_CTYPE' environment variables
2179     to `C'.
2180
2181   ---------- Footnotes ----------
2182
2183   (1) which is the actual "bug" that prompted the change in behavior
2184
2185
2186File: sed.info,  Node: Extended regexps,  Next: Concept Index,  Prev: Reporting Bugs,  Up: Top
2187
2188Appendix A Extended regular expressions
2189***************************************
2190
2191The only difference between basic and extended regular expressions is in
2192the behavior of a few characters: `?', `+', parentheses, and braces
2193(`{}').  While basic regular expressions require these to be escaped if
2194you want them to behave as special characters, when using extended
2195regular expressions you must escape them if you want them _to match a
2196literal character_.
2197
2198Examples:
2199`abc?'
2200     becomes `abc\?' when using extended regular expressions.  It
2201     matches the literal string `abc?'.
2202
2203`c\+'
2204     becomes `c+' when using extended regular expressions.  It matches
2205     one or more `c's.
2206
2207`a\{3,\}'
2208     becomes `a{3,}' when using extended regular expressions.  It
2209     matches three or more `a's.
2210
2211`\(abc\)\{2,3\}'
2212     becomes `(abc){2,3}' when using extended regular expressions.  It
2213     matches either `abcabc' or `abcabcabc'.
2214
2215`\(abc*\)\1'
2216     becomes `(abc*)\1' when using extended regular expressions.
2217     Backreferences must still be escaped when using extended regular
2218     expressions.
2219
2220
2221File: sed.info,  Node: Concept Index,  Next: Command and Option Index,  Prev: Extended regexps,  Up: Top
2222
2223Concept Index
2224*************
2225
2226This is a general index of all issues discussed in this manual, with the
2227exception of the `sed' commands and command-line options.
2228
2229�[index�]
2230* Menu:
2231
2232* 0 address:                             Reporting Bugs.      (line 103)
2233* Additional reading about sed:          Other Resources.     (line   6)
2234* ADDR1,+N:                              Addresses.           (line  78)
2235* ADDR1,~N:                              Addresses.           (line  78)
2236* Address, as a regular expression:      Addresses.           (line  27)
2237* Address, last line:                    Addresses.           (line  22)
2238* Address, numeric:                      Addresses.           (line   8)
2239* Addresses, in sed scripts:             Addresses.           (line   6)
2240* Append hold space to pattern space:    Other Commands.      (line 125)
2241* Append next input line to pattern space: Other Commands.    (line 105)
2242* Append pattern space to hold space:    Other Commands.      (line 117)
2243* Appending text after a line:           Other Commands.      (line  27)
2244* Backreferences, in regular expressions: The "s" Command.    (line  19)
2245* Branch to a label, if s/// failed:     Extended Commands.   (line  63)
2246* Branch to a label, if s/// succeeded:  Programming Commands.
2247                                                              (line  22)
2248* Branch to a label, unconditionally:    Programming Commands.
2249                                                              (line  18)
2250* Buffer spaces, pattern and hold:       Execution Cycle.     (line   6)
2251* Bugs, reporting:                       Reporting Bugs.      (line   6)
2252* Case-insensitive matching:             The "s" Command.     (line  94)
2253* Caveat -- #n on first line:            Common Commands.     (line  20)
2254* Command groups:                        Common Commands.     (line  50)
2255* Comments, in scripts:                  Common Commands.     (line  12)
2256* Conditional branch <1>:                Extended Commands.   (line  63)
2257* Conditional branch:                    Programming Commands.
2258                                                              (line  22)
2259* Copy hold space into pattern space:    Other Commands.      (line 121)
2260* Copy pattern space into hold space:    Other Commands.      (line 113)
2261* Delete first line from pattern space:  Other Commands.      (line  99)
2262* Disabling autoprint, from command line: Invoking sed.       (line  34)
2263* empty regular expression:              Addresses.           (line  31)
2264* Emptying pattern space <1>:            Reporting Bugs.      (line 130)
2265* Emptying pattern space:                Extended Commands.   (line  85)
2266* Evaluate Bourne-shell commands:        Extended Commands.   (line  12)
2267* Evaluate Bourne-shell commands, after substitution: The "s" Command.
2268                                                              (line  85)
2269* Exchange hold space with pattern space: Other Commands.     (line 129)
2270* Excluding lines:                       Addresses.           (line 101)
2271* Extended regular expressions, choosing: Invoking sed.       (line 113)
2272* Extended regular expressions, syntax:  Extended regexps.    (line   6)
2273* Files to be processed as input:        Invoking sed.        (line 141)
2274* Flow of control in scripts:            Programming Commands.
2275                                                              (line  11)
2276* Global substitution:                   The "s" Command.     (line  51)
2277* GNU extensions, /dev/stderr file <1>:  Other Commands.      (line  88)
2278* GNU extensions, /dev/stderr file:      The "s" Command.     (line  78)
2279* GNU extensions, /dev/stdin file <1>:   Extended Commands.   (line  53)
2280* GNU extensions, /dev/stdin file:       Other Commands.      (line  78)
2281* GNU extensions, /dev/stdout file <1>:  Other Commands.      (line  88)
2282* GNU extensions, /dev/stdout file <2>:  The "s" Command.     (line  78)
2283* GNU extensions, /dev/stdout file:      Invoking sed.        (line 149)
2284* GNU extensions, 0 address <1>:         Reporting Bugs.      (line 103)
2285* GNU extensions, 0 address:             Addresses.           (line  78)
2286* GNU extensions, 0,ADDR2 addressing:    Addresses.           (line  78)
2287* GNU extensions, ADDR1,+N addressing:   Addresses.           (line  78)
2288* GNU extensions, ADDR1,~N addressing:   Addresses.           (line  78)
2289* GNU extensions, branch if s/// failed: Extended Commands.   (line  63)
2290* GNU extensions, case modifiers in s commands: The "s" Command.
2291                                                              (line  23)
2292* GNU extensions, checking for their presence: Extended Commands.
2293                                                              (line  69)
2294* GNU extensions, disabling:             Invoking sed.        (line  81)
2295* GNU extensions, emptying pattern space <1>: Reporting Bugs. (line 130)
2296* GNU extensions, emptying pattern space: Extended Commands.  (line  85)
2297* GNU extensions, evaluating Bourne-shell commands <1>: Extended Commands.
2298                                                              (line  12)
2299* GNU extensions, evaluating Bourne-shell commands: The "s" Command.
2300                                                              (line  85)
2301* GNU extensions, extended regular expressions: Invoking sed. (line 113)
2302* GNU extensions, g and NUMBER modifier interaction in s command: The "s" Command.
2303                                                              (line  57)
2304* GNU extensions, I modifier <1>:        The "s" Command.     (line  94)
2305* GNU extensions, I modifier:            Addresses.           (line  49)
2306* GNU extensions, in-place editing <1>:  Reporting Bugs.      (line  85)
2307* GNU extensions, in-place editing:      Invoking sed.        (line  51)
2308* GNU extensions, L command:             Extended Commands.   (line  26)
2309* GNU extensions, M modifier:            The "s" Command.     (line  99)
2310* GNU extensions, modifiers and the empty regular expression: Addresses.
2311                                                              (line  31)
2312* GNU extensions, N~M addresses:         Addresses.           (line  13)
2313* GNU extensions, quitting silently:     Extended Commands.   (line  36)
2314* GNU extensions, R command:             Extended Commands.   (line  53)
2315* GNU extensions, reading a file a line at a time: Extended Commands.
2316                                                              (line  53)
2317* GNU extensions, reformatting paragraphs: Extended Commands. (line  26)
2318* GNU extensions, returning an exit code <1>: Extended Commands.
2319                                                              (line  36)
2320* GNU extensions, returning an exit code: Common Commands.    (line  30)
2321* GNU extensions, setting line length:   Other Commands.      (line  65)
2322* GNU extensions, special escapes <1>:   Reporting Bugs.      (line  78)
2323* GNU extensions, special escapes:       Escapes.             (line   6)
2324* GNU extensions, special two-address forms: Addresses.       (line  78)
2325* GNU extensions, subprocesses <1>:      Extended Commands.   (line  12)
2326* GNU extensions, subprocesses:          The "s" Command.     (line  85)
2327* GNU extensions, to basic regular expressions <1>: Reporting Bugs.
2328                                                              (line  51)
2329* GNU extensions, to basic regular expressions: Regular Expressions.
2330                                                              (line  26)
2331* GNU extensions, two addresses supported by most commands: Other Commands.
2332                                                              (line  25)
2333* GNU extensions, unlimited line length: Limitations.         (line   6)
2334* GNU extensions, writing first line to a file: Extended Commands.
2335                                                              (line  80)
2336* Goto, in scripts:                      Programming Commands.
2337                                                              (line  18)
2338* Greedy regular expression matching:    Regular Expressions. (line 143)
2339* Grouping commands:                     Common Commands.     (line  50)
2340* Hold space, appending from pattern space: Other Commands.   (line 117)
2341* Hold space, appending to pattern space: Other Commands.     (line 125)
2342* Hold space, copy into pattern space:   Other Commands.      (line 121)
2343* Hold space, copying pattern space into: Other Commands.     (line 113)
2344* Hold space, definition:                Execution Cycle.     (line   6)
2345* Hold space, exchange with pattern space: Other Commands.    (line 129)
2346* In-place editing:                      Reporting Bugs.      (line  85)
2347* In-place editing, activating:          Invoking sed.        (line  51)
2348* In-place editing, Perl-style backup file names: Invoking sed.
2349                                                              (line  62)
2350* Inserting text before a line:          Other Commands.      (line  46)
2351* Labels, in scripts:                    Programming Commands.
2352                                                              (line  14)
2353* Last line, selecting:                  Addresses.           (line  22)
2354* Line length, setting <1>:              Other Commands.      (line  65)
2355* Line length, setting:                  Invoking sed.        (line  76)
2356* Line number, printing:                 Other Commands.      (line  62)
2357* Line selection:                        Addresses.           (line   6)
2358* Line, selecting by number:             Addresses.           (line   8)
2359* Line, selecting by regular expression match: Addresses.     (line  27)
2360* Line, selecting last:                  Addresses.           (line  22)
2361* List pattern space:                    Other Commands.      (line  65)
2362* Mixing g and NUMBER modifiers in the s command: The "s" Command.
2363                                                              (line  57)
2364* Next input line, append to pattern space: Other Commands.   (line 105)
2365* Next input line, replace pattern space with: Common Commands.
2366                                                              (line  44)
2367* Non-bugs, 0 address:                   Reporting Bugs.      (line 103)
2368* Non-bugs, in-place editing:            Reporting Bugs.      (line  85)
2369* Non-bugs, localization-related:        Reporting Bugs.      (line 112)
2370* Non-bugs, N command on the last line:  Reporting Bugs.      (line  31)
2371* Non-bugs, regex syntax clashes:        Reporting Bugs.      (line  51)
2372* Parenthesized substrings:              The "s" Command.     (line  19)
2373* Pattern space, definition:             Execution Cycle.     (line   6)
2374* Perl-style regular expressions, multiline: Addresses.       (line  54)
2375* Portability, comments:                 Common Commands.     (line  15)
2376* Portability, line length limitations:  Limitations.         (line   6)
2377* Portability, N command on the last line: Reporting Bugs.    (line  31)
2378* POSIXLY_CORRECT behavior, bracket expressions: Regular Expressions.
2379                                                              (line 105)
2380* POSIXLY_CORRECT behavior, enabling:    Invoking sed.        (line  84)
2381* POSIXLY_CORRECT behavior, escapes:     Escapes.             (line  11)
2382* POSIXLY_CORRECT behavior, N command:   Reporting Bugs.      (line  46)
2383* Print first line from pattern space:   Other Commands.      (line 110)
2384* Printing line number:                  Other Commands.      (line  62)
2385* Printing text unambiguously:           Other Commands.      (line  65)
2386* Quitting <1>:                          Extended Commands.   (line  36)
2387* Quitting:                              Common Commands.     (line  30)
2388* Range of lines:                        Addresses.           (line  65)
2389* Range with start address of zero:      Addresses.           (line  78)
2390* Read next input line:                  Common Commands.     (line  44)
2391* Read text from a file <1>:             Extended Commands.   (line  53)
2392* Read text from a file:                 Other Commands.      (line  78)
2393* Reformat pattern space:                Extended Commands.   (line  26)
2394* Reformatting paragraphs:               Extended Commands.   (line  26)
2395* Replace hold space with copy of pattern space: Other Commands.
2396                                                              (line 113)
2397* Replace pattern space with copy of hold space: Other Commands.
2398                                                              (line 121)
2399* Replacing all text matching regexp in a line: The "s" Command.
2400                                                              (line  51)
2401* Replacing only Nth match of regexp in a line: The "s" Command.
2402                                                              (line  55)
2403* Replacing selected lines with other text: Other Commands.   (line  52)
2404* Requiring GNU sed:                     Extended Commands.   (line  69)
2405* Script structure:                      sed Programs.        (line   6)
2406* Script, from a file:                   Invoking sed.        (line  46)
2407* Script, from command line:             Invoking sed.        (line  41)
2408* sed program structure:                 sed Programs.        (line   6)
2409* Selecting lines to process:            Addresses.           (line   6)
2410* Selecting non-matching lines:          Addresses.           (line 101)
2411* Several lines, selecting:              Addresses.           (line  65)
2412* Slash character, in regular expressions: Addresses.         (line  41)
2413* Spaces, pattern and hold:              Execution Cycle.     (line   6)
2414* Special addressing forms:              Addresses.           (line  78)
2415* Standard input, processing as input:   Invoking sed.        (line 143)
2416* Stream editor:                         Introduction.        (line   6)
2417* Subprocesses <1>:                      Extended Commands.   (line  12)
2418* Subprocesses:                          The "s" Command.     (line  85)
2419* Substitution of text, options:         The "s" Command.     (line  47)
2420* Text, appending:                       Other Commands.      (line  27)
2421* Text, deleting:                        Common Commands.     (line  36)
2422* Text, insertion:                       Other Commands.      (line  46)
2423* Text, printing:                        Common Commands.     (line  39)
2424* Text, printing after substitution:     The "s" Command.     (line  65)
2425* Text, writing to a file after substitution: The "s" Command.
2426                                                              (line  78)
2427* Transliteration:                       Other Commands.      (line  14)
2428* Unbuffered I/O, choosing:              Invoking sed.        (line 131)
2429* Usage summary, printing:               Invoking sed.        (line  28)
2430* Version, printing:                     Invoking sed.        (line  24)
2431* Working on separate files:             Invoking sed.        (line 121)
2432* Write first line to a file:            Extended Commands.   (line  80)
2433* Write to a file:                       Other Commands.      (line  88)
2434* Zero, as range start address:          Addresses.           (line  78)
2435
2436
2437File: sed.info,  Node: Command and Option Index,  Prev: Concept Index,  Up: Top
2438
2439Command and Option Index
2440************************
2441
2442This is an alphabetical list of all `sed' commands and command-line
2443options.
2444
2445�[index�]
2446* Menu:
2447
2448* # (comments):                          Common Commands.     (line  12)
2449* --binary:                              Invoking sed.        (line  93)
2450* --expression:                          Invoking sed.        (line  41)
2451* --file:                                Invoking sed.        (line  46)
2452* --follow-symlinks:                     Invoking sed.        (line 104)
2453* --help:                                Invoking sed.        (line  28)
2454* --in-place:                            Invoking sed.        (line  51)
2455* --line-length:                         Invoking sed.        (line  76)
2456* --quiet:                               Invoking sed.        (line  34)
2457* --regexp-extended:                     Invoking sed.        (line 113)
2458* --silent:                              Invoking sed.        (line  34)
2459* --unbuffered:                          Invoking sed.        (line 131)
2460* --version:                             Invoking sed.        (line  24)
2461* -b:                                    Invoking sed.        (line  93)
2462* -e:                                    Invoking sed.        (line  41)
2463* -f:                                    Invoking sed.        (line  46)
2464* -i:                                    Invoking sed.        (line  51)
2465* -l:                                    Invoking sed.        (line  76)
2466* -n:                                    Invoking sed.        (line  34)
2467* -n, forcing from within a script:      Common Commands.     (line  20)
2468* -r:                                    Invoking sed.        (line 113)
2469* -u:                                    Invoking sed.        (line 131)
2470* : (label) command:                     Programming Commands.
2471                                                              (line  14)
2472* = (print line number) command:         Other Commands.      (line  62)
2473* a (append text lines) command:         Other Commands.      (line  27)
2474* b (branch) command:                    Programming Commands.
2475                                                              (line  18)
2476* c (change to text lines) command:      Other Commands.      (line  52)
2477* D (delete first line) command:         Other Commands.      (line  99)
2478* d (delete) command:                    Common Commands.     (line  36)
2479* e (evaluate) command:                  Extended Commands.   (line  12)
2480* G (appending Get) command:             Other Commands.      (line 125)
2481* g (get) command:                       Other Commands.      (line 121)
2482* H (append Hold) command:               Other Commands.      (line 117)
2483* h (hold) command:                      Other Commands.      (line 113)
2484* i (insert text lines) command:         Other Commands.      (line  46)
2485* L (fLow paragraphs) command:           Extended Commands.   (line  26)
2486* l (list unambiguously) command:        Other Commands.      (line  65)
2487* N (append Next line) command:          Other Commands.      (line 105)
2488* n (next-line) command:                 Common Commands.     (line  44)
2489* P (print first line) command:          Other Commands.      (line 110)
2490* p (print) command:                     Common Commands.     (line  39)
2491* q (quit) command:                      Common Commands.     (line  30)
2492* Q (silent Quit) command:               Extended Commands.   (line  36)
2493* r (read file) command:                 Other Commands.      (line  78)
2494* R (read line) command:                 Extended Commands.   (line  53)
2495* s command, option flags:               The "s" Command.     (line  47)
2496* T (test and branch if failed) command: Extended Commands.   (line  63)
2497* t (test and branch if successful) command: Programming Commands.
2498                                                              (line  22)
2499* v (version) command:                   Extended Commands.   (line  69)
2500* w (write file) command:                Other Commands.      (line  88)
2501* W (write first line) command:          Extended Commands.   (line  80)
2502* x (eXchange) command:                  Other Commands.      (line 129)
2503* y (transliterate) command:             Other Commands.      (line  14)
2504* z (Zap) command:                       Extended Commands.   (line  85)
2505* {} command grouping:                   Common Commands.     (line  50)
2506
2507
2508
2509Tag Table:
2510Node: Top944
2511Node: Introduction3867
2512Node: Invoking sed4421
2513Ref: Invoking sed-Footnote-110512
2514Ref: Invoking sed-Footnote-210704
2515Node: sed Programs10803
2516Node: Execution Cycle11951
2517Ref: Execution Cycle-Footnote-113129
2518Node: Addresses13430
2519Node: Regular Expressions18174
2520Node: Common Commands26082
2521Node: The "s" Command28085
2522Ref: The "s" Command-Footnote-132422
2523Node: Other Commands32494
2524Ref: Other Commands-Footnote-137636
2525Node: Programming Commands37708
2526Node: Extended Commands38622
2527Node: Escapes42630
2528Ref: Escapes-Footnote-145641
2529Node: Examples45832
2530Node: Centering lines46928
2531Node: Increment a number47820
2532Ref: Increment a number-Footnote-149380
2533Node: Rename files to lower case49500
2534Node: Print bash environment52203
2535Node: Reverse chars of lines52958
2536Ref: Reverse chars of lines-Footnote-153959
2537Node: tac54176
2538Node: cat -n54943
2539Node: cat -b56765
2540Node: wc -c57512
2541Ref: wc -c-Footnote-159420
2542Node: wc -w59489
2543Node: wc -l60953
2544Node: head61197
2545Node: tail61528
2546Node: uniq63209
2547Node: uniq -d63997
2548Node: uniq -u64708
2549Node: cat -s65419
2550Node: Limitations67270
2551Node: Other Resources68111
2552Node: Reporting Bugs68956
2553Ref: Reporting Bugs-Footnote-176092
2554Node: Extended regexps76163
2555Node: Concept Index77349
2556Node: Command and Option Index92298
2557
2558End Tag Table
2559