fb58af

fb58af
Archive-Name: editor-faq/sed
fb58af
Posting-Frequency: irregular
fb58af
Last-modified: 10 March 2003
fb58af
Version: 015
fb58af
URL: http://sed.sourceforge.net/sedfaq.html
fb58af
Maintainer: Eric Pement (pemente@northpark.edu)
fb58af

fb58af
                            THE SED FAQ
fb58af

fb58af
                  Frequently Asked Questions about
fb58af
                       sed, the stream editor
fb58af

fb58af
CONTENTS
fb58af

fb58af
1. GENERAL INFORMATION
fb58af
1.1. Introduction - How this FAQ is organized
fb58af
1.2. Latest version of the sed FAQ
fb58af
1.3. FAQ revision information
fb58af
1.4. How do I add a question/answer to the sed FAQ?
fb58af
1.5. FAQ abbreviations
fb58af
1.6. Credits and acknowledgements
fb58af
1.7. Standard disclaimers
fb58af

fb58af
2. BASIC SED
fb58af
2.1. What is sed?
fb58af
2.2. What versions of sed are there, and where can I get them?
fb58af

fb58af
2.2.1. Free versions
fb58af

fb58af
2.2.1.1. Unix platforms
fb58af
2.2.1.2. OS/2
fb58af
2.2.1.3. Microsoft Windows (Win3x, Win9x, WinNT, Win2K)
fb58af
2.2.1.4. MS-DOS
fb58af
2.2.1.5. CP/M
fb58af
2.2.1.6. Macintosh v8 or v9
fb58af

fb58af
2.2.2. Shareware and Commercial versions
fb58af

fb58af
2.2.2.1. Unix platforms
fb58af
2.2.2.2. OS/2
fb58af
2.2.2.3. Windows 95/98, Windows NT, Windows 2000
fb58af
2.2.2.4. MS-DOS
fb58af

fb58af
2.3. Where can I learn to use sed?
fb58af

fb58af
2.3.1. Books
fb58af
2.3.2. Mailing list
fb58af
2.3.3. Tutorials, electronic text
fb58af
2.3.4. General web and ftp sites
fb58af

fb58af
3. TECHNICAL
fb58af
3.1. More detailed explanation of basic sed
fb58af
3.1.1.  Regular expressions on the left side of "s///"
fb58af
3.1.2.  Escape characters on the right side of "s///"
fb58af
3.1.3.  Substitution switches
fb58af
3.2. Common one-line sed scripts. How do I . . . ?
fb58af

fb58af
      - double/triple-space a file?
fb58af
      - convert DOS/Unix newlines?
fb58af
      - delete leading/trailing spaces?
fb58af
      - do substitutions on all/certain lines?
fb58af
      - delete consecutive blank lines?
fb58af
      - delete blank lines at the top/end of the file?
fb58af

fb58af
3.3. Addressing and address ranges
fb58af
3.4. Address ranges in GNU sed and HHsed
fb58af
3.5. Debugging sed scripts
fb58af
3.6. Notes about s2p, the sed-to-perl translator
fb58af
3.7. GNU/POSIX extensions to regular expressions
fb58af

fb58af
4. EXAMPLES
fb58af
   ONE-CHARACTER QUESTIONS
fb58af
4.1.  How do I insert a newline into the RHS of a substitution?
fb58af
4.2.  How do I represent control-codes or non-printable characters?
fb58af
4.3.  How do I convert files with toggle characters, like +this+,
fb58af
      to look like [i]this[/i]?
fb58af

fb58af
   CHANGING STRINGS
fb58af
4.10. How do I perform a case-insensitive search?
fb58af
4.11. How do I match only the first occurrence of a pattern?
fb58af
4.12. How do I parse a comma-delimited (CSV) data file?
fb58af
4.13. How do I handle fixed-length, columnar data?
fb58af
4.14. How do I commify a string of numbers?
fb58af
4.15. How do I prevent regex expansion on substitutions?
fb58af
4.16. How do I convert a string to all lowercase or capital letters?
fb58af

fb58af
   CHANGING BLOCKS (consecutive lines)
fb58af
4.20. How do I change only one section of a file?
fb58af
4.21. How do I delete or change a block of text if the block contains
fb58af
      a certain regular expression?
fb58af
4.22. How do I locate a paragraph of text if the paragraph contains a
fb58af
      certain regular expression?
fb58af
4.23. How do I match a block of specific consecutive lines?
fb58af
4.23.1.  Try to use a "/range/, /expression/"
fb58af
4.23.2.  Try to use a "multi-line\nexpression"
fb58af
4.23.3.  Try to use a block of "literal strings"
fb58af
4.24. How do I address all the lines between RE1 and RE2, excluding the lines themselves?
fb58af
4.25. How do I join two lines if line #1 ends in a [certain string]?
fb58af
4.26. How do I join two lines if line #2 begins in a [certain string]?
fb58af
4.27. How do I change all paragraphs to long lines?
fb58af

fb58af
   SHELL AND ENVIRONMENT
fb58af
4.30.   How do I read environment variables with sed ...
fb58af
4.31.1.   ... on Unix platforms?
fb58af
4.31.2.   ... on MS-DOS or 4DOS platforms?
fb58af
4.32.   How do I export or pass variables back into the environment ...
fb58af
4.32.1.   ... on Unix platforms?
fb58af
4.32.2.   ... on MS-DOS or 4DOS platforms?
fb58af
4.33.   How do I handle shell quoting in sed?
fb58af

fb58af
   FILES, DIRECTORIES, AND PATHS
fb58af
4.40.  How do I read (insert/add) a file at the top of a textfile?
fb58af
4.41.  How do I make substitutions in every file in a directory, or
fb58af
        in a complete directory tree?
fb58af
4.41.1.   ... ssed solution
fb58af
4.41.2.   ... Unix solution
fb58af
4.41.3.   ... DOS solution
fb58af
4.42.  How do I replace "/some/UNIX/path" in a substitution?
fb58af
4.43.  How do I replace "C:\SOME\DOS\PATH" in a substitution?
fb58af
4.44.  How do I emulate file-includes, using sed?
fb58af

fb58af
5. WHY ISN'T THIS WORKING?
fb58af
5.1.  Why don't my variables like $var get expanded in my sed script?
fb58af
5.2.  I'm using 'p' to print, but I have duplicate lines sometimes.
fb58af
5.3.  Why does my DOS version of sed process a file part-way through
fb58af
      and then quit?
fb58af
5.4.  My RE isn't matching/deleting what I want it to. (Or, "Greedy vs.
fb58af
      stingy pattern matching")
fb58af
5.5.  What is CSDPMI*B.ZIP and why do I need it?
fb58af
5.6.  Where are the man pages for GNU sed?
fb58af
5.7.  How do I tell what version of sed I am using?
fb58af
5.8.  Does sed issue an exit code?
fb58af
5.9.  The 'r' command isn't inserting the file into the text.
fb58af
5.10. Why can't I match or delete a newline using the \n escape
fb58af
      sequence? Why can't I match 2 or more lines using \n?
fb58af
5.11. My script aborts with an error message, "event not found".
fb58af

fb58af
6. OTHER ISSUES
fb58af
6.1.  I have a problem that stumps me. Where can I get help?
fb58af
6.2.  How does sed compare with awk, perl, and other utilities?
fb58af
6.3.  When should I use sed?
fb58af
6.4.  When should I NOT use sed?
fb58af
6.5.  When should I ignore sed and use Awk or Perl instead?
fb58af
6.6.  Known limitations among sed versions
fb58af
6.7.  Known incompatibilities between sed versions
fb58af

fb58af
6.7.1. Issuing commands from the command line
fb58af
6.7.2. Using comments (prefixed by the '#' sign)
fb58af
6.7.3. Special syntax in REs
fb58af
6.7.4. Word boundaries
fb58af
6.7.5. Commands which operate differently
fb58af

fb58af
7. KNOWN BUGS AMONG SED VERSIONS
fb58af
7.1. ssed v3.59
fb58af
7.2. GNU sed v4.0 - v4.0.5
fb58af
7.3. GNU sed v3.02.80
fb58af
7.4. GNU sed v3.02
fb58af
7.5. GNU sed v2.05
fb58af
7.6. GNU sed v1.18
fb58af
7.7. GNU sed v1.03
fb58af
7.8. sed v1.6 (Briscoe)
fb58af
7.9. sed v1.5 (Helman)
fb58af
7.10. sedmod v1.0 (Chen)
fb58af
7.11. HP-UX sed
fb58af
7.12. SunOS sed v4.1
fb58af
7.13. SunOS sed v5.6
fb58af
7.14. Ultrix sed v4.3
fb58af
7.15. Digital Unix sed
fb58af

fb58af

fb58af
------------------------------
fb58af

fb58af
1. GENERAL INFORMATION
fb58af

fb58af
1.1. Introduction - How this FAQ is organized
fb58af

fb58af
   This FAQ is organized to answer common (and some uncommon)
fb58af
   questions about sed, quickly. If you see a term or abbreviation in
fb58af
   the examples that seems unclear, see if the term is defined in
fb58af
   section 1.5. If not, send your comment to pemente[at]northpark.edu.
fb58af

fb58af
1.2. Latest version of the sed FAQ
fb58af

fb58af
   The newest version of the sed FAQ is usually here:
fb58af

fb58af
       http://sed.sourceforge.net/sedfaq.html (HTML version)
fb58af
       http://sed.sourceforge.net/sedfaq.txt  (plain text)
fb58af
       http://www.student.northpark.edu/pemente/sed/sedfaq.html
fb58af
       http://www.student.northpark.edu/pemente/sed/sedfaq.txt
fb58af
       http://www.faqs.org/faqs/editor-faq/sed
fb58af
       ftp://rtfm.mit.edu/pub/faqs/editor-faq/sed
fb58af

fb58af
   Another FAQ file on sed by a different author can be found here:
fb58af

fb58af
       http://www.dreamwvr.com/sed-info/sed-faq.html
fb58af

fb58af
1.3. FAQ revision information
fb58af

fb58af
   In the plaintext version, changes are shown by a vertical bar (|)
fb58af
   placed in column 78 of the affected lines. To remove the vertical
fb58af
   bars (use double quotes for MS-DOS):
fb58af

fb58af
     sed 's/  *|$//' sedfaq.txt > sedfaq2.txt
fb58af

fb58af
   In the HTML version, vertical bars do not appear. New or altered
fb58af
   portions of the FAQ are indicated by printing in dark blue type.
fb58af

fb58af
   In the text version, words needing emphasis may be surrounded by
fb58af
   the underscore '_' or the asterisk '*'. In the HTML version, these
fb58af
   are changed to italics and boldface, respectively.
fb58af

fb58af
1.4. How do I add a question/answer to the sed FAQ?
fb58af

fb58af
   Word your question briefly and send it to pemente[at]northpark.edu,
fb58af
   indicating your proposed change. We'll post it on the sed-users
fb58af
   mailing list (see section 2.3.2) and discuss it there. If it's
fb58af
   good, your contribution will be added to the next edition.
fb58af

fb58af
1.5. FAQ abbreviations
fb58af

fb58af
       files = one or more filenames, separated by whitespace
fb58af
       gsed  = GNU sed
fb58af
       ssed  = super-sed
fb58af
       RE    = Regular Expressions supported by sed
fb58af
       LHS   = the left-hand side ("find" part) of "s/find/repl/" command
fb58af
       RHS   = the right-hand side ("replace" part) of "s/find/repl/" cmd
fb58af
       nn+   = version _nn_ or higher (e.g., "15+" = version 1.5 and above)
fb58af

fb58af
   files: "files" stands for one or more filenames entered on the
fb58af
   command line. The names may include any wildcards your shell
fb58af
   understands (such as ``zork*'' or ``Aug[4-9].let''). Sed will
fb58af
   process each filename passed to it by the shell.
fb58af

fb58af
   RE: For details on regular expressions, see section 3.1.1., below.
fb58af

fb58af
1.6. Credits and acknowledgements
fb58af

fb58af
   Many of the ideas for this FAQ were taken from the Awk FAQ:
fb58af
       http://www.faqs.org/faqs/computer-lang/awk/faq/
fb58af
       ftp://rtfm.mit.edu/pub/usenet/comp.lang.awk/faq
fb58af

fb58af
   and from the old Perl FAQ:
fb58af
       http://www.perl.com/doc/FAQs/FAQ/oldfaq-html/index.html
fb58af

fb58af
   The following individuals have contributed significantly to this
fb58af
   document, and have provided input and wording suggestions for
fb58af
   questions, answers, and script examples. Credit goes to these
fb58af
   contributors (in alphabetical order by last name):
fb58af

fb58af
   Al Aab, Yiorgos Adamopoulos, Paolo Bonzini, Walter Briscoe, Jim
fb58af
   Dennis, Carlos Duarte, Otavio Exel, Sven Guckes, Aurelio Jargas,
fb58af
   Mark Katz, Toby Kelsey, Eric Pement, Greg Pfeiffer, Ken Pizzini,
fb58af
   Niall Smart, Simon Taylor, Peter Tillier, Greg Ubben, Laurent
fb58af
   Vogel.
fb58af

fb58af
1.7. Standard disclaimers
fb58af

fb58af
   While a serious attempt has been made to ensure the accuracy of the
fb58af
   information presented herein, the contributors and maintainers of
fb58af
   this document do not claim the absence of errors and make no
fb58af
   warranties on the information provided. If you notice any mistakes,
fb58af
   please let us know so we can fix it.
fb58af

fb58af
------------------------------
fb58af

fb58af
2. BASIC SED
fb58af

fb58af
2.1. What is sed?
fb58af

fb58af
   "sed" stands for Stream EDitor. Sed is a non-interactive editor,
fb58af
   written by the late Lee E. McMahon in 1973 or 1974. A brief history
fb58af
   of sed's origins may be found in an early history of the Unix
fb58af
   tools, at <http://www.columbia.edu/~rh120/ch106.x09>.
fb58af

fb58af
   Instead of altering a file interactively by moving the cursor on
fb58af
   the screen (as with a word processor), the user sends a script of
fb58af
   editing instructions to sed, plus the name of the file to edit (or
fb58af
   the text to be edited may come as output from a pipe). In this
fb58af
   sense, sed works like a filter -- deleting, inserting and changing
fb58af
   characters, words, and lines of text. Its range of activity goes
fb58af
   from small, simple changes to very complex ones.
fb58af

fb58af
   Sed reads its input from stdin (Unix shorthand for "standard
fb58af
   input," i.e., the console) or from files (or both), and sends the
fb58af
   results to stdout ("standard output," normally the console or
fb58af
   screen). Most people use sed first for its substitution features.
fb58af
   Sed is often used as a find-and-replace tool.
fb58af

fb58af
     sed 's/Glenn/Harold/g' oldfile >newfile
fb58af

fb58af
   will replace every occurrence of "Glenn" with the word "Harold",
fb58af
   wherever it occurs in the file. The "find" portion is a regular
fb58af
   expression ("RE"), which can be a simple word or may contain
fb58af
   special characters to allow greater flexibility (for example, to
fb58af
   prevent "Glenn" from also matching "Glennon").
fb58af

fb58af
   My very first use of sed was to add 8 spaces to the left side of a
fb58af
   file, so when I printed it, the printing wouldn't begin at the
fb58af
   absolute left edge of a piece of paper.
fb58af

fb58af
     sed 's/^/        /' myfile >newfile   # my first sed script
fb58af
     sed 's/^/        /' myfile | lp       # my next sed script
fb58af

fb58af
   Then I learned that sed could display only one paragraph of a file,
fb58af
   beginning at the phrase "and where it came" and ending at the
fb58af
   phrase "for all people". My script looked like this:
fb58af

fb58af
     sed -n '/and where it came/,/for all people/p' myfile
fb58af

fb58af
   Sed's normal behavior is to print (i.e., display or show on screen)
fb58af
   the entire file, including the parts that haven't been altered,
fb58af
   unless you use the -n switch. The "-n" stands for "no output". This
fb58af
   switch is almost always used in conjunction with a 'p' command
fb58af
   somewhere, which says to print only the sections of the file that
fb58af
   have been specified. The -n switch with the 'p' command allow for
fb58af
   parts of a file to be printed (i.e., sent to the console).
fb58af

fb58af
   Next, I found that sed could show me only (say) lines 12-18 of a
fb58af
   file and not show me the rest. This was very handy when I needed to
fb58af
   review only part of a long file and I didn't want to alter it.
fb58af

fb58af
     # the 'p' stands for print
fb58af
     sed -n 12,18p myfile
fb58af

fb58af
   Likewise, sed could show me everything else BUT those particular
fb58af
   lines, without physically changing the file on the disk:
fb58af

fb58af
     # the 'd' stands for delete
fb58af
     sed 12,18d myfile
fb58af

fb58af
   Sed could also double-space my single-spaced file when it came time
fb58af
   to print it:
fb58af

fb58af
     sed G myfile >newfile
fb58af

fb58af
   If you have many editing commands (for deleting, adding,
fb58af
   substituting, etc.) which might take up several lines, those
fb58af
   commands can be put into a separate file and all of the commands in
fb58af
   the file applied to file being edited:
fb58af

fb58af
     #  'script.sed' is the file of commands
fb58af
     # 'myfile' is the file being changed
fb58af
     sed -f script.sed myfile  # 'script.sed' is the file of commands
fb58af

fb58af
   It is not our intention to convert this FAQ file into a full-blown
fb58af
   sed tutorial (for good tutorials, see section 2.3). Rather, we hope
fb58af
   this gives the complete novice a few ideas of how sed can be used.
fb58af

fb58af
2.2. What versions of sed are there, and where can I get them?
fb58af

fb58af
2.2.1. Free versions
fb58af

fb58af
   Note: "Free" does not mean "public domain" nor does it necessarily
fb58af
   mean you will never be charged for it. All versions of sed in this
fb58af
   section except the CP/M versions are based on the GNU general
fb58af
   public license and are "free software" by that standard (for
fb58af
   details, see http://www.gnu.org/philosophy/free-sw.html). This
fb58af
   means you can get the source code and develop it further.
fb58af

fb58af
   At the URLs listed in this category, sed binaries or source code
fb58af
   can be downloaded and used without fees or license payments.
fb58af

fb58af
2.2.1.1. Unix platforms
fb58af

fb58af
   ssed v3.60
fb58af
   ssed is the version recommended by the FAQ maintainers, since it
fb58af
   shares the same codebase with GNU sed, has the most options, and is
fb58af
   free software (you can get the source). Though there were earlier
fb58af
   version of ssed distributed, sites for these are not being listed.
fb58af

fb58af
       http://sed.sourceforge.net/grabbag/ssed
fb58af
       http://freshmeat.net/project/sed/
fb58af

fb58af
   GNU sed v4.0.5
fb58af
   This is the latest official version of GNU sed. It offers in-place
fb58af
   text replacement as an option switch.
fb58af

fb58af
       ftp://ftp.gnu.org/pub/gnu/sed/sed-4.0.5.tar.gz
fb58af
       http://freshmeat.net/project/sed
fb58af

fb58af
   BSD multi-byte sed (Japanese)
fb58af
   Based on the latest version of GNU sed, which supports multi-byte
fb58af
   characters.
fb58af

fb58af
       ftp://ftp1.freebsd.org/pub/FreeBSD/FreeBSD-stable/packages/Latest/ja-sed.tgz
fb58af

fb58af
   GNU sed v3.02.80
fb58af
   An alpha test release which was the base for the development of
fb58af
   ssed and GNU sed v4.0.
fb58af

fb58af
       ftp://alpha.gnu.org/pub/gnu/sed/sed-3.02.80.tar.gz
fb58af

fb58af
   GNU sed v3.02a
fb58af
   Interim version with most features of GNU sed v3.02.80.
fb58af

fb58af
   GNU sed v3.02
fb58af
       ftp://ftp.gnu.org/pub/gnu/sed/sed-3.02.tar.gz
fb58af

fb58af
   Precompiled versions:
fb58af

fb58af
   GNU sed v3.02-8
fb58af
   source code and binaries for Debian GNU/Linux
fb58af

fb58af
       http://www.debian.org/Packages/stable/base/sed.html
fb58af

fb58af
   For some time, the GNU project <http://www.gnu.org> used Eric S.
fb58af
   Raymond's version of sed (ESR sed v1.1), but eventually dropped it
fb58af
   because it had too many built-in limits. In 1991 Howard Helman
fb58af
   modified the GNU/ESR sed and produced a flexible version of sed
fb58af
   v1.5 available at several sites (Helman's version permitted things
fb58af
   like \<...\> to delimit word boundaries, \xHH to enter hex code and
fb58af
   \n to indicate newlines in the replace string). This version did
fb58af
   not catch on with the GNU project and their version of sed has
fb58af
   moved in a similar but different direction.
fb58af

fb58af
   sed v1.3, by Eric Steven Raymond (released 4 June 1998)
fb58af
       http://catb.org/~esr/sed-1.3.tar.gz
fb58af

fb58af
   Eric Raymond <esr@snark.thyrsus.com> wrote one of the earliest
fb58af
   versions of sed. On his website <http://www.catb.org/~esr/> which
fb58af
   also distributes many freeware utilities he has written or worked
fb58af
   on, he describes sed v1.1 this way:
fb58af

fb58af
   "This is the fast, small sed originally distributed in the GNU
fb58af
   toolkit and still distributed with Minix. The GNU people ditched it
fb58af
   when they built their own sed around an enhanced regex package --
fb58af
   but it's still better for some uses (in particular, faster and less
fb58af
   memory-intensive)." (Version 1.3 fixes an unidentified bug and adds
fb58af
   the L command to hexdump the current pattern space.)
fb58af

fb58af
2.2.1.2. OS/2
fb58af

fb58af
   GNU sed v3.02.80
fb58af
       http://www2s.biglobe.ne.jp/~vtgf3mpr/gnu/sed.htm
fb58af

fb58af
   GNU sed v3.02
fb58af
       http://hobbes.nmsu.edu/pub/os2/util/file/sed-3_02-r2-bin.zip # binaries
fb58af
       http://hobbes.nmsu.edu/pub/os2/util/file/sed-3_02-r2.zip     # source
fb58af

fb58af
2.2.1.3. Microsoft Windows (Win3x, Win9x, WinNT, Win2K)
fb58af

fb58af
   GNU sed v4.0.5
fb58af
   32-bit binaries and docs. Precompiled versions not available (yet).
fb58af

fb58af
   GNU sed v3.02.80
fb58af
   32-bit binaries and docs, using DJGPP compiler. For details on new
fb58af
   features, see Unix section, above.
fb58af

fb58af
       http://www.student.northpark.edu/pemente/sed/sed3028a.zip # DOS binaries
fb58af
       ftp://alpha.gnu.org/pub/gnu/sed/sed-3.02.80.tar.gz        # source
fb58af
       ftp://ftp.simtel.net/pub/simtelnet/gnu/djgpp/v2gnu/sed3028b.zip # binaries
fb58af
       ftp://ftp.simtel.net/pub/simtelnet/gnu/djgpp/v2gnu/sed3028d.zip # docs
fb58af
       ftp://ftp.simtel.net/pub/simtelnet/gnu/djgpp/v2gnu/sed3028s.zip # source
fb58af

fb58af
   GNU sed v2.05
fb58af
   32-bit binaries, no docs. Requires 80386 DX (SX will not run) and
fb58af
   must be run in a DOS window or in a full screen DOS session under
fb58af
   Microsoft Windows. Will not run in MS-DOS mode (outside Win/Win95).
fb58af
   We recommend using the latest version of GNU sed.
fb58af
       http://www.simtel.net/pub/win95/prog/gsed205b.zip
fb58af
       ftp://ftp.cdrom.com/pub/simtelnet/win95/prog/gsed205b.zip
fb58af

fb58af
   GNU sed v1.03
fb58af
   modified by Frank Whaley.
fb58af

fb58af
   This version was part of the "Virtually UN*X" toolset, hosted by
fb58af
   itribe.net; that website is now closed. Gsed v1.03 supported Win9x
fb58af
   long filenames, as well as hex, decimal, binary, and octal
fb58af
   character representations.
fb58af

fb58af
   The Cygwin toolkit:
fb58af
       http://www.cygwin.com
fb58af

fb58af
   Formerly know as "GNU-Win32 tools." According to their home page,
fb58af
   "The Cygwin tools are Win32 ports of the popular GNU development
fb58af
   tools for Windows NT, 95 and 98. They function through the use of
fb58af
   the Cygwin library which provides a UNIX-like API on top of the
fb58af
   Win32 API." The version of sed used is GNU sed v3.02.
fb58af

fb58af
   Minimalist GNU for Windows (MinGW):
fb58af
       http://www.mingw.org
fb58af
       http://mingw.sourceforge.net
fb58af

fb58af
   According to their home page, "MinGW ('Minimalist GNU for Windows')
fb58af
   refers to a set of runtime headers, used in building a compiler
fb58af
   system based on the GNU GCC and binutils projects. It compiles and
fb58af
   links code to be run on Win32 platforms ... MinGW uses Microsoft
fb58af
   runtime libraries, distributed with the Windows operating system."
fb58af
   The version of sed used is GNU sed v3.02.
fb58af

fb58af
   sed v1.5 (a/k/a HHsed), by Howard Helman
fb58af
   Compiled with Mingw32 for 32-bit environments described above. This
fb58af
   version should support Win95 long filenames.
fb58af
       http://www.dbnet.ece.ntua.gr/~george/sed/OLD/sed15.exe
fb58af
       http://www.student.northpark.edu/pemente/sed/sed15exe.zip
fb58af

fb58af
2.2.1.4. MS-DOS
fb58af

fb58af
   sed v1.6 (from HHsed), by Walter Briscoe
fb58af

fb58af
   This is a forthcoming version, now in beta testing, but with many
fb58af
   new features. It corrects all the bugs in sed v1.5, and adds the
fb58af
   best features of sedmod v1.0 (below). It is available in 16-bit and
fb58af
   32-bit compiled versions for MS-DOS. Sorry, no URLs available yet.
fb58af

fb58af
   sed v1.5 (a/k/a HHsed), by Howard Helman
fb58af
   uncompiled source code (Turbo C)
fb58af
       ftp://ftp.simtel.net/pub/simtelnet/msdos/txtutl/sed15.zip
fb58af
       ftp://ftp.cdrom.com/pub/simtelnet/msdos/txtutl/sed15.zip
fb58af

fb58af
   DOS executable and documentation
fb58af
       ftp://ftp.simtel.net/pub/simtelnet/msdos/txtutl/sed15x.zip
fb58af
       ftp://ftp.cdrom.com/pub/simtelnet/msdos/txtutl/sed15x.zip
fb58af

fb58af
   sedmod v1.0, by Hern Chen
fb58af
       http://www.ptug.org/sed/SEDMOD10.ZIP
fb58af
       http://www.student.northpark.edu/pemente/sed/sedmod10.zip
fb58af
       ftp://garbo.uwasa.fi/pc/unix/sedmod10.zip
fb58af

fb58af
   GNU sed v3.02.80
fb58af
   See section 2.2.1.3 ("Microsoft Windows"), above.
fb58af

fb58af
   GNU sed v2.05
fb58af
   Does not run under MS-DOS.
fb58af

fb58af
   GNU sed v1.18
fb58af
   32-bit binaries and source, using DJGPP compiler. Requires 80386 SX
fb58af
   or better. Also requires 3 CWS*.EXE extenders on the path. See
fb58af
   section 5.5 ("What is CSDPMI*B.ZIP and why do I need it?"), below.
fb58af
   We recommend using a newer version of GNU sed.
fb58af
       http://www.simtel.net/pub/simtelnet/gnu/djgpp/v2gnu/sed118b.zip
fb58af
       ftp://ftp.cdrom.com/pub/simtelnet/gnu/djgpp/v2gnu/sed118b.zip
fb58af
       http://www.simtel.net/pub/simtelnet/gnu/djgpp/v2gnu/sed118s.zip
fb58af
       ftp://ftp.cdrom.com/pub/simtelnet/gnu/djgpp/v2gnu/sed118s.zip
fb58af

fb58af
   GNU sed v1.06
fb58af
   16-bit binaries and source. Should run under any MS-DOS system.
fb58af
       http://www.simtel.net/pub/gnu/gnuish/sed106.zip
fb58af
       ftp://ftp.cdrom.com/pub/simtelnet/gnu/gnuish/sed106.zip
fb58af

fb58af
2.2.1.5. CP/M
fb58af

fb58af
   ssed v2.2, by Chuck A. Forsberg
fb58af

fb58af
   Written for CP/M, ssed (for "small/stupid stream editor) supports
fb58af
   only the a(ppend), c(hange), d(elete) and i(nsert) options, and
fb58af
   apparently doesn't support regular expressions. A -u switch will
fb58af
   "unsqueeze" compressed files and was used mainly in conjunction
fb58af
   with DIF.COM for source code maintenance. (file: ssed22.lbr)
fb58af

fb58af
   change, by Michael M. Rubenstein
fb58af

fb58af
   Rubenstein released a version of sed called CHANGE.COM (the
fb58af
   TTOOLS.LBR archive member CHANGE.CZM is a "crunched" file).
fb58af
   CHANGE.COM supports full RE's except grouping and backreferences,
fb58af
   and its only function is global substitution. (file: ttools.lbr)
fb58af

fb58af
2.2.1.6. Macintosh v8 or v9
fb58af

fb58af
   Since sed is a command-line utility, it is not customary to think
fb58af
   of sed being used on a Mac. Nonetheless, the following instructions
fb58af
   from Aurelio Jargas describe the process for running sed on MacOS
fb58af
   version version 8 or 9.
fb58af

fb58af
   (1) Download and install the Apple DiskCopy application
fb58af

fb58af
       ftp://ftp.apple.com/developer/Development_Kits
fb58af

fb58af
   (2) Download and install Apple MPW
fb58af

fb58af
       ftp://ftp.apple.com/developer/Tool_Chest/Core_Mac_OS_Tools/MPW_etc./
fb58af

fb58af
   (3) Download and expand Matthias Neeracher's GNU sed for MPW. (They
fb58af
   seem to have misnumbered the sed filename.)
fb58af

fb58af
       ftp://sunsite.cnlab-switch.ch/software/platform/macos/src/mpw_c/sed-2.03.sit.bin
fb58af

fb58af
   (4) Enter the sed-3.02 directory and doubleclick the 'sed' file
fb58af

fb58af
   (5) MPW Shell will open up. It will be a command window instead of
fb58af
   a command line, but sed should work as expected. For example:
fb58af

fb58af
       echo aa | sed 's/a/Z/g'<ENTER>
fb58af

fb58af
   Note that ENTER is different from RETURN on an iMac. Apple *also*
fb58af
   has its own version of sed on MPW, called "StreamEdit", with a
fb58af
   syntax fairly similar to that of normal sed.
fb58af

fb58af
2.2.2. Shareware and Commercial versions
fb58af

fb58af
2.2.2.1. Unix platforms
fb58af

fb58af
       [ Additional information needed. ]
fb58af

fb58af
2.2.2.2. OS/2
fb58af

fb58af
   Hamilton Labs:
fb58af
       http://www.hamiltonlabs.com/cshell.htm
fb58af

fb58af
   A sizable set of Unix/C shell utilities designed for OS/2. Price is
fb58af
   $350 in the US, $395 elsewhere, with FedEx shipping, unconditional
fb58af
   guarantee, unlimited support and free updates. A demo version of
fb58af
   the suite can be downloaded from this site, but a stand-alone copy
fb58af
   of sed is not available.
fb58af

fb58af
2.2.2.3. Windows 95/98, Windows NT, Windows 2000
fb58af

fb58af
   Hamilton Labs:
fb58af
       http://www.hamiltonlabs.com/cshell.htm
fb58af

fb58af
   A sizable set of Unix/C shell utilities designed for Win9x, WinNT,
fb58af
   and Win2K. Price is $350 in the US, $395 elsewhere, with FedEx
fb58af
   shipping, unconditional guarantee, unlimited support and free
fb58af
   updates. A demo version of the suite can be downloaded from this
fb58af
   site, but a stand-alone copy of sed is not available.
fb58af

fb58af
   Interix:
fb58af
       http://www.interix.com
fb58af

fb58af
   Interix (formerly known as OpenNT) is advertised as "a complete
fb58af
   UNIX system environment running natively on Microsoft Windows NT",
fb58af
   and is licensed and supported by Softway Systems. It offers over
fb58af
   200 Unix utilities, and supports Unix shells, sockets, networking,
fb58af
   and more. A single-user edition runs about $200. A free demo or
fb58af
   evaluation copy will run for 31 days and then quit; to continue
fb58af
   using it, you must purchase the commercial version.
fb58af

fb58af
   MKS NuTCRACKER Professional
fb58af
       http://www.datafocus.com/products/nutc/
fb58af

fb58af
   A different, yet related product line offered by MKS (Mortice Kern
fb58af
   Systems, below); the awkward spelling "NuTCRACKER" is intentional.
fb58af
   Various packages offer hundreds of Unix utilities for Win32
fb58af
   environments. Sed is not available as a separate product.
fb58af

fb58af
   UnixDos:
fb58af
       http://www.unixdos.com
fb58af

fb58af
   UnixDos is a suite of 82 Unix utilities ported over to the Windows
fb58af
   environments. There are 16-bit versions for Win3.x and 32-bit
fb58af
   versions for WinNT/Win95. It is distributed as uncrippled shareware
fb58af
   for the first 30 days. After the test period, the utilities will
fb58af
   not run and you must pay the registration fee of $50.
fb58af

fb58af
   Their version of sed supports "\n" in the RHS of expressions, and
fb58af
   increases the length of input lines to 10,000 characters. By
fb58af
   special arrangement with the owners, persons who want a licensed
fb58af
   version of sed *only* (without the other utilities) may pay a
fb58af
   license fee of $10.
fb58af

fb58af
   U/WIN:
fb58af
       http://www.research.att.com/sw/tools/uwin/
fb58af

fb58af
   U/WIN is a suite of Unix utilities created for WinNT and Win95
fb58af
   systems. It is owned by AT&T, created by David Korn (author of the
fb58af
   Unix korn shell), and is freely distributed only to educational
fb58af
   institutions, AT&T employees, or certain researchers; all others
fb58af
   must pay a fee after a 90-day evaluation period expires. U/WIN
fb58af
   operates best with the NTFS (WinNT file system) but will run in
fb58af
   degraded mode with the FAT file system and in further degraded mode
fb58af
   under Win95. A minimal installation takes about 25 to 30 megs of
fb58af
   disk space. Sed is not available as a separate file for download,
fb58af
   but comes with the suite.
fb58af

fb58af
2.2.2.4. MS-DOS
fb58af

fb58af
   Mix C/Utilities Toolchest
fb58af
       http://www.mixsoftware.com/product/utility.htm
fb58af

fb58af
   According to their web page, "The C/Utilities Toolchest adds over
fb58af
   40 powerful UNIX utilities to your MS-DOS operating system. The
fb58af
   result is an environment very similar to UNIX operating systems,
fb58af
   yet 100% compatible with MS-DOS programs and commands." The
fb58af
   toolchest costs $19.95, with source code available for an
fb58af
   additional fee. Mix C's version of sed is not available separately.
fb58af

fb58af
   MKS (Mortice Kern Systems) Toolkit
fb58af
       http://www.mks.com
fb58af

fb58af
   Sed comes bundled with the MKS Toolkit, which is distributed only
fb58af
   as commercial software; it is not available separately.
fb58af

fb58af
   Thompson Automation Software
fb58af
       http://www.tasoft.com
fb58af

fb58af
   The Thompson Toolkit contains over 100 familiar Unix utilities,
fb58af
   including a version of the Unix Korn shell. It runs under MS-DOS,
fb58af
   OS/2, Win3.x, Win9x, and WinNT. Sed is one of the utilities, though
fb58af
   Thompson is better known for its version of awk for DOS, TAWK. The
fb58af
   toolkit runs about $150; sed is not available separately.
fb58af

fb58af
2.3. Where can I learn to use sed?
fb58af

fb58af
2.3.1. Books
fb58af

fb58af
       _Sed & Awk, 2d edition_, by Dale Dougherty & Arnold Robbins
fb58af
       (Sebastopol, Calif: O'Reilly and Associates, 1997)
fb58af
       ISBN 1-56592-225-5
fb58af
       http://www.oreilly.com/catalog/sed2/noframes.html
fb58af

fb58af
   About 40 percent of this book is devoted to sed, and maybe 50
fb58af
   percent is devoted to awk. The other 10 percent covers regexes and
fb58af
   concepts common to both tools. If you prefer hard copy, this is
fb58af
   definitely the best single place to learn to use sed, including its
fb58af
   advanced features.
fb58af

fb58af
   The first edition is also very useful. Several typos crept into the
fb58af
   first printing of the first edition (though if you follow the
fb58af
   tutorials closely, you'll recognize them right away). A list of
fb58af
   errors from the first printing of _sed & awk_ is available at
fb58af
   <http://www.cs.colostate.edu/~dzubera/sedawk.txt>, and errors in
fb58af
   the 2nd are at <http://www.cs.colostate.edu/~dzubera/sedawk2.txt>,
fb58af
   though most of these were corrected in later printings. The second
fb58af
   edition tells how POSIX standards have affected these tools and
fb58af
   covers the popular GNU versions of sed and awk. Price is about (US)
fb58af
   $30.00
fb58af

fb58af
   -----
fb58af

fb58af
       _Mastering Regular Expressions, 2d ed.,_ by Jeffrey E. F. Friedl
fb58af
       (Sebastopol, Calif: O'Reilly and Associates, 2002)
fb58af
       ISBN 0-596-00289-0
fb58af
       http://regex.info
fb58af
       http://www.oreilly.com/catalog/regex2/
fb58af
       http://public.yahoo.com/~jfriedl/regex/ (for the first edition)
fb58af

fb58af
   Knowing how to use "regular expressions" is essential to effective
fb58af
   use of most Unix tools. This book focuses on how regular
fb58af
   expressions can be best implemented in utilities such as perl, vi,
fb58af
   emacs, and awk, but also touches on sed as well. Friedl's home page
fb58af
   (above) gives links to other sites which help students learn to
fb58af
   master regular expressions. His site also gives a Perl script for
fb58af
   determining a syntactically valid e-mail address, using regexes:
fb58af

fb58af
       http://public.yahoo.com/~jfriedl/regex/code.html
fb58af

fb58af
   -----
fb58af

fb58af
       _Awk und Sed_, by Helmut Herold.
fb58af
       (Bonn: Addison-Wesley, 1994; 288 pages)
fb58af
       2nd edition to be released in March 2003
fb58af
       ISBN 3-8273-2094-1
fb58af
       http://www.addison-wesley.de/main/main.asp?page=home/bookdetails&ProductID=37214
fb58af

fb58af
2.3.2. Mailing list
fb58af

fb58af
   If you are interested in learning more about sed (its syntax, using
fb58af
   regular expressions, etc.) you are welcome to subscribe to a
fb58af
   sed-oriented mailing list. In fact, there are two mailing lists
fb58af
   about sed: one in English named "sed-users", moderated by Sven
fb58af
   Guckes; and one in Portuguese named "sed-BR" (for sed-Brazil),
fb58af
   moderated by Aurelio Marinho Jargas. The average volume of mail for
fb58af
   "sed-users" is about 35 messages a week; the average volume of mail
fb58af
   for "sed-BR" is about 15 messages a week.
fb58af

fb58af
       sed-BR mailing list:    http://br.groups.yahoo.com/group/sed-br/
fb58af
       sed-users mailing list: http://groups.yahoo.com/group/sed-users/
fb58af

fb58af
   To subscribe to sed-users, send a blank message to:
fb58af

fb58af
       sed-users-subscribe@yahoogroups.com
fb58af

fb58af
   To unsubscribe from sed-users, send a blank message to:
fb58af

fb58af
       sed-users-unsubscribe@yahoogroups.com
fb58af

fb58af
2.3.3. Tutorials, electronic text
fb58af

fb58af
   The original users manual for sed, by Lee E. McMahon, from the
fb58af
   7th edition UNIX Manual (1978), with the classic "Kubla Khan"
fb58af
   example and tutorial, in formatted text format:
fb58af
       http://sed.sourceforge.net/grabbag/tutorials/sed_mcmahon.txt
fb58af

fb58af
   The source code to the preceding manual. Use "troff -ms sed" to
fb58af
   print this file properly:
fb58af
       http://plan9.bell-labs.com/7thEdMan/vol2/sed
fb58af
       http://cm.bell-labs.com/7thEdMan/vol2/sed
fb58af

fb58af
   "Do It With Sed", by Carlos Duarte
fb58af
       http://www.dbnet.ece.ntua.gr/~george/sed/OLD/sedtut_1.html
fb58af

fb58af
   "Sed: How to use sed, a special editor for modifying files
fb58af
   automatically", by Bruce Barnett and General Electric Company
fb58af
       http://www.grymoire.com/Unix/Sed.html
fb58af

fb58af
   U-SEDIT2.ZIP, by Mike Arst (16 June 1990)
fb58af
       ftp://ftp.cs.umu.se/pub/pc/u-sedit2.zip
fb58af
       ftp://ftp.uni-stuttgart.de/pub/systems/msdos/util/unixlike/u-sedit2.zip
fb58af
       ftp://sunsite.icm.edu.pl/vol/wojsyl/garbo/pc/editor/u-sedit2.zip
fb58af
       ftp://ftp.sogang.ac.kr/pub/msdos/garbo_pc/editor/u-sedit2.zip
fb58af

fb58af
   U-SEDIT3.ZIP, by Mike Arst (24 Jan. 1992)
fb58af
       http://www.student.northpark.edu/pemente/sed/u-sedit3.zip
fb58af
       CompuServe DTPFORUM, "PC DTP Utilities" library, file SEDDOC.ZIP
fb58af

fb58af
   Another sed FAQ
fb58af
       http://www.dreamwvr.com/sed-info/sed-faq.html
fb58af

fb58af
   sed-tutorial, by Felix von Leitner
fb58af
       http://www.math.fu-berlin.de/~leitner/sed/tutorial.html
fb58af

fb58af
   "Manipulating text with sed," chapter 14 of the SCO OpenServer
fb58af
   "Operating System Users Guide"
fb58af
       http://ou800doc.caldera.com/SHL_automate/CTOC-Manipulating_text_with_sed.html
fb58af

fb58af
   "Combining the Bourne-shell, sed and awk in the UNIX environment
fb58af
   for language analysis," by Lothar Schmitt and Kiel Christianson.
fb58af
   This basic tutorial on the Bourne shell, sed and awk downloads as a
fb58af
   71-page PostScript file (compressed to 290K with gzip). You may
fb58af
   need to navigate down from the root to get the file.
fb58af
       ftp://ftp.u-aizu.ac.jp/u-aizu/doc/Tech-Report/1997/97-2-007.tar.gz
fb58af
       available upon request from Lothar Schmitt <lothar@u-aizu.ac.jp>
fb58af

fb58af
2.3.4. General web and ftp sites
fb58af

fb58af
       http://sed.sourceforge.net/grabbag             # Collected scripts
fb58af
       http://main.rtfiber.com.tw/~changyj/sed/       # Yao-Jen Chang
fb58af
       http://www.math.fu-berlin.de/~guckes/sed/      # Sven Guckes
fb58af
       http://www.math.fu-berlin.de/~leitner/sed/     # Felix von Leitner
fb58af
       http://www.dbnet.ece.ntua.gr/~george/sed/      # Yiorgos Adamopoulos
fb58af
       http://www.student.northpark.edu/pemente/sed/  # Eric Pement
fb58af

fb58af
       http://spacsun.rice.edu/FAQ/sed.html
fb58af
       ftp://algos.inesc.pt/pub/users/cdua/scripts.tar.gz (sed and shell scripts)
fb58af

fb58af
   "Handy One-Liners For Sed", compiled by Eric Pement. A large list
fb58af
   of 1-line sed commands which can be executed from the command line.
fb58af
       http://sed.sourceforge.net/sed1line.txt
fb58af
       http://www.student.northpark.edu/pemente/sed/sed1line.txt
fb58af

fb58af
   "Handy One-Liners For Sed", translated to Portuguese
fb58af
       http://wmaker.lrv.ufsc.br/sed_ptBR.html
fb58af

fb58af
   The Single UNIX Specification, Version 3 (technical man page)
fb58af
       http://www.opengroup.org/onlinepubs/007904975/utilities/sed.html
fb58af

fb58af
   Getting started with sed
fb58af
       http://www.cs.hmc.edu/tech_docs/qref/sed.html
fb58af

fb58af
   masm to gas converter
fb58af
       http://www.delorie.com/djgpp/faq/converting/asm2s-sed.html
fb58af

fb58af
   mail2html.zip
fb58af
       http://www.crispen.org/src/#mail2html
fb58af

fb58af
   sample uses of sed in batch files and scripts (Benny Pederson)
fb58af
       http://users.cybercity.dk/~bse26236/batutil/help/SED.HTM
fb58af

fb58af
   dc.sed - the most complex and impressive sed script ever written.
fb58af
   This sed script by Greg Ubben emulates the Unix dc (desk
fb58af
   calculator), including base conversion, exponentiation, square
fb58af
   roots, and much more.
fb58af
       http://sed.sourceforge.net/grabbag/scripts/dc_overview.htm
fb58af

fb58af
   If you should find other tutorials or scripts that should be added
fb58af
   to this document, please forward the URLs to the FAQ maintainer.
fb58af

fb58af
------------------------------
fb58af

fb58af
3. TECHNICAL
fb58af

fb58af
3.1. More detailed explanation of basic sed
fb58af

fb58af
   Sed takes a script of editing commands and applies each command, in
fb58af
   order, to each line of input. After all the commands have been
fb58af
   applied to the first line of input, that line is output. A second
fb58af
   input line is taken for processing, and the cycle repeats. Sed
fb58af
   scripts can address a single line by line number or by matching a
fb58af
   /RE pattern/ on the line. An exclamation mark '!' after a regex
fb58af
   ('/RE/!') or line number will select all lines that do NOT match
fb58af
   that address. Sed can also address a range of lines in the same
fb58af
   manner, using a comma to separate the 2 addresses.
fb58af

fb58af
     $d               # delete the last line of the file
fb58af
     /[0-9]\{3\}/p    # print lines with 3 consecutive digits
fb58af
     5!s/ham/cheese/  # except on line 5, replace 'ham' with 'cheese'
fb58af
     /awk/!s/aaa/bb/  # unless 'awk' is found, replace 'aaa' with 'bb'
fb58af
     17,/foo/d        # delete all lines from line 17 up to 'foo'
fb58af

fb58af
   Following an address or address range, sed accepts curly braces
fb58af
   '{...}' so several commands may be applied to that line or to the
fb58af
   lines matched by the address range. On the command line, semicolons
fb58af
   ';' separate each instruction and must precede the closing brace.
fb58af

fb58af
     sed '/Owner:/{s/yours/mine/g;s/your/my/g;s/you/me/g;}' file
fb58af

fb58af
   Range addresses operate differently depending on which version of
fb58af
   sed is used (see section 3.4, below). For further information on
fb58af
   using sed, consult the references in section 2.3, above.
fb58af

fb58af
3.1.1. Regular expressions on the left side of "s///"
fb58af

fb58af
   All versions of sed support Basic Regular Expressions (BREs). For
fb58af
   the syntax of BREs, enter "man ed" at a Unix shell prompt. A
fb58af
   technical description of BREs from IEEE POSIX 1003.1-2001 and the
fb58af
   Single UNIX Specification Version 3 is available online at:
fb58af
   http://www.opengroup.org/onlinepubs/007904975/basedefs/xbd_chap09.html#tag_09_03
fb58af

fb58af
   Sed normally supports BREs plus '\n' to match a newline in the
fb58af
   pattern space, plus '\xREx' as equivalent to '/RE/', where 'x' is any
fb58af
   character other than a newline or another backslash.
fb58af

fb58af
   Some versions of sed support supersets of BREs, or "extended
fb58af
   regular expressions", which offer additional metacharacters for
fb58af
   increased flexibility. For additional information on extended REs
fb58af
   in GNU sed, see sections 3.7 ("GNU/POSIX extensions to regular
fb58af
   expressions") and 6.7.3 ("Special syntax in REs"), below.
fb58af

fb58af
   Though not required by BREs, some versions of sed support \t to
fb58af
   represent a TAB, \r for carriage return, \xHH for direct entry of
fb58af
   hex codes, and so forth. Other versions of sed do not.
fb58af

fb58af
   ssed (super-sed) introduced many new features for LHS pattern
fb58af
   matching, too many to give here. The complete list is found in
fb58af
   section 6.7.3.H ("ssed"), below.
fb58af

fb58af
3.1.2. Escape characters on the right side of "s///"
fb58af

fb58af
   The right-hand side (the replacement part) in "s/find/replace/" is
fb58af
   almost always a string literal, with no interpolation of these
fb58af
   metacharacters:
fb58af

fb58af
       .   ^   $   [   ]   {   }   (   )  ?   +   *   |
fb58af

fb58af
   Three things *are* interpolated: ampersand (&), backreferences, and
fb58af
   options for special seds. An ampersand on the RHS is replaced by
fb58af
   the entire expression matched on the LHS. There is _never_ any
fb58af
   reason to use grouping like this:
fb58af

fb58af
       s/\(some-complex-regex\)/one two \1 three/
fb58af

fb58af
   since you can do this instead:
fb58af

fb58af
       s/some-complex-regex/one two & three/
fb58af

fb58af
   To enter a literal ampersand on the RHS, type '\&'.
fb58af

fb58af
   Grouping and backreferences: All versions of sed support grouping
fb58af
   and backreferences on the LHS and backreferences only on the RHS.
fb58af
   Grouping allows a series of characters to be collected in a set,
fb58af
   indicating the boundaries of the set with \( and \). Then the set
fb58af
   can be designated to be repeated a certain number of times
fb58af

fb58af
       \(like this\)*   or   \(like this\)\{5,7\}.
fb58af

fb58af
   Groups can also be nested "\(like \(this\) is here\)" and may
fb58af
   contain any valid RE. Backreferences repeat the contents of a
fb58af
   particular group, using a backslash and a digit (1-9) for each
fb58af
   corresponding group. In other words, "/\(pom\)\1/" is another way
fb58af
   of writing "/pompom/". If groups are nested, backreference numbers
fb58af
   are counted by matching \( in strict left to right order.  Thus,
fb58af
   /..\(the \(word\)\) \("foo"\)../ is matched by the backreference
fb58af
   \3. Backreferences can be used in the LHS, the RHS, and in normal
fb58af
   RE addressing (see section 3.3).  Thus,
fb58af

fb58af
       /\(.\)\1\(.\)\2\(.\)\3/;      # matches "bookkeeper"
fb58af
       /^\(.\)\(.\)\(.\)\3\2\1$/;    # finds 6-letter palindromes
fb58af

fb58af
   Seds differ in how they treat invalid backreferences where no
fb58af
   corresponding group occurs. To insert a literal ampersand or
fb58af
   backslash into the RHS, prefix it with a backslash: \& or \\.
fb58af

fb58af
   ssed, sed16, and sedmod permit additional options on the RHS. They
fb58af
   all support changing part of the replacement string to upper case
fb58af
   (\u or \U), lower case (\l or \L), or to end case conversion (\E).
fb58af
   Both sed16 and sedmod support awk-style word references ($1, $2,
fb58af
   $3, ...) and $0 to insert the entire line before conversion.
fb58af

fb58af
     echo ab ghi | sed16 "s/.*/$0 - \U$2/"   # prints "ab ghi - GHI"
fb58af

fb58af
   *Note:* This feature of sed16 and sedmod will break sed scripts which
fb58af
   put a dollar sign and digit into the RHS. Though this is an unlikely
fb58af
   combination, it's worth remembering if you use other people's scripts.
fb58af

fb58af
3.1.3.  Substitution switches
fb58af

fb58af
   Standard versions of sed support 4 main flags or switches which may
fb58af
   be added to the end of an "s///" command. They are:
fb58af

fb58af
       N      - Replace the Nth match of the pattern on the LHS, where
fb58af
                N is an integer between 1 and 512. If N is omitted,
fb58af
                the default is to replace the first match only.
fb58af
       g      - Global replace of all matches to the pattern.
fb58af
       p      - Print the results to stdout, even if -n switch is used.
fb58af
       w file - Write the pattern space to 'file' if a replacement was
fb58af
                done. If the file already exists when the script is
fb58af
                executed, it is overwritten. During script execution,
fb58af
                w appends to the file for each match.
fb58af

fb58af
   GNU sed 3.02 and ssed also offer the /I switch for doing a
fb58af
   case-insensitive match. For example,
fb58af

fb58af
     echo ONE TWO | gsed "s/one/unos/I"      # prints "unos TWO"
fb58af

fb58af
   GNU sed 4.x and ssed add the /M switch, to simplify working with
fb58af
   multi-line patterns: when it is used, ^ or $ will match BOL or EOL.
fb58af
   \` and \' remain available to match the start and end of pattern
fb58af
   space, respectively.
fb58af

fb58af
   ssed supports two more switches, /S and /X, when its Perl mode is
fb58af
   used. They are described in detail in section 6.7.3.H, below.
fb58af

fb58af
3.1.4. Command-line switches
fb58af

fb58af
   All versions of sed support two switches, -e and -n. Though sed
fb58af
   usually separates multiple commands with semicolons (e.g., "H;d;"),
fb58af
   certain commands could not accept a semicolon command separator.
fb58af
   These include :labels, 't', and 'b'. These commands had to occur
fb58af
   last in a script, separated by -e option switches. For example:
fb58af

fb58af
     # The 'ta' means jump to label :a if last s/// returns true
fb58af
     sed -e :a -e '$!N;s/\n=/ /;ta' -e 'P;D' file
fb58af

fb58af
   The -n switch turns off sed's default behavior of printing every
fb58af
   line. With -n, lines are printed only if explicitly told to. In
fb58af
   addition, for certain versions of sed, if an external script begins
fb58af
   with "#n" as its first two characters, the output is suppressed
fb58af
   (exactly as if -n had been entered on the command line). A list of
fb58af
   which versions appears in section 6.7.2., below.
fb58af

fb58af
   GNU sed 4.x and ssed support additional switches. -l (lowercase L),
fb58af
   followed by a number, lets you adjust the default length of the 'l'
fb58af
   and 'L' commands (note that these implementations of sed also
fb58af
   support an argument to these commands, to tailor the length
fb58af
   separately of each occurrence of the command).
fb58af

fb58af
   -i activates in-place editing (see section 4.41.1, below). -s
fb58af
   treats each file as a separate stream: sed by default joins all the
fb58af
   files, so $ represents the last line of the last file; 15 means the
fb58af
   15th line in the joined stream; and /abc/,/def/ might match across
fb58af
   files.
fb58af

fb58af
   When -s is used, however all addresses refer to single files. For
fb58af
   example, $ represents the last line of each input file; 15 means
fb58af
   the 15th line of each input file; and /abc/,/def/ will be "reset"
fb58af
   (in other words, sed will not execute the commands and start
fb58af
   looking for /abc/ again) if a file ends before /def/ has been
fb58af
   matched. Note that -i automatically activates this interpretation
fb58af
   of addresses.
fb58af

fb58af
3.2. Common one-line sed scripts
fb58af

fb58af
   A separate document of over 70 handy "one-line" sed commands is
fb58af
   available at
fb58af
       http://sed.sourceforge.net/sed1line.txt
fb58af

fb58af
   Here are several common sed commands for one-line use. MS-DOS users
fb58af
   should replace single quotes ('...') with double quotes ("...") in
fb58af
   these examples. A specific filename usually follows the script,
fb58af
   though the input may also come via piping or redirection.
fb58af

fb58af
   # Double space a file
fb58af
   sed G file
fb58af

fb58af
   # Triple space a file
fb58af
   sed 'G;G' file
fb58af

fb58af
   # Under UNIX: convert DOS newlines (CR/LF) to Unix format
fb58af
   sed 's/.$//' file    # assumes that all lines end with CR/LF
fb58af
   sed 's/^M$// file    # in bash/tcsh, press Ctrl-V then Ctrl-M
fb58af

fb58af
   # Under DOS: convert Unix newlines (LF) to DOS format
fb58af
   sed 's/$//' file                     # method 1
fb58af
   sed -n p file                        # method 2
fb58af

fb58af
   # Delete leading whitespace (spaces/tabs) from front of each line
fb58af
   # (this aligns all text flush left). '^t' represents a true tab
fb58af
   # character. Under bash or tcsh, press Ctrl-V then Ctrl-I.
fb58af
   sed 's/^[ ^t]*//' file
fb58af

fb58af
   # Delete trailing whitespace (spaces/tabs) from end of each line
fb58af
   sed 's/[ ^t]*$//' file               # see note on '^t', above
fb58af

fb58af
   # Delete BOTH leading and trailing whitespace from each line
fb58af
   sed 's/^[ ^t]*//;s/[ ^]*$//' file    # see note on '^t', above
fb58af

fb58af
   # Substitute "foo" with "bar" on each line
fb58af
   sed 's/foo/bar/' file        # replaces only 1st instance in a line
fb58af
   sed 's/foo/bar/4' file       # replaces only 4th instance in a line
fb58af
   sed 's/foo/bar/g' file       # replaces ALL instances within a line
fb58af

fb58af
   # Substitute "foo" with "bar" ONLY for lines which contain "baz"
fb58af
   sed '/baz/s/foo/bar/g' file
fb58af

fb58af
   # Delete all CONSECUTIVE blank lines from file except the first.
fb58af
   # This method also deletes all blank lines from top and end of file.
fb58af
   # (emulates "cat -s")
fb58af
   sed '/./,/^$/!d' file       # this allows 0 blanks at top, 1 at EOF
fb58af
   sed '/^$/N;/\n$/D' file     # this allows 1 blank at top, 0 at EOF
fb58af

fb58af
   # Delete all leading blank lines at top of file (only).
fb58af
   sed '/./,$!d' file
fb58af

fb58af
   # Delete all trailing blank lines at end of file (only).
fb58af
   sed -e :a -e '/^\n*$/{$d;N;};/\n$/ba' file
fb58af

fb58af
   # If a line ends with a backslash, join the next line to it.
fb58af
   sed -e :a -e '/\\$/N; s/\\\n//; ta' file
fb58af

fb58af
   # If a line begins with an equal sign, append it to the previous
fb58af
   # line (and replace the "=" with a single space).
fb58af
   sed -e :a -e '$!N;s/\n=/ /;ta' -e 'P;D' file
fb58af

fb58af
3.3. Addressing and address ranges
fb58af

fb58af
   Sed commands may have an optional "address" or "address range"
fb58af
   prefix. If there is no address or address range given, then the
fb58af
   command is applied to all the lines of the input file or text
fb58af
   stream. Three commands cannot take an address prefix:
fb58af

fb58af
      - labels, used to branch or jump within the script
fb58af
      - the close brace, '}', which ends the '{' "command"
fb58af
      - the '#' comment character, also technically a "command"
fb58af

fb58af
   An address can be a line number (such as 1, 5, 37, etc.), a regular
fb58af
   expression (written in the form /RE/ or \xREx where 'x' is any
fb58af
   character other than '\' and RE is the regular expression), or the
fb58af
   dollar sign ($), representing the last line of the file. An
fb58af
   exclamation mark (!) after an address or address range will apply
fb58af
   the command to every line EXCEPT the ones named by the address. A
fb58af
   null regex ("//") will be replaced by the last regex which was
fb58af
   used. Also, some seds do not support \xREx as regex delimiters.
fb58af

fb58af
     5d               # delete line 5 only
fb58af
     5!d              # delete every line except line 5
fb58af
     /RE/s/LHS/RHS/g  # substitute only if RE occurs on the line
fb58af
     /^$/b label      # if the line is blank, branch to ':label'
fb58af
     /./!b label      # ... another way to write the same command
fb58af
     \%.%!b label     # ... yet another way to write this command
fb58af
     $!N              # on all lines but the last, get the Next line
fb58af

fb58af
   Note that an embedded newline can be represented in an address by
fb58af
   the symbol \n, but this syntax is needed only if the script puts 2
fb58af
   or more lines into the pattern space via the N, G, or other
fb58af
   commands. The \n symbol does *not* match the newline at an
fb58af
   end-of-line because when sed reads each line into the pattern space
fb58af
   for processing, it strips off the trailing newline, processes the
fb58af
   line, and adds a newline back when printing the line to standard
fb58af
   output. To match the end-of-line, use the '$' metacharacter, as
fb58af
   follows:
fb58af

fb58af
     /tape$/       # matches the word 'tape' at the end of a line
fb58af
     /tape$deck/   # matches the word 'tape$deck' with a literal '$'
fb58af
     /tape\ndeck/  # matches 'tape' and 'deck' with a newline between
fb58af

fb58af
   The following sed commands usually accept *only* a single address.
fb58af
   All other commands (except labels, '}', and '#') accept both single
fb58af
   addresses and address ranges.
fb58af

fb58af
     =       print to stdout the line number of the current line
fb58af
     a       after printing the current line, append "text" to stdout
fb58af
     i       before printing the current line, insert "text" to stdout
fb58af
     q       quit after the current line is matched
fb58af
     r file  prints contents of "file" to stdout after line is matched
fb58af

fb58af
   Note that we said "usually." If you need to apply the '=', 'a',
fb58af
   'i', or 'r' commands to each and every line within an address
fb58af
   range, this behavior can be coerced by the use of braces. Thus,
fb58af
   "1,9=" is an invalid command, but "1,9{=;}" will print each line
fb58af
   number followed by its line for the first 9 lines (and then print
fb58af
   the rest of the rest of the file normally).
fb58af

fb58af
   Address ranges occur in the form
fb58af

fb58af
       <address1>,<address2>    or    <address1>,<address2>!
fb58af

fb58af
   where the address can be a line number or a standard /regex/.
fb58af
   <address2> can also be a dollar sign, indicating the end of file.
fb58af
   Under GNU sed 3.02+, ssed, and sed15+, <address2> may also be a
fb58af
   notation of the form +num, indicating the next _num_ lines after
fb58af
   <address1> is matched.
fb58af

fb58af
   Address ranges are:
fb58af

fb58af
   (1) Inclusive. The range "/From here/,/eternity/" matches all the
fb58af
   lines containing "From here" up to and including the line
fb58af
   containing "eternity". It will not stop on the line just prior to
fb58af
   "eternity". (If you don't like this, see section 4.24.)
fb58af

fb58af
   (2) Plenary. They always match full lines, not just parts of lines.
fb58af
   In other words, a command to change or delete an address range will
fb58af
   change or delete whole lines; it won't stop in the middle of a
fb58af
   line.
fb58af

fb58af
   (3) Multi-linear. Address ranges normally match 2 lines or more.
fb58af
   The second address will never match the same line the first address
fb58af
   did; therefore a valid address range always spans at least two
fb58af
   lines, with these exceptions which match only one line:
fb58af

fb58af
      - if the first address matches the last line of the file
fb58af
      - if using the syntax "/RE/,3" and /RE/ occurs only once in the
fb58af
        file at line 3 or below
fb58af
      - if using HHsed v1.5. See section 3.4.
fb58af

fb58af
   (4) Minimalist. In address ranges with /regex/ as <address2>, the
fb58af
   range "/foo/,/bar/" will stop at the first "bar" it finds, provided
fb58af
   that "bar" occurs on a line below "foo". If the word "bar" occurs
fb58af
   on several lines below the word "foo", the range will match all the
fb58af
   lines from the first "foo" up to the first "bar". It will not
fb58af
   continue hopping ahead to find more "bar"s. In other words, address
fb58af
   ranges are not "greedy," like regular expressions.
fb58af

fb58af
   (5) Repeating. An address range will try to match more than one
fb58af
   block of lines in a file. However, the blocks cannot nest. In
fb58af
   addition, a second match will not "take" the last line of the
fb58af
   previous block.  For example, given the following text,
fb58af

fb58af
       start
fb58af
       stop  start
fb58af
       stop
fb58af

fb58af
   the sed command '/start/,/stop/d' will only delete the first two
fb58af
   lines. It will not delete all 3 lines.
fb58af

fb58af
   (6) Relentless. If the address range finds a "start" match but
fb58af
   doesn't find a "stop", it will match every line from "start" to the
fb58af
   end of the file. Thus, beware of the following behaviors:
fb58af

fb58af
     /RE1/,/RE2/  # If /RE2/ is not found, matches from /RE1/ to the
fb58af
                  # end-of-file.
fb58af

fb58af
     20,/RE/      # If /RE/ is not found, matches from line 20 to the
fb58af
                  # end-of-file.
fb58af

fb58af
     /RE/,30      # If /RE/ occurs any time after line 30, each
fb58af
                  # occurrence will be matched in sed15+, sedmod, and
fb58af
                  # GNU sed v3.02+. GNU sed v2.05 and 1.18 will match
fb58af
                  # from the 2nd occurrence of /RE/ to the end-of-file.
fb58af

fb58af
   If these behaviors seem strange, remember that they occur because
fb58af
   sed does not look "ahead" in the file. Doing so would stop sed from
fb58af
   being a stream editor and have adverse effects on its efficiency.
fb58af
   If these behaviors are undesirable, they can be circumvented or
fb58af
   corrected by the use of nested testing within braces. The following
fb58af
   scripts work under GNU sed 3.02:
fb58af

fb58af
     # Execute your_commands on range "/RE1/,/RE2/", but if /RE2/ is
fb58af
     # not found, do nothing.
fb58af
     /RE1/{:a;N;/RE2/!ba;your_commands;}
fb58af

fb58af
     # Execute your_commands on range "20,/RE/", but if /RE/ is not
fb58af
     # found, do nothing.
fb58af
     20{:a;N;/RE/!ba;your_commands;}
fb58af

fb58af
   As a side note, once we've used N to "slurp" lines together to test
fb58af
   for the ending expression, the pattern space will have gathered
fb58af
   many lines (possibly thousands) together and concatenated them as a
fb58af
   single expression, with the \n sequence marking line breaks. The
fb58af
   REs *within* the pattern space may have to be modified (e.g., you
fb58af
   must write '/\nStart/' instead of '/^Start/' and '/[^\n]*/' instead
fb58af
   of '/.*/') and other standard sed commands will be unavailable or
fb58af
   difficult to use.
fb58af

fb58af
     # Execute your_commands on range "/RE/,30", but if /RE/ occurs
fb58af
     # on line 31 or later, do not match it.
fb58af
     1,30{/RE/,$ your_commands;}
fb58af

fb58af
   For related suggestions on using address ranges, see sections 4.2,
fb58af
   4.15, and 4.19 of this FAQ. Also, note the following section.
fb58af

fb58af
3.4. Address ranges in GNU sed and HHsed
fb58af

fb58af
   (1) GNU sed 3.02+, ssed, and sed15+ all support address ranges like:
fb58af

fb58af
       /regex/,+5
fb58af

fb58af
   which match /regex/ plus the next 5 lines (or EOF, whichever comes
fb58af
   first).
fb58af

fb58af
   (2) GNU sed v3.02.80 (and above) and ssed support address ranges of:
fb58af

fb58af
       0,/regex/
fb58af

fb58af
   as a special case to permit matching /regex/ if it occurs on the
fb58af
   first line. This syntax permits a range expression that matches
fb58af
   every line from the top of the file to the first instance of
fb58af
   /regex/, even if /regex/ is on the first line.
fb58af

fb58af
   (3) HHsed (sed15) has an exceptional way of implementing
fb58af

fb58af
       /regex1/,/regex2/
fb58af

fb58af
   If /RE1/ and /RE2/ both occur on the *same* line, HHsed will match
fb58af
   that single line. In other words, an address range block can
fb58af
   consist of just one line. HHsed will then look for the next
fb58af
   occurrence of /regex1/ to begin the block again.
fb58af

fb58af
   Every other version of sed (including sed16) requires 2 lines to
fb58af
   match an address range, and thus /regex1/ and /regex2/ cannot
fb58af
   successfully match just one line. See also the comments at
fb58af
   section 7.9.4, below.
fb58af

fb58af
   (4) BEGIN~STEP selection: ssed and GNU sed (v2.05 and above) offer
fb58af
   a form of addressing called "BEGIN~STEP selection". This is *not* a
fb58af
   range address, which selects an inclusive block of consecutive
fb58af
   lines from /start/ to /finish/. But I think it seems to belong here.
fb58af

fb58af
   Given an expression of the form "M~N", where M and N are integers,
fb58af
   GNU sed and ssed will select every Nth line, beginning at line M.
fb58af
   (With gsed v2.05, M had to be less than N, but this restriction is
fb58af
   no longer necessary). Both M and N may equal 0 ("0~0" selects every
fb58af
   line). These examples illustrate the syntax:
fb58af

fb58af
     sed '1~3d' file      # delete every 3d line, starting with line 1
fb58af
                          # deletes lines 1, 4, 7, 10, 13, 16, ...
fb58af

fb58af
     sed '0~3d' file      # deletes lines 3, 6, 9, 12, 15, 18, ...
fb58af

fb58af
     sed -n '2~5p' file   # print every 5th line, starting with line 2
fb58af
                          # prints lines 2, 7, 12, 17, 22, 27, ...
fb58af

fb58af
   (5) Finally, GNU sed v2.05 has a bug in range addressing (see
fb58af
   section 7.5), which was fixed in the higher versions.
fb58af

fb58af

fb58af
3.5. Debugging sed scripts
fb58af

fb58af
   The following two debuggers should make it easier to understand how
fb58af
   sed scripts operate. They can save hours of grief when trying to
fb58af
   determine the problems with a sed script.
fb58af

fb58af
   (1) sd (sed debugger), by Brian Hiles
fb58af

fb58af
   This debugger runs under a Unix shell, is powerful, and is easy to
fb58af
   use. sd has conditional breakpoints and spypoints of the pattern
fb58af
   space and hold space, on any scope defined by regex match and/or
fb58af
   script line number. It can be semi-automated, can save diagnostic
fb58af
   reports, and shows potential problems with a sed script before it
fb58af
   tries to execute it. The script is robust and requires the Unix
fb58af
   shell utilities plus the Bourne shell or Korn shell to execute.
fb58af

fb58af
       http://sed.sourceforge.net/grabbag/scripts/sd.ksh.txt (2003)
fb58af
       http://sed.sourceforge.net/grabbag/scripts/sd.sh.txt  (1998)
fb58af

fb58af
   (2) sedsed, by Aurelio Jargas
fb58af

fb58af
   This debugger requires Python to run it, and it uses your own
fb58af
   version of sed, whatever that may be. It displays the current input
fb58af
   line, the pattern space, and the hold space, before and after each
fb58af
   sed command is executed.
fb58af

fb58af
       http://sedsed.sourceforge.net
fb58af

fb58af

fb58af
3.6. Notes about s2p, the sed-to-perl translator
fb58af

fb58af
   s2p (sed to perl) is a Perl program to convert sed scripts into the
fb58af
   Perl programming language; it is included with many versions of
fb58af
   Perl. These problems have been found when using s2p:
fb58af

fb58af
   (1) Doesn't recognize the semicolon properly after s/// commands.
fb58af

fb58af
       s/foo/bar/g;
fb58af

fb58af
   (2) Doesn't trim trailing whitespace after s/// commands. Even lone
fb58af
   trailing spaces, without comments, produce an error.
fb58af

fb58af
   (3) Doesn't handle multiple commands within braces. E.g.,
fb58af

fb58af
       1,4{=;G;}
fb58af

fb58af
   will produce perl code with missing braces, and miss the second "G"
fb58af
   command as well. In fact, any commands after the first one are
fb58af
   missed in the perl output script, and the output perl script will
fb58af
   also contain mismatched braces.
fb58af

fb58af
3.7. GNU/POSIX extensions to regular expressions
fb58af

fb58af
   GNU sed supports "character classes" in addition to regular
fb58af
   character sets, such as [0-9A-F]. Like regular character sets,
fb58af
   character classes represent any single character within a set.
fb58af

fb58af
   "Character classes are a new feature introduced in the POSIX
fb58af
   standard. A character class is a special notation for describing
fb58af
   lists of characters that have a specific attribute, but where the
fb58af
   actual characters themselves can vary from country to country
fb58af
   and/or from character set to character set. For example, the notion
fb58af
   of what is an alphabetic character differs in the USA and in
fb58af
   France." [quoted from the docs for GNU awk v3.1.0.]
fb58af

fb58af
   Though character classes don't generally conserve space on the
fb58af
   line, they help make scripts portable for international use. The
fb58af
   equivalent character sets _for U.S. users_ follows:
fb58af

fb58af
     [[:alnum:]]  - [A-Za-z0-9]     Alphanumeric characters
fb58af
     [[:alpha:]]  - [A-Za-z]        Alphabetic characters
fb58af
     [[:blank:]]  - [ \x09]         Space or tab characters only
fb58af
     [[:cntrl:]]  - [\x00-\x19\x7F] Control characters
fb58af
     [[:digit:]]  - [0-9]           Numeric characters
fb58af
     [[:graph:]]  - [!-~]           Printable and visible characters
fb58af
     [[:lower:]]  - [a-z]           Lower-case alphabetic characters
fb58af
     [[:print:]]  - [ -~]           Printable (non-Control) characters
fb58af
     [[:punct:]]  - [!-/:-@[-`{-~]  Punctuation characters
fb58af
     [[:space:]]  - [ \t\v\f]       All whitespace chars
fb58af
     [[:upper:]]  - [A-Z]           Upper-case alphabetic characters
fb58af
     [[:xdigit:]] - [0-9a-fA-F]     Hexadecimal digit characters
fb58af

fb58af
   Note that [[:graph:]] does not match the space " ", but [[:print:]]
fb58af
   does. Some character classes may (or may not) match characters in
fb58af
   the high ASCII range (ASCII 128-255 or 0x80-0xFF), depending on
fb58af
   which C library was used to compile sed. For non-English languages,
fb58af
   [[:alpha:]] and other classes may also match high ASCII characters.
fb58af

fb58af
------------------------------
fb58af

fb58af
4. EXAMPLES
fb58af

fb58af
   ONE-CHARACTER QUESTIONS
fb58af

fb58af
4.1. How do I insert a newline into the RHS of a substitution?
fb58af

fb58af
   Several versions of sed permit '\n' to be typed directly into the
fb58af
   RHS, which is then converted to a newline on output: ssed,
fb58af
   gsed302a+, gsed103 (with the -x switch), sed15+, sedmod, and
fb58af
   UnixDOS sed. The _easiest_ solution is to use one of these
fb58af
   versions.
fb58af

fb58af
   For other versions of sed, try one of the following:
fb58af

fb58af
   (a) If typing the sed script from a Bourne shell, use one backslash
fb58af
   "\" if the script uses 'single quotes' or two backslashes "\\" if
fb58af
   the script requires "double quotes". In the example below, note
fb58af
   that the leading '>' on the 2nd line is generated by the shell to
fb58af
   prompt the user for more input. The user types in slash,
fb58af
   single-quote, and then ENTER to terminate the command:
fb58af

fb58af
     [sh-prompt]$ echo twolines | sed 's/two/& new\
fb58af
     >/'
fb58af
     two new
fb58af
     lines
fb58af
     [bash-prompt]$
fb58af

fb58af
   (b) Use a script file with one backslash '\' in the script,
fb58af
   immediately followed by a newline. This will embed a newline into
fb58af
   the "replace" portion. Example:
fb58af

fb58af
     sed -f newline.sed files
fb58af

fb58af
     # newline.sed
fb58af
     s/twolines/two new\
fb58af
     lines/g
fb58af

fb58af
   Some versions of sed may not need the trailing backslash. If so,
fb58af
   remove it.
fb58af

fb58af
   (c) Insert an unused character and pipe the output through tr:
fb58af

fb58af
     echo twolines | sed 's/two/& new=/' | tr "=" "\n"   # produces
fb58af
     two new
fb58af
     lines
fb58af

fb58af
   (d) Use the "G" command:
fb58af

fb58af
   G appends a newline, plus the contents of the hold space to the end
fb58af
   of the pattern space. If the hold space is empty, a newline is
fb58af
   appended anyway. The newline is stored in the pattern space as "\n"
fb58af
   where it can be addressed by grouping "\(...\)" and moved in the
fb58af
   RHS. Thus, to change the "twolines" example used earlier, the
fb58af
   following script will work:
fb58af

fb58af
     sed '/twolines/{G;s/\(two\)\(lines\)\(\n\)/\1\3\2/;}'
fb58af

fb58af
   (e) Inserting full lines, not breaking lines up:
fb58af

fb58af
   If one is not *changing* lines but only inserting complete lines
fb58af
   before or after a pattern, the procedure is much easier. Use the
fb58af
   "i" (insert) or "a" (append) command, making the alterations by an
fb58af
   external script. To insert "This line is new" BEFORE each line
fb58af
   matching a regex:
fb58af

fb58af
     /RE/i This line is new               # HHsed, sedmod, gsed 3.02a
fb58af
     /RE/{x;s/$/This line is new/;G;}     # other seds
fb58af

fb58af
   The two examples above are intended as "one-line" commands entered
fb58af
   from the console. If using a sed script, "i\" immediately followed
fb58af
   by a literal newline will work on all versions of sed. Furthermore,
fb58af
   the command "s/$/This line is new/" will only work if the hold
fb58af
   space is already empty (which it is by default).
fb58af

fb58af
   To append "This line is new" AFTER each line matching a regex:
fb58af

fb58af
     /RE/a This line is new               # HHsed, sedmod, gsed 3.02a
fb58af
     /RE/{G;s/$/This line is new/;}       # other seds
fb58af

fb58af
   To append 2 blank lines after each line matching a regex:
fb58af

fb58af
     /RE/{G;G;}                    # assumes the hold space is empty
fb58af

fb58af
   To replace each line matching a regex with 5 blank lines:
fb58af

fb58af
     /RE/{s/.*//;G;G;G;G;}         # assumes the hold space is empty
fb58af

fb58af
   (f) Use the "y///" command if possible:
fb58af

fb58af
   On some Unix versions of sed (not GNU sed!), though the s///
fb58af
   command won't accept '\n' in the RHS, the y/// command does. If
fb58af
   your Unix sed supports it, a newline after "aaa" can be inserted
fb58af
   this way (which is not portable to GNU sed or other seds):
fb58af

fb58af
     s/aaa/&~;; y/~/\n/;    # assuming no other '~' is on the line!
fb58af

fb58af
4.2. How do I represent control-codes or nonprintable characters?
fb58af

fb58af
   Several versions of sed support the notation \xHH, where "HH" are
fb58af
   two hex digits, 00-FF: ssed, GNU sed v3.02.80 and above, GNU sed
fb58af
   v1.03, sed16 and sed15 (HHsed). Try to use one of those versions.
fb58af

fb58af
   Sed is not intended to process binary or object code, and files
fb58af
   which contain nulls (0x00) will usually generate errors in most
fb58af
   versions of sed. The latest versions of GNU sed and ssed are an
fb58af
   exception; they permit nulls in the input files and also in
fb58af
   regexes.
fb58af

fb58af
   On Unix platforms, the 'echo' command may allow insertion of octal
fb58af
   or hex values, e.g., `echo "\0nnn"` or `echo -n "\0nnn"`. The echo
fb58af
   command may also support syntax like '\\b' or '\\t' for backspace
fb58af
   or tab characters. Check the man pages to see what syntax your
fb58af
   version of echo supports. Some versions support the following:
fb58af

fb58af
     # replace 0x1A (32 octal) with ASCII letters
fb58af
     sed 's/'`echo "\032"`'/Ctrl-Z/g'
fb58af

fb58af
     # note the 3 backslashes in the command below
fb58af
     sed "s/.`echo \\\b`//g"
fb58af

fb58af
4.3. How do I convert files with toggle characters, like +this+, to
fb58af
look like [i]this[/i]?
fb58af

fb58af
   Input files, especially message-oriented text files, often contain
fb58af
   toggle characters for emphasis, like ~this~, *this*, or =this=. Sed
fb58af
   can make the same input pattern produce alternating output each
fb58af
   time it is encountered. Typical needs might be to generate HMTL
fb58af
   codes or print codes for boldface, italic, or underscore. This
fb58af
   script accomodates multiple occurrences of the toggle pattern on
fb58af
   the same line, as well as cases where the pattern starts on one
fb58af
   line and finishes several lines later, even at the end of the file:
fb58af

fb58af
     # sed script to convert +this+ to [i]this[/i]
fb58af
     :a
fb58af
     /+/{ x;        # If "+" is found, switch hold and pattern space
fb58af
       /^ON/{       # If "ON" is in the (former) hold space, then ..
fb58af
         s///;      # .. delete it
fb58af
         x;         # .. switch hold space and pattern space back
fb58af
         s|+|[/i]|; # .. turn the next "+" into "[/i]"
fb58af
         ba;        # .. jump back to label :a and start over
fb58af
       }
fb58af
     s/^/ON/;       # Else, "ON" was not in the hold space; create it
fb58af
     x;             # Switch hold space and pattern space
fb58af
     s|+|[i]|;      # Turn the first "+" into "[i]"
fb58af
     ba;            # Branch to label :a to find another pattern
fb58af
     }
fb58af
     #---end of script---
fb58af

fb58af
   This script uses the hold space to create a "flag" to indicate
fb58af
   whether the toggle is ON or not. We have added remarks to
fb58af
   illustrate the script logic, but in most versions of sed remarks
fb58af
   are not permitted after 'b'ranch commands or labels.
fb58af

fb58af
   If you are sure that the +toggle+ characters never cross line
fb58af
   boundaries (i.e., never begin on one line and end on another), this
fb58af
   script can be reduced to one line:
fb58af

fb58af
     s|+\([^+][^+]*\)+|[i]\1[/i]|g
fb58af

fb58af
   If your toggle pattern contains regex metacharacters (such as '*'
fb58af
   or perhaps '+' or '?'), remember to quote them with backslashes.
fb58af

fb58af
   CHANGING STRINGS
fb58af

fb58af
4.10. How do I perform a case-insensitive search?
fb58af

fb58af
   Several versions of sed support case-insensitive matching: ssed and
fb58af
   GNU sed v3.02+ (with I flag after s/// or /regex/); sedmod with the
fb58af
   -i switch; and sed16 (which supports both types of switches).
fb58af

fb58af
   With other versions of sed, case-insensitive searching is awkward,
fb58af
   so people may use awk or perl instead, since these programs have
fb58af
   options for case-insensitive searches. In gawk/mawk, use "BEGIN
fb58af
   {IGNORECASE=1}" and in perl, "/regex/i". For other seds, here are
fb58af
   three solutions:
fb58af

fb58af
   Solution 1: convert everything to upper case and search normally
fb58af

fb58af
     # sed script, solution 1
fb58af
     h;          # copy the original line to the hold space
fb58af
                 # convert the pattern space to solid caps
fb58af
     y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/
fb58af
                 # now we can search for the word "CARLOS"
fb58af
     /CARLOS/ {
fb58af
          # add or insert lines. Note: "s/.../.../" will not work
fb58af
          # here because we are searching a modified pattern
fb58af
          # space and are not printing the pattern space.
fb58af
     }
fb58af
     x;          # get back the original pattern space
fb58af
                 # the original pattern space will be printed
fb58af
     #---end of sed script---
fb58af

fb58af
   Solution 2: search for both cases
fb58af

fb58af
   Often, proper names will either start with all lower-case ("unix"),
fb58af
   with an initial capital letter ("Unix") or occur in solid caps
fb58af
   ("UNIX"). There may be no need to search for every possibility.
fb58af

fb58af
     /UNIX/b match
fb58af
     /[Uu]nix/b match
fb58af

fb58af
   Solution 3: search for all possible cases
fb58af

fb58af
     # If you must, search for any possible combination
fb58af
     /[Ca][Aa][Rr][Ll][Oo][Ss]/ { ... }
fb58af

fb58af
   Bear in mind that as the pattern length increases, this solution
fb58af
   becomes an order of magnitude slower than the one of Solution 1, at
fb58af
   least with some implementations of sed.
fb58af

fb58af
4.11. How do I match only the first occurrence of a pattern?
fb58af

fb58af
   (1) The general solution is to use GNU sed or ssed, with one of
fb58af
   these range expressions. The first script ("print only the first
fb58af
   match") works with any version of sed:
fb58af

fb58af
     sed -n '/RE/{p;q;}' file       # print only the first match
fb58af
     sed '0,/RE/{//d;}' file        # delete only the first match
fb58af
     sed '0,/RE/s//to_that/' file   # change only the first match
fb58af

fb58af
   (2) If you cannot use GNU sed and if you *know* the pattern will
fb58af
   not occur on the first line, this will work:
fb58af

fb58af
     sed '1,/RE/{//d;}' file        # delete only the first match
fb58af
     sed '1,/RE/s//to_that/' file   # change only the first match
fb58af

fb58af
   (3) If you cannot use GNU sed and the pattern *might* occur on the
fb58af
   first line, use one of the following commands (credit for short GNU
fb58af
   script goes to Donald Bruce Stewart):
fb58af

fb58af
     sed '/RE/{x;/Y/!{s/^/Y/;h;d;};x;}' file       # delete (one way)
fb58af
     sed -e '/RE/{d;:a' -e '$!N;$ba' -e '}' file   # delete (another way)
fb58af
     sed '/RE/{d;:a;N;$ba;}' file                  # same script, GNU sed
fb58af
     sed -e '/RE/{s//to_that/;:a' -e '$!N;$!ba' -e '}' file  # change
fb58af

fb58af
   Still another solution, using a flag in the hold space. This is
fb58af
   portable to all seds and works if the pattern is on the first line:
fb58af

fb58af
     # sed script to change "foo" to "bar" only on the first occurrence
fb58af
     1{x;s/^/first/;x;}
fb58af
     1,/foo/{x;/first/s///;x;s/foo/bar/;}
fb58af
     #---end of script---
fb58af

fb58af
4.12. How do I parse a comma-delimited (CSV) data file?
fb58af

fb58af
   Comma-delimited data files can come in several forms, requiring
fb58af
   increasing levels of complexity in parsing and handling. They are
fb58af
   often referred to as CSV files (for "comma separated values") and
fb58af
   occasionally as SDF files (for "standard data format"). Note that
fb58af
   some vendors use "SDF" to refer to variable-length records with
fb58af
   comma-separated fields which are "double-quoted" if they contain
fb58af
   character values, while other vendors use "SDF" to designate
fb58af
   fixed-length records with fixed-length, nonquoted fields! (For help
fb58af
   with fixed-length fields, see question 4.23)
fb58af

fb58af
   The term "CSV" became a de-facto standard when Microsoft Excel used
fb58af
   it as an optional output file format.
fb58af

fb58af
   Here are 4 different forms you may encounter in comma-delimited data:
fb58af

fb58af
   (a) No quotes, no internal commas
fb58af

fb58af
       1001,John Smith,PO Box 123,Chicago,IL,60699
fb58af
       1002,Mary Jones,320 Main,Denver,CO,84100,
fb58af

fb58af
   (b) Like (a), with quotes around each field
fb58af

fb58af
       "1003","John Smith","PO Box 123","Chicago","IL","60699"
fb58af
       "1004","Mary Jones","320 Main","Denver","CO","84100"
fb58af

fb58af
   (c) Like (b), with embedded commas
fb58af

fb58af
       "1005","Tom Hall, Jr.","61 Ash Ct.","Niles","OH","44446"
fb58af
       "1006","Bob Davis","429 Pine, Apt. 5","Boston","MA","02128"
fb58af

fb58af
   (d) Like (c), with embedded commas and quotes
fb58af

fb58af
       "1007","Sue "Red" Smith","19 Main","Troy","MI","48055"
fb58af
       "1008","Joe "Hey, guy!" Hall","POB 44","Reno","NV","89504"
fb58af

fb58af
   In each example above, we have 7 fields and 6 commas which function
fb58af
   as field separators. Case (c) is a very typical form of these data
fb58af
   files, with double quotes used to enclose each field and to protect
fb58af
   internal commas (such as "Tom Hall, Jr.") from interpretation as
fb58af
   field separators. However, many times the data may include both
fb58af
   embedded quotation marks as well as embedded commas, as seen by
fb58af
   case (d), above.
fb58af

fb58af
   Case (d) is the closest to Microsoft CSV format. *However*, the
fb58af
   Microsoft CSV format allows embedded newlines within a
fb58af
   double-quoted field. If embedded newlines within fields are a
fb58af
   possibility for your data, you should consider using something
fb58af
   other than sed to work with the data file.
fb58af

fb58af
   Before handling a comma-delimited data file, make sure that you
fb58af
   fully understand its format and check the integrity of the data.
fb58af
   Does each line contain the same number of fields? Should certain
fb58af
   fields be composed only of numbers or of two-letter state
fb58af
   abbreviations in all caps? Sed (or awk or perl) should be used to
fb58af
   validate the integrity of the data file before you attempt to alter
fb58af
   it or extract particular fields from the file.
fb58af

fb58af
   After ensuring that each line has a valid number of fields, use sed
fb58af
   to locate and modify individual fields, using the \(...\) grouping
fb58af
   command where needed.
fb58af

fb58af
   In case (a):
fb58af

fb58af
     sed 's/^[^,]*,[^,]*,[^,]*,[^,]*,/.../'
fb58af
             ^     ^     ^
fb58af
             |     |     |_ 3rd field
fb58af
             |     |_______ 2nd field
fb58af
             |_____________ 1st field
fb58af

fb58af
     # Unix script to delete the second field for case (a)
fb58af
     sed 's/^\([^,]*\),[^,]*,/\1,,/' file
fb58af

fb58af
     # Unix script to change field 1 to 9999 for case (a)
fb58af
     sed 's/^[^,]*,/9999,/' file
fb58af

fb58af
   In cases (b) and (c):
fb58af

fb58af
     sed 's/^"[^"]*","[^"]*","[^"]*","[^"]*",/.../'
fb58af
              1st--   2nd--   3rd--   4th--
fb58af

fb58af
     # Unix script to delete the second field for case (c)
fb58af
     sed 's/^\("[^"]*"\),"[^"]*",/\1,"",/' file
fb58af

fb58af
     # Unix script to change field 1 to 9999 for case (c)
fb58af
     sed 's/^"[^"]*",/"9999",/' file
fb58af

fb58af

fb58af
   In case (d):
fb58af

fb58af
   One way to parse such files is to replace the 3-character field
fb58af
   separator "," with an unused character like the tab or vertical
fb58af
   bar. (Technically, the field separator is only the comma while the
fb58af
   fields are surrounded by "double quotes", but the net _effect_ is
fb58af
   that fields are separated by quote-comma-quote, with quote
fb58af
   characters added to the beginning and end of each record.) Search
fb58af
   your datafile _first_ to make sure that your character appears
fb58af
   nowhere in it!
fb58af

fb58af
     sed -n '/|/p' file        # search for any instance of '|'
fb58af
     # if it's not found, we can use the '|' to separate fields
fb58af

fb58af
   Then replace the 3-character field separator and parse as before:
fb58af

fb58af
     # sed script to delete the second field for case (d)
fb58af
     s/","/|/g;                  # global change of "," to bar
fb58af
     s/^\([^|]*\)|[^|]|/\1||/;   # delete 2nd field
fb58af
     s/|/","/g;                  # global change of bar back to ","
fb58af
     #---end of script---
fb58af

fb58af
     # sed script to change field 1 to 9999 for case (d)
fb58af
     # Remember to accommodate leading and trailing quote marks
fb58af
     s/","/|/g;
fb58af
     s/^[^|]*|/"9999|/;
fb58af
     s/|/","/g;
fb58af
     #---end of script---
fb58af

fb58af
   Note that this technique works only if _each_ and _every_ field is
fb58af
   surrounded with double quotes, including empty fields.
fb58af

fb58af
   The following solution is for more complex examples of (d), such
fb58af
   as: not all fields contain "double-quote" marks, or the presence of
fb58af
   embedded "double-quote" marks within fields, or extraneous
fb58af
   whitespace around field delimiters. (Thanks to Greg Ubben for this
fb58af
   script!)
fb58af

fb58af
     # sed script to convert case (d) to bar-delimited records
fb58af
     s/^ *\(.*[^ ]\) *$/|\1|/;
fb58af
     s/" *, */"|/g;
fb58af
     : loop
fb58af
     s/| *\([^",|][^,|]*\) *, */|\1|/g;
fb58af
     s/| *, */|\1|/g;
fb58af
     t loop
fb58af
     s/  *|/|/g;
fb58af
     s/|  */|/g;
fb58af
     s/^|\(.*\)|$/\1/;
fb58af
     #---end of script---
fb58af

fb58af
   For example, it turns this (which is badly-formed but legal):
fb58af

fb58af
   first,"",unquoted ,""this" is, quoted " ,, sub "quote" inside, f", lone  " empty:
fb58af

fb58af
   into this:
fb58af

fb58af
   first|""|unquoted|""this" is, quoted "||sub "quote" inside|f"|lone  "   empty:
fb58af

fb58af
   Note that the script preserves the "double-quote" marks, but
fb58af
   changes only the commas where they are used as field separators. I
fb58af
   have used the vertical bar "|" because it's easier to read, but you
fb58af
   may change this to another field separator if you wish.
fb58af

fb58af
   If your CSV datafile is more complex, it would probably not be
fb58af
   worth the effort to write it in sed. For such a case, you should
fb58af
   use Perl with a dedicated CSV module (there are at least two recent
fb58af
   CSV parsers available from CPAN).
fb58af

fb58af
4.13. How do I handle fixed-length, columnar data?
fb58af

fb58af
   Sed handles fixed-length fields via \(grouping\) and backreferences
fb58af
   (\1, \2, \3 ...). If we have 3 fields of 10, 25, and 9 characters
fb58af
   per field, our sed script might look like so:
fb58af

fb58af
     s/^\(.\{10\}\)\(.\{25\}\)\(.\{9\}\)/\3\2\1/;  # Change the fields
fb58af
        ^^^^^^^^^^^~~~~~~~~~~~==========           #   from 1,2,3 to 3,2,1
fb58af
         field #1   field #2   field #3
fb58af

fb58af
   This is a bit hard to read. By using GNU sed or ssed with the -r
fb58af
   switch active, it can look like this:
fb58af

fb58af
     s/^(.{10})(.{25})(.{9})/\3\2\1/;          # Using the -r switch
fb58af

fb58af
   To delete a field in sed, use grouping and omit the backreference
fb58af
   from the field to be deleted. If the data is long or difficult to
fb58af
   work with, use ssed with the -R switch and the /x flag after an s///
fb58af
   command, to insert comments and remarks about the fields.
fb58af

fb58af
   For records with many fields, use GNU awk with the FIELDWIDTHS
fb58af
   variable set in the top of the script. For example:
fb58af

fb58af
     awk 'BEGIN{FIELDWIDTHS = "10 25 9"}; {print $3 $2 $1}' file
fb58af

fb58af
   This is much easier to read than a similar sed script, especially
fb58af
   if there are more than 5 or 6 fields to manipulate.
fb58af

fb58af
4.14. How do I commify a string of numbers?
fb58af

fb58af
   Use the simplest script necessary to accomplish your task. As
fb58af
   variations of the line increase, the sed script must become more
fb58af
   complex to handle additional conditions. Whole numbers are
fb58af
   simplest, followed by decimal formats, followed by embedded words.
fb58af

fb58af
   Case 1: simple strings of whole numbers separated by spaces or
fb58af
   commas, with an optional negative sign. To convert this:
fb58af

fb58af
       4381, -1222333, and 70000: - 44555666 1234567890 words
fb58af
       56890  -234567, and 89222  -999777  345888777666 chars
fb58af

fb58af
   to this:
fb58af

fb58af
       4,381, -1,222,333, and 70,000: - 44,555,666 1,234,567,890 words
fb58af
       56,890  -234,567, and 89,222  -999,777  345,888,777,666 chars
fb58af

fb58af
   use one of these one-liners:
fb58af

fb58af
     sed ':a;s/\B[0-9]\{3\}\>/,&/;ta'                      # GNU sed
fb58af
     sed -e :a -e 's/\(.*[0-9]\)\([0-9]\{3\}\)/\1,\2/;ta'  # other seds
fb58af

fb58af
   Case 2: strings of numbers which may have an embedded decimal
fb58af
   point, separated by spaces or commas, with an optional negative
fb58af
   sign. To change this:
fb58af

fb58af
       4381,  -6555.1212 and 70000,  7.18281828  44906982.071902
fb58af
       56890   -2345.7778 and 8.0000:  -49000000 -1234567.89012
fb58af

fb58af
   to this:
fb58af

fb58af
       4,381,  -6,555.1212 and 70,000,  7.18281828  44,906,982.071902
fb58af
       56,890   -2,345.7778 and 8.0000:  -49,000,000 -1,234,567.89012
fb58af

fb58af
   use the following command for GNU sed:
fb58af

fb58af
     sed ':a;s/\(^\|[^0-9.]\)\([0-9]\+\)\([0-9]\{3\}\)/\1\2,\3/g;ta'
fb58af

fb58af
   and for other versions of sed:
fb58af

fb58af
     sed -f case2.sed files
fb58af

fb58af
     # case2.sed
fb58af
     s/^/ /;                 # add space to start of line
fb58af
     :a
fb58af
     s/\( [-0-9]\{1,\}\)\([0-9]\{3\}\)/\1,\2/g
fb58af
     ta
fb58af
     s/ //;                  # remove space from start of line
fb58af
     #---end of script---
fb58af

fb58af
4.15. How do I prevent regex expansion on substitutions?
fb58af

fb58af
   Sometimes you want to *match* regular expression metacharacters as
fb58af
   literals (e.g., you want to match "[0-9]" or "\n"), to be replaced
fb58af
   with something else. The ordinary way to prevent expanding
fb58af
   metacharacters is to prefix them with a backslash. Thus, if "\n"
fb58af
   matches a newline, "\\n" will match the two-character string of
fb58af
   'backslash' followed by 'n'.
fb58af

fb58af
   But doing this repeatedly can become tedious if there are many
fb58af
   regexes. The following script will replace alternating strings of
fb58af
   literals, where no character is interpreted as a regex
fb58af
   metacharacter:
fb58af

fb58af
     # filename: sub_quote.sed
fb58af
     #   author: Paolo Bonzini
fb58af
     # sed script to add backslash to find/replace metacharacters
fb58af
     N;                  # add even numbered line to pattern space
fb58af
     s,[]/\\$*[],\\&,;;  # quote all of [, ], /, \, $, or *
fb58af
     s,^,s/,;            # prepend "s/" to front of pattern space
fb58af
     s,$,/,;             # append "/" to end of pattern space
fb58af
     s,\n,/,;            # change "\n" to "/", making s/from/to/
fb58af
     #---end of script---
fb58af

fb58af
   Here's a sample of how sub_quote.sed might be used. This example
fb58af
   converts typical sed regexes to perl-style regexes. The input file
fb58af
   consists of 10 lines:
fb58af

fb58af
       [0-9]
fb58af
       \d
fb58af
       [^0-9]
fb58af
       \D
fb58af
       \+
fb58af
       +
fb58af
       \?
fb58af
       ?
fb58af
       \|
fb58af
       |
fb58af

fb58af
   Run the command "sed -f sub_quote.sed input", to transform the
fb58af
   input file (above) to 5 lines of output:
fb58af

fb58af
       s/\[0-9\]/\\d/
fb58af
       s/\[^0-9\]/\\D/
fb58af
       s/\\+/+/
fb58af
       s/\\?/?/
fb58af
       s/\\|/|/
fb58af

fb58af
   The above file is itself a sed script, which can then be used to
fb58af
   modify other files.
fb58af

fb58af
4.16. How do I convert a string to all lowercase or capital letters?
fb58af

fb58af
   The easiest method is to use a new version of GNU sed, ssed, sedmod
fb58af
   or sed16 and employ the \U, \L, or other switches on the right side
fb58af
   of an s/// command. For example, to convert any word which begins
fb58af
   with "reg" or "exp" into solid capital letters:
fb58af

fb58af
       sed -r "s/\<(reg|exp)[a-z]+/\U&/g"              # gsed4.+ or ssed
fb58af
       sed "s/\
fb58af

fb58af
   As you can see, sedmod and sed16 do not support alternation (|),
fb58af
   but they do support case conversion. If none of these versions of
fb58af
   sed are available to you, some sample scripts for this task are
fb58af
   available from the Seder's Grab Bag:
fb58af

fb58af
       http://sed.sourceforge.net/grabbag/scripts
fb58af

fb58af
   Note that some case conversion scripts are listed under "Filename
fb58af
   manipulation" and others are under "Text formatting."
fb58af

fb58af
   CHANGING BLOCKS (consecutive lines)
fb58af

fb58af
4.20. How do I change only one section of a file?
fb58af

fb58af
   You can match a range of lines by line number, by regexes (say, all
fb58af
   lines between the words "from" and "until"), or by a combination of
fb58af
   the two. For multiple substitutions on the same range, put the
fb58af
   command(s) between braces {...}. For example:
fb58af

fb58af
     # replace only between lines 1 and 20
fb58af
     1,20 s/Johnson/White/g
fb58af

fb58af
     # replace everywhere EXCEPT between lines 1 and 20
fb58af
     1,20 !s/Johnson/White/g
fb58af

fb58af
     # replace only between words "from" and "until". Note the
fb58af
     # use of \<....\> as word boundary markers in GNU sed.
fb58af
     /from/,/until/ { s/\<red\>/magenta/g; s/\<blue\>/cyan/g; }
fb58af

fb58af
     # replace only from the words "ENDNOTES:" to the end of file
fb58af
     /ENDNOTES:/,$ { s/Schaff/Herzog/g; s/Kraft/Ebbing/g; }
fb58af

fb58af
   For technical details on using address ranges, see section 3.3
fb58af
   ("Addressing and Address ranges").
fb58af

fb58af
4.21. How do I delete or change a block of text if the block contains
fb58af
      a certain regular expression?
fb58af

fb58af
   The following deletes the block between 'start' and 'end'
fb58af
   inclusively, if and only if the block contains the string
fb58af
   'regex'. Written by Russell Davies, with additional comments:
fb58af

fb58af
     # sed script to delete a block if /regex/ matches inside it
fb58af
     :t
fb58af
     /start/,/end/ {    # For each line between these block markers..
fb58af
        /end/!{         #   If we are not at the /end/ marker
fb58af
           $!{          #     nor the last line of the file,
fb58af
              N;        #     add the Next line to the pattern space
fb58af
              bt
fb58af
           }            #   and branch (loop back) to the :t label.
fb58af
        }               # This line matches the /end/ marker.
fb58af
        /regex/d;       # If /regex/ matches, delete the block.
fb58af
     }                  # Otherwise, the block will be printed.
fb58af
     #---end of script---
fb58af

fb58af
   Note: When the script above reaches /regex/, the entire multi-line
fb58af
   block is in the pattern space. To replace items inside the block,
fb58af
   use "s///". To change the entire block, use the 'c' (change)
fb58af
   command:
fb58af

fb58af
     /regex/c\
fb58af
     1: This will replace the entire block\
fb58af
     2: with these two lines of text.
fb58af

fb58af
4.22. How do I locate a paragraph of text if the paragraph contains a
fb58af
      certain regular expression?
fb58af

fb58af
   Assume that paragraphs are separated by blank lines. For regexes
fb58af
   that are single terms, use one of the following scripts:
fb58af

fb58af
     sed -e '/./{H;$!d;}' -e 'x;/regex/!d'      # most seds
fb58af
     sed '/./{H;$!d;};x;/regex/!d'              # GNU sed
fb58af

fb58af
   To print paragraphs only if they contain 3 specific regular
fb58af
   expressions (RE1, RE2, and RE3), in any order in the paragraph:
fb58af

fb58af
     sed -e '/./{H;$!d;}' -e 'x;/RE1/!d;/RE2/!d;/RE3/!d'
fb58af

fb58af
   With this solution and the preceding one, if the paragraphs are
fb58af
   excessively long (more than 4k in length), you may overflow sed's
fb58af
   internal buffers. If using HHsed, you must add a "G;" command
fb58af
   immediately after the "x;" in the scripts above to defeat a bug
fb58af
   in HHsed (see section 7.9(5), below, for a description).
fb58af

fb58af
4.23. How do I match a block of _specific_ consecutive lines?
fb58af

fb58af
   There are three ways to approach this problem:
fb58af

fb58af
       (1) Try to use a "/range/, /expression/"
fb58af
       (2) Try to use a "/multi-line\nexpression/"
fb58af
       (3) Try to use a block of "literal strings"
fb58af

fb58af
   We describe each approach in the following sections.
fb58af

fb58af
4.23.1.  Try to use a "/range/, /expression/"
fb58af

fb58af
   If the block of lines are strings that *never change their order*
fb58af
   and if the top line never occurs outside the block, like this:
fb58af

fb58af
       Abel
fb58af
       Baker
fb58af
       Charlie
fb58af
       Delta
fb58af

fb58af
   then these solutions will work for deleting the block:
fb58af

fb58af
     sed 's/^Abel$/{N;N;N;d;}' files    # for blocks with few lines
fb58af
     sed '/^Abel$/, /^Zebra$/d' files   # for blocks with many lines
fb58af
     sed '/^Abel$/,+25d' files          # HHsed, sedmod, ssed, gsed 3.02.80
fb58af

fb58af
   To change the block, use the 'c' (change) command instead of 'd'.
fb58af
   To print that block only, use the -n switch and 'p' (print) instead
fb58af
   of 'd'. To change some things inside the block, try this:
fb58af

fb58af
     /^Abel$/,/^Delta$/ {
fb58af
         :ack
fb58af
         N;
fb58af
         /\nDelta$/! b ack
fb58af
         # At this point, all the lines in the block are collected
fb58af
         s/ubstitute /somethin/g;
fb58af
     }
fb58af

fb58af
4.23.2.  Try to use a "multi-line\nexpression"
fb58af

fb58af
   If the top line of the block sometimes appears alone or is
fb58af
   sometimes followed by other lines, or if a partial block may occur
fb58af
   somewhere in the file, a multi-line expression may be required.
fb58af

fb58af
   In these examples, we give solutions for matching an N-line block.
fb58af
   The expression "/^RE1\nRE2\nRE3...$/" represents a properly formed
fb58af
   regular expression where \n indicates a newline between lines. Note
fb58af
   that the 'N' followed by the 'P;D;' commands forms a "sliding
fb58af
   window" technique. A window of N lines is formed. If the multi-line
fb58af
   pattern matches, the block is handled. If not, the top line is
fb58af
   printed and then deleted from the pattern space, and we try to
fb58af
   match at the next line.
fb58af

fb58af
     # sed script to delete 2 consecutive lines: /^RE1\nRE2$/
fb58af
     $b
fb58af
     /^RE1$/ {
fb58af
       $!N
fb58af
       /^RE1\nRE2$/d
fb58af
       P;D
fb58af
     }
fb58af
     #---end of script---
fb58af

fb58af
     # sed script to delete 3 consecutive lines. (This script
fb58af
     # fails under GNU sed v2.05 and earlier because of the 't'
fb58af
     # bug when s///n is used; see section 7.5(1) of the FAQ.)
fb58af
     : more
fb58af
     $!N
fb58af
     s/\n/&/;;
fb58af
     t enough
fb58af
     $!b more
fb58af
     : enough
fb58af
     /^RE1\nRE2\nRE3$/d
fb58af
     P;D
fb58af
     #---end of script---
fb58af

fb58af
   For example, to delete a block of 5 consecutive lines, the previous
fb58af
   script must be altered in only two places:
fb58af

fb58af
   (1) Change the 2 in "s/\n/&/;;" to a 4 (the trailing semicolon is
fb58af
   needed to work around a bug in HHsed v1.5).
fb58af

fb58af
   (2) Change the regex line to "/^RE1\nRE2\nRE3\nRE4\nRE5$/d",
fb58af
   modifying the expression as needed.
fb58af

fb58af
   Suppose we want to delete a block of two blank lines followed by
fb58af
   the word "foo" followed by another blank line (4 lines in all).
fb58af
   Other blank lines and other instances of "foo" should be left
fb58af
   alone. After changing the '2' to a '3' (always one number less than
fb58af
   the total number of lines), the regex line would look like this:
fb58af
   "/^\n\nfoo\n$/d". (Thanks to Greg Ubben for this script.)
fb58af

fb58af
   As an alternative to work around the 't' bug in older versions of
fb58af
   GNU sed, the following script will delete 4 consecutive lines:
fb58af

fb58af
     # sed script to delete 4 consecutive lines. Use this if you
fb58af
     # require GNU sed 2.05 and below.
fb58af
     /^RE1$/!b
fb58af
     $!N
fb58af
     $!N
fb58af
     :a
fb58af
     $b
fb58af
     N
fb58af
     /^RE1\nRE2\nRE3\nRE4$/d
fb58af
     P
fb58af
     s/^.*\n\(.*\n.*\n.*\)$/\1/
fb58af
     ba
fb58af
     #---end of script---
fb58af

fb58af
   Its drawback is that it must be modified in 3 places instead of 2
fb58af
   to adapt it for more lines, and as additional lines are added, the
fb58af
   's' command is forced to work harder to match the regexes. On the
fb58af
   other hand, it avoids a bug with gsed-2.05 and illustrates another
fb58af
   way to solve the problem of deleting consecutive lines.
fb58af

fb58af
4.23.3.  Try to use a block of "literal strings"
fb58af

fb58af
   If you need to match a static block of text (which may occur any
fb58af
   number of times throughout a file), where the contents of the block
fb58af
   are known in advance, then this script is easy to use. It requires
fb58af
   an intermediate file, which we will call "findrep.txt" (below):
fb58af

fb58af
       A block of several consecutive lines to
fb58af
       be matched literally should be placed on
fb58af
       top. Regular expressions like .*  or [a-z]
fb58af
       will lose their special meaning and be
fb58af
       interpreted literally in this block.
fb58af
       ----
fb58af
       Four hyphens separate the two sections. Put
fb58af
       the replacement text in the lower section.
fb58af
       As above, sed symbols like &, \n, or \1 will
fb58af
       lose their special meaning.
fb58af

fb58af
   This is a 3-step process. A generic script called "blockrep.sed"
fb58af
   will read "findrep.txt" (above) and generate a custom script, which
fb58af
   is then used on the actual input file. In other words,
fb58af
   "findrep.txt" is a simplified description of the editing that you
fb58af
   want to do on the block, and "blockrep.sed" turns it into actual
fb58af
   sed commands.
fb58af

fb58af
   Use this process from a Unix shell or from a DOS prompt:
fb58af

fb58af
     sed -nf blockrep.sed findrep.txt >custom.sed
fb58af
     sed -f custom.sed input.file >output.file
fb58af
     erase custom.sed
fb58af

fb58af
   The generic script "blockrep.sed" follows below. It's fairly long.
fb58af
   Examining its output might help you understanding how to use the
fb58af
   _sliding window_ technique.
fb58af

fb58af
     # filename: blockrep.sed
fb58af
     #   author: Paolo Bonzini
fb58af
     # Requires:
fb58af
     #    (1) blocks to find and replace, e.g., findrep.txt
fb58af
     #    (2) an input file to be changed, input.file
fb58af
     #
fb58af
     # blockrep.sed creates a second sed script, custom.sed,
fb58af
     # to find the lines above the row of 4 hyphens, globally
fb58af
     # replacing them with the lower block of text. GNU sed
fb58af
     # is recommended but not required for this script.
fb58af
     #
fb58af
     # Loop on the first part, accumulating the `from' text
fb58af
     # into the hold space.
fb58af
     :a
fb58af
     /^----$/! {
fb58af
        # Escape slashes, backslashes, the final newline and
fb58af
        # regular expression metacharacters.
fb58af
        s,[/\[.*],\\&,g
fb58af
        s/$/\\/
fb58af
        H
fb58af
        #
fb58af
        # Append N cmds needed to maintain the sliding window.
fb58af
        x
fb58af
        1 s,^.,s/,
fb58af
        1! s/^/N\
fb58af
     /
fb58af
        x
fb58af
        n
fb58af
        ba
fb58af
     }
fb58af
     #
fb58af
     # Change the final backslash to a slash to separate the
fb58af
     # two sides of the s command.
fb58af
     x
fb58af
     s,\\$,/,
fb58af
     x
fb58af
     #
fb58af
     # Until EOF, gather the substitution into hold space.
fb58af
     :b
fb58af
     n
fb58af
     s,[/\],\\&,g
fb58af
     $! s/$/\\/
fb58af
     H
fb58af
     $! bb
fb58af
     #
fb58af
     # Start the RHS of the s command without a leading
fb58af
     # newline, add the P/D pair for the sliding window, and
fb58af
     # print the script.
fb58af
     g
fb58af
     s,/\n,/,
fb58af
     s,$,/\
fb58af
     P\
fb58af
     D,p
fb58af
     #---end of script---
fb58af

fb58af
4.24. How do I address all the lines between RE1 and RE2, excluding the
fb58af
      lines themselves?
fb58af

fb58af
   Normally, to address the lines between two regular expressions, RE1
fb58af
   and RE2, one would do this: '/RE1/,/RE2/{commands;}'. Excluding
fb58af
   those lines takes an extra step. To put 2 arrows before each line
fb58af
   between RE1 and RE2, except for those lines:
fb58af

fb58af
     sed '1,/RE1/!{ /RE2/,/RE1/!s/^/>>/; }' input.fil
fb58af

fb58af
   The preceding script, though short, may be difficult to follow. It
fb58af
   also requires that /RE1/ cannot occur on the first line of the
fb58af
   input file. The following script, though it's not a one-liner, is
fb58af
   easier to read and it permits /RE1/ to appear on the first line:
fb58af

fb58af
     # sed script to replace all lines between /RE1/ and /RE2/,
fb58af
     # without matching /RE1/ or /RE2/
fb58af
     /RE1/,/RE2/{
fb58af
       /RE1/b
fb58af
       /RE2/b
fb58af
       s/^/>>/
fb58af
     }
fb58af
     #---end of script---
fb58af

fb58af
   Contents of input.fil:         Output of sed script:
fb58af
      aaa                           aaa
fb58af
      bbb                           bbb
fb58af
      RE1                           RE1
fb58af
      aaa                           >>aaa
fb58af
      bbb                           >>bbb
fb58af
      ccc                           >>ccc
fb58af
      RE2                           RE2
fb58af
      end                           end
fb58af

fb58af
4.25. How do I join two lines if line #1 ends in a [certain string]?
fb58af

fb58af
   This question appears in the section on one-line sed scripts, but
fb58af
   it comes up so many times that it needs a place here also. Suppose
fb58af
   a line ends with a particular string (often, a line ends with a
fb58af
   backslash). How do you bring up the second line after it, even in
fb58af
   cases where several consecutive lines all end in a backslash?
fb58af

fb58af
     sed -e :a -e '/\\$/N; s/\\\n//; ta' file   # all seds
fb58af
     sed ':a; /\\$/N; s/\\\n//; ta' file        # GNU sed, ssed, HHsed
fb58af

fb58af
   Note that this replaces the backslash-newline with nothing. You may
fb58af
   want to replace the backslash-newline with a single space instead.
fb58af

fb58af
4.26. How do I join two lines if line #2 begins in a [certain string]?
fb58af

fb58af
   The inverse situation is another FAQ. Suppose a line begins with a
fb58af
   particular string. How do you bring that line up to follow the
fb58af
   previous line? In this example, we want to match the string "<<="
fb58af
   at the beginning of one line, bring that line up to the end of the
fb58af
   line before it, and replace the string with a single space:
fb58af

fb58af
     sed -e :a -e '$!N;s/\n<<=/ /;ta' -e 'P;D' file   # all seds
fb58af
     sed ':a; $!N;s/\n<<=/ /;ta;P;D' file             # GNU, ssed, sed15+
fb58af

fb58af
4.27. How do I change all paragraphs to long lines?
fb58af

fb58af
   A frequent request is how to convert DOS-style textfiles, in which
fb58af
   each line ends with "paragraph marker", to Microsoft-style
fb58af
   textfiles, in which the "paragraph" marker only appears at the end
fb58af
   of real paragraphs. Sometimes this question is framed as, "How do I
fb58af
   remove the hard returns at the end of each line in a paragraph?"
fb58af

fb58af
   The problem occurs because newer word processors don't work the
fb58af
   same way older text editors did. Older text editors used a newline
fb58af
   (CR/LF in DOS; LF alone in Unix) to end each line on screen or on
fb58af
   disk, and used two newlines to separate paragraphs. Certain word
fb58af
   processors wanted to make paragraph reformatting and reflowing work
fb58af
   easily, so they use one newline to end a paragraph and never allow
fb58af
   newlines _within_ a paragraph. This means that textfiles created
fb58af
   with standard editors (Emacs, vi, Vedit, Boxer, etc.) appear to
fb58af
   have "hard returns" at inappropriate places. The following sed
fb58af
   script finds blocks of consecutive nonblank lines (i.e., paragraphs
fb58af
   of text), and converts each block into one long line with one "hard
fb58af
   return" at the end.
fb58af

fb58af
     # sed script to change all paragraphs to long lines
fb58af
     /./{H; $!d;}             # Put each paragraph into hold space
fb58af
     x;                       # Swap hold space and pattern space
fb58af
     s/^\(\n\)\(..*\)$/\2\1/; # Move leading \n to end of PatSpace
fb58af
     s/\n\(.\)/ \1/g;         # Replace all other \n with 1 space
fb58af
     # Uncomment the following line to remove excess blank lines:
fb58af
     # /./!d;
fb58af
     #---end of sed script---
fb58af

fb58af
   If the input files have formatting or indentation that conveys
fb58af
   special meaning (like program source code), this script will remove
fb58af
   it. But if the text still needs to be extended, try 'par'
fb58af
   (paragraph reformatter) or the 'fmt' utility with the -t or -c
fb58af
   switches and the width option (-w) set to a number like 9999.
fb58af

fb58af
   SHELL AND ENVIRONMENT
fb58af

fb58af
4.30. How do I read environment variables with sed?
fb58af

fb58af
4.30.1. - on Unix platforms
fb58af

fb58af
   In Unix, environment variables begin with a dollar sign, such as
fb58af
   $TERM, $PATH, $var or $i. In sed, the dollar sign is used to
fb58af
   indicate the last line of the input file, the end of a line (in the
fb58af
   LHS), or a literal symbol (in the RHS). Sed cannot access variables
fb58af
   directly, so one must pay attention to shell quoting requirements
fb58af
   to expand the variables properly.
fb58af

fb58af
   To ALLOW the Unix shell to interpret the dollar sign, put the
fb58af
   script in double quotes:
fb58af

fb58af
     sed "s/_terminal-type_/$TERM/g" input.file >output.file
fb58af

fb58af
   To PREVENT the Unix shell from interpreting the dollar sign as a
fb58af
   shell variable, put the script in single quotes:
fb58af

fb58af
     sed 's/.$//' infile >outfile
fb58af

fb58af
   To use BOTH Unix $environment_vars and sed /end-of-line$/ pattern
fb58af
   matching, there are two solutions. (1) The easiest is to enclose
fb58af
   the script in "double quotes" so the shell can see the $variables,
fb58af
   and to prefix the sed metacharacter ($) with a backslash. Thus, in
fb58af

fb58af
     sed "s/$user\$/root/" file
fb58af

fb58af
   the shell interpolates $user and sed interprets \$ as the symbol
fb58af
   for end-of-line.
fb58af

fb58af
   (2) Another method--somewhat less readable--is to concatenate the
fb58af
   script with 'single quotes' where the $ should not be interpolated
fb58af
   and "double quotes" where variable interpolation should occur. To
fb58af
   demonstrate using the preceding script:
fb58af

fb58af
     sed "s/$user"'$/root/' file
fb58af

fb58af
   Solution #1 seems easier to remember. In either case, we search for
fb58af
   the user's name (stored in a variable called $user) when it occurs
fb58af
   at the end of the line ($), and substitute the word "root" in all
fb58af
   matches.
fb58af

fb58af
   For longer shell scripts, it is sometimes useful to begin with
fb58af
   single quote marks ('), close them upon encountering the variable,
fb58af
   enclose the variable name in double quotes ("), and resume with
fb58af
   single quotes, closing them at the end of the sed script.  Example:
fb58af

fb58af
     #! /bin/sh
fb58af
     # sed script to illustrate 'quote'"matching"'usage'
fb58af
     FROM='abcdefgh'
fb58af
     TO='ABCDEFGH'
fb58af
     sed -e '
fb58af
     y/'"$FROM"'/'"$TO"'/;    # note the quote pairing
fb58af
     # some more commands go here . . .
fb58af
     # last line is a single quote mark
fb58af
     '
fb58af

fb58af
   Thus, each variable named $FROM is replaced by $TO, and the single
fb58af
   quotes are used to glue the multiple lines together in the script.
fb58af
   (See also section 4.10, "How do I handle shell quoting in sed?")
fb58af

fb58af
4.30.2. - on MS-DOS and 4DOS platforms
fb58af

fb58af
   Under 4DOS and MS-DOS version 7.0 (Win95) or 7.10 (Win95 OSR2),
fb58af
   environment variables can be accessed from the command prompt.
fb58af
   Under MS-DOS v6.22 and below, environment variables can only be
fb58af
   accessed from within batch files. Environment variables should be
fb58af
   enclosed between percent signs and are case-insensitive; i.e.,
fb58af
   %USER% or %user% will display the USER variable. To generate a true
fb58af
   percent sign, just enter it twice.
fb58af

fb58af
   DOS versions of sed require that sed scripts be enclosed by double
fb58af
   quote marks "..." (not single quotes!) if the script contains
fb58af
   embedded tabs, spaces, redirection arrows or the vertical bar. In
fb58af
   fact, if the input for sed comes from piping, a sed script should
fb58af
   not contain a vertical bar, even if it is protected by double
fb58af
   quotes (this seems to be bug in the normal MS-DOS syntax). Thus,
fb58af

fb58af
       echo blurk | sed "s/^/ |foo /"     # will cause an error
fb58af
       sed "s/^/ |foo /" blurk.txt        # will work as expected
fb58af

fb58af
   Using DOS environment variables which contain DOS path statements
fb58af
   (such as a TMP variable set to "C:\TEMP") within sed scripts is
fb58af
   discouraged because sed will interpret the backslash '\' as a
fb58af
   metacharacter to "quote" the next character, not as a normal
fb58af
   symbol. Thus,
fb58af

fb58af
       sed "s/^/%TMP% /" somefile.txt
fb58af

fb58af
   will not prefix each line with (say) "C:\TEMP ", but will prefix
fb58af
   each line with "C:TEMP "; sed will discard the backslash, which is
fb58af
   probably not what you want. Other variables such as %PATH% and
fb58af
   %COMSPEC% will also lose the backslash within sed scripts.
fb58af

fb58af
   Environment variables which do not use backslashes are usually
fb58af
   workable. Thus, all the following should work without difficulty,
fb58af
   if they are invoked from within DOS batch files:
fb58af

fb58af
       sed "s/=username=/%USER%/g" somefile.txt
fb58af
       echo %FILENAME% | sed "s/\.TXT/.BAK/"
fb58af
       grep -Ei "%string%" somefile.txt | sed "s/^/  /"
fb58af

fb58af
   while from either the DOS prompt or from within a batch file,
fb58af

fb58af
       sed "s/%%/ percent/g" input.fil >output.fil
fb58af

fb58af
   will replace each percent symbol in a file with " percent" (adding
fb58af
   the leading space for readability).
fb58af

fb58af
4.31. How do I export or pass variables back into the environment?
fb58af

fb58af
4.31.1. - on Unix platforms
fb58af

fb58af
   Suppose that line #1, word #2 of the file 'terminals' contains a
fb58af
   value to be put in your TERM environment variable. Sed cannot
fb58af
   export variables directly to the shell, but it can pass strings to
fb58af
   shell commands. To set a variable in the Bourne shell:
fb58af

fb58af
       TERM=`sed 's/^[^ ][^ ]* \([^ ][^ ]*\).*/\1/;q' terminals`;
fb58af
       export TERM
fb58af

fb58af
   If the second word were "Wyse50", this would send the shell command
fb58af
   "TERM=Wyse50".
fb58af

fb58af
4.31.2. - on MS-DOS or 4DOS platforms
fb58af

fb58af
   Sed cannot directly manipulate the environment. Under DOS, only
fb58af
   batch files (.BAT) can do this, using the SET instruction, since
fb58af
   they are run directly by the command shell. Under 4DOS, special
fb58af
   4DOS commands (such as ESET) can also alter the environment.
fb58af

fb58af
   Under DOS or 4DOS, sed can select a word and pass it to the SET
fb58af
   command. Suppose you want the 1st word of the 2nd line of MY.DAT
fb58af
   put into an environment variable named %PHONE%. You might do this:
fb58af

fb58af
       @echo off
fb58af
       sed -n "2 s/^\([^ ][^ ]*\) .*/SET PHONE=\1/p;3q" MY.DAT > GO_.BAT
fb58af
       call GO_.BAT
fb58af
       echo The environment variable for PHONE is %PHONE%
fb58af
       :: cleanup
fb58af
       del GO_.BAT
fb58af

fb58af
   The sed script assumes that the first character on the 2nd line is
fb58af
   not a space and uses grouping \(...\) to save the first string of
fb58af
   non-space characters as \1 for the RHS. In writing any batch files,
fb58af
   make sure that output filenames such as GO_.BAT don't overwrite
fb58af
   preexisting files of the same name.
fb58af

fb58af
4.32. How do I handle Unix shell quoting in sed?
fb58af

fb58af
   To embed a literal single quote (') in a script, use (a) or (b):
fb58af

fb58af
   (a) If possible, put the script in double quotes:
fb58af

fb58af
     sed "s/cannot/can't/g" file
fb58af

fb58af
   (b) If the script must use single quotes, then close-single-quote
fb58af
   the script just before the SPECIAL single quote, prefix the single
fb58af
   quote with a backslash, and use a 2nd pair of single quotes to
fb58af
   finish marking the script. Thus:
fb58af

fb58af
     sed 's/cannot$/can'\''t/g' file
fb58af

fb58af
   Though this looks hard to read, it breaks down to 3 parts:
fb58af

fb58af
      's/cannot$/can'   \'   't/g'
fb58af
      ---------------   --   -----
fb58af

fb58af
   To embed a literal double quote (") in a script, use (a) or (b):
fb58af

fb58af
   (a) If possible, put the script in single quotes. You don't need to
fb58af
   prefix the double quotes with anything. Thus:
fb58af

fb58af
     sed 's/14"/fourteen inches/g' file
fb58af

fb58af
   (b) If the script must use double quotes, then prefix the SPECIAL
fb58af
   double quote with a backslash (\). Thus,
fb58af

fb58af
     sed "s/$length\"/$length inches/g" file
fb58af

fb58af
   To embed a literal backslash (\) into a script, enter it twice:
fb58af

fb58af
     sed 's/C:\\DOS/D:\\DOS/g' config.sys
fb58af

fb58af
   FILES, DIRECTORIES, AND PATHS
fb58af

fb58af
4.40. How do I read (insert/add) a file at the top of a textfile?
fb58af

fb58af
   Normally, adding a "header" file to the top of a "body" file is
fb58af
   done from the command prompt before passing the file on to sed.
fb58af
   (MS-DOS below version 6.0 must use COPY and DEL instead of MOVE in
fb58af
   the following example.)
fb58af

fb58af
       copy header.txt+body temp                  # MS-DOS command 1
fb58af
       echo Y | move temp body                    # MS-DOS command 2
fb58af
                                                    #
fb58af
       cat header.txt body >temp; mv temp body    # Unix commands
fb58af

fb58af
   However, if inserting the file must occur within sed, there is a
fb58af
   way. The sed command "1 r header.txt" will not work; it will print
fb58af
   line 1 and then insert "header.txt" between lines 1 and 2. The
fb58af
   following script solves this problem; however, there must be at
fb58af
   least 2 lines in the target file for the script to work properly.
fb58af

fb58af
     # sed script to insert "header.txt" above the first line
fb58af
     1{h; r header.txt
fb58af
       D; }
fb58af
     2{x; G; }
fb58af
     #---end of sed script---
fb58af

fb58af
4.41. How do I make substitutions in every file in a directory, or in
fb58af
      a complete directory tree?
fb58af

fb58af
4.41.1. - ssed and Perl solution
fb58af

fb58af
   The best solution for multiple files in a single directory is to
fb58af
   use ssed or gsed v4.0 or higher:
fb58af

fb58af
     sed -i.BAK 's|foo|bar|g' files       # -i does in-place replacement
fb58af

fb58af
   If you don't have ssed, there is a similar solution in Perl. (Yes,
fb58af
   we know this is a FAQ file for sed, not perl, but perl is more
fb58af
   common than ssed for many users.)
fb58af

fb58af
     perl -pi.bak -e 's|foo|bar|g' files                # or
fb58af
     perl -pi.bak -e 's|foo|bar|g' `find /pathname -name "filespec"`
fb58af

fb58af
   For each file in the filelist, sed (or Perl) renames the source
fb58af
   file to "filename.bak"; the modified file gets the original
fb58af
   filename. Remove '.bak' if you don't need backup copies. (Note the
fb58af
   use of "s|||" instead of "s///" here, and in the scripts below. The
fb58af
   vertical bars in the 's' command let you replace '/some/path' with
fb58af
   '/another/path', accommodating slashes in the LHS and RHS.)
fb58af

fb58af
   To recurse directories in Unix or GNU/Linux:
fb58af

fb58af
     # We use xargs to prevent passing too many filenames to sed, but
fb58af
     # this command will fail if filenames contain spaces or newlines.
fb58af
     find /my/path -name '*.ht' -print | xargs sed -i.BAK 's|foo|bar|g'
fb58af

fb58af
   To recurse directories under Windows 2000 (CMD.EXE or COMMAND.COM):
fb58af

fb58af
     # This syntax isn't supported under Windows 9x COMMAND.COM
fb58af
     for /R c:\my\path %f in (*.htm) do sed -i.BAK "s|foo|bar|g" %f
fb58af

fb58af
4.41.2. - Unix solution
fb58af

fb58af
   For all files in a single directory, assuming they end with *.txt
fb58af
   and you have no files named "[anything].txt.bak" already, use a
fb58af
   shell script:
fb58af

fb58af
     #! /bin/sh
fb58af
     # Source files are saved as "filename.txt.bak" in case of error
fb58af
     # The '&&' after cp is an additional safety feature
fb58af
     for file in *.txt
fb58af
     do
fb58af
        cp $file $file.bak &&
fb58af
        sed 's|foo|bar|g' $file.bak >$file
fb58af
     done
fb58af

fb58af
   To do an entire directory tree, use the Unix utility find, like so
fb58af
   (thanks to Jim Dennis <jadestar@rahul.net> for this script):
fb58af

fb58af
     #! /bin/sh
fb58af
     # filename: replaceall
fb58af
     # Backup files are NOT saved in this script.
fb58af
     find . -type f -name '*.txt' -print | while read i
fb58af
     do
fb58af
        sed 's|foo|bar|g' $i > $i.tmp && mv $i.tmp $i
fb58af
     done
fb58af

fb58af
   This previous shell script recurses through the directory tree,
fb58af
   finding only files in the directory (not symbolic links, which will
fb58af
   be encountered by the shell command "for file in *.txt", above). To
fb58af
   preserve file permissions and make backup copies, use the 2-line cp
fb58af
   routine of the earlier script instead of "sed ... && mv ...". By
fb58af
   replacing the sed command 's|foo|bar|g' with something like
fb58af

fb58af
     sed "s|$1|$2|g" ${i}.bak > $i
fb58af

fb58af
   using double quotes instead of single quotes, the user can also
fb58af
   employ positional parameters on the shell script command tail, thus
fb58af
   reusing the script from time to time. For example,
fb58af

fb58af
       replaceall East West
fb58af

fb58af
   would modify all your *.txt files in the current directory.
fb58af

fb58af
4.41.3. - DOS solution:
fb58af

fb58af
   MS-DOS users should use two batch files like this:
fb58af

fb58af
      @echo off
fb58af
      :: MS-DOS filename: REPLACE.BAT
fb58af
      ::
fb58af
      :: Create a destination directory to put the new files.
fb58af
      :: Note: The next command will fail under Novel Netware
fb58af
      :: below version 4.10 unless "SHOW DOTS=ON" is active.
fb58af
      if not exist .\NEWFILES\NUL mkdir NEWFILES
fb58af
      for %%f in (*.txt) do CALL REPL_2.BAT %%f
fb58af
      echo Done!!
fb58af
      :: ---End of first batch file---
fb58af

fb58af
      @echo off
fb58af
      :: MS-DOS filename: REPL_2.BAT
fb58af
      ::
fb58af
      sed "s/foo/bar/g" %1 > NEWFILES\%1
fb58af
      :: ---End of the second batch file---
fb58af

fb58af
   When finished, the current directory contains all the original
fb58af
   files, and the newly-created NEWFILES subdirectory contains the
fb58af
   modified *.TXT files. Do not attempt a command like
fb58af

fb58af
       for %%f in (*.txt) do sed "s/foo/bar/g" %%f >NEWFILES\%%f
fb58af

fb58af
   under any version of MS-DOS because the output filename will be
fb58af
   created as a literal '%f' in the NEWFILES directory before the
fb58af
   %%f is expanded to become each filename in (*.txt). This occurs
fb58af
   because MS-DOS creates output filenames via redirection commands
fb58af
   before it expands "for..in..do" variables.
fb58af

fb58af
   To recurse through an entire directory tree in MS-DOS requires a
fb58af
   batch file more complex than we have room to describe. Examine the
fb58af
   file SWEEP.BAT in Timo Salmi's great archive of batch tricks,
fb58af
   located at <ftp://garbo.uwasa.fi/pc/link/tsbat.zip> (this file is
fb58af
   regularly updated). Another alternative is to get an external
fb58af
   program designed for directory recursion. Here are some recommended
fb58af
   programs for directory recursion. The first one, FORALL, runs under
fb58af
   either OS/2 or DOS. Unfortunately, none of these supports Win9x
fb58af
   long filenames.
fb58af

fb58af
       http://hobbes.nmsu.edu/pub/os2/util/disk/forall72.zip
fb58af
       ftp://garbo.uwasa.fi/pc/filefind/target15.zip
fb58af

fb58af
4.42. How do I replace "/some/UNIX/path" in a substitution?
fb58af

fb58af
   Technically, the normal meaning of the slash can be disabled by
fb58af
   prefixing it with a backslash. Thus,
fb58af

fb58af
     sed 's/\/some\/UNIX\/path/\/a\/new\/path/g' files
fb58af

fb58af
   But this is hard to read and write. There is a better solution.
fb58af
   The s/// substitution command allows '/' to be replaced by any
fb58af
   other character (including spaces or alphanumerics). Thus,
fb58af

fb58af
     sed 's|/some/UNIX/path|/a/new/path|g' files
fb58af

fb58af
   and if you are using variable names in a Unix shell script,
fb58af

fb58af
     sed "s|$OLDPATH|$NEWPATH|g" oldfile >newfile
fb58af

fb58af
4.43. How do I replace "C:\SOME\DOS\PATH" in a substitution?
fb58af

fb58af
   For MS-DOS users, every backslash must be doubled. Thus, to replace
fb58af
   "C:\SOME\DOS\PATH" with "D:\MY\NEW\PATH":
fb58af

fb58af
     sed "s|C:\\SOME\\DOS\\PATH|D:\\MY\\NEW\\PATH|g" infile >outfile
fb58af

fb58af
   Remember that DOS pathnames are not case sensitive and can appear
fb58af
   in upper or lower case in the input file. If this concerns you, use
fb58af
   a version of sed which can ignore case when matching (gsed, ssed,
fb58af
   sedmod, sed16).
fb58af

fb58af
       @echo off
fb58af
       :: sample MS-DOS batch file to alter path statements
fb58af
       :: requires GNU sed with the /i flag for s///
fb58af
       set old=C:\\SOME\\DOS\\PATH
fb58af
       set new=D:\\MY\\NEW\\PATH
fb58af
       gsed "s|%old%|%new%|gi" infile >outfile
fb58af
       :: or
fb58af
       ::     sedmod -i "s|%old%|%new%|g" infile >outfile
fb58af
       set old=
fb58af
       set new=
fb58af

fb58af
   Also, remember that under Windows long filenames may be stored in
fb58af
   two formats: e.g., as "C:\Program Files" or as "C:\PROGRA~1".
fb58af

fb58af
4.44.  How do I emulate file-includes, using sed?
fb58af

fb58af
   Given an input file with file-include statements, similar to
fb58af
   C-style includes or "server-side includes" (SSI) of this format:
fb58af

fb58af
       This is the source file. It's short.
fb58af
       Its name is simply 'source'. See the script below.
fb58af
       
fb58af
              And this is any amount of text between
fb58af
       
fb58af
       This is the last line of the file.
fb58af

fb58af
   How do we direct sed to import/insert whichever files are at the
fb58af
   point of the 'file="filename"' token? First, use this file:
fb58af

fb58af
     #n
fb58af
     # filename: incl.sed
fb58af
     # Comments supported by GNU sed or ssed. Leading '#n' should
fb58af
     # be on line 1, columns 1-2 of the line.
fb58af
     /
fb58af
       =;                     #   print the line number
fb58af
       s/^[^"]*"/{r /;        #   change pattern to 'r{ '
fb58af
       s/".*//p;              #   delete rest to EOL, print
fb58af
                              #   and a(ppend) a delete command
fb58af
       a\
fb58af
       d;}
fb58af
     }
fb58af
     #---end of sed script---
fb58af

fb58af
   Second, use the following shell script or DOS batch file (if
fb58af
   running a DOS batch file, use "double quotes" instead of 'single
fb58af
   quotes', and use "del" instead of "rm" to remove the temp file):
fb58af

fb58af
     sed -nf incl.sed source | sed 'N;N;s/\n//' >temp.sed
fb58af
     sed -f temp.sed source >target
fb58af
     rm temp.sed
fb58af

fb58af
   If you have GNU sed or ssed, you can reduce the script even further
fb58af
   (thanks to Michael Carmack for the reminder):
fb58af

fb58af
     sed -nf incl.sed source | sed 'N;N;s/\n//' | sed -f - source >target
fb58af

fb58af
   In brief, the script replaces each filename with a 'r filename'
fb58af
   command to insert the file at that point, while omitting the
fb58af
   extraneous material. Two important things to note with this script:
fb58af
   (1) There should be only one '#include file' directive per line, and
fb58af
   (2) each '#include file' directive must be the *only* thing on that
fb58af
   line, because everything else on the line will be deleted.
fb58af

fb58af
   Though the script uses GNU sed or ssed because of the great support
fb58af
   for embedded script comments, it should run on any version of sed.
fb58af
   If not, write me and let me know.
fb58af

fb58af
------------------------------
fb58af

fb58af
5. WHY ISN'T THIS WORKING?
fb58af

fb58af
5.1. Why don't my variables like $var get expanded in my sed script?
fb58af

fb58af
   Because your sed script uses 'single quotes' instead of "double
fb58af
   quotes." Unix shells never expand $variables in single quotes.
fb58af

fb58af
   This is probably the most frequently-asked sed question. For more
fb58af
   info on using variables, see section 4.30.
fb58af

fb58af
5.2. I'm using 'p' to print, but I have duplicate lines sometimes.
fb58af

fb58af
   Sed prints the entire file by default, so the 'p' command might
fb58af
   cause the duplicate lines. If you want the whole file printed,
fb58af
   try removing the 'p' from commands like 's/foo/bar/p'. If you want
fb58af
   part of the file printed, run your sed script with -n flag to
fb58af
   suppress normal output, and rewrite the script to get all output
fb58af
   from the 'p' comand.
fb58af

fb58af
   If you're still getting duplicate lines, you are probably finding
fb58af
   several matches for the same line. Suppose you want to print lines
fb58af
   with the words "Peter" or "James" or "John", but not the same line
fb58af
   twice. The following command will fail:
fb58af

fb58af
     sed -n '/Peter/p; /James/p; /John/p' files
fb58af

fb58af
   Since all 3 commands of the script are executed for each line,
fb58af
   you'll get extra lines. A better way is to use the 'd' (delete) or
fb58af
   'b' (branch) commands, like so (with GNU sed):
fb58af

fb58af
     sed '/Peter/b; /James/b; /John/b; d' files          # one way
fb58af
     sed -n '/Peter/{p;d;};/James/{p;d;};/John/p' files  # a 2nd way
fb58af
     sed -n '/Peter/{p;b;};/James/{p;b;};/John/p' files  # a 3rd way
fb58af
     sed '/Peter\|James\|John/!d' files                  # shortest way
fb58af

fb58af
   On standard seds, these must be broken down with -e commands:
fb58af

fb58af
     sed -e '/Peter/b' -e '/James/b' -e '/John/b' -e d files
fb58af
     sed -n -e '/Peter/{p;d;}' -e '/James/{p;d;}' -e '/John/p' files
fb58af

fb58af
   The 3rd line would require too many -e commands to fit on one line,
fb58af
   since standard versions of sed require an -e command after each 'b'
fb58af
   and also after each closing brace '}'.
fb58af

fb58af
5.3. Why does my DOS version of sed process a file part-way through
fb58af
     and then quit?
fb58af

fb58af
   First, look for errors in the script. Have you used the -n switch
fb58af
   without telling sed to print anything to the console? Have you read
fb58af
   the docs to your version of sed to see if it has a syntax you may
fb58af
   have misused? (Look for an N or H command that gathers too much.)
fb58af

fb58af
   Next, if you are sure your sed script is valid, a probable cause is
fb58af
   an end-of-file marker embedded in the file. An EOF marker (SUB) is
fb58af
   a Control-Z character, with the value of 1A hex (26 decimal). As
fb58af
   soon as any DOS version of sed encounters a Ctrl-Z character, sed
fb58af
   stops processing.
fb58af

fb58af
   To locate the EOF character, use Vern Buerg's shareware file viewer
fb58af
   LIST.COM <http://www.buerg.com/list.html>. In text mode, look for a
fb58af
   right-arrow symbol; in hex mode (Alt-H), look for a 1A code. With
fb58af
   Unix utilities ported to DOS, use 'od' (octal dump) to display
fb58af
   hexcodes in your file, and then use sed to locate the offending
fb58af
   character:
fb58af

fb58af
       od -txC badfile.txt | sed -n "/ 1a /p; / 1a$/p"
fb58af

fb58af
   Then edit the input file to remove the offending character(s).
fb58af

fb58af
   If you would rather NOT edit the input file, there is still a fix.
fb58af
   It requires the DJGPP 32-bit port of 'tr', the Unix translate
fb58af
   program (v1.22 or higher). GNU od and tr are currently at v2.0 (for
fb58af
   DOS); they are packaged with the GNU text utilities, available at
fb58af

fb58af
       ftp://ftp.simtel.net/pub/simtelnet/gnu/djgpp/v2gnu/txt20b.zip
fb58af
       http://www.simtel.net/gnudlpage.php?product=/gnu/djgpp/v2gnu/txt20b.zip&name=txt20b.zip
fb58af

fb58af
   It is important to get the DJGPP version of 'tr' because other
fb58af
   versions ported to DOS will stop processing when they encounter the
fb58af
   EOF character. Use the -d (delete) command:
fb58af

fb58af
       tr -d \32 < badfile.txt | sed -f myscript.sed
fb58af

fb58af
5.4. My RE isn't matching/deleting what I want it to. (Or, "Greedy vs.
fb58af
     stingy pattern matching")
fb58af

fb58af
   The two most common causes for this problem are: (1) misusing the
fb58af
   '.' metacharacter, and (2) misusing the '*' metacharacter. The RE
fb58af
   '.*' is designed to be "greedy" (i.e., matching as many characters
fb58af
   as possible). However, sometimes users need an expression which is
fb58af
   "stingy," matching the shortest possible string.
fb58af

fb58af
   (1) On single-line patterns, the '.' metacharacter matches any
fb58af
   single character on the line. ('.' cannot match the newline at the
fb58af
   end of the line because the newline is removed when the line is put
fb58af
   into the pattern space; sed adds a newline automatically when the
fb58af
   pattern space is printed.) On multi-line patterns obtained with the
fb58af
   'N' or 'G' commands, '.' _will_ match a newline in the middle of the
fb58af
   pattern space. If there are 3 lines in the pattern space, "s/.*//"
fb58af
   will delete all 3 lines, not just the first one (leaving 1 blank
fb58af
   line, since the trailing newline is added to the output).
fb58af

fb58af
   Normal misuse of '.' occurs in trying to match a word or bounded
fb58af
   field, and forgetting that '.' will also cross the field limits.
fb58af
   Suppose you want to delete the first word in braces:
fb58af

fb58af
       echo {one} {two} {three} | sed 's/{.*}/{}/'       # fails
fb58af
       echo {one} {two} {three} | sed 's/{[^}]*}/{}/'    # succeeds
fb58af

fb58af
   's/{.*}/{}/' is not the solution, since the regex '.' will match
fb58af
   any character, including the close braces. Replace the '.' with
fb58af
   '[^}]', which signifies a negated character set '[^...]' containing
fb58af
   anything other than a right brace. FWIW, we know that 's/{one}/{}/'
fb58af
   would also solve our question, but we're trying to illustrate the
fb58af
   use of the negated character set: [^anything-but-this].
fb58af

fb58af
   A negated character set should be used for matching words between
fb58af
   quote marks, for fields separated by commas, and so on. See also
fb58af
   section 4.12 ("How do I parse a comma-delimited data file?").
fb58af

fb58af
   (2) The '*' metacharacter represents zero or more instances of the
fb58af
   previous expression. The '*' metacharacter looks for the leftmost
fb58af
   possible match first and will match zero characters. Thus,
fb58af

fb58af
       echo foo | sed 's/o*/EEE/'
fb58af

fb58af
   will generate 'EEEfoo', not 'fEEE' as one might expect. This is
fb58af
   because /o*/ matches the null string at the beginning of the word.
fb58af

fb58af
   After finding the leftmost possible match, the '*' is GREEDY; it
fb58af
   always tries to match the longest possible string. When two or
fb58af
   three instances of '.*' occur in the same RE, the leftmost instance
fb58af
   will grab the most characters. Consider this example, which uses
fb58af
   grouping '\(...\)' to save patterns:
fb58af

fb58af
       echo bar bat bay bet bit | sed 's/^.*\(b.*\)/\1/'
fb58af

fb58af
   What will be displayed is 'bit', never anything longer, because the
fb58af
   leftmost '.*' took the longest possible match. Remember this rule:
fb58af
   "leftmost match, longest possible string, zero also matches."
fb58af

fb58af
5.5. What is CSDPMI*B.ZIP and why do I need it?
fb58af

fb58af
   If you use MS-DOS outside of Windows and try to use GNU sed v1.18
fb58af
   or 3.02, you may encounter the following error message:
fb58af

fb58af
       no DPMI - Get csdpmi*b.zip
fb58af

fb58af
   "DPMI" stands for DOS Protected Mode Interface; it's basically a
fb58af
   means of running DOS in Protected Mode (as opposed to Real Mode),
fb58af
   which allows programs to share resources in extended memory without
fb58af
   conflicting with one another. Running HIMEM.SYS and EMM386.EXE is
fb58af
   not enough. The "CSDPMI*B.ZIP" refers to files written by Charles
fb58af
   Sandmann to provide DPMI services for 32-bit computers (i.e.,
fb58af
   386SX, 386DX, 486SX, etc.). Download the binary file (the source
fb58af
   code is also available):
fb58af

fb58af
       http://www.delorie.com/djgpp/dl/ofc/simtel/v2misc/csdpmi5b.zip  # binaries
fb58af
       http://www.delorie.com/djgpp/dl/ofc/simtel/v2misc/csdpmi5s.zip  # source
fb58af
       ftp://ftp.cdrom.com/pub/simtelnet/gnu/djgpp/v2misc/csdpmi5b.zip # binaries
fb58af
       ftp://ftp.cdrom.com/pub/simtelnet/gnu/djgpp/v2misc/csdpmi5s.zip # source
fb58af

fb58af
   and extract CWSDPMI.EXE, CWSDPR0.EXE and CWSPARAM.EXE from the ZIP
fb58af
   file. Put all 3 CWS*.EXE files in the same directory as GSED.EXE
fb58af
   and you're all set. There are DOC files enclosed, but they're
fb58af
   nearly incomprehensible for the average computer user. (Another
fb58af
   case of user-vicious documentation.)
fb58af

fb58af
   If you're running Windows and you normally use a DOS session to run
fb58af
   GNU sed (i.e., you get to a DOS prompt with a resizable window or
fb58af
   you press Alt-Enter to switch to full-screen mode), you don't need
fb58af
   the CWS*.EXE files at all, since Windows uses DPMI already.
fb58af

fb58af
5.6. Where are the man pages for GNU sed?
fb58af

fb58af
   Prior to GNU sed v3.02, there weren't any. Until recently, man
fb58af
   pages distributed with gsed were borrowed from old sources or from
fb58af
   other compilations. None of them were "official." GNU sed v3.02 had
fb58af
   the first real set of official man pages, and the documentation has
fb58af
   greatly improved with GNU sed version 4.0, which now includes both
fb58af
   man pages and textinfo pages.
fb58af

fb58af
5.7. How do I tell what version of sed I am using?
fb58af

fb58af
   Try entering "sed" all by itself on the command line, followed by
fb58af
   no arguments or parameters.  Also, try "sed --version".  In a
fb58af
   pinch, you can also try this:
fb58af

fb58af
       strings sed | grep -i ver
fb58af

fb58af
   Your version of 'strings' must be a version of the Unix utility of
fb58af
   this name. It should not be the DOS utility STRINGS.COM by Douglas
fb58af
   Boling.
fb58af

fb58af
5.8. Does sed issue an exit code?
fb58af

fb58af
   Most versions of sed do not, but check the documentation that came
fb58af
   with whichever version you are using. GNU sed issues an exit code
fb58af
   of 0 if the program terminated normally, 1 if there were errors in
fb58af
   the script, and 2 if there were errors during script execution.
fb58af

fb58af
5.9. The 'r' command isn't inserting the file into the text.
fb58af

fb58af
   On most versions of sed (but not all), the 'r' (read) and 'w'
fb58af
   (write) commands must be followed by exactly one space, then the
fb58af
   filename, and then terminated by a newline. Any additional
fb58af
   characters before or after the filename are interpreted as *part*
fb58af
   of the filename. Thus
fb58af

fb58af
       /RE/r  insert.me
fb58af

fb58af
   will would try to locate a file called ' insert.me' (note the
fb58af
   leading space!). If the file was not found, most versions of sed
fb58af
   say nothing, not even an error message.
fb58af

fb58af
   When sed scripts are used on the command line, every 'r' and 'w'
fb58af
   must be the last command in that part of the script. Thus,
fb58af

fb58af
       sed -e '/regex/{r insert.file;d;}' source         # will fail
fb58af
       sed -e '/regex/{r insert.file' -e 'd;}' source    # will succeed
fb58af

fb58af
5.10. Why can't I match or delete a newline using the \n escape sequence?
fb58af
      Why can't I match 2 or more lines using \n?
fb58af

fb58af
   The \n will never match the newline at the end-of-line because the
fb58af
   newline is always stripped off before the line is placed into the
fb58af
   pattern space. To get 2 or more lines into the pattern space, use
fb58af
   the 'N' command or something similar (such as 'H;...;g;').
fb58af

fb58af
   Sed works like this: sed reads one line at a time, chops off the
fb58af
   terminating newline, puts what is left into the pattern space where
fb58af
   the sed script can address or change it, and when the pattern space
fb58af
   is printed, appends a newline to stdout (or to a file). If the
fb58af
   pattern space is entirely or partially deleted with 'd' or 'D', the
fb58af
   newline is *not* added in such cases. Thus, scripts like
fb58af

fb58af
       sed 's/\n//' file       # to delete newlines from each line
fb58af
       sed 's/\n/foo\n/' file  # to add a word to the end of each line
fb58af

fb58af
   will _never_ work, because the trailing newline is removed _before_
fb58af
   the line is put into the pattern space. To perform the above tasks,
fb58af
   use one of these scripts instead:
fb58af

fb58af
       tr -d '\n' < file              # use tr to delete newlines
fb58af
       sed ':a;N;$!ba;s/\n//g' file   # GNU sed to delete newlines
fb58af
       sed 's/$/ foo/' file           # add "foo" to end of each line
fb58af

fb58af
   Since versions of sed other than GNU sed have limits to the size of
fb58af
   the pattern buffer, the Unix 'tr' utility is to be preferred here.
fb58af
   If the last line of the file contains a newline, GNU sed will add
fb58af
   that newline to the output but delete all others, whereas tr will
fb58af
   delete all newlines.
fb58af

fb58af
   To match a block of two or more lines, there are 3 basic choices:
fb58af
   (1) use the 'N' command to add the Next line to the pattern space;
fb58af
   (2) use the 'H' command at least twice to append the current line
fb58af
   to the Hold space, and then retrieve the lines from the hold space
fb58af
   with x, g, or G; or (3) use address ranges (see section 3.3, above)
fb58af
   to match lines between two specified addresses.
fb58af

fb58af
   Choices (1) and (2) will put an \n into the pattern space, where it
fb58af
   can be addressed as desired ('s/ABC\nXYZ/alphabet/g'). One example
fb58af
   of using 'N' to delete a block of lines appears in section 4.13
fb58af
   ("How do I delete a block of _specific_ consecutive lines?"). This
fb58af
   example can be modified by changing the delete command to something
fb58af
   else, like 'p' (print), 'i' (insert), 'c' (change), 'a' (append),
fb58af
   or 's' (substitute).
fb58af

fb58af
   Choice (3) will not put an \n into the pattern space, but it _does_
fb58af
   match a block of consecutive lines, so it may be that you don't
fb58af
   even need the \n to find what you're looking for. Since several
fb58af
   versions of sed support this syntax:
fb58af

fb58af
       sed '/start/,+4d'  # to delete "start" plus the next 4 lines,
fb58af

fb58af
   in addition to the traditional '/from here/,/to there/{...}' range
fb58af
   addresses, it may be possible to avoid the use of \n entirely.
fb58af

fb58af
5.11. My script aborts with an error message, "event not found".
fb58af

fb58af
   This error is generated by the csh or tcsh shells, not by sed. The
fb58af
   exclamation mark (!) is special to csh/tcsh, and if you use it in
fb58af
   command-line or shell scripts--even within single quotes--it must
fb58af
   be preceded by a backslash. Thus, under the csh/tcsh shell:
fb58af

fb58af
       sed '/regex/!d'      # will fail
fb58af
       sed '/regex/\!d'     # will succeed
fb58af

fb58af
   The exclamation mark should not be prefixed with a backslash when
fb58af
   the script is called from a file, as "-f script.file".
fb58af

fb58af
------------------------------
fb58af

fb58af
6. OTHER ISSUES
fb58af

fb58af
6.1. I have a certain problem that stumps me. Where can I get help?
fb58af

fb58af
   Post your question on the "sed-users" mailing list (section 2.3.2),
fb58af
   where many sed users will be able to see your question. You will have
fb58af
   to subscribe to have posting privileges.
fb58af

fb58af
   Your other alternative is one of these newsgroups:
fb58af

fb58af
      - alt.comp.editors.batch
fb58af
      - comp.editors
fb58af
      - comp.unix.questions
fb58af
      - comp.unix.shell
fb58af

fb58af
6.2. How does sed compare with awk, perl, and other utilities?
fb58af

fb58af
   Awk is a much richer language with many features of a programming
fb58af
   language, including variable names, math functions, arrays, system
fb58af
   calls, etc. Its command structure is similar to sed:
fb58af

fb58af
      address { command(s) }
fb58af

fb58af
   which means that for each line or range of lines that matches the
fb58af
   address, execute the command(s). In both sed and awk, an address
fb58af
   can be a line number or a RE somewhere on the line, or both.
fb58af

fb58af
   In program size, awk is 3-10 times larger than sed. Awk has most of
fb58af
   the functions of sed, but not all. Notably, sed supports
fb58af
   backreferences (\1, \2, ...) to previous expressions, and awk does
fb58af
   not have any comparable syntax. (One exception: GNU awk v3.0
fb58af
   introduced gensub(), which supports backreferences only on
fb58af
   substitutions.)
fb58af

fb58af
   Perl is a general-purpose programming language, with many features
fb58af
   beyond text processing and interprocess communication, taking it
fb58af
   well past awk or other scripting languages. Perl supports every
fb58af
   feature sed does and has its own set of extended regular
fb58af
   expressions, which give it extensive power in pattern matching and
fb58af
   processing. (Note: the standard perl distribution comes with 's2p',
fb58af
   a sed-to-perl conversion script. See section 3.6 for more info.)
fb58af
   Like sed and awk, perl scripts do not need to be compiled into
fb58af
   binary code. Like sed, perl can also run many useful "one-liners"
fb58af
   from the command line, though with greater flexibility; see
fb58af
   question 4.41 ("How do I make substitutions in every file in a
fb58af
   directory, or in a complete directory tree?").
fb58af

fb58af
   On the other hand, the current version of perl is from 8 to 35
fb58af
   times larger than sed in its executables alone (perl's library
fb58af
   modules and allied files not included!). Further, for most simple
fb58af
   tasks such as substitution, sed executes more quickly than either
fb58af
   perl or awk. All these utilities serve to process input text,
fb58af
   transforming it to meet our needs . . . or our arbitrary whims.
fb58af

fb58af
6.3. When should I use sed?
fb58af

fb58af
   When you need a small, fast program to modify words, lines, or
fb58af
   blocks of lines in a textfile.
fb58af

fb58af
6.4. When should I NOT use sed?
fb58af

fb58af
   You should not use sed when you have "dedicated" tools which can do
fb58af
   the job faster or with an easier syntax. Do not use sed when you
fb58af
   only want to:
fb58af

fb58af
   - print individual lines, based on patterns within the line itself.
fb58af
     Instead, use "grep".
fb58af

fb58af
   - print blocks of lines, with 1 or more lines of context above or
fb58af
     below a specific regular expression. Instead, use the GNU version
fb58af
     of grep as follows:
fb58af

fb58af
        grep -A{number} -B{number} "regex"
fb58af

fb58af
   - remove individual lines, based on patterns within the line
fb58af
     itself. Instead, use "grep -v".
fb58af

fb58af
   - print line numbers.  Instead, use "nl" or "cat -n".
fb58af

fb58af
   - reformat lines or paragraphs. Instead, use "fold", "fmt" or "par".
fb58af

fb58af
   The tr utility is also more suited than sed to some simple tasks. For
fb58af
   example, to:
fb58af

fb58af
   - delete individual characters. Instead of "s/[a-d]//g", use
fb58af

fb58af
        tr -d "[a-d]"
fb58af

fb58af
   - squeeze sequential characters. Instead of "s/ee*/e/g", use
fb58af

fb58af
        tr -s "{character-set}"
fb58af

fb58af
   - change individual characters. Instead of "y/abcdef/ABCDEF/", use
fb58af

fb58af
        tr "[a-f]" "[A-F]"
fb58af

fb58af
   Note, however, that tr does not support giving input files on the
fb58af
   command line, so the syntax is:
fb58af

fb58af
     tr {options-and-patterns} < input-file
fb58af

fb58af
   or, to process multiple files:
fb58af

fb58af
     cat input-file1 input-file2 | tr {options-and-patterns}
fb58af

fb58af
   If you have multiple files, using tr instead of sed is often more of
fb58af
   an exercise than a useful thing. Although sed can perfectly emulate
fb58af
   certain functions of cat, grep, nl, rev, sort, tac, tail, tr, uniq,
fb58af
   and other utilities, producing identical output, the native utilities
fb58af
   are usually optimized to do the job more quickly than sed.
fb58af

fb58af
6.5. When should I ignore sed and use awk or Perl instead?
fb58af

fb58af
   If you can write the same script in awk or Perl and do it in less
fb58af
   time, then use Perl or awk. There's no reason to spend an hour
fb58af
   writing and debugging a sed script if you can do it in Perl in 10
fb58af
   minutes (assuming that you know Perl already) and if the processing
fb58af
   time or memory use is not a factor. Don't hunt pheasants with a .22
fb58af
   if you have a shotgun at your side . . . unless you simply enjoy
fb58af
   the challenge!
fb58af

fb58af
   Specifically, use awk or perl if you need to:
fb58af

fb58af
      - count fields or words on a line. (awk)
fb58af
      - count lines in a block or objects in a file.
fb58af
      - check lengths of strings or do math operations.
fb58af
      - handle very long lines or need very large buffers. (or gsed)
fb58af
      - handle binary data (control characters). (perl: binmode)
fb58af
      - loop through an array or list.
fb58af
      - test for file existence, filesize, or fileage.
fb58af
      - treat each paragraph as a line. (well, not always)
fb58af

fb58af
6.6. Known limitations among sed versions
fb58af

fb58af
   Limits on distributed versions, although source code for most
fb58af
   versions of free sed allows for modification and recompilation. As
fb58af
   used below, "no limit" means there is no "fixed" limit. Limits are
fb58af
   actually determined by one's hardware, memory, operating system,
fb58af
   and which C library is used to compile sed.
fb58af

fb58af
6.6.1. Maximum line length
fb58af

fb58af
      GNU sed:        no limit
fb58af
      ssed:           no limit
fb58af
      sedmod v1.0:    4096 bytes
fb58af
      HHsed v1.5:     4000 bytes
fb58af
      sed v1.6:       [pending]
fb58af

fb58af
6.6.2. Maximum size for all buffers (pattern space + hold space)
fb58af

fb58af
      GNU sed:        no limit
fb58af
      ssed:           no limit
fb58af
      sedmod v1.0:    4096 bytes
fb58af
      HHsed v1.5:     4000 bytes
fb58af
      sed v1.6:       [pending]
fb58af

fb58af
6.6.3. Maximum number of files that can be read with read command
fb58af

fb58af
      GNU sed v3+:    no limit
fb58af
      ssed:           no limit
fb58af
      GNU sed v2.05:  total no. of r and w commands may not exceed 32
fb58af
      sedmod v1.0:    total no. of r and w commands may not exceed 20
fb58af
      sed v1.6:       [pending]
fb58af

fb58af
6.6.4. Maximum number of files that can be written with 'w' command
fb58af

fb58af
      GNU sed v3+:    no limit (but typical Unix is 253)
fb58af
      ssed:           no limit (but typical Unix is 253)
fb58af
      GNU sed v2.05:  total no. of r and w commands may not exceed 32
fb58af
      sedmod v1.0:    10
fb58af
      HHsed v1.5:     10
fb58af
      sed v1.6:       [pending]
fb58af

fb58af
6.6.5. Limits on length of label names
fb58af

fb58af
      GNU sed:        no limit
fb58af
      ssed:           no limit
fb58af
      HHsed v1.5:     no limit
fb58af
      sed v1.6:       [pending]
fb58af
      BSD sed:        8 characters
fb58af

fb58af
   Note that GNU sed and ssed both consider a semicolon to terminate a
fb58af
   label name.
fb58af

fb58af
6.6.6. Limits on length of write-file names
fb58af

fb58af
      GNU sed:        no limit
fb58af
      ssed:           no limit
fb58af
      HHsed v1.5:     no limit
fb58af
      sed v1.6:       [pending]
fb58af
      BSD sed:        40 characters
fb58af

fb58af
6.6.7. Limits on branch/jump commands
fb58af

fb58af
      GNU sed:        no limit
fb58af
      ssed:           no limit
fb58af
      HHsed v1.5:     50
fb58af
      sed v1.6:       [pending]
fb58af

fb58af
   As a practical consequence, this means that HHsed will not read
fb58af
   more than 50 lines into the pattern space via an N command, even if
fb58af
   the pattern space is only a few hundred bytes in size. HHsed exits
fb58af
   with an error message, "infinite branch loop at line {nn}".
fb58af

fb58af
6.7. Known incompatibilities between sed versions
fb58af

fb58af
6.7.1. Issuing commands from the command line
fb58af

fb58af
   Most versions of sed permit multiple commands to issued on the
fb58af
   command line, separated by a semicolon (;). Thus,
fb58af

fb58af
       sed 'G;G' file
fb58af

fb58af
   should triple-space a file. However, for non-GNU sed, some commands
fb58af
   *require* separate expressions on the command line. These include:
fb58af

fb58af
      - all labels (':a', ':more', etc.)
fb58af
      - all branching instructions ('b', 't')
fb58af
      - commands to read and write files ('r' and 'w')
fb58af
      - any closing brace, '}'
fb58af

fb58af
   If these commands are used, they must be the LAST commands of an
fb58af
   expression. Subsequent commands must use another expression
fb58af
   (another -e switch plus arguments).  E.g.,
fb58af

fb58af
     sed  -e :a -e 's/^.\{1,77\}$/ &/;ta' -e 's/\( *\)\1/\1/' files
fb58af

fb58af
   GNU sed, ssed, sed15 and sed16 all permit these commands to be
fb58af
   followed by a semicolon, so the previous script can be written:
fb58af

fb58af
     sed  ':a;s/^.\{1,77\}$/ &/;ta;s/\( *\)\1/\1/' files
fb58af

fb58af
   Versions differ in implementing the 'a' (append), 'c' (change), and
fb58af
   'i' (insert) commands:
fb58af

fb58af
      sed "/foo/i New text here"              # HHsed/sedmod/gsed-30280
fb58af
      gsed -e "/foo/i\\" -e "New text here"   # GNU sed
fb58af
      sed1 -e "/foo/i" -e "New text here"     # one version of sed
fb58af
      sed2 "/foo/i\ New text here"            # another version
fb58af

fb58af
6.7.2. Using comments (prefixed by the '#' sign)
fb58af

fb58af
   Most versions of sed permit comments to appear in sed scripts only
fb58af
   on the first line of the script. Comments on line 2 or thereafter
fb58af
   are not recognized and will generate an error like "unrecognized
fb58af
   command" or "command [bad-line-here] has trailing garbage".
fb58af

fb58af
   GNU sed, HHsed, sedmod, and HP-UX sed permit comments to appear on
fb58af
   any line of the script, except after labels and branching commands
fb58af
   (b,t), *provided* that a semicolon (;) occurs after the command
fb58af
   itself. This syntax makes sed similar to awk and perl, which use a
fb58af
   similar commenting structure in their scripts.  Thus,
fb58af

fb58af
      # GNU style sed script
fb58af
      $!N;                        # except for last line, get next line
fb58af
      s/^\([0-9]\{5\}\).*\n\1.*//;    # if first 5 digits of each line
fb58af
                                      # match, delete BOTH lines.
fb58af
      t skip
fb58af
      P;                              # print 1st line only if no match
fb58af
      :skip
fb58af
      D;                    # delete 1st line of pattern space and loop
fb58af
      #---end of script---
fb58af

fb58af
   is a valid script for GNU-based versions of sed, but is
fb58af
   unrecognized for most other versions of sed.
fb58af

fb58af
   Finally, if the first two characters in a disk file script are
fb58af
   "#n", the output is suppressed, exactly as if -n were entered on
fb58af
   the command line. This is true for the following versions of sed:
fb58af

fb58af
      - ssed v3.57 and above
fb58af
      - gsed
fb58af
      - HHsed v1.5
fb58af
      - sed v1.6
fb58af

fb58af
   This syntax is not recognized by these versions of sed:
fb58af

fb58af
      - ssed v3.45 to v3.50 (other versions untested)
fb58af
      - sedmod v1.0
fb58af

fb58af
6.7.3. Special syntax in REs
fb58af

fb58af
A. HHsed v1.5 (by Howard Helman)
fb58af

fb58af
   The following expressions can be used for /RE/ addresses or in the
fb58af
   LHS side of a substitution:
fb58af

fb58af
      +    - 1 or more occurrences of previous RE: same as \{1,\}
fb58af
      \<   - boundary between nonword and word character
fb58af
      \>   - boundary between word and nonword character
fb58af

fb58af
   The following expressions can be used for /RE/ addresses or on
fb58af
   either side of a substitution:
fb58af

fb58af
      \a   - bell         (ASCII 07, 0x07)
fb58af
      \b   - backspace    (ASCII 08, 0x08)
fb58af
      \e   - escape       (ASCII 27, 0x1B)
fb58af
      \f   - formfeed     (ASCII 12, 0x0C)
fb58af
      \n   - newline      (printed as 2 bytes, 0D 0A or ^M^J, in DOS)
fb58af
      \r   - return       (ASCII 13, 0x0D)
fb58af
      \t   - tab          (ASCII 09, 0x09)
fb58af
      \v   - vertical tab (ASCII 11, 0x0B)
fb58af
      \xHH - the ASCII character corresponding to 2 hex digits HH.
fb58af

fb58af
B. sed v1.6 (by Walter Briscoe)
fb58af

fb58af
   sed v1.6 accepts every expression supported by sed v1.5 (above),
fb58af
   plus the following elements, which can also used in the RHS of a
fb58af
   substitution (in addition to those listed above):
fb58af

fb58af
      \\~  - insert replacement pattern defined in last s/// command
fb58af
             (must be used alone in the RHS)
fb58af
      \l   - change next element to lower case
fb58af
      \L   - change remaining elements to lower case
fb58af
      \u   - change next element to upper case
fb58af
      \U   - change remaining elements to upper case
fb58af
      \e   - end case conversion of next element
fb58af
      \E   - end case conversion of remaining elements
fb58af
      $0   - insert pattern space BEFORE the substitution
fb58af
      $1-$9 - match Nth word on the pattern space
fb58af

fb58af

fb58af
C. sedmod v1.0 (by Hern Chen)
fb58af

fb58af
   The following expressions can be used for /RE/ addresses in the LHS
fb58af
   of a substitution:
fb58af

fb58af
      +    - 1 or more occurrences of previous RE: same as \{1,\}
fb58af
      \a   - any alphanumeric: same as [a-zA-Z0-9]
fb58af
      \A   - 1 or more alphas: same as \a+
fb58af
      \d   - any digit: same as [0-9]
fb58af
      \D   - 1 or more digits: same as \d+
fb58af
      \h   - any hex digit: same as [0-9a-fA-F]
fb58af
      \H   - 1 or more hexdigits: same as \h+
fb58af
      \l   - any letter: same as [A-Za-z]
fb58af
      \L   - 1 or more letters: same as \l+
fb58af
      \n   - newline      (read as 2 bytes, 0D 0A or ^M^J, in DOS)
fb58af
      \s   - any whitespace character: space, tab, or vertical tab
fb58af
      \S   - 1 or more whitespace chars: same as \s+
fb58af
      \t   - tab          (ASCII 09, 0x09)
fb58af
      \<   - boundary between nonword and word character
fb58af
      \>   - boundary between word and nonword character
fb58af

fb58af
   The following expressions can be used in the RHS of a substitution.
fb58af
   "Elements" refer to \1 .. \9, &, $0, or $1 .. $9:
fb58af

fb58af
      &    - insert regexp defined on LHS
fb58af
      \e   - end case conversion of next element
fb58af
      \E   - end case conversion of remaining elements
fb58af
      \l   - change next element to lower case
fb58af
      \L   - change remaining elements to lower case
fb58af
      \n   - newline      (printed as 2 bytes, 0D 0A or ^M^J, in DOS)
fb58af
      \t   - tab          (ASCII 09, 0x09)
fb58af
      \u   - change next element to upper case
fb58af
      \U   - change remaining elements to upper case
fb58af
      $0   - insert the original pattern space
fb58af
      $1-$9 - match Nth word on the pattern space
fb58af

fb58af
D. UnixDos sed
fb58af

fb58af
   The following expressions can be used in text, LHS, and RHS:
fb58af

fb58af
      \n   - newline      (printed as 2 bytes, 0D 0A or ^M^J, in DOS)
fb58af

fb58af
E. GNU sed v1.03 (by Frank Whaley)
fb58af

fb58af
   When used with the -x (extended) switch on the command line, or
fb58af
   when '#x' occurs as the first line of a script, Whaley's gsed103
fb58af
   supports the following expressions in both the LHS and RHS of a
fb58af
   substitution:
fb58af

fb58af
      \|      matches the expression on either side
fb58af
      ?       0 or 1 occurrences of previous RE: same as \{0,1\}
fb58af
      +       1 or more occurrence of previous RE: same as \{1,\}
fb58af
      \a      "alert" beep     (BEL, Ctrl-G, 0x07)
fb58af
      \b      backspace        (BS, Ctrl-H, 0x08)
fb58af
      \f      formfeed         (FF, Ctrl-L, 0x0C)
fb58af
      \n      newline          (LF, Ctrl-J, 0x0A)
fb58af
      \r      carriage-return  (CR, Ctrl-M, 0x0D)
fb58af
      \t      horizontal tab   (HT, Ctrl-I, 0x09)
fb58af
      \v      vertical tab     (VT, Ctrl-K, 0x0B)
fb58af
      \bBBB   binary char, where BBB are 1-8 binary digits, [0-1]
fb58af
      \dDDD   decimal char, where DDD are 1-3 decimal digits, [0-9]
fb58af
      \oOOO   octal char, where OOO are 1-3 octal digits, [0-7]
fb58af
      \xHH    hex char, where HH are 1-2 hex digits, [0-9A-F]
fb58af

fb58af
   In normal mode, with or without the -x switch, the following escape
fb58af
   sequences are also supported in regex addressing or in the LHS of a
fb58af
   substitution:
fb58af

fb58af
      \`      matches beginning of pattern space: same as /^/
fb58af
      \'      matches end of pattern space: same as /$/
fb58af
      \B      boundary between 2 word or 2 nonword characters
fb58af
      \w      any nonword character [*BUG!* should be a word char]
fb58af
      \W      any nonword character: same as /[^A-Za-z0-9]/
fb58af
      \<      boundary between nonword and word char
fb58af
      \>      boundary between word and nonword char
fb58af

fb58af
F. GNU sed v2.05 and higher versions
fb58af

fb58af
   The following expressions can be used for /RE/ addresses or in the
fb58af
   LHS side of a substitution:
fb58af

fb58af
      \`  - matches the beginning of the pattern space (same as "^")
fb58af
      \'  - matches the end of the pattern space (same as "$")
fb58af
      \?  - 0 or 1 occurrence of previous character: same as \{0,1\}
fb58af
      \+  - 1 or more occurrences of previous character: same as \{1,\}
fb58af
      \|  - matches the string on either side, e.g., foo\|bar
fb58af
      \b  - boundary between word and nonword chars (reversible)
fb58af
      \B  - boundary between 2 word or between 2 nonword chars
fb58af
      \n  - embedded newline (usable after N, G, or similar commands)
fb58af
      \w  - any word character: [A-Za-z0-9_]
fb58af
      \W  - any nonword char: [^A-Za-z0-9_]
fb58af
      \<  - boundary between nonword and word character
fb58af
      \>  - boundary between word and nonword character
fb58af

fb58af
   On \b, \B, \<, and \>, see section 6.7.4 ("Word boundaries"),
fb58af
   below.
fb58af

fb58af
   Undocumented -r switch:
fb58af

fb58af
   Beginning with version 3.02, GNU sed has an undocumented -r switch
fb58af
   (undocumented till version 4.0), activating Extended Regular
fb58af
   Expressions in the following manner:
fb58af

fb58af
       ?      -  0 or 1 occurrence of previous character
fb58af
       +      -  1 or more occurrences of previous character
fb58af
       |      -  matches the string on either side, e.g., foo|bar
fb58af
       (...)  -  enable grouping without backslash
fb58af
       {...}  -  enable interval expression without backslash
fb58af

fb58af
   When the -r switch (mnemonic: "regular expression") is used, prefix
fb58af
   these symbols with a backslash to disable the special meaning.
fb58af

fb58af
   Escape sequences:
fb58af

fb58af
   Beginning with version 3.02.80, the following escape sequences can
fb58af
   now be used on both sides of a "s///" substitution:
fb58af

fb58af
      \a      "alert" beep     (BEL, Ctrl-G, 0x07)
fb58af
      \f      formfeed         (FF, Ctrl-L, 0x0C)
fb58af
      \n      newline          (LF, Ctrl-J, 0x0A)
fb58af
      \r      carriage-return  (CR, Ctrl-M, 0x0D)
fb58af
      \t      horizontal tab   (HT, Ctrl-I, 0x09)
fb58af
      \v      vertical tab     (VT, Ctrl-K, 0x0B)
fb58af
      \oNNN   a character with the octal value NNN
fb58af
      \dNNN   a character with the decimal value NNN
fb58af
      \xHH    a character with the hexadecimal value HH
fb58af

fb58af
   Note that GNU sed also supports "character classes", a POSIX
fb58af
   extension to regexes, described in section 3.7, above.
fb58af

fb58af
G. sed 4.0 and higher versions
fb58af

fb58af
   The following expressions can be used in the RHS of a substitution.
fb58af

fb58af
      \e   - end case conversion
fb58af
      \l   - change next character to lower case
fb58af
      \L   - change remaining text to lower case
fb58af
      \n   - newline      (printed as 2 bytes, 0D 0A or ^M^J, in DOS)
fb58af
      \t   - tab          (ASCII 09, 0x09)
fb58af
      \u   - change next character to upper case
fb58af
      \U   - change remaining text to upper case
fb58af

fb58af
   In addition, GNU sed 4.0 can modify the way ^ and $ are interpreted,
fb58af
   so that ^ can also match an empty string after a newline character,
fb58af
   and $ can also match an empty string before a newline character (to
fb58af
   do this, add an "M" after the regular expression terminator, like
fb58af
   /^>/M -- see section 3.1.1). Even if you use this feature, \` and \'
fb58af
   still match the beginning and the end of the pattern space,
fb58af
   respectively.
fb58af

fb58af
H. ssed
fb58af

fb58af
   Everything that was said for GNU sed applies to ssed as well. In
fb58af
   addition, in Perl-mode (-R switch), these become active or inactive:
fb58af

fb58af
      .     - no longer matches new-line characters
fb58af
      \A    - matches beginning of pattern space
fb58af
      \Z    - matches end of pattern space or last newline in the PS
fb58af
      \z    - matches end of pattern space
fb58af
      \d    - matches any digit: same as [0-9]
fb58af
      \D    - matches any non-digit: same as [^0-9]
fb58af
      \`    - no longer matches beginning of pattern space
fb58af
      \'    - no longer matches end of pattern space
fb58af
      \<    - no longer matches boundary between nonword & word char
fb58af
      \>    - no longer matches boundary between word & nonword char
fb58af
      \oNNN - no longer matches char with octal value NNN
fb58af
      \dNNN - no longer matches char with decimal value NNN
fb58af
      \NNN  - matches char with octal value NNN
fb58af

fb58af
   Perl mode supports lookahead (?=match) and lookbehind (?<=match)
fb58af
   pattern matching.  The matched text is NOT captured in "&" for s///
fb58af
   replacements!
fb58af

fb58af
      foo(?=bar)   - match "foo" only if "bar" follows it
fb58af
      foo(?!bar)   - match "foo" only if "bar" does NOT follow it
fb58af
      (?<=foo)bar  - match "bar" only if "foo" precedes it
fb58af
      (?
fb58af

fb58af
      (?
fb58af
                  - match "foo" only if NOT preceded by "in", "on" or "at"
fb58af
      (?<=\d{3})(?
fb58af
                  - match "foo" only if preceded by 3 digits other than "999"
fb58af

fb58af
  In Perl mode, there are two new switches in /addressing/ or s///
fb58af
  commands. Switches may be lowercase in s/// commands, but must be
fb58af
  uppercase in /addressing/:
fb58af

fb58af
       /S  - lets "." match a newline also
fb58af
       /X  - extra whitespace is ignored. See below, for sample usage.
fb58af

fb58af
   Here are some examples of Perl-style regular expressions. Use the -R
fb58af
   switch.
fb58af

fb58af
     (?i)abc    - case-insensitive match of abc, ABC, aBc, ABc, etc.
fb58af
     ab(?i)c    - same as above; the (?i) applies throughout the pattern
fb58af
     (ab(?i)c)  - matches abc or abC; the outer parens make the difference!
fb58af
     (?m)       - multi-line pattern space: same as "s/FIND/REPL/M"
fb58af
     (?s)       - set "." to match newline also: same as "s/FIND/REPL/S"
fb58af
     (?x)       - ignore whitespace and #comments; see section (9) below.
fb58af

fb58af
     (?:abc)foo    - match "abcfoo", but do not capture 'abc' in \1
fb58af
     (?:ab|cd)ef   - match "abef" or "cdef"; only 'cd' is captured in \1
fb58af
     (?#remark)xy  - match "xy"; remarks after "#" are ignored.
fb58af

fb58af
   And here are some sample uses of /X switch to add comments to complex
fb58af
   expressions. To embed literal spaces, precede with \ or put inside
fb58af
   [brackets].
fb58af

fb58af
     # ssed script to change "(123) 456-7890" into "[ac123] 456-7890"
fb58af
     #
fb58af
     s/ # BACKSLASH IS NEEDED AT END OF EACH LINE!   \
fb58af
     \(                   # literal left paren, (    \
fb58af
     (\d{3})              # 3 digits                 \
fb58af
     \)                   # literal right paren, )   \
fb58af
     [ \t]*               # zero or more spaces or tabs  \
fb58af
     (\d{3}-\d{4})        # 3 digits, hyphen, 4 digits   \
fb58af
     /[ac\1] \2/gx;       # replace g(lobally), with e(x)tended spacing
fb58af

fb58af
6.7.4. Word boundaries
fb58af

fb58af
   GNU sed, ssed, sed16, sed15 and sedmod use certain symbols to define
fb58af
   the boundary between a "word character" and a nonword character. A
fb58af
   word character fits the regex "[A-Za-z0-9_]". Note: a word character
fb58af
   includes the underscore "_" but not the hyphen, probably because the
fb58af
   underscore is permissible as a label in sed and in other scripting
fb58af
   languages. (In gsed103, a word character did NOT include the
fb58af
   underscore; it included alphanumerics only.)
fb58af

fb58af
   These symbols include '\<' and '\>' (gsed, ssed, sed15, sed16,
fb58af
   sedmod) and '\b' and '\B' (gsed only). Note that the boundary
fb58af
   symbols do not represent a character, but a position on the line.
fb58af
   Word boundaries are used with literal characters or character sets
fb58af
   to let you match (and delete or alter) whole words without
fb58af
   affecting the spaces or punctuation marks outside of those words.
fb58af
   They can only be used in a "/pattern/" address or in the LHS of a
fb58af
   's/LHS/RHS/' command. The following table shows how these symbols
fb58af
   may be used in HHsed and GNU sed. Sedmod matches the syntax of
fb58af
   HHsed.
fb58af

fb58af
      Match position      Possible word boundaries   HHsed   GNU sed
fb58af
      ---------------------------------------------------------------
fb58af
      start of word    [nonword char]^[word char]      \<    \< or \b
fb58af
      end of word         [word char]^[nonword char]   \>    \> or \b
fb58af
      middle of word      [word char]^[word char]     none      \B
fb58af
      outside of word  [nonword char]^[nonword char]  none      \B
fb58af
      ---------------------------------------------------------------
fb58af

fb58af
   In ssed, the symbols '\<' and '\>' lose their special meaning when
fb58af
   the -R switch is used to invoke Perl-style expressions. However,
fb58af
   the identical meaning of '\<' and '\>' can be obtained through
fb58af
   these nonmatching, zero-width assertions:
fb58af

fb58af
       (?
fb58af

fb58af
6.7.5. Commands which operate differently
fb58af

fb58af
A. GNU sed version 3.02 and 3.02.80
fb58af

fb58af
   The N command no longer discards the contents of the pattern space
fb58af
   upon reaching the end of file. This is not a bug, it's a feature.
fb58af
   However, it breaks certain scripts which relied on the older
fb58af
   behavior of N.
fb58af

fb58af
   'N' adds the Next line to the pattern space, enabling multiple
fb58af
   lines to be stored and acted upon. Upon reaching the last line of
fb58af
   the file, if the N command was issued again, the contents of the
fb58af
   pattern space would be silently deleted and the script would abort
fb58af
   (this has been the traditional behavior). For this reason, sed
fb58af
   users generally wrote:
fb58af

fb58af
       $!N;   # to add the Next line to every line but the last one.
fb58af

fb58af
   However, certain sed scripts relied on this behavior, such as the
fb58af
   script to delete trailing blank lines at the end of a file (see
fb58af
   script #12 in section 3.2, "Common one-line sed scripts", above).
fb58af
   Also, classic textbooks such as Dale Dougherty and Arnold Robbins'
fb58af
   _sed & awk_ documented the older behavior.
fb58af

fb58af
   The GNU sed maintainer felt that despite the portability problems
fb58af
   this would cause, changing the N command to print (rather than
fb58af
   delete) the pattern space was more consistent with one's intuitions
fb58af
   about how a command to "append the Next line" _ought_ to behave.
fb58af
   Another fact favoring the change was that "{N;command;}" will
fb58af
   delete the last line if the file has an odd number of lines, but
fb58af
   print the last line if the file has an even number of lines.
fb58af

fb58af
   To convert scripts which used the former behavior of N (deleting
fb58af
   the pattern space upon reaching the EOF) to scripts compatible with
fb58af
   all versions of sed, change a lone "N;" to "$d;N;".
fb58af

fb58af
------------------------------
fb58af

fb58af
7. KNOWN BUGS AMONG SED VERSIONS
fb58af

fb58af
   Most versions of GNU sed and ssed contain a "buglist" in the
fb58af
   archive source code of known errors or reported behaviors that may
fb58af
   be misconstrued as bugs. This portion of the sed FAQ does _not_
fb58af
   attempt to fully reproduce those buglists files. However, we do
fb58af
   seek to do some substantial reporting, particularly where certain
fb58af
   programs have no "buglist" of their own or are not being actively
fb58af
   maintained.
fb58af

fb58af
   As a rule of thumb, if the bug "bites" someone on the sed-users
fb58af
   mailing list, I tend to report it.
fb58af

fb58af
7.1. ssed v3.59 (by Paolo Bonzini)
fb58af

fb58af
   (1) N does not discard the contents of the pattern space upon
fb58af
   reaching the end of file; not a bug. See section 6.7.5.A, above.
fb58af

fb58af
   (2) If \x26 is entered into the RHS of a substitution, it is
fb58af
   interpreted as an ampersand metacharacter, and the entire pattern
fb58af
   matched in the "find" portion is inserted at that point. A literal
fb58af
   ampersand should be inserted instead.
fb58af

fb58af
   (3) Under Windows 2000, the -i switch doesn't create backup files
fb58af
   properly. When passed one or more files to process, the source
fb58af
   file(s) are unchanged, and the output changed files are given
fb58af
   filenames like sedDOSxyz with no way to correspond them with the
fb58af
   names of the source files.
fb58af

fb58af
7.2. GNU sed v4.0 - v4.0.5
fb58af

fb58af
   (1) N does not discard the contents of the pattern space upon
fb58af
   reaching the end of file; not a bug. See section 6.7.5.A, above.
fb58af

fb58af
   (2) If \x26 is entered into the RHS of a substitution, it is
fb58af
   interpreted as an ampersand metacharacter, and the entire pattern
fb58af
   matched in the "find" portion is inserted at that point. A literal
fb58af
   ampersand should be inserted instead.
fb58af

fb58af
7.3. GNU sed v3.02.80
fb58af

fb58af
   (1) N does not discard the contents of the pattern space upon
fb58af
   reaching the end of file; not a bug. See section 6.7.5.A, above.
fb58af

fb58af
   (2) Same as #2 for GNU sed v4.0, above.
fb58af

fb58af
7.4. GNU sed v3.02
fb58af

fb58af
   (1) Affects only v3.02 binaries compiled with DJGPP for MS-DOS and
fb58af
   MS-Windows: 'l' (list) command does not display a lone carriage
fb58af
   return (0x0D, ^M) embedded in a line.
fb58af

fb58af
   (2) The expression "\<" causes problems when attempting the
fb58af
   following types of substitutions, which should print "+aaa +bbb":
fb58af

fb58af
       echo aaa bbb | sed 's/\</+/g'    # prints "+a+a+a +b+b+b"
fb58af
       echo aaa bbb | sed 's/\<./+&/g'  # prints "+a+a+a +b+b+b"
fb58af

fb58af
   (3) The N command no longer discards the contents of the pattern
fb58af
   space upon reaching the end of file. This is not a bug, it's a
fb58af
   feature. See section 6.7.5, "Commands which operate differently".
fb58af

fb58af
7.5. GNU sed v2.05
fb58af

fb58af
   (1) If a number follows the substitute command (e.g., s/f/F/10) and
fb58af
   the number exceeds the possible matches on the pattern space, the
fb58af
   command 't label' _always_ jumps to the specified label. 't' should
fb58af
   jump only if the substitution was successful (or returned "true").
fb58af

fb58af
   (2) 'l' (list) command does not convert the following characters to
fb58af
   hex values, but passes them through unchanged: 0xF7, 0xFB, 0xFC,
fb58af
   0xFD, 0xFE.
fb58af

fb58af
   (3) A range address like "/foo/,14" is supposed to match every line
fb58af
   from the first occurrence of "foo" until line 14, inclusive, and
fb58af
   then match only those lines containing "foo" thereafter. In gsed
fb58af
   v2.05, if "foo" occurs later in the file, every line from there to
fb58af
   the end of file will be matched (since gsed is looking for line 14
fb58af
   to occur again!).
fb58af

fb58af
   (4) The regexes /\`/ and /\'/ are not interpreted as a backquote
fb58af
   and apostrophe, as might be expected. Instead, they are used to
fb58af
   represent the beginning-of-line and end-of-line (respectively), to
fb58af
   conform with similar regexes in the GNU versions of Emacs and awk.
fb58af
   As a consequence, there is no clear way to indicate an apostrophe,
fb58af
   since a bare apostrophe (') has special meaning to the Unix shell
fb58af
   and the quoted apostrophe (\') is interpreted as the EOL. A
fb58af
   double-quote apostrophe (\\') was interpreted as a backslash to sed
fb58af
   and a quote mark to the shell--again, not providing the expected
fb58af
   results. This syntax changed in the next version of gsed.
fb58af

fb58af
   (5) Multiple occurrences of the 'w' command fail, as shown here,
fb58af
   given that both "aaa" and "bbb" occur within the file:
fb58af

fb58af
       gsed -e "/aaa/w FILE" -e "/bbb/w FILE" input.txt
fb58af

fb58af
   (6) The expression "\<" causes problems when attempting the
fb58af
   following type of substitution, which should print "+aaa +bbb":
fb58af

fb58af
       echo aaa bbb | sed 's/\</+/g'    # sed hangs up with no output
fb58af

fb58af
   The syntax 's/\<./+&/g' issues the proper output.
fb58af

fb58af
7.6. GNU sed v1.18
fb58af

fb58af
   (1) Same as #1 for GNU sed v2.05, above.
fb58af

fb58af
   (2) The following command will lock the computer under Win95. Echos
fb58af
   is an echo command that does not issue a trailing newline:
fb58af

fb58af
       echos any_word | gsed "s/[ ]*$//"
fb58af

fb58af
   (3) Same as #3 for GNU sed v2.05, above.
fb58af

fb58af
7.7. GNU sed v1.03 (by Frank Whaley)
fb58af

fb58af
   (1) The \w and \W escape sequences both match only nonword
fb58af
   characters. \w is misdefined and should match word characters.
fb58af

fb58af
   (2) The underscore is defined as a nonword character; it should be
fb58af
   defined as a word character.
fb58af

fb58af
   (3) same as #3 for GNU sed v2.05, above.
fb58af

fb58af
7.8. sed v1.6 (by Walter Briscoe) - still in beta version
fb58af

fb58af
   (1) Duplicated subexpressions (still) do not match an empty set as
fb58af
   they should. This problem was inherited from HHsed15.
fb58af

fb58af
       echo 123 | sed "s/\([a-z][a-z]\)*/=\1/"  # does not return '='
fb58af

fb58af
   (2) If grouping is followed by a + operator, nothing is matched.
fb58af
   This problem was inherited from HHsed; it fixed a bug with the *
fb58af
   operator, but the problem with the + operator persists.
fb58af

fb58af
       echo aaa | sed "/\(a\)+/d"          # nothing is deleted.
fb58af

fb58af
   (3) With the interval expressions \{1,\} and +, there is a bug
fb58af
   related to the & replacement character. This affected the BETA
fb58af
   release, and it's not known if it affects the final release.
fb58af

fb58af
       echo ab | sed "s/a[^a]*/&c/"        # returns 'abc'. Okay.
fb58af
       echo ab | sed "s/a[^a]+/&c/"        # returns 'ab'. Bug!
fb58af
       echo ab | sed "s/a[^a]\{1,\}/&c/"   # returns 'ab'. Bug!
fb58af

fb58af
7.9. HHsed v1.5 (by Howard Helman)
fb58af

fb58af
   (1) If a number follows the substitute command (e.g., s/foo/bar/2),
fb58af
   in a sed script entered from the command line, two semicolons must
fb58af
   follow the number, or they must be separated by an -e switch.
fb58af
   Normally, only 1 semicolon is needed to separate commands.
fb58af

fb58af
       echo bit bet | HHsed "s/b/n/2;;s/b/B/"          # solution 1
fb58af
       echo bit bet | HHsed -e "s/b/n/2" -e "s/b/B"    # solution 2
fb58af

fb58af
   (2) If the substitute command is followed by a number and a "p"
fb58af
   flag, when the -n switch is used, the "p" flag must occur first.
fb58af

fb58af
       echo aaa | HHsed -n "s/./B/3p"    # bug! nothing prints
fb58af
       echo aaa | HHsed -n "s/./B/p3"    # prints "aaB" as expected
fb58af

fb58af
   (3) The following commands will cause HHsed to lock the computer
fb58af
   under MS-DOS or Win95. Note that they occur because of malformed
fb58af
   regular expressions which will match no characters.
fb58af

fb58af
       sed -n "p;s/\<//g;" file
fb58af
       sed -n "p;s/[char-set]*//g;" file
fb58af

fb58af
   (4) The range command '/RE1/,/RE2/' in HHsed will match one line if
fb58af
   both regexes occur on the same line (see section 3.4(3), above).
fb58af
   Though this could be construed as a feature, it should probably be
fb58af
   considered a bug since its operation differs from every other
fb58af
   version of sed. For example, '/----/,/----/{s/^/>>/;}' should put
fb58af
   two angle brackets ">>" before every line which is sandwiched
fb58af
   between a row of 4 or more hyphens. With HHsed, this command will
fb58af
   only prefix the hyphens themselves with the angle brackets.
fb58af

fb58af
   (5) If the hold space is empty, the H command copies the pattern
fb58af
   space to the hold space but fails to prepend a leading newline. The
fb58af
   H command is supposed to add a newline, followed by the contents of
fb58af
   the pattern space, to the hold space at all times. A workaround is
fb58af
   "{G;s/^\(.*\)\(\n\)$/\2\1/;H;s/\n$//;}", but it requires knowing
fb58af
   that the hold space is empty and using the command only once.
fb58af
   Another alternative is to use the G or the h command alone at key
fb58af
   points in the script.
fb58af

fb58af
   (6) If grouping is followed by an '*' or '+' operator, HHsed does
fb58af
   not match the pattern, but issues no warning. See below:
fb58af

fb58af
       echo aaa | HHsed "/\(a\)*/d"      # nothing is deleted
fb58af
       echo aaa | HHsed "/\(a\)+/d"      # nothing is deleted
fb58af
       echo aaa | HHsed "s/\(a\)*/\1B/"  # nothing is changed
fb58af
       echo aaa | HHsed "s/\(a\)+/\1B/"  # nothing is changed
fb58af

fb58af
   (7) If grouping is followed by an interval expression, HHsed halts
fb58af
   with the error message "garbled command", in all of the following
fb58af
   examples:
fb58af

fb58af
       echo aaa | HHsed "/\(a\)\{3\}/d"
fb58af
       echo aaa | HHsed "/\(a\)\{1,5\}/d"
fb58af
       echo aaa | HHsed "s/\(a\)\{3\}/\1B/"
fb58af

fb58af
   (8) In interval expressions, 0 is not supported. E.g., \{0,3\)
fb58af

fb58af
7.10. sedmod v1.0 (by Hern Chen)
fb58af

fb58af
   Technically, the following are limits (or features?) of sedmod, not
fb58af
   bugs, since the docs for sedmod do not claim to support these
fb58af
   missing features.
fb58af

fb58af
   (1) sedmod does not support standard interval expressions  \{...\}
fb58af
   present in nearly all versions of sed.
fb58af

fb58af
   (2) If grouping is followed by an '*' or '+' operator, sedmod gives
fb58af
   a "garbled command" message. However, if the grouped expressions
fb58af
   are strings literals with no metacharacters, a partial workaround
fb58af
   can be done like so:
fb58af

fb58af
       \(string\)\1*    # matches 1 or more instances of 'string'
fb58af
       \(string\)\1+    # matches 2 or more instances of 'string'
fb58af

fb58af
   (3) sedmod does not support a numeric argument after the s///
fb58af
   command, as in 's/a/b/3', present in nearly all versions of sed.
fb58af

fb58af
   The following are bugs in sedmod v1.0:
fb58af

fb58af
   (4) When the -i (ignore case) switch is used, the '/regex/d'
fb58af
   command is not properly obeyed. Sedmod may miss one or more lines
fb58af
   matching the expression, regardless of where they occur in the
fb58af
   script. Workaround: use "/regex/{d;}" instead.
fb58af

fb58af
7.11. HP-UX sed
fb58af

fb58af
   (1) Versions of HP-UX sed up to and including version 10.20 are
fb58af
   buggy. According to the README file, which comes with the GNU cc
fb58af
   at <ftp://ftp.ntua.gr/pub/gnu/sed/sed-2.05.bin.README>:
fb58af

fb58af
   "When building gcc on a hppa*-*-hpux10 platform, the `fixincludes'
fb58af
   step (which involves running a sed script) fails because of a bug
fb58af
   in the vendor's implementation of sed.  Currently the only known
fb58af
   workaround is to install GNU sed before building gcc.  The file
fb58af
   sed-2.05.bin.hpux10 is a precompiled binary for that platform."
fb58af

fb58af
7.12. SunOS sed v4.1
fb58af

fb58af
   (1) Bug occurs in RE pattern matching when a non-null '[char-set]*'
fb58af
   is followed by a null '\NUM' pattern recall, illustrated here and
fb58af
   reported by Greg Ubben:
fb58af

fb58af
       s/\(a\)\(b*\)cd\1[0-9]*\2foo/bar/  # between '[0-9]*' and '\2'
fb58af
       s/\(a\{0,1\}\).\{0,1\}\1/bar/      # between '.\{0,1\}' and '\1'
fb58af

fb58af
   Workaround: add a do-nothing 'X*' expression which will not match
fb58af
   any characters on the line between the two components. E.g.,
fb58af

fb58af
       s/\(a\)\(b*\)cd\1[0-9]*X*\2foo/bar/
fb58af
       s/\(a\{0,1\}\).\{0,1\}X*\1/bar/
fb58af

fb58af
7.13. SunOS sed v5.6
fb58af

fb58af
   (1) If grouping is followed by an asterisk, SunOS sed does not match
fb58af
   the null string, which it should do. The following command:
fb58af

fb58af
       echo foo | sed 's/f\(NO-MATCH\)*/g\1/'
fb58af

fb58af
   should transform "foo" to "goo" under normal versions of sed.
fb58af

fb58af
7.14. Ultrix sed v4.3
fb58af

fb58af
   (1) If grouping is followed by an asterisk, Ultrix sed replies with
fb58af
   "command garbled", as shown in the following example:
fb58af

fb58af
       echo foo | sed 's/f\(NO-MATCH\)*/g\1/'
fb58af

fb58af
   (2) If grouping is followed by a numeric operator such as \{0,9\},
fb58af
   Ultrix sed does not find the match.
fb58af

fb58af
7.15. Digital Unix sed
fb58af

fb58af
   (1) The following comes from the man pages for sed distributed with
fb58af
   new, 1998 versions of Digital Unix (reformatted to fit our
fb58af
   margins):
fb58af

fb58af
   [Digital]  The h subcommand for sed does not work properly.  When
fb58af
   you use the  h subcommand to place text into the hold area, only
fb58af
   the last line of the specified text is saved.  You can use the H
fb58af
   subcommand to append text to the hold area. The H subcommand and
fb58af
   all others dealing with the hold area work correctly.
fb58af

fb58af
   (2) "$d" command issues an error message, "cannot parse".  Reported
fb58af
   by Carlos Duarte on 8 June 1998.
fb58af

fb58af
[end-of-file]