Advanced sed Commands: Building a Deeper Understanding
The Unix utility sed (stream editor) is a powerful tool for parsing and transforming text. While many users rely on a handful of simple substitutions, mastering advanced sed commands can transform your text processing capabilities. This guide goes beyond memorizing syntax—by understanding the underlying pattern of sed operations, you’ll be better equipped to adapt and combine commands for your unique tasks.
Below is a comprehensive walkthrough that incorporates various markdown elements such as headers, code blocks, bullet lists, and inline code to create an engaging and organized article.
Table of Contents
- Multi-line Pattern Matching
- Hold Space Manipulation
- Advanced Substitution with Back-references
- Conditional Execution and Branching
- Extended Regex and Character Classes
- Reading from Files and Writing to Pattern Space
- Advanced Deletion Operations
- Using Multiple Address Ranges
- Inserting and Appending Text with Labels
- Using the Quit Command
- The G Command and Combining Hold/Get Operations
- In-place Editing with the
-i
Flag - Conclusion
Multi-line Pattern Matching
sed processes text line by line by default. However, many real-world scenarios require working with multiple lines simultaneously. Two essential commands for this are N
and P
.
-
Joining Lines:
To join a line containing a specific pattern with the next line, you can use:sed '/pattern1/{N;s/\n/ /}'
Here,
N
appends the next line to the pattern space, and the substitution command (s/\n/ /
) replaces the newline with a space. -
Printing a Range of Lines:
To print only lines between two markers (e.g., “start” and “end”), use:sed -n '/start/,/end/p'
The
-n
flag suppresses automatic printing, so only the lines in the specified range are output.
Hold Space Manipulation
sed’s hold space is a secondary buffer that can store text temporarily. Mastering commands like h
(copy to hold space), g
(get from hold space), and x
(swap pattern and hold spaces) lets you perform complex text manipulations.
-
Swapping Lines:
A simple swap of the current line with the hold space:sed 'x'
-
Advanced Hold Operations:
Copy the current line to hold space, append the next line, and then retrieve the combined result:sed 'h;n;H;g'
-
Appending to Each Line:
To save the first line and append it to every subsequent line:sed '1h;1!H;$g;s/\n/ /'
Advanced Substitution with Back-references
Substitution is at the heart of sed. Incorporating capture groups and back-references makes your editing highly dynamic.
-
Using Capture Groups:
Capture portions of a line and rearrange them using back-references:sed 's/\(foo\).*\(bar\)/\2 \1/'
This command swaps the positions of
foo
andbar
along with their surrounding content. -
Multiple Substitutions:
Chain substitutions in a single sed command:sed 's/old/new/g;s/this/that/g'
-
Case Conversion:
Some versions of sed support conversion to uppercase or lowercase:sed 's/\L&//' # Convert matched text to lowercase sed 's/\U&//' # Convert matched text to uppercase
Conditional Execution and Branching
sed allows conditional execution of commands based on pattern matches. This helps tailor your commands to only run when certain conditions are met.
-
Skipping Commands:
If a pattern is matched, skip the next two commands:sed '/pattern/{n;n}'
-
Branching with Labels:
Use labels to jump within your script:sed ':label /pattern/b label'
This control flow is especially useful for processing complex, multi-step transformations.
Extended Regex and Character Classes
Extended regular expressions (regex) and character classes allow you to define patterns more succinctly and powerfully.
-
Using the Extended Regex Flag:
Activate extended regex with the-E
flag:sed -E 's/[0-9]+\s[a-z]+/replacement/'
-
Character Classes:
For broader matching, use POSIX character classes:sed 's/[[:alpha:]]/X/g'
Reading from Files and Writing to Pattern Space
Beyond in-line text manipulation, sed can read from and write to external files.
-
Reading File Content:
To insert the contents of another file at a pattern match:sed '/pattern/r input.txt'
-
Writing to a File:
Write the current pattern space to an output file:sed -n '/pattern/w output.txt'
Advanced Deletion Operations
Selective deletion can streamline your text processing, whether you need to remove ranges of lines or every nth line.
-
Deleting a Range:
Remove all lines between two patterns (inclusive):sed '/start/,/end/d'
-
Deleting Every nth Line:
Delete every other line or every third line:sed 'n;d' # Delete every other line sed 'n;n;d' # Delete every third line
Using Multiple Address Ranges
Apply commands only to specified portions of your file with multiple address ranges.
-
Combining Operations:
For example, delete lines matching a pattern only within lines 2 to 5, and substitute text between lines 7 and 10:sed -e '2,5{/pattern/d}' \ -e '7,10{s/old/new/}'
Inserting and Appending Text with Labels
sed is not just about deletion and substitution—it can also insert or append text based on pattern matching.
-
Inserting Text:
Insert text before a matching line:sed '/pattern/i\ New text here'
-
Appending Text:
Append text after a matching line:sed '/pattern/a\ New text here'
Using the Quit Command
Sometimes you only need to process part of a file. The q
(quit) command allows for early termination based on a line count or pattern.
-
Quit After a Set Number of Lines:
sed '10q'
-
Quit Upon a Pattern Match:
sed '/pattern/q'
The G Command and Combining Hold/Get Operations
The G
command appends the hold space to the pattern space, separated by a newline. This is useful for combining original and modified lines.
-
Appending a Blank Line:
Double the space between lines or add a blank line after matching lines:sed 'G' file.txt # Doubles space between lines sed '/pattern/G' file.txt # Add blank line after matching lines sed 'n;G' file.txt # Add blank line after every other line
-
Combining Modifications:
For example, swap a string and then print both the original and modified versions:sed 'h;s/foo/bar/;G' file.txt
In-place Editing with the -i
Flag
When you’re confident in your sed script, the -i
flag lets you apply changes directly to your files. Use this option with caution.
-
Basic In-place Substitution:
sed -i 's/old/new/g' file.txt
-
Creating Backups:
Always create a backup when performing in-place edits:sed -i.bak 's/old/new/g' file.txt
On macOS/BSD systems, the syntax requires an explicit backup extension:
sed -i '' 's/old/new/g' file.txt # No backup sed -i '.bak' 's/old/new/g' file.txt # With backup
-
Advanced In-place Modification:
A powerful combination is adding the original line as a comment above each modified line:sed -i.bak 'h;s/^/# /;G' file.txt
Conclusion
Advanced sed commands open up a world of text manipulation possibilities that go far beyond simple search-and-replace. By understanding the patterns behind sed’s operations—whether it’s handling multi-line patterns, manipulating hold space, or executing conditional logic—you can create scripts that are both efficient and adaptable.
Remember, the goal is not just to memorize commands but to build an intuition for how sed processes text. Experiment with these techniques on sample files, and soon you’ll be combining commands seamlessly to meet your unique text-processing challenges.
Embrace the power of sed and transform your approach to text editing and automation. Happy scripting!