Advanced Piping Techniques in Unix/Linux

Named Pipes (FIFOs)

Named pipes allow communication between processes using a file system node.

Creating Named Pipes

mkfifo my_pipe
# Or with specific permissions
mkfifo -m 644 my_pipe

Using Named Pipes

Terminal 1:

cat > my_pipe

Terminal 2:

cat < my_pipe

Process Substitution

Process substitution allows feeding output of a process to another process that expects a file.

Syntax

<(command)  # Output process substitution
>(command)  # Input process substitution

Examples

  1. Compare outputs of two commands:
diff <(ls dir1) <(ls dir2)
  1. Multiple input sources:
cat <(ls) <(echo "---") <(date)

Tee with Pipes

Redirect output to multiple destinations simultaneously.

Basic Usage

command | tee file.txt | grep "pattern"

Advanced Tee Usage

  1. Append to files:
command | tee -a file1.txt file2.txt | less
  1. Write to privileged files:
command | sudo tee /etc/config > /dev/null

Pipeline Control

Using xargs

  1. Basic usage with confirmation:
find . -name "*.tmp" | xargs -p rm
  1. Custom delimiter:
echo "file1;file2;file3" | xargs -d ";" touch
  1. Parallel execution:
find . -type f | xargs -P 4 -I {} gzip {}

Using parallel

find . -name "*.jpg" | parallel convert {} {.}.png

Advanced Redirection

File Descriptors

# Redirect stderr to stdout
command 2>&1
 
# Redirect both stdout and stderr to file
command &> file.txt
 
# Redirect stderr to file, stdout to another file
command 2>error.log 1>output.log

Here Documents (heredoc)

cat << EOF > script.sh
#!/bin/bash
echo "Generated script"
date
EOF

Here Strings

grep "pattern" <<< "input string"

Complex Pipeline Examples

  1. Monitor log file and send emails on errors:
tail -f log.txt | grep --line-buffered "ERROR" | \
while read -r line; do
    echo "$line" | mail -s "Error Alert" admin@example.com
done
  1. Process management with pipes:
ps aux | \
awk 'NR>1{proc[$11]+=$3} END{for (p in proc) print p,proc[p]}' | \
sort -k2 -rn | head -5
  1. Parallel data processing:
find . -type f -name "*.log" | \
parallel --pipe -N 1000 'grep "ERROR" {} | \
cut -d" " -f1 | sort | uniq -c'

Best Practices

  1. Error Handling
  • Always consider both stdout and stderr
  • Use appropriate redirections based on needs
  • Implement proper error checking in scripts
  1. Performance Considerations
  • Use parallel processing when appropriate
  • Consider buffer sizes for large data
  • Monitor system resources
  1. Security
  • Be careful with privileged operations
  • Sanitize input when using xargs
  • Consider file permissions with named pipes
  1. Debugging
# Debug pipeline stages
command1 | tee debug1.log | \
command2 | tee debug2.log | \
command3 > final_output.log

Common Pitfalls to Avoid

  1. Subshell Variables Pipeline operations run in subshells; variables set inside won’t be available outside:
# Won't work as expected
echo "data" | read var
echo $var  # Empty
 
# Correct approach
var=$(echo "data")
# or
read var <<< "data"
  1. Buffering Issues Some commands buffer their output differently when piped:
# Force line buffering
stdbuf -oL command | grep "pattern"
# or
command | grep --line-buffered "pattern"
  1. Race Conditions When using named pipes, be careful about reading/writing timing:
# Safer approach using background process
{ command1 > my_pipe & }
command2 < my_pipe