GLM 4.5 Air

POSIX ex: How to Ignore Unmatched Patterns in Substitutes

Learn how to make POSIX ex portably ignore unmatched patterns in substitute commands without using Vim's 'e' flag. Complete guide with find command examples and best practices.

Question

Is it possible to make POSIX ex portably ignore unmatched patterns in its substitute command, given the following restrictions?

  1. Vim’s ‘e’ flag is an extension and not required by POSIX
  2. POSIX states that implementations may support more than a single -c option, so implementations providing two or more -c options cannot be relied upon
  3. sed -i should never be used to edit files

Background: POSIX.1-2017 specifies two line-oriented file editing utilities: ed and ex. ex is more efficient for noninteractive batch-editing because POSIX requires it to accept one or more file operands, while ed is only required to accept one. This means editing with ex avoids forking when used in find’s exec primary.

For example, compare:
find . -type f -exec sh -c ‘ed -s “$1” <<EOF
s/RE/replacement
w
q
EOF’ sh {} ;

with:
find . -type f -exec ex -sc s/pattern/repl|x {} +

Both ed and ex implementations behave such that if a search for a BRE fails to find a matching line, it’s considered an error. In a command like ex -sc s/pattern1/repl1|s/pattern2/repl2|x file, if pattern1 is not matched in file, the second substitute command is never attempted.

While Unix Power Tools suggests a workaround for ed using global commands, this approach doesn’t work for ex. Is there a portable solution that works within POSIX constraints?

GLM 4.5 Air

Brief Answer

Yes, it is possible to make POSIX ex portably ignore unmatched patterns in its substitute command by using the global (g) command syntax. The solution involves g/pattern/s//replacement/ which only attempts substitution on lines that match the pattern, thus avoiding errors when patterns are unmatched. This approach is fully POSIX-compliant and doesn’t rely on any implementation-specific extensions.

Contents

Understanding the Problem with POSIX ex and Unmatched Patterns

POSIX ex and ed treat unmatched patterns as errors, which causes them to stop processing further commands when a substitute operation fails to find a match. This behavior creates challenges when you want to apply multiple substitute operations where some patterns might not exist in all files.

In a command like:

ex -sc 's/pattern1/repl1|s/pattern2/repl2|x' file

If pattern1 is not found in file, the entire command sequence fails, and pattern2 is never attempted. This is problematic for batch processing where files may have varying content.

The constraints add further limitations:

  • Cannot use Vim’s ‘e’ flag (which silently ignores pattern errors)
  • Cannot rely on implementations with multiple -c options
  • Should avoid sed -i for file editing

The Global Command Solution

The most portable solution within POSIX constraints is to use the global command (g) syntax:

g/pattern1/s//repl1/
g/pattern2/s//repl2/
x

This approach works because:

  1. The g/pattern/ command selects all lines that match the pattern
  2. The s//repl/ command substitutes on the currently selected lines using the previous pattern
  3. If no lines match the pattern, no selection is made, and the substitute command silently does nothing
  4. Commands continue to execute regardless of whether any matches were found

When combined in a single command string with the pipe (|) delimiter:

ex -sc 'g/pattern1/s//repl1/|g/pattern2/s//repl2/|x' file

This provides a fully POSIX-compliant way to perform multiple substitutions while ignoring unmatched patterns.


Implementing the Solution in find Commands

Applying this solution to the find command example transforms it from:

bash
find . -type f -exec sh -c 'ed -s "$1" <<EOF
    s/RE/replacement
    w
    q
EOF' sh {} \;

To the more efficient ex version:

bash
find . -type f -exec ex -sc 'g/pattern1/s//repl1/|g/pattern2/s//repl2/|x' {} +

This version:

  • Uses the more efficient ex utility (accepts multiple files)
  • Implements the global command pattern to handle unmatched patterns
  • Uses only POSIX-standard features
  • Maintains the efficiency of not forking a shell for each file

The + terminator at the end of the -exec action is particularly important as it passes multiple file names to a single ex invocation, further improving efficiency.

Alternative Approaches and Their Limitations

Multiple ex Invocations

While possible, using multiple ex invocations is less efficient:

bash
find . -type f -exec ex -sc 'g/pattern1/s//repl1/|x' {} + -exec ex -sc 'g/pattern2/s//repl2/|x' {} +

This approach:

  • Is fully POSIX-compliant
  • Works across implementations
  • But requires multiple process invocations and file reads/writes
  • Defeats some of the efficiency benefits of using ex with multiple files

Conditional Branching

Some implementations might support conditional branching:

?pattern1?
s//repl1/
?pattern2?
s//repl2/

However, this approach:

  • Is not guaranteed to be portable across all POSIX implementations
  • May still treat pattern mismatches as errors in some versions
  • Doesn’t provide the same level of reliability as the global command approach

Using the ‘v’ Command (Inverse Global)

The ‘v’ command selects lines that don’t match a pattern:

v/pattern1/!s//repl1/
v/pattern2/!s//repl2/

While this could theoretically work:

  • It’s less intuitive than the global command approach
  • May not be as widely supported
  • Doesn’t directly solve the original problem of handling unmatched patterns

Practical Examples and Best Practices

Basic Pattern Replacement

To replace all occurrences of “old” with “new” in multiple files:

bash
find . -type f -exec ex -sc 'g/old/s//new/g|x' {} +

This uses:

  • g/old/ to select lines containing “old”
  • s//new/g to replace all occurrences of “old” with “new” on selected lines
  • x to write and quit

Multiple Pattern Replacements

To handle multiple patterns where some might not match:

bash
find . -type f -exec ex -sc 'g/error/s//ERROR/g|g/warning/s//WARNING/g|g/info/s//INFO/g|x' {} +

This will:

  • Replace “error” with “ERROR” (only if found)
  • Replace “warning” with “WARNING” (only if found)
  • Replace “info” with “INFO” (only if found)
  • Continue processing regardless of which patterns were matched

Conditional Pattern Replacement

To replace patterns only if they appear at the beginning of a line:

bash
find . -type f -exec ex -sc 'g/^pattern1/s//repl1/|g/^pattern2/s//repl2/|x' {} +

This uses:

  • ^ in the pattern to match only at line beginnings
  • The global command approach to handle unmatched patterns

Best Practice: Always place more specific patterns before more general ones when using this approach, as ex processes commands sequentially and you might want to ensure more specific replacements happen before more general ones.

Combining with Other Commands

The global command approach can be combined with other POSIX ex commands:

bash
find . -type f -exec ex -sc 'g/^#include/s//\/\* & \*\//|g/^\/\/ /s//\/\* & \*\//|x' {} +

This:

  • Converts #include lines to /* #include */
  • Converts // comments to /* // */
  • Only processes lines that match the patterns

Conclusion: The Most Portable Approach

The global command syntax (g/pattern/s//replacement/) provides the most reliable and portable solution for making POSIX ex ignore unmatched patterns in substitute commands. This approach:

  1. Uses only POSIX-standard features without relying on implementation-specific extensions
  2. Handles unmatched patterns gracefully by silently skipping them
  3. Works efficiently with multiple files through find’s + terminator
  4. Maintains compatibility across different implementations

For practical use in batch processing, the recommended pattern is:

bash
find . -type f -exec ex -sc 'g/pattern1/s//repl1/|g/pattern2/s//repl2/|...|x' {} +

This solution addresses all the constraints mentioned in the question while providing a robust way to perform multiple pattern substitutions across files, regardless of whether all patterns are matched in every file.