NeuroAgent

Complete Guide: Check String Contains Substring in Bash

Learn the most efficient methods to check if a string contains a substring in Bash. Compare built-in operators, case statements, grep, and regex approaches with performance benchmarks and best practices.

Question

How to check if a string contains a substring in Bash

I have a string in Bash:

bash
string="My string"

How can I test if it contains another string? I tried using:

bash
if [ $string ?? 'foo' ]; then
  echo "It's there!"
fi

Where ?? is my unknown operator. Is using echo and grep the correct approach?

bash
if echo "$string" | grep 'foo'; then
  echo "It's there!"
fi

That seems a bit clumsy. What’s the proper way to check for substrings in Bash?

NeuroAgent

The most efficient and common way to check if a string contains a substring in Bash is using the [[ operator with wildcard patterns. For your example, you would write:

bash
string="My string"
if [[ $string == *"foo"* ]]; then
  echo "It's there!"
fi

This method is built into Bash, doesn’t require external commands, and is more performant than using grep.


Contents


Built-in Bash Methods

Using the [[ Operator with Wildcards

The most straightforward approach uses Bash’s built-in pattern matching with the [[ operator:

bash
string="My string"
substring="foo"

if [[ $string == *"$substring"* ]]; then
    echo "Substring found!"
else
    echo "Substring not found"
fi

Key points:

  • The * characters act as wildcards matching any sequence of characters
  • Always quote the substring variable to handle special characters properly
  • The [[ operator is Bash’s advanced conditional construct

Using case Statement

The case statement provides an efficient alternative:

bash
string="My string"
substring="foo"

case $string in
    *"$substring"*)
        echo "Substring found!"
        ;;
    *)
        echo "Substring not found"
        ;;
esac

As noted in the Stack Overflow discussion, the case method is actually more efficient than the [[ operator for substring matching.

Using Regular Expressions

Bash supports regular expression matching with the =~ operator:

bash
string="My string"
substring="foo"

if [[ $string =~ .*"$substring".* ]]; then
    echo "Substring found!"
fi

Using External Commands

grep Method

While your grep approach works, it can be optimized:

bash
string="My string"
substring="foo"

if echo "$string" | grep -q "$substring"; then
    echo "Substring found!"
fi

The -q option makes grep quiet and only return an exit code. However, as Linuxize notes, this method is less efficient as it requires spawning a new process.

Here-Document Alternative

You can use a here-document instead of echo:

bash
if grep -q "$substring" <<< "$string"; then
    echo "Substring found!"
fi

Advanced Pattern Matching

Multiple Substrings

You can check for multiple substrings using logical operators:

bash
string="Hello world"
if [[ $string == *"Hello"* && $string == *"world"* ]]; then
    echo "Both substrings found!"
fi

Pattern Variations

Different wildcard patterns for various matching needs:

bash
# Starts with substring
if [[ $string == "foo"* ]]; then
    echo "String starts with 'foo'"
fi

# Ends with substring
if [[ $string == *"foo" ]]; then
    echo "String ends with 'foo'"
fi

# Exact position (not directly possible, but using regex)
if [[ $string =~ foo ]]; then
    echo "Contains 'foo'"
fi

Case-Insensitive Matching

Using grep with Case-Insensitive Flag

bash
string="Hello World"
substring="hello"

if echo "$string" | grep -qi "$substring"; then
    echo "Substring found (case-insensitive)!"
fi

Using shopt for Case-Insensitive Bash Matching

bash
shopt -s nocasematch
string="Hello World"
substring="hello"

if [[ $string == *"hello"* ]]; then
    echo "Substring found (case-insensitive)!"
fi
shopt -u nocasematch  # Turn off case-insensitive matching

Performance Comparison

Based on the research findings, here’s how the methods compare in performance:

  1. case statement - Most efficient, especially for large strings
  2. [[ operator with wildcards - Good performance, very readable
  3. Regular expressions (=~) - More powerful but slower than pattern matching
  4. grep method - Least efficient due to process creation overhead

As mentioned in the Stack Overflow answer, a 2023 update confirmed that the case method is significantly more efficient than parameter expansion approaches.


Best Practices

Always Quote Variables

bash
# Good
if [[ "$string" == *"foo"* ]]; then

# Bad (can cause issues with spaces or special characters)
if [[ $string == *foo* ]]; then

Handle Empty Strings

bash
if [[ -n "$string" && "$string" == *"foo"* ]]; then
    echo "Non-empty string contains 'foo'"
fi

Use [[ Instead of [

The [[ operator is more powerful and safer than the older [ test command:

bash
# Better
if [[ $string == *"foo"* ]]; then

# Older style, more limited
if [ "$string" = *foo* ]; then  # This won't work with wildcards!

Consider Portability

If your script needs to work in environments without Bash (like minimal Docker containers), consider using grep or other POSIX-compatible methods.

Error Handling

For production scripts, add error handling:

bash
#!/bin/bash

string="My string"
substring="foo"

if [[ -z "$string" ]]; then
    echo "Error: String is empty" >&2
    exit 1
fi

if [[ "$string" == *"$substring"* ]]; then
    echo "Substring '$substring' found in string"
else
    echo "Substring '$substring' not found"
fi

Conclusion

  1. For most cases, use the [[ $string == *"substring"* ]] method - it’s readable, efficient, and built into Bash.

  2. For performance-critical applications, consider the case statement approach which has been shown to be more efficient.

  3. Avoid the grep method unless you specifically need case-insensitive matching or regular expression capabilities.

  4. Always quote your variables to prevent word splitting and globbing issues.

  5. Remember that not all environments have Bash - if writing portable scripts, test your chosen method in the target environment.

The proper way to check for substrings in Bash depends on your specific needs, but the [[ operator with wildcards provides the best balance of readability, performance, and compatibility for most use cases.