I've captured the standard output of an external program into a bytes object and want to convert that to a normal Python string so that I can print it. How do I convert the bytes object to a str with Python 3?

NeuroAgent

How to Convert Bytes to String in Python 3

Learn multiple methods to convert bytes to string in Python 3, including subprocess output. Discover .decode(), text=True parameter, and encoding handling techniques with practical examples.

#python-3-tutorial #python-bytes-conversion #python-encoding #string-decoding #subprocess-output #text-processing

10/21/2025, 04:54 AM

Question

How to convert bytes to string in Python 3

I’ve captured the standard output of an external program into a bytes object:

python

>>> from subprocess import *
>>> stdout = Popen(['ls', '-l'], stdout=PIPE).communicate()[0]
>>> stdout
b'total 0\n-rw-rw-r-- 1 thomas thomas 0 Mar  3 07:03 file1\n-rw-rw-r-- 1 thomas thomas 0 Mar  3 07:03 file2\n'

I want to convert that to a normal Python string, so that I can print it like this:

python

>>> print(stdout)
-rw-rw-r-- 1 thomas thomas 0 Mar  3 07:03 file1
-rw-rw-r-- 1 thomas thomas 0 Mar  3 07:03 file2

How do I convert the bytes object to a str with Python 3?

NeuroAgent

The most straightforward way to convert bytes to string in Python 3 is to use the .decode() method on your bytes object. For subprocess output specifically, you can either decode the bytes manually or use the text=True parameter (available in Python 3.7+) to get strings directly from the start.

Basic Bytes to String Conversion
Converting Subprocess Output
Advanced Decoding Techniques
Handling Encoding Issues
Best Practices
Complete Examples

Basic Bytes to String Conversion

In Python 3, the fundamental method to convert bytes to string is using the .decode() method. When you have a bytes object like the one from your subprocess output, you can convert it to a regular string by specifying an encoding:

python

>>> stdout = b'total 0\n-rw-rw-r-- 1 thomas thomas 0 Mar  3 07:03 file1\n'
>>> string_output = stdout.decode('utf-8')
>>> print(string_output)
total 0
-rw-rw-r-- 1 thomas thomas 0 Mar  3 07:03 file1

The decode() method takes an encoding parameter, with utf-8 being the most common default. According to the Python 3.14.0 documentation, subprocess functions return data as encoded bytes by default, and the application needs to handle the decoding.

For your specific case:

python

>>> from subprocess import *
>>> stdout = Popen(['ls', '-l'], stdout=PIPE).communicate()[0]
>>> string_output = stdout.decode('utf-8')
>>> print(string_output)
total 0
-rw-rw-r-- 1 thomas thomas 0 Mar  3 07:03 file1
-rw-rw-r-- 1 thomas thomas 0 Mar  3 07:03 file2

Converting Subprocess Output

When working with subprocess output, you have several approaches to get strings instead of bytes:

Method 1: Using `text=True` (Recommended for Python 3.7+)

The modern approach is to use the text=True parameter (or universal_newlines=True in older versions) which automatically handles the conversion:

python

>>> from subprocess import Popen, PIPE
>>> process = Popen(['ls', '-l'], stdout=PIPE, text=True)
>>> stdout = process.communicate()[0]
>>> print(stdout)  # stdout is already a string
total 0
-rw-rw-r-- 1 thomas thomas 0 Mar  3 07:03 file1
-rw-rw-r-- 1 thomas thomas 0 Mar  3 07:03 file2

As explained on Stack Overflow, this approach is cleaner and avoids manual decoding.

Method 2: Using `encoding` Parameter

You can also specify the encoding directly:

python

>>> process = Popen(['ls', '-l'], stdout=PIPE, encoding='utf-8')
>>> stdout = process.communicate()[0]

Method 3: Using `subprocess.run()` (Modern API)

For newer Python versions, subprocess.run() is the preferred method:

python

>>> import subprocess
>>> result = subprocess.run(['ls', '-l'], capture_output=True, text=True)
>>> print(result.stdout)
total 0
-rw-rw-r-- 1 thomas thomas 0 Mar  3 07:03 file1
-rw-rw-r-- 1 thomas thomas 0 Mar  3 07:03 file2

Advanced Decoding Techniques

Handling Different Encodings

Sometimes subprocess output might not be in UTF-8. You can specify different encodings:

python

# For Windows command output (often cp437 or cp1252)
output = subprocess.check_output('dir', shell=True, encoding='cp437')

As noted in the Stack Overflow discussion, you might need to use platform-specific encodings for certain system commands.

Splitting Lines Directly

You can decode and split lines in one operation:

python

>>> lines = stdout.decode('utf-8').splitlines()
>>> for line in lines:
...     print(line)
total 0
-rw-rw-r-- 1 thomas thomas 0 Mar  3 07:03 file1
-rw-rw-r-- 1 thomas thomas 0 Mar  3 07:03 file2

Using Context Managers

For more robust handling, use context managers:

python

import subprocess

with subprocess.Popen(['ls', '-l'], stdout=subprocess.PIPE, text=True) as p:
    stdout, stderr = p.communicate()
    print(stdout)

Handling Encoding Issues

Error Handling in Decoding

When decoding bytes, you might encounter encoding errors. You can handle these with the errors parameter:

python

# Ignore errors
clean_string = corrupted_bytes.decode('utf-8', errors='ignore')

# Replace errors with placeholder
clean_string = corrupted_bytes.decode('utf-8', errors='replace')

# Strict mode (default) - raises UnicodeDecodeError on errors
clean_string = corrupted_bytes.decode('utf-8', errors='strict')

According to sqlpey, the errors argument is particularly useful when dealing with corrupted or mixed-encoding data.

Detecting Encoding

For cases where you’re unsure about the encoding, you might need to detect it first:

python

import locale

# Use system's preferred encoding
encoding = locale.getpreferredencoding(False)
output = stdout.decode(encoding, errors='replace')

The Stack Overflow discussion explains how encoding detection can help with subprocess output.

Best Practices

Prefer text=True for Python 3.7+ - it’s cleaner and less error-prone than manual decoding.
Handle encoding explicitly - don’t rely on system defaults when working with external command output.
Use error handling - always consider what happens when encoding fails.
Choose the right subprocess function - subprocess.run() is preferred for modern Python.
Consider context managers - they ensure proper resource cleanup.

As the Python documentation states, “the actual encoding of the output data may depend on the command being invoked, so the decoding to text will often need to be handled at the application level.”

Complete Examples

Example 1: Basic Conversion

python

from subprocess import Popen, PIPE

# Get bytes output
process = Popen(['ls', '-l'], stdout=PIPE)
stdout_bytes, stderr_bytes = process.communicate()

# Convert to string
stdout_str = stdout_bytes.decode('utf-8')

print(stdout_str)

Example 2: Modern Approach (Recommended)

python

import subprocess

# Get string output directly
result = subprocess.run(['ls', '-l'], capture_output=True, text=True)

print(result.stdout)

Example 3: Robust Error Handling

python

import subprocess

try:
    # Try with UTF-8 first
    result = subprocess.run(['ls', '-l'], capture_output=True, text=True, encoding='utf-8')
    print(result.stdout)
except UnicodeDecodeError:
    # Fall back to system encoding
    result = subprocess.run(['ls', '-l'], capture_output=True, text=True, encoding=locale.getpreferredencoding())
    print(result.stdout)

Example 4: Working with Multiple Commands

python

import subprocess

commands = [
    ['ls', '-l'],
    ['date'],
    ['whoami']
]

for cmd in commands:
    result = subprocess.run(cmd, capture_output=True, text=True, encoding='utf-8')
    print(f"Command: {' '.join(cmd)}")
    print("Output:")
    print(result.stdout)
    print("-" * 40)

Sources

Conclusion

Converting bytes to string in Python 3 is straightforward once you understand the available methods. For your subprocess output, the simplest solution is either using .decode('utf-8') on the bytes object or using text=True parameter to get strings directly.

Key takeaways:

Use .decode('utf-8') for manual conversion
Prefer text=True parameter (Python 3.7+) for cleaner code
Handle encoding errors with the errors parameter when needed
Consider using subprocess.run() instead of Popen for modern Python code
Be aware that subprocess output encoding may vary between systems

Choose the method that best fits your Python version and specific use case, and always consider potential encoding issues when working with external command output.

How to handle encoding errors when converting bytes to string in Python?What's the difference between text=True and universal_newlines in Python subprocess?How to convert string back to bytes in Python 3?Why does subprocess output come as bytes in Python 3?How to detect encoding of bytes object in Python?What's the best way to handle subprocess output with mixed encodings in Python?

Ask NeuroAgent

How to Convert Bytes to String in Python 3

Contents

Basic Bytes to String Conversion

Converting Subprocess Output

Method 1: Using text=True (Recommended for Python 3.7+)

Method 2: Using encoding Parameter

Method 3: Using subprocess.run() (Modern API)

Advanced Decoding Techniques

Handling Different Encodings

Splitting Lines Directly

Using Context Managers

Handling Encoding Issues

Error Handling in Decoding

Detecting Encoding

Best Practices

Complete Examples

Example 1: Basic Conversion

Example 2: Modern Approach (Recommended)

Example 3: Robust Error Handling

Example 4: Working with Multiple Commands

Sources

Conclusion

Method 1: Using `text=True` (Recommended for Python 3.7+)

Method 2: Using `encoding` Parameter

Method 3: Using `subprocess.run()` (Modern API)