How can I sort a list of dictionaries by a specific key’s value in Python?
Given the following list of dictionaries:
[{'name': 'Homer', 'age': 39}, {'name': 'Bart', 'age': 10}]
When sorted by the ‘name’ key, it should become:
[{'name': 'Bart', 'age': 10}, {'name': 'Homer', 'age': 39}]
What is the most efficient way to sort a list of dictionaries by a specific key’s value in Python?
To sort a list of dictionaries by a specific key’s value in Python, you can use the sorted() function with a lambda function or the operator.itemgetter() method. The most efficient approach for large datasets is operator.itemgetter() as it’s faster than lambda functions, while lambda functions offer more flexibility for complex sorting logic.
Contents
- Basic Sorting Methods
- Performance Comparison
- Advanced Sorting Techniques
- Error Handling
- Practical Examples
- Conclusion
Basic Sorting Methods
Using Lambda Function
The most straightforward approach is using sorted() with a lambda function:
data = [{'name': 'Homer', 'age': 39}, {'name': 'Bart', 'age': 10}]
# Sort by 'name' key
sorted_data = sorted(data, key=lambda x: x['name'])
print(sorted_data)
# Output: [{'name': 'Bart', 'age': 10}, {'name': 'Homer', 'age': 39}]
Using operator.itemgetter()
For better performance, especially with large datasets:
from operator import itemgetter
data = [{'name': 'Homer', 'age': 39}, {'name': 'Bart', 'age': 10}]
# Sort by 'name' key
sorted_data = sorted(data, key=itemgetter('name'))
print(sorted_data)
In-Place Sorting
If you want to modify the original list:
data = [{'name': 'Homer', 'age': 39}, {'name': 'Bart', 'age': 10}]
# Sort in-place by 'name'
data.sort(key=lambda x: x['name'])
print(data)
Performance Comparison
When working with large datasets, the choice of sorting method matters significantly:
import timeit
from operator import itemgetter
# Generate test data
large_data = [{'id': i, 'name': f'User_{i}'} for i in range(10000)]
# Test lambda approach
lambda_time = timeit.timeit(
'sorted(large_data, key=lambda x: x["id"])',
globals=globals(),
number=100
)
# Test itemgetter approach
itemgetter_time = timeit.timeit(
'sorted(large_data, key=itemgetter("id"))',
globals=globals(),
number=100
)
print(f"Lambda approach: {lambda_time:.4f} seconds")
print(f"Itemgetter approach: {itemgetter_time:.4f} seconds")
print(f"Itemgetter is {lambda_time/itemgetter_time:.1f}x faster")
According to Python performance benchmarks, itemgetter() is typically 2-3x faster than lambda functions for sorting operations on large datasets.
Advanced Sorting Techniques
Sorting by Multiple Keys
You can sort by multiple keys by providing a tuple to the key function:
data = [
{'name': 'Homer', 'age': 39, 'department': 'Safety'},
{'name': 'Bart', 'age': 10, 'department': 'Elementary'},
{'name': 'Marge', 'age': 36, 'department': 'Home'}
]
# Sort by department (ascending), then by age (descending)
sorted_data = sorted(
data,
key=lambda x: (x['department'], -x['age'])
)
Reverse Sorting
To sort in descending order:
data = [{'name': 'Homer', 'age': 39}, {'name': 'Bart', 'age': 10}]
# Sort by age in descending order
sorted_data = sorted(data, key=lambda x: x['age'], reverse=True)
Custom Sorting Functions
For complex sorting logic:
data = [{'name': 'Homer', 'age': 39}, {'name': 'Bart', 'age': 10}]
def custom_sort(item):
return len(item['name']) # Sort by name length
sorted_data = sorted(data, key=custom_sort)
Error Handling
Handling Missing Keys
When dictionaries might not have the sorting key:
data = [
{'name': 'Homer', 'age': 39},
{'name': 'Bart'}, # Missing 'age' key
{'name': 'Marge', 'age': 36}
]
# Safe sorting with default values
sorted_data = sorted(
data,
key=lambda x: x.get('age', 0) # Default age of 0
)
Type-Agnostic Sorting
To handle mixed data types:
data = [
{'name': 'Alice', 'score': 85},
{'name': 'Bob', 'score': '90'},
{'name': 'Charlie', 'score': 78}
]
# Convert to comparable types
sorted_data = sorted(
data,
key=lambda x: float(x['score']) if isinstance(x['score'], str) else x['score']
)
Practical Examples
Sorting JSON Data from API
import json
# Example JSON response
json_response = '''
[
{"id": 1, "title": "Python Basics", "price": 29.99},
{"id": 3, "title": "Advanced Python", "price": 49.99},
{"id": 2, "title": "Python for Data Science", "price": 39.99}
]
'''
# Parse and sort by price
products = json.loads(json_response)
sorted_products = sorted(products, key=lambda x: x['price'])
print("Cheapest to most expensive:")
for product in sorted_products:
print(f"{product['title']}: ${product['price']}")
Sorting Student Records
students = [
{'name': 'Alice', 'grades': [85, 90, 78]},
{'name': 'Bob', 'grades': [92, 88, 95]},
{'name': 'Charlie', 'grades': [76, 82, 79]}
]
# Sort by average grade
sorted_students = sorted(
students,
key=lambda x: sum(x['grades']) / len(x['grades']),
reverse=True
)
Case-Insensitive Sorting
data = [
{'name': 'homer', 'age': 39},
{'name': 'Bart', 'age': 10},
{'name': 'Marge', 'age': 36}
]
# Case-insensitive sorting
sorted_data = sorted(data, key=lambda x: x['name'].lower())
Conclusion
Sorting lists of dictionaries in Python offers several effective approaches:
- For most cases, use
sorted()with a lambda function for its simplicity and readability - For performance-critical applications, prefer
operator.itemgetter()as it’s significantly faster - When sorting by multiple criteria, use tuple-based key functions
- Always handle potential missing keys using
.get()with appropriate defaults - Consider in-place sorting with
.sort()when memory efficiency is important
The choice between lambda and itemgetter depends on your specific needs - lambda offers more flexibility for complex sorting logic, while itemgetter provides better performance for straightforward key-based sorting. Both methods are essential tools in Python’s data manipulation toolkit.