How to Retrieve the Last Record in Each Group in MySQL
I have a table messages with the following structure and data:
Id Name Other_Columns
-------------------------
1 A A_data_1
2 A A_data_2
3 A A_data_3
4 B B_data_1
5 B B_data_2
6 C C_data_1
When I run SELECT * FROM messages GROUP BY name, I get the first record for each group:
1 A A_data_1
4 B B_data_1
6 C C_data_1
However, I need to retrieve the last record for each group instead:
3 A A_data_3
5 B B_data_2
6 C C_data_1
Currently, I’m using this query which works but seems inefficient:
SELECT
*
FROM (SELECT
*
FROM messages
ORDER BY id DESC) AS x
GROUP BY name
What are more efficient ways to retrieve the last record in each group in MySQL?
To retrieve the last record in each group in MySQL efficiently, you should use window functions in MySQL 8.0+ or alternative methods like self-JOINs for older versions. The traditional GROUP BY approach returns arbitrary rows, while window functions with ROW_NUMBER() OVER (PARTITION BY group_column ORDER BY sort_column DESC) provide a clean, performant solution that ranks records within groups and returns only the highest-ranked (latest) ones.
Contents
- Understanding the Problem
- Window Functions Solution (MySQL 8.0+)
- Self-JOIN Method (All MySQL Versions)
- Subquery with MAX() Approach
- Performance Comparison
- Best Practices and Recommendations
- Complete Example with Sample Data
Understanding the Problem
The issue you’re experiencing is that MySQL’s GROUP BY clause doesn’t guarantee which row from each group will be returned. When you use SELECT * FROM messages GROUP BY name, MySQL returns an arbitrary row from each group, not necessarily the first or last one. This behavior is documented in the MySQL documentation, which explains that GROUP BY is not guaranteed to return any specific row from each group.
Your current approach of ordering by id DESC and then grouping works but has performance implications because it requires sorting the entire dataset before grouping, which can be inefficient with large tables.
Window Functions Solution (MySQL 8.0+)
The most efficient and readable solution for MySQL 8.0 and newer versions is to use window functions. These functions allow you to perform calculations across a set of table rows that are somehow related to the current row.
Using ROW_NUMBER()
WITH numbered_messages AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY name ORDER BY id DESC) as rn
FROM messages
)
SELECT * FROM numbered_messages
WHERE rn = 1;
This query works by:
- Using a Common Table Expression (CTE) to assign a row number to each record
- Partitioning by
name(creating groups by name) - Ordering within each partition by
id DESC(so highest ID gets rank 1) - Filtering for rows where
rn = 1(the latest record in each group)
Using LAST_VALUE()
WITH latest_messages AS (
SELECT *,
LAST_VALUE(id) OVER (PARTITION BY name ORDER BY id DESC
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as latest_id
FROM messages
)
SELECT DISTINCT
name,
(SELECT * FROM messages m
WHERE m.id = latest_messages.latest_id
AND m.name = latest_messages.name) as latest_record
FROM latest_messages;
According to Oracle’s documentation, LAST_VALUE() can be particularly useful for this type of operation when used with proper window framing.
Self-JOIN Method (All MySQL Versions)
For MySQL versions before 8.0 that don’t support window functions, the self-JOIN approach is a reliable alternative:
SELECT m1.*
FROM messages m1
LEFT JOIN messages m2 ON m1.name = m2.name AND m1.id < m2.id
WHERE m2.id IS NULL;
This query works by:
- Joining the table to itself on the
namecolumn - Finding records where
m1.id < m2.id(meaning there’s a newer record with the same name) - Keeping only records where no newer record exists (
m2.id IS NULL)
Subquery with MAX() Approach
Another approach that works in all MySQL versions is using subqueries with MAX():
SELECT m.*
FROM messages m
JOIN (
SELECT name, MAX(id) as max_id
FROM messages
GROUP BY name
) latest ON m.name = latest.name AND m.id = latest.max_id;
This method:
- First finds the maximum (latest) ID for each name using a subquery
- Then joins back to the original table to get the complete records for those IDs
Performance Comparison
Different approaches have different performance characteristics:
| Method | MySQL Version | Performance | Readability | Index Usage |
|---|---|---|---|---|
| Window Functions (ROW_NUMBER) | 8.0+ | Excellent | High | Good with proper indexes |
| Self-JOIN | All | Moderate | Medium | Depends on join optimization |
| Subquery with MAX | All | Good | High | Excellent on indexed columns |
| ORDER BY DESC GROUP BY | All | Poor | Low | Full table scan required |
As noted in the Virtueinfo blog post, window functions are generally the most performant approach in MySQL 8.0+ when properly indexed.
Best Practices and Recommendations
- Use Window Functions for MySQL 8.0+: They are the most efficient and readable solution
- Add Proper Indexes: Ensure your grouping column and ordering column are indexed for optimal performance
- Consider Data Volume: For very large datasets, test different approaches as performance can vary
- Use EXPLAIN: Always analyze query execution plans to understand performance bottlenecks
The GeeksforGeeks tutorial emphasizes that window functions provide the most modern and maintainable solution when available.
Complete Example with Sample Data
Here’s a complete working example using the sample data you provided:
-- Create sample table
CREATE TABLE messages (
id INT PRIMARY KEY,
name VARCHAR(10),
other_columns VARCHAR(20)
);
-- Insert sample data
INSERT INTO messages VALUES
(1, 'A', 'A_data_1'),
(2, 'A', 'A_data_2'),
(3, 'A', 'A_data_3'),
(4, 'B', 'B_data_1'),
(5, 'B', 'B_data_2'),
(6, 'C', 'C_data_1');
-- Method 1: Window Functions (MySQL 8.0+)
WITH numbered_messages AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY name ORDER BY id DESC) as rn
FROM messages
)
SELECT * FROM numbered_messages
WHERE rn = 1;
-- Method 2: Self-JOIN (All versions)
SELECT m1.*
FROM messages m1
LEFT JOIN messages m2 ON m1.name = m2.name AND m1.id < m2.id
WHERE m2.id IS NULL;
-- Method 3: Subquery with MAX (All versions)
SELECT m.*
FROM messages m
JOIN (
SELECT name, MAX(id) as max_id
FROM messages
GROUP BY name
) latest ON m.name = latest.name AND m.id = latest.max_id;
All three methods will return the same result:
3 A A_data_3
5 B B_data_2
6 C C_data_1
Conclusion
The key takeaways for retrieving the last record in each group in MySQL are:
-
For MySQL 8.0+: Use window functions with
ROW_NUMBER() OVER (PARTITION BY name ORDER BY id DESC)for the most efficient and readable solution. -
For older MySQL versions: The self-JOIN approach or subquery with MAX() methods provide reliable alternatives, though they may have different performance characteristics.
-
Performance optimization: Always ensure proper indexing on both the grouping column and the ordering column for best results.
-
Query testing: Use EXPLAIN to analyze query execution plans and test different approaches on your specific dataset size and structure.
The window function approach recommended by SQLpey is generally the most recommended and readable approach for modern MySQL installations, leveraging both SQL standards and performance optimizations.
Sources
- SQL - Retrieving the last record in each group - MySQL - Stack Overflow
- How To Get Last Record In Each Group In MySQL? - GeeksforGeeks
- Solved: How to retrieve the latest record per group in MySQL - SQLpey
- MySQL - Select Latest Record for Each Group - Codelabs365
- Get Last Record in Each MySQL Group - Virtueinfo
- Window Functions in MySQL - MySQL Documentation
- LAST_VALUE Function Documentation
- How to get latest 2 records of each group - Database Administrators Stack Exchange