NeuroAgent

How to Get Last Record in Each Group in MySQL

Learn efficient methods to retrieve the last record in each group in MySQL. Explore window functions, self-JOINs, and subquery approaches with performance comparisons for optimal database queries.

Question

How to Retrieve the Last Record in Each Group in MySQL

I have a table messages with the following structure and data:

Id   Name   Other_Columns
-------------------------
1    A       A_data_1
2    A       A_data_2
3    A       A_data_3
4    B       B_data_1
5    B       B_data_2
6    C       C_data_1

When I run SELECT * FROM messages GROUP BY name, I get the first record for each group:

1    A       A_data_1
4    B       B_data_1
6    C       C_data_1

However, I need to retrieve the last record for each group instead:

3    A       A_data_3
5    B       B_data_2
6    C       C_data_1

Currently, I’m using this query which works but seems inefficient:

sql
SELECT
  *
FROM (SELECT
  *
FROM messages
ORDER BY id DESC) AS x
GROUP BY name

What are more efficient ways to retrieve the last record in each group in MySQL?

NeuroAgent

To retrieve the last record in each group in MySQL efficiently, you should use window functions in MySQL 8.0+ or alternative methods like self-JOINs for older versions. The traditional GROUP BY approach returns arbitrary rows, while window functions with ROW_NUMBER() OVER (PARTITION BY group_column ORDER BY sort_column DESC) provide a clean, performant solution that ranks records within groups and returns only the highest-ranked (latest) ones.

Contents

Understanding the Problem

The issue you’re experiencing is that MySQL’s GROUP BY clause doesn’t guarantee which row from each group will be returned. When you use SELECT * FROM messages GROUP BY name, MySQL returns an arbitrary row from each group, not necessarily the first or last one. This behavior is documented in the MySQL documentation, which explains that GROUP BY is not guaranteed to return any specific row from each group.

Your current approach of ordering by id DESC and then grouping works but has performance implications because it requires sorting the entire dataset before grouping, which can be inefficient with large tables.

Window Functions Solution (MySQL 8.0+)

The most efficient and readable solution for MySQL 8.0 and newer versions is to use window functions. These functions allow you to perform calculations across a set of table rows that are somehow related to the current row.

Using ROW_NUMBER()

sql
WITH numbered_messages AS (
    SELECT *,
           ROW_NUMBER() OVER (PARTITION BY name ORDER BY id DESC) as rn
    FROM messages
)
SELECT * FROM numbered_messages
WHERE rn = 1;

This query works by:

  1. Using a Common Table Expression (CTE) to assign a row number to each record
  2. Partitioning by name (creating groups by name)
  3. Ordering within each partition by id DESC (so highest ID gets rank 1)
  4. Filtering for rows where rn = 1 (the latest record in each group)

Using LAST_VALUE()

sql
WITH latest_messages AS (
    SELECT *,
           LAST_VALUE(id) OVER (PARTITION BY name ORDER BY id DESC 
                               ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as latest_id
    FROM messages
)
SELECT DISTINCT
    name,
    (SELECT * FROM messages m 
     WHERE m.id = latest_messages.latest_id 
     AND m.name = latest_messages.name) as latest_record
FROM latest_messages;

According to Oracle’s documentation, LAST_VALUE() can be particularly useful for this type of operation when used with proper window framing.

Self-JOIN Method (All MySQL Versions)

For MySQL versions before 8.0 that don’t support window functions, the self-JOIN approach is a reliable alternative:

sql
SELECT m1.*
FROM messages m1
LEFT JOIN messages m2 ON m1.name = m2.name AND m1.id < m2.id
WHERE m2.id IS NULL;

This query works by:

  1. Joining the table to itself on the name column
  2. Finding records where m1.id < m2.id (meaning there’s a newer record with the same name)
  3. Keeping only records where no newer record exists (m2.id IS NULL)

Subquery with MAX() Approach

Another approach that works in all MySQL versions is using subqueries with MAX():

sql
SELECT m.*
FROM messages m
JOIN (
    SELECT name, MAX(id) as max_id
    FROM messages
    GROUP BY name
) latest ON m.name = latest.name AND m.id = latest.max_id;

This method:

  1. First finds the maximum (latest) ID for each name using a subquery
  2. Then joins back to the original table to get the complete records for those IDs

Performance Comparison

Different approaches have different performance characteristics:

Method MySQL Version Performance Readability Index Usage
Window Functions (ROW_NUMBER) 8.0+ Excellent High Good with proper indexes
Self-JOIN All Moderate Medium Depends on join optimization
Subquery with MAX All Good High Excellent on indexed columns
ORDER BY DESC GROUP BY All Poor Low Full table scan required

As noted in the Virtueinfo blog post, window functions are generally the most performant approach in MySQL 8.0+ when properly indexed.

Best Practices and Recommendations

  1. Use Window Functions for MySQL 8.0+: They are the most efficient and readable solution
  2. Add Proper Indexes: Ensure your grouping column and ordering column are indexed for optimal performance
  3. Consider Data Volume: For very large datasets, test different approaches as performance can vary
  4. Use EXPLAIN: Always analyze query execution plans to understand performance bottlenecks

The GeeksforGeeks tutorial emphasizes that window functions provide the most modern and maintainable solution when available.

Complete Example with Sample Data

Here’s a complete working example using the sample data you provided:

sql
-- Create sample table
CREATE TABLE messages (
    id INT PRIMARY KEY,
    name VARCHAR(10),
    other_columns VARCHAR(20)
);

-- Insert sample data
INSERT INTO messages VALUES
(1, 'A', 'A_data_1'),
(2, 'A', 'A_data_2'),
(3, 'A', 'A_data_3'),
(4, 'B', 'B_data_1'),
(5, 'B', 'B_data_2'),
(6, 'C', 'C_data_1');

-- Method 1: Window Functions (MySQL 8.0+)
WITH numbered_messages AS (
    SELECT *,
           ROW_NUMBER() OVER (PARTITION BY name ORDER BY id DESC) as rn
    FROM messages
)
SELECT * FROM numbered_messages
WHERE rn = 1;

-- Method 2: Self-JOIN (All versions)
SELECT m1.*
FROM messages m1
LEFT JOIN messages m2 ON m1.name = m2.name AND m1.id < m2.id
WHERE m2.id IS NULL;

-- Method 3: Subquery with MAX (All versions)
SELECT m.*
FROM messages m
JOIN (
    SELECT name, MAX(id) as max_id
    FROM messages
    GROUP BY name
) latest ON m.name = latest.name AND m.id = latest.max_id;

All three methods will return the same result:

3    A       A_data_3
5    B       B_data_2
6    C       C_data_1

Conclusion

The key takeaways for retrieving the last record in each group in MySQL are:

  1. For MySQL 8.0+: Use window functions with ROW_NUMBER() OVER (PARTITION BY name ORDER BY id DESC) for the most efficient and readable solution.

  2. For older MySQL versions: The self-JOIN approach or subquery with MAX() methods provide reliable alternatives, though they may have different performance characteristics.

  3. Performance optimization: Always ensure proper indexing on both the grouping column and the ordering column for best results.

  4. Query testing: Use EXPLAIN to analyze query execution plans and test different approaches on your specific dataset size and structure.

The window function approach recommended by SQLpey is generally the most recommended and readable approach for modern MySQL installations, leveraging both SQL standards and performance optimizations.

Sources

  1. SQL - Retrieving the last record in each group - MySQL - Stack Overflow
  2. How To Get Last Record In Each Group In MySQL? - GeeksforGeeks
  3. Solved: How to retrieve the latest record per group in MySQL - SQLpey
  4. MySQL - Select Latest Record for Each Group - Codelabs365
  5. Get Last Record in Each MySQL Group - Virtueinfo
  6. Window Functions in MySQL - MySQL Documentation
  7. LAST_VALUE Function Documentation
  8. How to get latest 2 records of each group - Database Administrators Stack Exchange