NeuroAgent

Complete Guide: Extract Records PowerQuery Excel

Learn how to extract specific records from a column to separate columns using PowerQuery in Excel. Complete guide with step-by-step methods for text extraction and data transformation. Master PowerQuery today!

How to extract specific records from a column to separate columns using PowerQuery in Excel?

I’m trying to extract specific information from a body column and populate it into separate columns using PowerQuery in Excel.

Scenario:
I’m importing data from Microsoft Exchange into an Excel file with columns including “DateTimeReceived”, “Subject”, and “Body”. The “Body” column contains multiple records including first name, last name, email, and cost center.

Requirement:
I need to extract these individual records from the “Body” column into four separate columns: “First Name”, “Last Name”, “Email”, and “Cost Center” using Excel PowerQuery.

Steps I’ve attempted:

  1. Split by delimiter “:”
  2. Split by delimiter " "
  3. Added a custom column using “Text.Contains”

Can someone provide guidance on the most effective way to extract and organize this data into separate columns?

NeuroAgent

PowerQuery offers several powerful methods to extract specific records from a column into separate columns, including using the Text.BetweenDelimiters function, splitting by delimiters, and creating custom columns with extraction formulas. The most effective approach depends on your data structure and delimiter patterns, with the Text.BetweenDelimiters function being particularly useful when you have consistent start and end delimiters marking the specific information you want to extract.

Contents

Understanding Your Data Structure

Before extracting data, it’s crucial to analyze your “Body” column structure. The research shows that successful extraction depends on identifying the consistent patterns and delimiters in your data.

“The most similar function to Microsoft Excel’s text to columns button is Splitting columns by delimiter. It allows you define a specific character and separate out portions of a single column into new columns based on it.” Source: PopAutomation

Look for consistent patterns in your Exchange data. For example:

  • First names might be preceded by “First Name:” or “FN:”
  • Email addresses typically contain “@” symbols
  • Cost centers might be indicated by specific codes or patterns

Method 1: Using Text.BetweenDelimiters Function

The Text.BetweenDelimiters function is specifically designed for extracting text between specific start and end delimiters. This is ideal when your data follows a consistent pattern.

Basic Syntax:

powerquery
Text.BetweenDelimiters(text as any, optional startDelimiter as any, optional endDelimiter as any, optional startIndex as any, optional endIndex as any) as any

Step-by-Step Implementation:

  1. Open Power Query Editor in Excel
  2. Select your table with the “Body” column
  3. Go to Add ColumnCustom Column
  4. Use formulas like these for each field:
powerquery
// Extract First Name
= Text.BetweenDelimiters([Body], "First Name:", "Last Name:")

// Extract Last Name  
= Text.BetweenDelimiters([Body], "Last Name:", "Email:")

// Extract Email
= Text.BetweenDelimiters([Body], "Email:", "Cost Center:")

// Extract Cost Center
= Text.BetweenDelimiters([Body], "Cost Center:", "End of Record:")

Handling Edge Cases:

powerquery
= if Text.Contains([Body], "First Name:") then 
    Text.BetweenDelimiters([Body], "First Name:", "Last Name:") 
  else null

As RADACAD explains, this function is perfect for extracting specific parts of text values using delimiters.

Method 2: Split Column by Delimiter

This method is useful when your data is structured with consistent separators.

Step-by-Step Implementation:

  1. Select the “Body” column in Power Query Editor
  2. Go to TransformSplit ColumnBy Delimiter
  3. Choose your delimiter (e.g., “:”, “;”, or a space)
  4. Choose Split into Rows or Split into Columns based on your needs

For extracting multiple fields, you might need multiple split operations:

  1. First split by a major delimiter (like “Record:” or “END”)
  2. Then split the resulting columns by smaller delimiters

According to Microsoft Support, you can split a column with text data type into two or more columns by using the number of characters or delimiters within a text value.

Method 3: Custom Column with Position Functions

For more complex extraction scenarios, you can use position-based functions.

Step-by-Step Implementation:

  1. Go to Add ColumnCustom Column
  2. Use functions like Text.PositionOf and Text.Range:
powerquery
// Extract text between specific positions
= Text.Range(
    [Body], 
    Text.PositionOf([Body], "First Name:") + 12, // Start after "First Name:"
    Text.PositionOf([Body], "Last Name:") - Text.PositionOf([Body], "First Name:") - 12
)

// Alternative approach using multiple functions
= if Text.Contains([Body], "First Name:") then
    let
        startPos = Text.PositionOf([Body], "First Name:") + 12,
        endPos = Text.PositionOf([Body], "Last Name:", Occurrence.First),
        result = Text.Range([Body], startPos, endPos - startPos)
    in result
  else null

The Stack Overflow discussion provides detailed examples of using these functions for complex extraction scenarios.

Method 4: Extract Commands in Power Query

Power Query provides built-in extract commands for common scenarios.

Step-by-Step Implementation:

  1. Select the “Body” column
  2. Go to Add ColumnExtract
  3. Choose from these options:
    • Text Before Delimiter - Extracts text before a specific character
    • Text After Delimiter - Extracts text after a specific character
    • Text Between Delimiters - Extracts text between two delimiters

For your scenario:

  1. Create “First Name” column using Text Before Delimiter with “Last Name:” as the delimiter
  2. Create “Last Name” column using Text Between Delimiters with “Last Name:” and “Email:” as delimiters
  3. Create “Email” column using Text Between Delimiters with “Email:” and “Cost Center:” as delimiters
  4. Create “Cost Center” column using Text After Delimiter with “Cost Center:” as the delimiter

As Excel Campus suggests, instead of creating duplicate columns, you can go to “Add Column” → “Extract” and perform text before, after, and between delimiters.

Advanced Techniques for Complex Scenarios

Handling Multiple Occurrences

If your data contains multiple records in the same cell, you may need to split into rows first:

powerquery
let
    Source = YourTable,
    SplitRows = Table.ExpandListColumn(
        Table.TransformColumns(Source, {{"Body", Splitter.SplitTextByDelimiter("||", QuoteStyle.Csv)}}, "Body"),
        "Body"
    ),
    ExtractFields = Table.AddColumn(SplitRows, "First Name", each Text.BetweenDelimiters([Body], "First Name:", "Last Name:")),
    // Add more columns as needed
in ExtractFields

Creating Reusable Functions

For repetitive extraction tasks, you can create custom functions:

powerquery
= (text as text, startDelim as text, endDelim as text) as text =>
    if Text.Contains(text, startDelim) and Text.Contains(text, endDelim) then
        Text.BetweenDelimiters(text, startDelim, endDelim)
    else null

The My Online Training Hub resource provides insights into creating reusable functions for text extraction.

Handling Inconsistent Data

For data with inconsistent patterns, use conditional logic:

powerquery
= if Text.Contains([Body], "First Name:") then
    Text.BetweenDelimiters([Body], "First Name:", "Last Name:")
  else if Text.Contains([Body], "FN:") then
    Text.BetweenDelimiters([Body], "FN:", "LN:")
  else null

Troubleshooting Common Issues

Issue 1: Text Not Found

Problem: Extraction returns null or empty values
Solution: Add error handling:

powerquery
= if Text.Contains([Body], "First Name:") then
    Text.BetweenDelimiters([Body], "First Name:", "Last Name:")
  else "Not Found"

Issue 2: Multiple Records in One Cell

Problem: Multiple records are combined in one cell
Solution: Split into rows first, then extract:

powerquery
= Table.ExpandListColumn(
    Table.TransformColumns(YourTable, {{"Body", Splitter.SplitTextByDelimiter("||", QuoteStyle.Csv)}}, "Body"),
    "Body"
)

Issue 3: Inconsistent Delimiters

Problem: Delimiters vary between records
Solution: Use multiple extraction methods or create a comprehensive function:

powerquery
= if Text.Contains([Body], "First Name:") then
    Text.BetweenDelimiters([Body], "First Name:", "Last Name:")
  else if Text.Contains([Body], "Name:") then
    Text.BetweenDelimiters([Body], "Name:", "Email:")
  else null

The Reddit discussions show practical examples of handling different types of delimiters and edge cases.

Conclusion

Extracting specific records from a column to separate columns using PowerQuery can be accomplished through several effective methods:

  1. Text.BetweenDelimiters is the most straightforward method when you have consistent start and end delimiters
  2. Split Column by Delimiter works well for data with clear separators
  3. Custom columns with position functions provide maximum flexibility for complex scenarios
  4. Built-in Extract commands offer a user-friendly interface for common extraction tasks

For your Exchange data scenario, I recommend starting with the Text.BetweenDelimiters method as it’s designed specifically for extracting text between delimiters. If your data has inconsistent patterns, combine it with conditional logic or create reusable functions.

The key to successful data extraction is understanding your data structure and choosing the method that best matches your specific delimiter patterns. Start with a simple approach, test it on a small sample of your data, and then scale up to your full dataset.

Sources

  1. Microsoft Support - Split a column of text (Power Query)
  2. Microsoft Learn - Text.BetweenDelimiters
  3. RADACAD - Extract Parts of a Text Value in Power BI using a Delimiter
  4. Excel Campus - How to Split Cells and Text in Excel with Power Query
  5. Stack Overflow - Power BI Query - Extract text between delimiters to a new column
  6. My Online Training Hub - Remove Text Between Delimiters - Power Query
  7. PopAutomation - Methods to Split Columns in Power Query