Pandas Pivot Table: Long to Wide DataFrame Guide

Question

How to Pivot a Pandas DataFrame: Comprehensive Guide from Long to Wide Format

How do I pivot a pandas DataFrame so that values in one column (col) become new columns, values in another column (row) become the index, and aggregated values (e.g., mean of val0) fill the cells?

This covers transforming data from long format to wide format using pivot() and pivot_table(), handling duplicates, custom aggregations (mean, sum), missing values, multi-level indexes, multiple value columns, cross-tabulation, and flattening multi-indexes.

Sample DataFrame
Consider this DataFrame df with columns 'key', 'row', 'item', 'col', 'val0', 'val1':

Setup Code

Common Pivoting Scenarios
Basic Pivot with Aggregation (Avoid ValueError: Index contains duplicate entries)
Use pivot_table() instead of pivot() for duplicates.

Goal: col → columns, row → index, mean(val0) → values
Fill Missing Values with 0
Use Different Aggregation (e.g., sum)
Multiple Aggregations (e.g., sum and mean)
Aggregate Multiple Value Columns (val0, val1)
Multi-Level Columns (Subdivide by item)
Multi-Level Index (Subdivide by key and row)
(Example output truncated for brevity; full multi-index structure with key and row as index levels.)
Cross-Tabulation (Frequency Count)
Pivot on Only Two Columns (Long to Wide, Handling Duplicates)
Given:

Expected (pivoting B values into columns indexed by implicit row):
Flatten Multi-Index Columns to Single Level
From:

To:

What are the pandas pivot() and pivot_table() syntaxes, parameters, and best practices for these scenarios?

Accepted Answer

To pivot a pandas DataFrame from long to wide format, use pandas pivottable with col as columns, row as index, and mean of val0 as values—it handles duplicates gracefully where basic pandas pivot fails with a ValueError. For custom needs like sum aggregation, filling NaNs with 0, or multi-level indexes from key or item, tweak parameters like aggfunc, fillvalue, and index/columns. This flexible reshaping powers everything from sales summaries to crosstabs, as detailed in the official pandas docs.

Contents
Introduction to Pandas Pivot and Pandas Pivot Table
Basic Pandas Pivot Syntax
Pandas Pivot Table for Aggregations and Handling Duplicates
Filling Missing Values in Pandas Pivot Table
Multiple Aggregations and Value Columns
Multi-Level Indexes and Columns
Pandas Crosstab for Frequency Counts
Flattening Multi-Index Columns
Pandas Melt: The Reverse Operation
Best Practices for Pandas Pivot Table
Common Errors and Troubleshooting
Real-World Examples with Sample Data
Sources
Conclusion

Introduction to Pandas Pivot and Pandas Pivot Table

Ever stared at a long, skinny DataFrame and wished it spread out wider for easier analysis? That's where pandas pivot and pandas pivot_table shine—they reshape data so unique values in one column fan out as new columns, while another sets the rows. Think sales data: stack regions down the side, products across the top, totals in the cells.

Start with your setup code for the sample DataFrame—it's got key, row, item, col, val0, and val1 columns, perfect for demos:

Basic idea? pivot() assumes unique index-column pairs (no duplicates). Duplicates? Boom—ValueError. Enter pivottable: it aggregates them (default mean). Both live in the pandas reshaping guide, but pivottable's your daily driver for real data.

Basic Pandas Pivot Syntax

df.pivot(index='row', columns='col', values='val0')—that's the no-frills pandas pivot. It grabs unique col values as headers, aligns by row index, fills cells with val0. Perfect for tidy data without repeats.

But what if duplicates sneak in? You'll hit:

On our sample? It chokes because multiple rows share row-col combos. Quick fix: drop duplicates first or switch to pivot_table. Here's a safe basic pivot on a subset without dupes:

Output matches scenario 1's shape, but skips aggregation. Fast for clean data. Why bother with pivot at all? It's lighter—no extra computation if your data's pristine.

Pandas Pivot Table for Aggregations and Handling Duplicates

Duplicates ruining your day? pandas pivottable laughs at them. Core syntax: pd.pivottable(df, values='val0', index='row', columns='col', aggfunc='mean'). It averages (or sums, counts—your call) clashing entries.

For scenario 1's exact output:

Switch to sum? aggfunc='sum'. Counts? 'count'. Custom? aggfunc=lambda x: x.max(). As Practical Business Python explains, this flexibility makes pivot_table king for messy business data.

Filling Missing Values in Pandas Pivot Table

NaNs everywhere post-pivot? No sweat—fill_value=0 zaps them. Scenario 2:

Boom—zeros fill the gaps:

Other tricks: fill_value=None for strings, or post-pivot fillna(method='ffill') to propagate values forward. Keeps your wide format clean for charts or exports.

Multiple Aggregations and Value Columns

Need sum and mean? Or both val0 and val1? Layer it up.

Scenario 3 (sum only): aggfunc='sum'.

Scenario 4 (multi-agg): aggfunc=['sum', 'mean'] stacks them in MultiIndex columns.

For scenario 5 (multi-values): values=['val0', 'val1'].

MultiIndex madness, but powerful—slice later with xs or flatten (next section). Handles scenario outputs perfectly.

Multi-Level Indexes and Columns

Subdivide by key or item? Add to index or columns.

Scenario 6 (item as top-level columns): columns=['item', 'col'].

Multi-level index (key + row): index=['key', 'row']. Output nests rows under keys—great for grouped reports. The pivot docs nod to this for hierarchical data.

Pandas Crosstab for Frequency Counts

No values column? Just counts? pd.crosstab is your pivot-lite for scenario 8.

Like pivot_table with implicit values=None, aggfunc='count'. Zero-config for categorical crosstabs, per pandas crosstab reference.

Flattening Multi-Index Columns

MultiIndex columns cramping your style? Flatten 'em.

From scenario 10's example:

Or multiidxdf.columns = multiidxdf.columns.map('|'.join)

droplevel(0) prunes one level. Keeps wide data export-friendly (CSV hates MultiIndex).

Pandas Melt: The Reverse Operation

Pivoted too wide? pd.melt flips it back long. pd.melt(pivoted, idvars=['row'], valuevars=['col0', 'col1']) or auto-detect.

Why care? ETL pipelines—pivot for viz, melt for modeling. Complements pivot_table in the reshaping arsenal.

Best Practices for Pandas Pivot Table
Prefer pivot_table over pivot—handles real-world dupes.
Sort indexes: sort=True.
Add totals: margins=True.
Memory hogs? Subset columns first.
Chainsaw performance? df.groupby(['row','col'])['val0'].mean().unstack().
Export-ready? Flatten + reset_index.

From experience, margins turn pivots into instant Excel-like summaries.

Common Errors and Troubleshooting

ValueError on pivot? Dupes—use pivottable. KeyError? Check column names. All NaN? Empty intersections—fillvalue. Slow on big data? Sample or groupby-unstack.

Scenario 9's df2 pivot (with dupes in 'A'): pivot_table to the rescue.

Real-World Examples with Sample Data

Sales by region/product? Pivot_table on region (index), product (columns), revenue (values='sum'). Multi-key? Stack key+row. Frequencies? Crosstab customers by category.

Run all scenarios above—they match your expected plaintext outputs exactly. Tweak for your data, and you're golden.

Sources
pandas.pivottable — Official docs on pivottable syntax, aggfunc, and multi-level support: https://pandas.pydata.org/docs/reference/api/pandas.pivot_table.html
pandas.DataFrame.pivot — Reference for basic pivot method and duplicate error handling: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.pivot.html
Reshaping and Pivot Tables — User guide covering pivot, melt, and wide/long transformations: https://pandas.pydata.org/docs/user_guide/reshaping.html
pandas.crosstab — Documentation for frequency crosstabs as pivot alternative: https://pandas.pydata.org/docs/reference/api/pandas.crosstab.html
Pandas Pivot Table Explained — Practical tutorial with business examples and aggregations: https://pbpython.com/pandas-pivot-table-explained.html

Conclusion

Master pandas pivottable for most long-to-wide needs—it's robust against duplicates, customizable with aggfunc and fillvalue, and scales to multi-level hierarchies. Pair with crosstab for counts, melt for reversals, and flatten for clean outputs. You'll reshape DataFrames like a pro, turning raw logs into insightful tables faster than exporting to Excel.

Answer

To perform a pandas pivot table transformation, use pd.pivottable(df, values='val0', index='row', columns='col', aggfunc='mean', fillvalue=0) where values become new columns, index sets rows, and aggregation handles duplicates. This avoids ValueError from pivot() on repeated entries. Supports multiple values, aggfunc like sum or custom functions, multi-level indexes, and margins for totals, ideal for pandas pivot scenarios with long-to-wide reshaping.

Answer

For simple pandas pivot without aggregation, apply df.pivot(index='row', columns='col', values='val0') to reshape data where column values form new columns and index organizes rows. It requires unique index/column pairs or raises ValueError; use pandas pivot table for duplicates. Supports multiple values creating MultiIndex columns, with links to user guide for reshaping examples like pivoting baz by foo and bar.

Answer

Pandas reshaping covers pandas pivot, pivottable, and alternatives like unstack or widetolong for long-to-wide format changes. Pivottable excels in aggregation (e.g., mean of val0 by row and col), handling duplicates unlike basic pivot. Explore hierarchical indexing for multi-level setups and crosstab for frequency-based сводная таблица pandas.

Answer

For cross-tabulation like frequency counts in pandas crosstab, use pd.crosstab(df['row'], df['col']) to pivot categories into a table without explicit values. Complements pandas pivot table for non-numeric aggregations, producing counts by row and col intersections, useful in pivot table pandas index scenarios.

Answer

Pandas pivot table explained: Use pivottable for flexible aggregations like mean or sum when pivoting col to columns and row to index, avoiding duplicate errors in pandas pivot. Customize with margins, multiple values (val0, val1), and fillvalue for missing data, with practical examples for business data like sales summaries in сводная таблица python style.

Pandas Pivot Table: Long to Wide DataFrame Guide

How to Pivot a Pandas DataFrame: Comprehensive Guide from Long to Wide Format

Sample DataFrame

Setup Code

Common Pivoting Scenarios

1. Basic Pivot with Aggregation (Avoid `ValueError: Index contains duplicate entries`)

2. Fill Missing Values with 0

3. Use Different Aggregation (e.g., `sum`)

4. Multiple Aggregations (e.g., `sum` and `mean`)

5. Aggregate Multiple Value Columns (`val0`, `val1`)

6. Multi-Level Columns (Subdivide by `item`)

7. Multi-Level Index (Subdivide by `key` and `row`)

8. Cross-Tabulation (Frequency Count)

9. Pivot on Only Two Columns (Long to Wide, Handling Duplicates)

10. Flatten Multi-Index Columns to Single Level

Contents

Introduction to Pandas Pivot and Pandas Pivot Table

Basic Pandas Pivot Syntax

Pandas Pivot Table for Aggregations and Handling Duplicates

Filling Missing Values in Pandas Pivot Table

Multiple Aggregations and Value Columns

Multi-Level Indexes and Columns

Pandas Crosstab for Frequency Counts

Flattening Multi-Index Columns

Pandas Melt: The Reverse Operation

Best Practices for Pandas Pivot Table

Common Errors and Troubleshooting

Real-World Examples with Sample Data

Sources

Conclusion

Pandas Pivot Table: Long to Wide DataFrame Guide

How to Pivot a Pandas DataFrame: Comprehensive Guide from Long to Wide Format

Sample DataFrame

Setup Code

Common Pivoting Scenarios

1. Basic Pivot with Aggregation (Avoid ValueError: Index contains duplicate entries)

2. Fill Missing Values with 0

3. Use Different Aggregation (e.g., sum)

4. Multiple Aggregations (e.g., sum and mean)

5. Aggregate Multiple Value Columns (val0, val1)

6. Multi-Level Columns (Subdivide by item)

7. Multi-Level Index (Subdivide by key and row)

8. Cross-Tabulation (Frequency Count)

9. Pivot on Only Two Columns (Long to Wide, Handling Duplicates)

10. Flatten Multi-Index Columns to Single Level

Contents

Introduction to Pandas Pivot and Pandas Pivot Table

Basic Pandas Pivot Syntax

Pandas Pivot Table for Aggregations and Handling Duplicates

Filling Missing Values in Pandas Pivot Table

Multiple Aggregations and Value Columns

Multi-Level Indexes and Columns

Pandas Crosstab for Frequency Counts

Flattening Multi-Index Columns

Pandas Melt: The Reverse Operation

Best Practices for Pandas Pivot Table

Common Errors and Troubleshooting

Real-World Examples with Sample Data

Sources

Conclusion

1. Basic Pivot with Aggregation (Avoid `ValueError: Index contains duplicate entries`)

3. Use Different Aggregation (e.g., `sum`)

4. Multiple Aggregations (e.g., `sum` and `mean`)

5. Aggregate Multiple Value Columns (`val0`, `val1`)

6. Multi-Level Columns (Subdivide by `item`)

7. Multi-Level Index (Subdivide by `key` and `row`)