Web

Next.js Excel Upload to NestJS: Parse & Import Guide

Implement Excel (.xlsx) bank statement upload from Next.js to NestJS. Parse rows to JSON with exceljs or xlsx, validate data, map columns to DB fields, and persist using Prisma or TypeORM. Best practices for large files.

1 answer 1 view

How can I implement an upload-and-import flow for Excel bank statements (.xlsx) from a Next.js frontend to a NestJS backend, parse rows into JSON, validate and map columns to database fields, and persist records using Prisma or TypeORM?

Requirements:

  • Next.js frontend uploads an Excel file using multipart/form-data
  • NestJS backend receives the file (using @nestjs/platform-express and FileInterceptor)
  • Backend responsibilities:
  • Read and parse the .xlsx file
  • Convert rows into JSON
  • Optionally map Excel column names to DB fields
  • Validate data (types, required fields, date/number parsing)
  • Save records to the database (Prisma / TypeORM)

Questions I need answered:

  1. Which library is recommended for reading Excel files in NestJS (e.g., exceljs, xlsx, or others)? What are the pros and cons for server-side processing?
  2. What is a reliable approach to parse Excel rows and convert them to JSON (handling headers, empty cells, date and number parsing, and type coercion)?
  3. How to safely map Excel columns to database fields and validate data before insertion (mapping config, schema validation, handling missing/extra columns, duplicate rows)?
  4. What are the best practices for handling large Excel files (streaming vs in-memory parsing, chunked inserts, background jobs/queues, upload size limits and timeouts)?
  5. Could you provide example snippets or an outline for:
  • Next.js multipart/form-data upload
  • NestJS controller using FileInterceptor to accept and forward the uploaded file
  • Parsing an .xlsx into JSON and persisting it with Prisma or TypeORM

Example input format (simple bank-statement rows):

Date | Debit | Credit | Description
01-01-2025 | 500 | | ATM Withdrawal
02-01-2025 | | 2000 | Salary Credit

Any recommended libraries, code examples, validation strategies, and performance/architecture considerations would be very helpful.

Use a multipart/form-data POST from Next.js to send the .xlsx to a NestJS endpoint that uses @nestjs/platform-express FileInterceptor (memoryStorage or disk/S3). Parse the buffer to JSON with either SheetJS (xlsx) for fast row-based parsing or exceljs for streaming/large-file support, normalize headers, validate and coerce types with Zod or class-validator, then persist in chunks using Prisma (createMany / upsert) or TypeORM (bulk insert / upsert). For large imports prefer streaming or upload-to-S3 + a background worker (Bull/BullMQ) and chunked inserts to avoid OOMs and timeouts.


Contents


Recommended library for server-side import excel (exceljs vs xlsx)

Short answer: choose the tool that matches scale and features. The npm-compare roundup shows the common options and trade-offs; community posts echo the same tradeoffs (SheetJS/xlsx for raw parsing speed, exceljs for streaming and Excel feature support) [https://npm-compare.com/excel4node,exceljs,xlsx,xlsx-populate]. For NestJS you’ll usually pick one of these two:

  • SheetJS (package name: xlsx)
  • Pros: very fast for sheet_to_json-style parsing, simple API, handles many file variants; good for synchronous in-memory parsing of small-to-medium files. See simple upload+parse examples in community tutorials [https://medium.com/@yctatli/excel-import-with-nest-js-88ef360500ac], [https://www.kindacode.com/article/node-js-reading-and-parsing-excel-xlsx-files].
  • Cons: memory-heavy on large files; advanced styling/write features may be part of the paid/pro builds (if you need advanced write/styling features you should check licensing).
  • ExcelJS (package name: exceljs)
  • Pros: supports streaming read/write (useful for very large files), richer support for cell metadata and styles, more flexible row-by-row iteration which helps for incremental processing.
  • Cons: slower than SheetJS for pure in-memory sheet_to_json scenarios; streaming API is more code to implement.

Other options (xlsx-populate, node-xlsx, etc.) exist but are less popular for server-side imports. Community threads confirm exceljs vs xlsx is the common choice for server work [https://www.reddit.com/r/node/comments/1g21jcv/suggest_a_library_that_can_be_used_to_read_write/]. Pick xlsx for quick import/transform pipelines; pick exceljs if you must stream very large files or need streaming memory-safety.


Parsing Excel rows and converting to JSON (excel to json, parsing excel)

Goal: reliable rows → JSON with header alignment, empty-cell handling, date and number parsing, and predictable types.

Common rules/steps

  • Read workbook from buffer/stream.
  • Pick a sheet (by name or index).
  • Build normalized headers (trim, lowercase, remove special chars).
  • For each row produce an object keyed by normalized headers.
  • Use explicit coercion for dates and numbers (don’t trust raw strings).
  • Replace empty strings with null (so validators can catch missing required fields).

Option A — SheetJS (xlsx) (in-memory, quick):

  • Use XLSX.read(buffer, { type: 'buffer', cellDates: true })
  • Use XLSX.utils.sheet_to_json(sheet, { defval: null, raw: false }) so empty cells yield null and dates are parsed when possible.

Example (Node / Nest service):

ts
import * as XLSX from 'xlsx';

function normalizeHeader(h: string) {
 return h?.toString().trim().toLowerCase().replace(/\s+/g, '_').replace(/[^\w]/g, '') ?? '';
}

export function parseXlsxBufferToJson(buffer: Buffer) {
 const workbook = XLSX.read(buffer, { type: 'buffer', cellDates: true });
 const sheet = workbook.Sheets[workbook.SheetNames[0]];
 const rawRows: any[] = XLSX.utils.sheet_to_json(sheet, { defval: null, raw: false });

 if (rawRows.length === 0) return [];

 // normalize column names
 const headers = Object.keys(rawRows[0]).map(normalizeHeader);
 return rawRows.map((r: any) => {
 const out: Record<string, any> = {};
 Object.keys(r).forEach((k, i) => out[headers[i]] = r[k] === '' ? null : r[k]);
 return out;
 });
}

Option B — ExcelJS (streaming friendly):

  • Use workbook.xlsx.load(buffer) for small files, or workbook.xlsx.read(stream) for streams.
  • Iterate rows, build headers on the first row, then map subsequent rows using row.getCell(i).value.
  • For streaming large files use the streaming reader to avoid buffering the whole file.

Example (ExcelJS):

ts
import ExcelJS from 'exceljs';

export async function parseWithExcelJSBuffer(buffer: Buffer) {
 const workbook = new ExcelJS.Workbook();
 await workbook.xlsx.load(buffer); // or workbook.xlsx.read(stream)
 const worksheet = workbook.worksheets[0];
 const rows: Record<string, any>[] = [];
 let headers: string[] = [];

 worksheet.eachRow((row, rowNumber) => {
 const values = row.values as any[]; // note: values[0] is undefined
 if (rowNumber === 1) {
 headers = values.slice(1).map((h: any) => normalizeHeader(String(h ?? '')));
 return;
 }
 const obj: Record<string, any> = {};
 values.slice(1).forEach((cellVal, idx) => {
 obj[headers[idx]] = cellVal === '' ? null : cellVal;
 });
 rows.push(obj);
 });

 return rows;
}

Date and number specifics

  • Excel date cells may already be JS Date objects (with cellDates: true in SheetJS or ExcelJS cell.value as Date). Strings like “01-01-2025” should be parsed with a robust parser (date-fns, dayjs) and formats you expect.
  • Normalize numbers: strip currency symbols and thousands separators before Number/parseFloat; for money store cents (integers) or use Decimal type in DB.
  • For blank numeric cells use null. For cells that contain formulas make sure to read the computed value (SheetJS raw=false or ExcelJS cell.value.result).

Edge cases

  • Mixed header rows (multiple header rows): let the UI allow selecting the header row.
  • Empty rows: skip rows where every field is null/empty.
  • Different locale date formats: attempt a set of likely formats, or require the user to pick the locale/format in the import UI.

Mapping columns to DB fields and validating data (mapping-validation)

You’ll want a deterministic, user-friendly mapping layer and a strong validation layer.

Mapping strategy

  • Build a default mapping by normalizing spreadsheet headers to candidate DB fields (e.g., “date” → date, “debit” → debit, “credit” → credit, “description” → description).
  • Allow the user to confirm or edit the mapping in the UI before import (very helpful when column names differ).
  • Store mapping config for repeatable imports.

Example mapping config:

ts
const defaultMap = {
 date: 'date',
 debit: 'debit',
 credit: 'credit',
 description: 'description',
};

Then transform each parsed row into the DB DTO using the mapping.

Validation approach

  • Validate each row after mapping but before DB insertion.
  • Use Zod (recommended for parsing/transform workflows) or Nest’s class-validator with DTOs.

Example Zod schema (rows parsed to JS types first):

ts
import { z } from 'zod';

const txnSchema = z.object({
 date: z.preprocess(val => {
 if (val instanceof Date) return val;
 if (typeof val === 'string') return new Date(val);
 return null;
 }, z.date()),
 debit: z.number().nullable().optional(),
 credit: z.number().nullable().optional(),
 description: z.string().min(1),
}).superRefine((obj, ctx) => {
 if ((obj.debit ?? null) === null && (obj.credit ?? null) === null) {
 ctx.addIssue({ code: z.ZodIssueCode.custom, message: 'Either debit or credit is required', path: ['debit'] });
 }
});

Use txnSchema.safeParse(row) to gather per-row errors and report back to the importer UI with row numbers and messages.

Handling missing/extra columns

  • Missing required columns: reject import or ask the user to map them.
  • Extra columns: either ignore or store in a JSON metadata column if you want to preserve them.

Duplicate rows and idempotency

  • Compute a fingerprint per row (e.g., sha256(date + amountCents + description)) and either:
  • Add a unique constraint on fingerprint and use createMany(..., skipDuplicates: true) with Prisma (when supported), or
  • Upsert by unique key, or
  • Insert into a staging table and use SQL dedupe queries.
  • Example fingerprint:
ts
import crypto from 'crypto';
function fingerprint(row) {
 return crypto.createHash('sha256').update(`${row.date.toISOString()}|${row.amountCents}|${(row.description||'').trim()}`).digest('hex');
}

Transaction & error strategy

  • Prefer idempotent operations (upsert or unique fingerprint).
  • Use DB transactions for multi-step imports if you must rollback entire batches on failure.
  • For very large imports prefer batch inserts and track failed rows to retry individually.

Prisma-specific tips

  • prisma.model.createMany({ data: batch, skipDuplicates: true }) is the fastest path for many rows if your DB supports the skipDuplicates behavior.
  • If you need per-row validation before insertion, validate beforehand and only send clean data to createMany.
  • See validation+Prisma patterns for Nest in this Prisma blog post [https://www.prisma.io/blog/nestjs-prisma-validation-7D056s1kOla1].

Handling large Excel files (streaming, chunked inserts, background jobs)

What counts as “large”? Depends on memory and runtime limits, but tens of MBs in-memory can be risky on constrained servers or serverless functions. So ask: will users upload hundreds of MBs? If yes, stream/offload.

Strategies

  • Small files (< ~10MB): memory uploads (Multer memoryStorage) and in-request parsing is fine.
  • Medium files (10–100MB): prefer streaming parsing or temporary disk storage to avoid big memory spikes.
  • Large files (>100MB): use direct-to-storage (S3/Spaces) and background workers.

Recommended architecture for reliability

  1. Frontend uploads file directly to S3 with a presigned URL (avoid routing large uploads through app server).
  2. Frontend calls NestJS to enqueue a job (pointing to S3 key).
  3. A worker (separate Nest worker process or BullMQ consumer) downloads/streams the file and parses it with streaming Excel reader (exceljs) or converts to CSV and parses row-by-row.
  4. Worker inserts in DB in chunks (e.g., 500–2000 rows per transaction), using bulk insert methods.

Queues and workers

  • Use Bull/BullMQ (Redis-backed) for queued tasks and retries.
  • Limit concurrency so DB and worker memory don’t get overwhelmed.

Chunked inserts & batching

  • Batch size depends on DB and row size; start with 500 and tune.
  • For Prisma use createMany on each chunk; for TypeORM use QueryBuilder.insert().values(chunk).execute() or raw insert SQL.
  • Wrap chunk inserts per-batch in transactions only if necessary.

Timeouts, size limits and safety

  • Set Multer limits (e.g., 10–50MB for direct request handling): limits: { fileSize: 10 * 1024 * 1024 }.
  • Configure Nest/HTTP server timeouts to match expected processing durations.
  • Consider resumable uploads (tus) for unstable connections.

Resources on streaming and in-memory parsing patterns: see a practical NestJS in-memory parsing example [https://dev.to/damir_maham/streamline-file-uploads-in-nestjs-efficient-in-memory-parsing-for-csv-xlsx-without-disk-storage-145g] and streaming guidance [https://www.telerik.com/blogs/how-stream-large-files-handle-data-efficiently-nodejs-streams-nestjs].


Code examples: Next.js upload, NestJS FileInterceptor, parse to JSON, persist with Prisma/TypeORM

Below are condensed examples — adapt to your app structure, error handling, and auth.

  1. Next.js (React) simple upload component
jsx
// components/UploadForm.jsx
import { useState } from 'react';

export default function UploadForm() {
 const [file, setFile] = useState(null);
 const [status, setStatus] = useState('');

 async function submit(e) {
 e.preventDefault();
 if (!file) return setStatus('Choose a file first');

 const fd = new FormData();
 fd.append('file', file);

 setStatus('Uploading...');
 const res = await fetch('https://api.example.com/import', {
 method: 'POST',
 body: fd,
 });

 const json = await res.json();
 setStatus(json.message ?? 'Done');
 }

 return (
 <form onSubmit={submit}>
 <input type="file" accept=".xlsx,.xls" onChange={e => setFile(e.target.files?.[0])} />
 <button type="submit">Upload</button>
 <div>{status}</div>
 </form>
 );
}
  1. NestJS controller with FileInterceptor (memory storage)
ts
// import.controller.ts
import { Controller, Post, UploadedFile, UseInterceptors } from '@nestjs/common';
import { FileInterceptor } from '@nestjs/platform-express';
import { memoryStorage } from 'multer';
import { ImportService } from './import.service';

@Controller('import')
export class ImportController {
 constructor(private svc: ImportService) {}

 @Post()
 @UseInterceptors(FileInterceptor('file', {
 storage: memoryStorage(),
 limits: { fileSize: 25 * 1024 * 1024 }, // 25MB
 }))
 async upload(@UploadedFile() file: Express.Multer.File) {
 const result = await this.svc.handleBuffer(file.buffer);
 return { message: 'import queued', stats: result };
 }
}

See NestJS file-upload docs for more config options [https://docs.nestjs.com/techniques/file-upload].

  1. Parsing + validating + persisting (service outline) — using SheetJS + Zod + Prisma
ts
// import.service.ts
import { Injectable } from '@nestjs/common';
import * as XLSX from 'xlsx';
import { z } from 'zod';
import { PrismaClient } from '@prisma/client';
import crypto from 'crypto';

const prisma = new PrismaClient();

const txnSchema = z.object({
 date: z.preprocess(v => (v instanceof Date ? v : new Date(String(v))), z.date()),
 debit: z.preprocess(v => v == null ? null : Number(String(v).replace(/[^0-9.-]/g, '')), z.number().nullable()),
 credit: z.preprocess(v => v == null ? null : Number(String(v).replace(/[^0-9.-]/g, '')), z.number().nullable()),
 description: z.string().min(1),
}).superRefine((obj, ctx) => {
 if ((obj.debit ?? null) == null && (obj.credit ?? null) == null) {
 ctx.addIssue({ code: z.ZodIssueCode.custom, message: 'Either debit or credit required' });
 }
});

function fingerprint(row: any) {
 return crypto.createHash('sha256').update(`${row.date.toISOString()}|${row.amountCents}|${row.description}`).digest('hex');
}

@Injectable()
export class ImportService {
 async handleBuffer(buffer: Buffer) {
 // parse
 const workbook = XLSX.read(buffer, { type: 'buffer', cellDates: true });
 const sheet = workbook.Sheets[workbook.SheetNames[0]];
 const rawRows: any[] = XLSX.utils.sheet_to_json(sheet, { defval: null, raw: false });

 // normalize, map and validate
 const mapped = rawRows.map((r, idx) => {
 // normalize column keys in the sheet to expected fields
 // Example assumes header keys 'Date','Debit','Credit','Description'
 return {
 date: r['Date'] ?? r['date'] ?? r['Date '],
 debit: r['Debit'] ?? r['debit'] ?? null,
 credit: r['Credit'] ?? r['credit'] ?? null,
 description: r['Description'] ?? r['description'] ?? '',
 _row: idx + 2, // for user-facing error messages
 };
 });

 const validRows = [];
 const errors = [];
 for (const r of mapped) {
 const parsed = txnSchema.safeParse(r);
 if (!parsed.success) {
 errors.push({ row: r._row, errors: parsed.error.errors });
 continue;
 }
 const data = parsed.data as any;
 const amount = (data.credit ?? 0) - (data.debit ?? 0);
 const amountCents = Math.round(amount * 100);
 const fp = fingerprint({ date: data.date, amountCents, description: data.description });
 validRows.push({
 date: data.date,
 amountCents,
 description: data.description,
 fingerprint: fp,
 });
 }

 // Persist in chunks
 const chunkSize = 500;
 for (let i = 0; i < validRows.length; i += chunkSize) {
 const chunk = validRows.slice(i, i + chunkSize);
 await prisma.$transaction([
 prisma.transaction.createMany({ data: chunk, skipDuplicates: true })
 ]);
 }

 return { imported: validRows.length, errors: errors.length, errorDetails: errors.slice(0, 10) };
 }
}

Notes:

  • The Prisma model should include a fingerprint unique column or use composite unique constraint so skipDuplicates works predictably.
  • If your DB/drivers don’t support skipDuplicates, use upsert or dedupe in SQL.
  1. TypeORM example (bulk insert pattern)
ts
// using DataSource
await this.dataSource.createQueryBuilder()
 .insert()
 .into(TransactionEntity)
 .values(batch)
 .execute();
  • For dedupe or conflict handling use DB-specific SQL (Postgres ON CONFLICT DO NOTHING) via raw queries or query builder extensions.

Sources


Conclusion

Importing Excel bank statements is straightforward if you pick tools that match scale: use xlsx (SheetJS) for quick in-memory excel to json flows and exceljs for streaming/large-file needs. Normalize headers, map them explicitly (or let users map), validate rows with Zod or class-validator, then persist in batches with Prisma (createMany/upsert) or TypeORM (bulk insert/upsert). For reliability and scale, prefer direct-to-storage uploads plus a background worker and chunked DB writes — that pattern avoids timeouts and keeps imports idempotent.

Authors
Verified by moderation
Moderation
Next.js Excel Upload to NestJS: Parse & Import Guide