How can I convert HTML and CSS to PDF on a Linux server using PHP? I have an HTML document that renders correctly in browsers but needs to be converted to PDF. I’ve tried several solutions:
- DOMPDF: Had issues with tables and images, consuming excessive memory
- HTML2PDF/HTML2PS: Better table rendering but encountered unknown node_type() errors
- Htmldoc: Works on basic HTML but has poor CSS support
I need a solution that runs on Linux and ideally works on-demand via PHP on a webserver. What are the best approaches or alternatives for this conversion?
HTML and CSS to PDF conversion on Linux servers using PHP can be achieved through several robust approaches, with modern solutions like headless browsers (Chrome/Chromium) offering the most accurate rendering, while specialized libraries like Mpdf or Snappy provide PHP-native alternatives that address the limitations you’ve encountered with DOMPDF and other tools.
Contents
- Overview of HTML to PDF Conversion Methods
- Modern Headless Browser Solutions
- PHP-Native Libraries and Alternatives
- Installation and Configuration Guide
- Performance Optimization Techniques
- Troubleshooting Common Issues
- Advanced Features and Customization
Overview of HTML to PDF Conversion Methods
HTML to PDF conversion on Linux servers typically falls into three main categories:
- PHP-native libraries that parse HTML directly and generate PDF output
- Command-line tools that convert HTML to PDF via system calls
- Headless browser solutions that render HTML in a browser environment and convert to PDF
Each approach has its strengths and weaknesses. PHP-native libraries like Mpdf are lightweight and easy to integrate but may struggle with complex CSS. Command-line tools like Htmldoc offer good basic support but limited CSS capabilities. Headless browsers provide the most accurate rendering but require more system resources.
Key Consideration: The best approach depends on your specific requirements for CSS support, performance, system resources, and output quality.
Modern Headless Browser Solutions
Puppeteer with Node.js Bridge
Puppeteer, a Node.js library, offers excellent HTML to PDF conversion through headless Chrome/Chromium. While not PHP-native, you can create a bridge between PHP and Node.js:
// Example PHP to Node.js bridge for Puppeteer
function generatePdfWithPuppeteer($html, $outputPath) {
$tempHtmlFile = tempnam(sys_get_temp_dir(), 'html_');
file_put_contents($tempHtmlFile, $html);
$command = "node puppeteer-pdf.js " . escapeshellarg($tempHtmlFile) . " " . escapeshellarg($outputPath);
exec($command, $output, $returnCode);
unlink($tempHtmlFile);
return ($returnCode === 0);
}
Pros:
- Excellent CSS support including modern features
- Accurate rendering matching browsers
- Supports complex layouts, tables, and images
Cons:
- Requires Node.js on the server
- Higher memory consumption
- More complex setup
WeasyPrint
WeasyPrint is a Python library that converts HTML/CSS to PDF with excellent standards compliance:
// PHP wrapper for WeasyPrint
function generatePdfWithWeasyPrint($html, $outputPath) {
$tempHtmlFile = tempnam(sys_get_temp_dir(), 'html_');
file_put_contents($tempHtmlFile, $html);
$command = "weasyprint " . escapeshellarg($tempHtmlFile) . " " . escapeshellarg($outputPath);
exec($command, $output, $returnCode);
unlink($tempHtmlFile);
return ($returnCode === 0);
}
Pros:
- Excellent CSS3 support
- High-quality output
- Good performance
Cons:
- Requires Python dependencies
- Limited JavaScript support
- Steeper learning curve
PHP-Native Libraries and Alternatives
Mpdf (Improved Alternative to DOMPDF)
Mpdf is often recommended as a successor to DOMPDF with better performance and features:
require_once __DIR__ . '/vendor/autoload.php';
$mpdf = new \Mpdf\Mpdf([
'mode' => 'utf-8',
'format' => 'A4',
'margin_left' => 10,
'margin_right' => 10,
'margin_top' => 10,
'margin_bottom' => 10
]);
$mpdf->WriteHTML($htmlContent);
$mpdf->Output('document.pdf', 'D');
Key Improvements over DOMPDF:
- Better table rendering
- Improved CSS support
- Lower memory consumption
- Better image handling
Installation:
composer require mpdf/mpdf
TCPDF
TCPDF is another robust PHP library with extensive features:
require_once('tcpdf/tcpdf.php');
$pdf = new TCPDF(PDF_PAGE_ORIENTATION, PDF_UNIT, PDF_PAGE_FORMAT, true, 'UTF-8', false);
$pdf->AddPage();
$pdf->writeHTML($htmlContent, true, false, true, false, '');
$pdf->Output('document.pdf', 'D');
Features:
- Supports complex layouts
- Good for forms and tables
- Extensive documentation
Snappy ( wkhtmltopdf wrapper)
Snappy provides a PHP wrapper for the powerful wkhtmltopdf command-line tool:
require_once __DIR__ . '/vendor/autoload.php';
use Knp\Snappy\Pdf;
$snappy = new Pdf('/usr/bin/wkhtmltopdf');
$snappy->generateFromHtml(
'<h1>Bill</h1><p>Dear Client</p>',
'/path/to/bill.pdf'
);
Advantages:
- Uses actual WebKit rendering engine
- Excellent CSS support
- Handles complex layouts well
Installation and Configuration Guide
System Requirements
Before installing any HTML to PDF solution, ensure your Linux server meets these requirements:
# Basic system requirements
sudo apt update
sudo apt install -y php php-cli php-gd php-xml php-mbstring
sudo apt install -y libpng-dev libjpeg-dev libfreetype6-dev
sudo apt install -y fontconfig
Installing Mpdf
# Install Mpdf via Composer
composer require mpdf/mpdf
# Or install system dependencies first
sudo apt install -y php-pear php-dev libpng-dev libjpeg-dev libfreetype6-dev
sudo pecl install imagick
echo "extension=imagick.so" | sudo tee /etc/php/$(php -r 'echo PHP_MAJOR_VERSION.".".PHP_MINOR_VERSION;')/cli/conf.d/imagick.ini
Installing wkhtmltopdf for Snappy
# Download and install wkhtmltopdf
sudo apt install -y xvfb
wget https://github.com/wkhtmltopdf/packaging/releases/download/0.12.6-1/wkhtmltox_0.12.6-1.bionic_amd64.deb
sudo dpkg -i wkhtmltox_0.12.6-1.bionic_amd64.deb
sudo apt install -f
# Verify installation
wkhtmltopdf --version
Setting Up Headless Chrome
# Install Chrome/Chromium
sudo apt install -y chromium-browser
# Or install Google Chrome
wget -q -O - https://dl.google.com/linux/linux_signing_key.pub | sudo apt-key add -
echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" | sudo tee /etc/apt/sources.list.d/google-chrome.list
sudo apt update
sudo apt install -y google-chrome-stable
# Install Node.js and Puppeteer
curl -fsSL https://deb.nodesource.com/setup_18.x | sudo -E bash -
sudo apt install -y nodejs
npm install -g puppeteer
Performance Optimization Techniques
Memory Management
For PHP-native libraries like Mpdf:
// Configure memory limits in php.ini
memory_limit = 512M
// In PHP code
ini_set('memory_limit', '512M');
// For large documents, use streaming
$mpdf = new \Mpdf\Mpdf([
'mode' => 'utf-8',
'format' => 'A4',
'tempDir' => '/tmp/mpdf',
'setAutoTopMargin' => 'pad',
'setAutoBottomMargin' => 'pad'
]);
// Process HTML in chunks if needed
$mpdf->WriteHTML($htmlContent);
$mpdf->Output('document.pdf', 'F');
Caching and Optimization
// Implement caching for frequently generated PDFs
function getCachedPdf($key, $html, $ttl = 3600) {
$cacheFile = "/tmp/pdf_cache/{$key}.pdf";
if (file_exists($cacheFile) && (time() - filemtime($cacheFile)) < $ttl) {
return $cacheFile;
}
// Generate PDF
generatePdfWithLibrary($html, $cacheFile);
return $cacheFile;
}
// Optimize HTML for PDF generation
function optimizeHtmlForPdf($html) {
// Remove unnecessary elements
$html = preg_replace('/<script\b[^<]*(?:(?!<\/script>)<[^<]*)*<\/script>/mi', '', $html);
$html = preg_replace('/<style\b[^<]*(?:(?!<\/style>)<[^<]*)*<\/style>/mi', '', $html);
// Optimize images
$html = preg_replace('/<img([^>]+)>/i', '<img$1 loading="eager" />', $html);
return $html;
}
Background Processing
For resource-intensive conversions:
// Use queues or background processing
function generatePdfInBackground($html, $outputPath, $callback = null) {
$tempHtmlFile = tempnam(sys_get_temp_dir(), 'html_');
file_put_contents($tempHtmlFile, $html);
$command = "nohup php generate_pdf.php " . escapeshellarg($tempHtmlFile) . " " . escapeshellarg($outputPath) . " > /dev/null 2>&1 &";
exec($command);
if ($callback) {
$callback($tempHtmlFile);
}
}
// Worker script (generate_pdf.php)
$htmlFile = $argv[1];
$outputFile = $argv[2];
$html = file_get_contents($htmlFile);
generatePdfWithLibrary($html, $outputFile);
unlink($htmlFile);
Troubleshooting Common Issues
Memory Issues with DOMPDF/Mpdf
// Solution: Increase memory and optimize processing
$mpdf = new \Mpdf\Mpdf([
'mode' => 'utf-8',
'format' => 'A4',
'tempDir' => '/tmp/mpdf',
'memoryLimit' => '512M',
'debug' => true // Enable debug mode
]);
// For very large documents, break into pages
$htmlParts = explode('<page-break>', $htmlContent);
foreach ($htmlParts as $part) {
$mpdf->AddPage();
$mpdf->WriteHTML($part);
}
CSS Compatibility Issues
// Solution: Use CSS resets and PDF-specific styles
$pdfStyles = '
@page {
size: A4;
margin: 1cm;
}
body {
font-family: Arial, sans-serif;
font-size: 12pt;
line-height: 1.4;
}
table {
border-collapse: collapse;
width: 100%;
}
table td, table th {
border: 1px solid #ddd;
padding: 8px;
text-align: left;
}
img {
max-width: 100%;
height: auto;
}
';
// Apply PDF-specific styles
$html = '<style>' . $pdfStyles . '</style>' . $html;
Font Loading Issues
// Solution: Embed custom fonts
$mpdf = new \Mpdf\Mpdf([
'mode' => 'utf-8',
'format' => 'A4',
'default_font_size' => 0,
'default_font' => 'helvetica',
'fontDir' => [
__DIR__ . '/fonts/',
__DIR__ . '/tcpdf/fonts/',
'/usr/share/fonts/truetype/dejavu/'
],
'fontdata' => [
'customfont' => [
'R' => 'Custom-Regular.ttf',
'B' => 'Custom-Bold.ttf',
'I' => 'Custom-Italic.ttf',
'BI' => 'Custom-BoldItalic.ttf'
]
]
]);
Table Rendering Problems
// Solution: Use table-friendly HTML structure
$html = '
<table>
<thead>
<tr>
<th>Header 1</th>
<th>Header 2</th>
</tr>
</thead>
<tbody>
<tr>
<td>Data 1</td>
<td>Data 2</td>
</tr>
</tbody>
</table>
';
// CSS for tables
$css = '
table {
border-collapse: collapse;
width: 100%;
page-break-inside: avoid;
}
table td, table th {
border: 1px solid #000;
padding: 6px;
text-align: left;
}
thead {
background-color: #f2f2f2;
}
';
Advanced Features and Customization
Adding Headers and Footers
// Mpdf example with headers and footers
$mpdf = new \Mpdf\Mpdf();
// Set up headers and footers
$mpdf->SetHTMLHeader('
<div style="text-align: right; font-size: 10pt;">
Page {PAGENO} of {nbpg}
</div>
');
$mpdf->SetHTMLFooter('
<div style="text-align: center; font-size: 8pt;">
Generated on ' . date('Y-m-d H:i:s') . '
</div>
');
$mpdf->WriteHTML($html);
$mpdf->Output('document.pdf', 'D');
Watermarks and Security
// Add watermark to PDF
$mpdf = new \Mpdf\Mpdf();
// Add watermark
$mpdf->SetWatermarkText('DRAFT');
$mpdf->showWatermarkText = true;
// Set PDF security
$mpdf->SetProtection(
['copy', 'print'], // Allow printing and copying
'user_password', // User password
'owner_password' // Owner password
);
$mpdf->WriteHTML($html);
$mpdf->Output('document.pdf', 'D');
Multi-language Support
// Support for different languages and fonts
$mpdf = new \Mpdf\Mpdf([
'mode' => 'utf-8',
'format' => 'A4',
'fontDir' => [
__DIR__ . '/fonts/',
'/usr/share/fonts/truetype/'
],
'fontdata' => [
'cjk' => [
'R' => 'NotoSansCJK-Regular.ttc',
'useOTL' => 0x100
]
]
]);
// Set language-specific settings
$mpdf->autoScriptToLang = true;
$mpdf->autoLangToFont = true;
$mpdf->WriteHTML('<p>Hello 世界!</p>');
$mpdf->WriteHTML('<p>Bonjour le monde!</p>');
$mpdf->Output('multilingual.pdf', 'D');
Dynamic Content Generation
// Generate PDF with dynamic data from database
function generateInvoicePdf($invoiceId) {
// Get invoice data
$invoice = getInvoiceData($invoiceId);
$items = getInvoiceItems($invoiceId);
// Generate HTML template
$html = '
<h1>Invoice #' . $invoice['id'] . '</h1>
<p>Date: ' . $invoice['date'] . '</p>
<table>
<thead>
<tr>
<th>Item</th>
<th>Quantity</th>
<th>Price</th>
<th>Total</th>
</tr>
</thead>
<tbody>';
foreach ($items as $item) {
$html .= '
<tr>
<td>' . $item['description'] . '</td>
<td>' . $item['quantity'] . '</td>
<td>$' . number_format($item['price'], 2) . '</td>
<td>$' . number_format($item['quantity'] * $item['price'], 2) . '</td>
</tr>';
}
$html .= '
</tbody>
</table>
<p><strong>Total: $' . number_format($invoice['total'], 2) . '</strong></p>
';
// Generate PDF
$mpdf = new \Mpdf\Mpdf();
$mpdf->WriteHTML($html);
$mpdf->Output('invoice_' . $invoiceId . '.pdf', 'D');
}
Conclusion
HTML to PDF conversion on Linux servers using PHP can be successfully implemented with several approaches depending on your specific requirements:
-
For high-quality CSS support: Use headless browser solutions like Puppeteer or WeasyPrint, which provide the most accurate rendering but require additional system dependencies.
-
For PHP-native solutions: Mpdf offers the best balance of features and performance, addressing many of the limitations you encountered with DOMPDF, particularly for tables and memory usage.
-
For complex layouts: Consider wkhtmltopdf via Snappy, which uses WebKit rendering for excellent CSS support while remaining accessible through PHP.
-
For performance optimization: Implement caching, background processing, and memory management techniques to handle large documents efficiently.
Recommended next steps: Start with Mpdf for its ease of integration and good performance, then explore headless browser solutions if you need advanced CSS support. Always test with your specific HTML content to ensure compatibility and performance meet your requirements.
Sources
- Mpdf Documentation - Official PHP PDF Generation Library
- Puppeteer Documentation - Headless Chrome Node API
- wkhtmltopdf - HTML to PDF Command Line Tool
- WeasyPrint Documentation - HTML/CSS to PDF Converter
- TCPDF Documentation - PHP PDF Library
- Snappy PHP Wrapper - wkhtmltopdf PHP Interface
- PHP Performance Best Practices - Memory Management