Turning PDFs Into PNG/APNG With Bun, Ghostscript, and UPNG
A light, reliable pipeline to convert single-page PDFs to PNG and multi-page PDFs to APNG using Bun, Ghostscript, and UPNG — plus an interactive browser viewer.
Turning PDFs Into PNG/APNG With Bun, Ghostscript, and UPNG
If you work with multi‑page PDFs, a lightweight, universally viewable preview format is invaluable. For single pages, PNG is perfect. For multi‑page documents, APNG gives you a compact, animated image that works in modern browsers without video players or heavyweight viewers.
This post covers a small Bun CLI that converts PDFs into PNG/APNG, plus a browser viewer to pause on any frame for comfortable reading.
Why APNG (and not GIF or MP4)
- Built‑in browser support across Chrome, Firefox, Safari.
- Truecolor and alpha; much better quality than GIF.
- Stays “image-like” for easy sharing, embedding, and CDN caching.
- Faster iteration than generating video for simple “page flipping.”
Approach
- Rasterize PDF pages to PNG via Ghostscript (
gs) for speed and reliability with Bun (no native Node rebuilds). - Decode those PNGs to RGBA pixels and assemble an APNG using
upng-js. - Normalize frame sizes; flatten transparency onto white; and encode as truecolor (no palette) to avoid rendering quirks.
The CLI
- Script path:
scripts/pdf-to-images.ts - Package script:
pdf:images - Usage:
# Single file (auto-detects 1 vs many pages)
bun run pdf:images path/to/file.pdf
# Options
--dpi=300 # rasterization quality (Ghostscript path)
--delay=400 # per-frame delay in ms for APNG
--scale=2 # only used by the pdf.js fallback path
Behavior
- 1 page → writes
<name>.png - 2+ pages → writes
<name>.apng - Output is adjacent to the input PDF
Under the hood
- Detect
gs. If present, run it once to produce page PNGs (fast and avoids the Node‑canvas ABI mismatch with Bun):
gs -dSAFER -dBATCH -dNOPAUSE -sDEVICE=pngalpha -r300 -o /tmp/out-%03d.png input.pdf
- Decode each PNG with
upng-js, flatten its alpha onto white, and pad to the max width/height across frames. - Build a truecolor APNG (disable palette) with consistent frame sizes and your chosen delay.
The “White Rectangle” Problem (and Fix)
During testing, one multi‑page PDF produced an APNG that appeared blank in the browser. Root cause:
- The PDF had transparency/soft masks. Ghostscript’s
pngalphaemitted valid RGB but with alpha 0 (fully transparent). - Browsers correctly showed the page background through the APNG (white rectangle).
Fixes applied:
- Flatten RGBA onto white before encoding (alpha → 255).
- Encode as truecolor (disable palette) to avoid lossy palette interactions with alpha.
- Optional alternative: switch Ghostscript to
-sDEVICE=png16m(opaque) if you never want alpha.
Flattening snippet used before APNG encoding:
function flattenOnWhite(rgba: Uint8Array): Uint8Array {
const out = new Uint8Array(rgba.length);
for (let i = 0; i < rgba.length; i += 4) {
const r = rgba[i], g = rgba[i + 1], b = rgba[i + 2], a = rgba[i + 3] / 255;
out[i] = Math.round(a * r + (1 - a) * 255);
out[i + 1] = Math.round(a * g + (1 - a) * 255);
out[i + 2] = Math.round(a * b + (1 - a) * 255);
out[i + 3] = 255;
}
return out;
}
And forcing truecolor in UPNG’s encoder:
const apng = UPNG.encode(frames, width, height, /*ps*/ 0, delays, /*forbidPlte*/ true);
Result: both documents now convert to readable APNGs consistently.
Interactive APNG Viewer (Pause on Any Frame)
- HTML path:
test/data/apng-viewer.html - Features:
- Play/Pause, Prev/Next, frame slider
- Zoom (Fit/100/150/200%) and speed control
- Works locally via
file://(loadsUPNG.js+pakofromnode_modules) - Auto-tries to load a generated APNG; or use “Open APNG” to pick any file
Setup and Examples
bun install
# macOS: install Ghostscript once
brew install ghostscript
# Convert multi-page PDF at higher quality
bun run pdf:images test/data/building-an-ai-native-engineering.pdf --dpi=300 --delay=400
# Convert another file
bun run pdf:images test/data/building-an-ai-native-engineering-team.pdf --dpi=300 --delay=400
Open test/data/apng-viewer.html to browse frames, pause, and zoom for comfortable reading.
Quality and Performance Tips
- DPI vs. size: 200–300 DPI is a good balance for readability and file size.
- Delay: 300–600 ms per page often feels right for “reading pace”; or just pause on specific frames in the viewer.
- APNG vs. video: APNG is ideal for static pages; if you need long, high‑FPS captures or audio, use video instead.
Lessons Learned
- Bun + native Node modules: Node‑canvas often needs a local rebuild for ABI alignment; Ghostscript avoids this entirely.
- Transparency matters: flatten alpha for reliable cross‑browser rendering; be explicit about palette vs. truecolor.
- Keep frames consistent: equal dimensions across frames eliminate APNG rendering surprises and encoder workarounds.
What’s Next
- Add
--opaqueto force Ghostscriptpng16m(no alpha) when desired. - Optional resize/crop flags (e.g.,
--max-width=1200) for predictable output sizes. - Package as a standalone binary or
npx-style tool for broader reuse.
If you want the post to include additional code excerpts (e.g., the full Ghostscript invocation, padding logic, or error handling), you can copy directly from scripts/pdf-to-images.ts.
Here’s why those sizes look the way they do:
- APNG stores raster frames; PDF stores vector/text
- PDF often compresses text, fonts, and vector shapes very efficiently. Images inside PDFs are typically JPEG/Flate compressed and fonts are embedded once then reused across pages.
- APNG is 20 full raster frames (one per page). At 200 DPI letter-size pages are about 1700×2200 px ≈ 3.74M pixels. Raw RGBA is ~15 MB per frame; 20 frames is ~300 MB raw before DEFLATE compression. Getting to 8–9 MB is already a 30–40× compression ratio, which is typical for mostly-text pages.
- The APNGs differ because of page complexity/compressibility
- 9.3 MB (engineering) vs 8.4 MB (engineering-team) suggests the “engineering” pages have more photos/gradients or noise, which are less compressible than crisp text and flat
- Alpha channel and truecolor add overhead
- We flatten visuals onto white but the encoder still emits APNG in truecolor with alpha (RGBA). A fully opaque alpha channel compresses well (many 0xFF bytes), but it’s still extra data in the stream.
- If we forced RGB-only frames (no alpha) we could shave some size, sometimes noticeably.
- DPI scales size roughly with the square
- If one file was rendered at higher DPI, its APNG would balloon fast. Going 200 → 300 DPI increases pixels by ~2.25×. In your numbers, both APNGs are in the same ballpark, so DPI was likely the same; the size gap is mainly compressibility differences.
- Why APNG > PDF in both cases
- PDF retains structure (text, fonts, vectors), while APNG is pixels for every page. APNG pays a fixed per-pixel cost across all pages, even for simple text, whereas PDF reuses fonts and draws text/paths compactly.
Quick ways to shrink the APNGs if needed
- Lower DPI: --dpi=150 or --dpi=200 (biggest impact; quality vs size tradeoff).
- Drop alpha: switch Ghostscript to RGB (png16m) or force RGB in the encoder; saves alpha overhead.
- Color quantization: allow palette (e.g., 256 colors) for mostly-text docs; can reduce size a lot but may introduce dithering on images.
- Re-compress: run a PNG optimizer (oxipng/advpng/zopfli) on frames before APNG assembly, or a final APNG optimizer.
If you want, I can add:
- --opaque to use png16m (no alpha)
- --quantize= to enable palette quantization (e.g., 256) for size-focused outputs
- A --dpi preset toggle to generate “web”, “readable”, “print-ish” tiers