Instructions LICENSE.txt editing.md pptxgenjs.md scripts/__init__.py scripts/add_slide.py scripts/clean.py scripts/thumbnail.py
PPTX Skill
Quick Reference
Task Guide Read/analyze content python -m markitdown presentation.pptxEdit or create from template Read editing.md Create from scratch Read pptxgenjs.md
Reading Content
# Text extraction
python -m markitdown presentation.pptx
# Visual overview
python scripts/thumbnail.py presentation.pptx
# Raw XML
python scripts/office/unpack.py presentation.pptx unpacked/
Editing Workflow
Read editing.md for full details.
Analyze template with thumbnail.py
Unpack → manipulate slides → edit content → clean → pack
Creating from Scratch
Read pptxgenjs.md for full details.
Use when no template or reference presentation is available.
Design Ideas
Don’t create boring slides. Plain bullets on a white background won’t impress anyone. Consider ideas from this list for each slide.
Before Starting
Pick a bold, content-informed color palette : The palette should feel designed for THIS topic. If swapping your colors into a completely different presentation would still “work,” you haven’t made specific enough choices.
Dominance over equality : One color should dominate (60-70% visual weight), with 1-2 supporting tones and one sharp accent. Never give all colors equal weight.
Dark/light contrast : Dark backgrounds for title + conclusion slides, light for content (“sandwich” structure). Or commit to dark throughout for a premium feel.
Commit to a visual motif : Pick ONE distinctive element and repeat it — rounded image frames, icons in colored circles, thick single-side borders. Carry it across every slide.
Color Palettes
Choose colors that match your topic — don’t default to generic blue. Use these palettes as inspiration:
Theme Primary Secondary Accent Midnight Executive 1E2761 (navy)CADCFC (ice blue)FFFFFF (white)Forest & Moss 2C5F2D (forest)97BC62 (moss)F5F5F5 (cream)Coral Energy F96167 (coral)F9E795 (gold)2F3C7E (navy)Warm Terracotta B85042 (terracotta)E7E8D1 (sand)A7BEAE (sage)Ocean Gradient 065A82 (deep blue)1C7293 (teal)21295C (midnight)Charcoal Minimal 36454F (charcoal)F2F2F2 (off-white)212121 (black)Teal Trust 028090 (teal)00A896 (seafoam)02C39A (mint)Berry & Cream 6D2E46 (berry)A26769 (dusty rose)ECE2D0 (cream)Sage Calm 84B59F (sage)69A297 (eucalyptus)50808E (slate)Cherry Bold 990011 (cherry)FCF6F5 (off-white)2F3C7E (navy)
For Each Slide
Every slide needs a visual element — image, chart, icon, or shape. Text-only slides are forgettable.
Layout options:
Two-column (text left, illustration on right)
Icon + text rows (icon in colored circle, bold header, description below)
2x2 or 2x3 grid (image on one side, grid of content blocks on other)
Half-bleed image (full left or right side) with content overlay
Data display:
Large stat callouts (big numbers 60-72pt with small labels below)
Comparison columns (before/after, pros/cons, side-by-side options)
Timeline or process flow (numbered steps, arrows)
Visual polish:
Icons in small colored circles next to section headers
Italic accent text for key stats or taglines
Typography
Choose an interesting font pairing — don’t default to Arial. Pick a header font with personality and pair it with a clean body font.
Header Font Body Font Georgia Calibri Arial Black Arial Calibri Calibri Light Cambria Calibri Trebuchet MS Calibri Impact Arial Palatino Garamond Consolas Calibri
Element Size Slide title 36-44pt bold Section header 20-24pt bold Body text 14-16pt Captions 10-12pt muted
Spacing
0.5” minimum margins
0.3-0.5” between content blocks
Leave breathing room—don’t fill every inch
Avoid (Common Mistakes)
Don’t repeat the same layout — vary columns, cards, and callouts across slides
Don’t center body text — left-align paragraphs and lists; center only titles
Don’t skimp on size contrast — titles need 36pt+ to stand out from 14-16pt body
Don’t default to blue — pick colors that reflect the specific topic
Don’t mix spacing randomly — choose 0.3” or 0.5” gaps and use consistently
Don’t style one slide and leave the rest plain — commit fully or keep it simple throughout
Don’t create text-only slides — add images, icons, charts, or visual elements; avoid plain title + bullets
Don’t forget text box padding — when aligning lines or shapes with text edges, set margin: 0 on the text box or offset the shape to account for padding
Don’t use low-contrast elements — icons AND text need strong contrast against the background; avoid light text on light backgrounds or dark text on dark backgrounds
NEVER use accent lines under titles — these are a hallmark of AI-generated slides; use whitespace or background color instead
QA (Required)
Assume there are problems. Your job is to find them.
Your first render is almost never correct. Approach QA as a bug hunt, not a confirmation step. If you found zero issues on first inspection, you weren’t looking hard enough.
Content QA
python -m markitdown output.pptx
Check for missing content, typos, wrong order.
When using templates, check for leftover placeholder text:
python -m markitdown output.pptx | grep -iE "xxxx|lorem|ipsum|this.*(page|slide).*layout"
If grep returns results, fix them before declaring success.
Visual QA
⚠️ USE SUBAGENTS — even for 2-3 slides. You’ve been staring at the code and will see what you expect, not what’s there. Subagents have fresh eyes.
Convert slides to images (see Converting to Images ), then use this prompt:
Visually inspect these slides. Assume there are issues — find them.
Look for:
- Overlapping elements (text through shapes, lines through words, stacked elements)
- Text overflow or cut off at edges/box boundaries
- Decorative lines positioned for single-line text but title wrapped to two lines
- Source citations or footers colliding with content above
- Elements too close (< 0.3" gaps) or cards/sections nearly touching
- Uneven gaps (large empty area in one place, cramped in another)
- Insufficient margin from slide edges (< 0.5")
- Columns or similar elements not aligned consistently
- Low-contrast text (e.g., light gray text on cream-colored background)
- Low-contrast icons (e.g., dark icons on dark backgrounds without a contrasting circle)
- Text boxes too narrow causing excessive wrapping
- Leftover placeholder content
For each slide, list issues or areas of concern, even if minor.
Read and analyze these images:
1. /path/to/slide-01.jpg (Expected: [brief description])
2. /path/to/slide-02.jpg (Expected: [brief description])
Report ALL issues found, including minor ones.
Verification Loop
Generate slides → Convert to images → Inspect
List issues found (if none found, look again more critically)
Fix issues
Re-verify affected slides — one fix often creates another problem
Repeat until a full pass reveals no new issues
Do not declare success until you’ve completed at least one fix-and-verify cycle.
Converting to Images
Convert presentations to individual slide images for visual inspection:
python scripts/office/soffice.py --headless --convert-to pdf output.pptx
pdftoppm -jpeg -r 150 output.pdf slide
This creates slide-01.jpg, slide-02.jpg, etc.
To re-render specific slides after fixes:
pdftoppm -jpeg -r 150 -f N -l N output.pdf slide-fixed
Dependencies
pip install "markitdown[pptx]" - text extraction
pip install Pillow - thumbnail grids
npm install -g pptxgenjs - creating from scratch
LibreOffice (soffice) - PDF conversion (auto-configured for sandboxed environments via scripts/office/soffice.py)
Poppler (pdftoppm) - PDF to images
© 2025 Anthropic, PBC. All rights reserved.
LICENSE: Use of these materials (including all code, prompts, assets, files,
and other components of this Skill) is governed by your agreement with
Anthropic regarding use of Anthropic's services. If no separate agreement
exists, use is governed by Anthropic's Consumer Terms of Service or
Commercial Terms of Service, as applicable:
https://www.anthropic.com/legal/consumer-terms
https://www.anthropic.com/legal/commercial-terms
Your applicable agreement is referred to as the "Agreement." "Services" are
as defined in the Agreement.
ADDITIONAL RESTRICTIONS: Notwithstanding anything in the Agreement to the
contrary, users may not:
- Extract these materials from the Services or retain copies of these
materials outside the Services
- Reproduce or copy these materials, except for temporary copies created
automatically during authorized use of the Services
- Create derivative works based on these materials
- Distribute, sublicense, or transfer these materials to any third party
- Make, offer to sell, sell, or import any inventions embodied in these
materials
- Reverse engineer, decompile, or disassemble these materials
The receipt, viewing, or possession of these materials does not convey or
imply any license or right beyond those expressly granted above.
Anthropic retains all right, title, and interest in these materials,
including all copyrights, patents, and other intellectual property rights.
Editing Presentations
Template-Based Workflow
When using an existing presentation as a template:
Analyze existing slides :
python scripts/thumbnail.py template.pptx
python -m markitdown template.pptx Review thumbnails.jpg to see layouts, and markitdown output to see placeholder text.
Plan slide mapping : For each content section, choose a template slide.
⚠️ USE VARIED LAYOUTS — monotonous presentations are a common failure mode. Don't default to basic title + bullet slides. Actively seek out:
Multi-column layouts (2-column, 3-column)
Image + text combinations
Full-bleed images with text overlay
Quote or callout slides
Section dividers
Stat/number callouts
Icon grids or icon + text rows
Avoid: Repeating the same text-heavy layout for every slide.
Match content type to layout style (e.g., key points → bullet slide, team info → multi-column, testimonials → quote slide).
Unpack : python scripts/office/unpack.py template.pptx unpacked/
Build presentation (do this yourself, not with subagents):
Delete unwanted slides (remove from <p:sldIdLst>)
Duplicate slides you want to reuse (add_slide.py)
Reorder slides in <p:sldIdLst>
Complete all structural changes before step 5
Edit content : Update text in each slide{N}.xml.
Use subagents here if available — slides are separate XML files, so subagents can edit in parallel.
Clean : python scripts/clean.py unpacked/
Pack : python scripts/office/pack.py unpacked/ output.pptx --original template.pptx
Scripts
Script
Purpose
unpack.py
Extract and pretty-print PPTX
add_slide.py
Duplicate slide or create from layout
clean.py
Remove orphaned files
pack.py
Repack with validation
thumbnail.py
Create visual grid of slides
unpack.py
python scripts/office/unpack.py input.pptx unpacked/ Extracts PPTX, pretty-prints XML, escapes smart quotes.
add_slide.py
python scripts/add_slide.py unpacked/ slide2.xml # Duplicate slide
python scripts/add_slide.py unpacked/ slideLayout2.xml # From layout Prints <p:sldId> to add to <p:sldIdLst> at desired position.
clean.py
python scripts/clean.py unpacked/ Removes slides not in <p:sldIdLst>, unreferenced media, orphaned rels.
pack.py
python scripts/office/pack.py unpacked/ output.pptx --original input.pptx Validates, repairs, condenses XML, re-encodes smart quotes.
thumbnail.py
python scripts/thumbnail.py input.pptx [output_prefix] [--cols N] Creates thumbnails.jpg with slide filenames as labels. Default 3 columns, max 12 per grid.
Use for template analysis only (choosing layouts). For visual QA, use soffice + pdftoppm to create full-resolution individual slide images—see SKILL.md.
Slide Operations
Slide order is in ppt/presentation.xml → <p:sldIdLst>.
Reorder : Rearrange <p:sldId> elements.
Delete : Remove <p:sldId>, then run clean.py.
Add : Use add_slide.py. Never manually copy slide files—the script handles notes references, Content_Types.xml, and relationship IDs that manual copying misses.
Editing Content
Subagents: If available, use them here (after completing step 4). Each slide is a separate XML file, so subagents can edit in parallel. In your prompt to subagents, include:
The slide file path(s) to edit
"Use the Edit tool for all changes"
The formatting rules and common pitfalls below
For each slide:
Read the slide's XML
Identify ALL placeholder content—text, images, charts, icons, captions
Replace each placeholder with final content
Use the Edit tool, not sed or Python scripts. The Edit tool forces specificity about what to replace and where, yielding better reliability.
Formatting Rules
Bold all headers, subheadings, and inline labels : Use b="1" on <a:rPr>. This includes:
Slide titles
Section headers within a slide
Inline labels like (e.g.: "Status:", "Description:") at the start of a line
Never use unicode bullets (•) : Use proper list formatting with <a:buChar> or <a:buAutoNum>
Bullet consistency : Let bullets inherit from the layout. Only specify <a:buChar> or <a:buNone>.
Common Pitfalls
Template Adaptation
When source content has fewer items than the template:
Remove excess elements entirely (images, shapes, text boxes), don't just clear text
Check for orphaned visuals after clearing text content
Run visual QA to catch mismatched counts
When replacing text with different length content:
Shorter replacements : Usually safe
Longer replacements : May overflow or wrap unexpectedly
Test with visual QA after text changes
Consider truncating or splitting content to fit the template's design constraints
Template slots ≠ Source items : If template has 4 team members but source has 3 users, delete the 4th member's entire group (image + text boxes), not just the text.
Multi-Item Content
If source has multiple items (numbered lists, multiple sections), create separate <a:p> elements for each — never concatenate into one string .
❌ WRONG — all items in one paragraph:
< a:p >
< a:r >< a:rPr .../>< a:t >Step 1: Do the first thing. Step 2: Do the second thing.</ a:t ></ a:r >
</ a:p > ✅ CORRECT — separate paragraphs with bold headers:
< a:p >
< a:pPr algn = "l" >< a:lnSpc >< a:spcPts val = "3919" /></ a:lnSpc ></ a:pPr >
< a:r >< a:rPr lang = "en-US" sz = "2799" b = "1" .../>< a:t >Step 1</ a:t ></ a:r >
</ a:p >
< a:p >
< a:pPr algn = "l" >< a:lnSpc >< a:spcPts val = "3919" /></ a:lnSpc ></ a:pPr >
< a:r >< a:rPr lang = "en-US" sz = "2799" .../>< a:t >Do the first thing.</ a:t ></ a:r >
</ a:p >
< a:p >
< a:pPr algn = "l" >< a:lnSpc >< a:spcPts val = "3919" /></ a:lnSpc ></ a:pPr >
< a:r >< a:rPr lang = "en-US" sz = "2799" b = "1" .../>< a:t >Step 2</ a:t ></ a:r >
</ a:p >
<!-- continue pattern --> Copy <a:pPr> from the original paragraph to preserve line spacing. Use b="1" on headers.
Smart Quotes
Handled automatically by unpack/pack. But the Edit tool converts smart quotes to ASCII.
When adding new text with quotes, use XML entities:
< a:t >the “ Agreement ” </ a:t >
Character
Name
Unicode
XML Entity
“
Left double quote
U+201C
“
”
Right double quote
U+201D
”
‘
Left single quote
U+2018
‘
’
Right single quote
U+2019
’
Other
Whitespace : Use xml:space="preserve" on <a:t> with leading/trailing spaces
XML parsing : Use defusedxml.minidom, not xml.etree.ElementTree (corrupts namespaces)
PptxGenJS Tutorial
Setup & Basic Structure
const pptxgen = require ( "pptxgenjs" );
let pres = new pptxgen ();
pres.layout = 'LAYOUT_16x9' ; // or 'LAYOUT_16x10', 'LAYOUT_4x3', 'LAYOUT_WIDE'
pres.author = 'Your Name' ;
pres.title = 'Presentation Title' ;
let slide = pres. addSlide ();
slide. addText ( "Hello World!" , { x: 0.5 , y: 0.5 , fontSize: 36 , color: "363636" });
pres. writeFile ({ fileName: "Presentation.pptx" }); Layout Dimensions
Slide dimensions (coordinates in inches):
LAYOUT_16x9: 10" × 5.625" (default)
LAYOUT_16x10: 10" × 6.25"
LAYOUT_4x3: 10" × 7.5"
LAYOUT_WIDE: 13.3" × 7.5"
Text & Formatting
// Basic text
slide. addText ( "Simple Text" , {
x: 1 , y: 1 , w: 8 , h: 2 , fontSize: 24 , fontFace: "Arial" ,
color: "363636" , bold: true , align: "center" , valign: "middle"
});
// Character spacing (use charSpacing, not letterSpacing which is silently ignored)
slide. addText ( "SPACED TEXT" , { x: 1 , y: 1 , w: 8 , h: 1 , charSpacing: 6 });
// Rich text arrays
slide. addText ([
{ text: "Bold " , options: { bold: true } },
{ text: "Italic " , options: { italic: true } }
], { x: 1 , y: 3 , w: 8 , h: 1 });
// Multi-line text (requires breakLine: true)
slide. addText ([
{ text: "Line 1" , options: { breakLine: true } },
{ text: "Line 2" , options: { breakLine: true } },
{ text: "Line 3" } // Last item doesn't need breakLine
], { x: 0.5 , y: 0.5 , w: 8 , h: 2 });
// Text box margin (internal padding)
slide. addText ( "Title" , {
x: 0.5 , y: 0.3 , w: 9 , h: 0.6 ,
margin: 0 // Use 0 when aligning text with other elements like shapes or icons
}); Tip: Text boxes have internal margin by default. Set margin: 0 when you need text to align precisely with shapes, lines, or icons at the same x-position.
Lists & Bullets
// ✅ CORRECT: Multiple bullets
slide. addText ([
{ text: "First item" , options: { bullet: true , breakLine: true } },
{ text: "Second item" , options: { bullet: true , breakLine: true } },
{ text: "Third item" , options: { bullet: true } }
], { x: 0.5 , y: 0.5 , w: 8 , h: 3 });
// ❌ WRONG: Never use unicode bullets
slide. addText ( "• First item" , { ... }); // Creates double bullets
// Sub-items and numbered lists
{ text : "Sub-item" , options : { bullet : true , indentLevel : 1 } }
{ text : "First" , options : { bullet : { type : "number" }, breakLine : true } }
Shapes
slide. addShape (pres.shapes. RECTANGLE , {
x: 0.5 , y: 0.8 , w: 1.5 , h: 3.0 ,
fill: { color: "FF0000" }, line: { color: "000000" , width: 2 }
});
slide. addShape (pres.shapes. OVAL , { x: 4 , y: 1 , w: 2 , h: 2 , fill: { color: "0000FF" } });
slide. addShape (pres.shapes. LINE , {
x: 1 , y: 3 , w: 5 , h: 0 , line: { color: "FF0000" , width: 3 , dashType: "dash" }
});
// With transparency
slide. addShape (pres.shapes. RECTANGLE , {
x: 1 , y: 1 , w: 3 , h: 2 ,
fill: { color: "0088CC" , transparency: 50 }
});
// Rounded rectangle (rectRadius only works with ROUNDED_RECTANGLE, not RECTANGLE)
// ⚠️ Don't pair with rectangular accent overlays — they won't cover rounded corners. Use RECTANGLE instead.
slide. addShape (pres.shapes. ROUNDED_RECTANGLE , {
x: 1 , y: 1 , w: 3 , h: 2 ,
fill: { color: "FFFFFF" }, rectRadius: 0.1
});
// With shadow
slide. addShape (pres.shapes. RECTANGLE , {
x: 1 , y: 1 , w: 3 , h: 2 ,
fill: { color: "FFFFFF" },
shadow: { type: "outer" , color: "000000" , blur: 6 , offset: 2 , angle: 135 , opacity: 0.15 }
}); Shadow options:
Property
Type
Range
Notes
type
string
"outer", "inner"
color
string
6-char hex (e.g. "000000")
No # prefix, no 8-char hex — see Common Pitfalls
blur
number
0-100 pt
offset
number
0-200 pt
Must be non-negative — negative values corrupt the file
angle
number
0-359 degrees
Direction the shadow falls (135 = bottom-right, 270 = upward)
opacity
number
0.0-1.0
Use this for transparency, never encode in color string
To cast a shadow upward (e.g. on a footer bar), use angle: 270 with a positive offset — do not use a negative offset.
Note : Gradient fills are not natively supported. Use a gradient image as a background instead.
Images
Image Sources
// From file path
slide. addImage ({ path: "images/chart.png" , x: 1 , y: 1 , w: 5 , h: 3 });
// From URL
slide. addImage ({ path: "https://example.com/image.jpg" , x: 1 , y: 1 , w: 5 , h: 3 });
// From base64 (faster, no file I/O)
slide. addImage ({ data: "image/png;base64,iVBORw0KGgo..." , x: 1 , y: 1 , w: 5 , h: 3 }); Image Options
slide. addImage ({
path: "image.png" ,
x: 1 , y: 1 , w: 5 , h: 3 ,
rotate: 45 , // 0-359 degrees
rounding: true , // Circular crop
transparency: 50 , // 0-100
flipH: true , // Horizontal flip
flipV: false , // Vertical flip
altText: "Description" , // Accessibility
hyperlink: { url: "https://example.com" }
}); Image Sizing Modes
// Contain - fit inside, preserve ratio
{ sizing : { type : 'contain' , w : 4 , h : 3 } }
// Cover - fill area, preserve ratio (may crop)
{ sizing : { type : 'cover' , w : 4 , h : 3 } }
// Crop - cut specific portion
{ sizing : { type : 'crop' , x : 0.5 , y : 0.5 , w : 2 , h : 2 } } Calculate Dimensions (preserve aspect ratio)
const origWidth = 1978 , origHeight = 923 , maxHeight = 3.0 ;
const calcWidth = maxHeight * (origWidth / origHeight);
const centerX = ( 10 - calcWidth) / 2 ;
slide. addImage ({ path: "image.png" , x: centerX, y: 1.2 , w: calcWidth, h: maxHeight }); Supported Formats
Standard : PNG, JPG, GIF (animated GIFs work in Microsoft 365)
SVG : Works in modern PowerPoint/Microsoft 365
Icons
Use react-icons to generate SVG icons, then rasterize to PNG for universal compatibility.
Setup
const React = require ( "react" );
const ReactDOMServer = require ( "react-dom/server" );
const sharp = require ( "sharp" );
const { FaCheckCircle , FaChartLine } = require ( "react-icons/fa" );
function renderIconSvg ( IconComponent , color = "#000000" , size = 256 ) {
return ReactDOMServer. renderToStaticMarkup (
React. createElement (IconComponent, { color, size: String (size) })
);
}
async function iconToBase64Png ( IconComponent , color , size = 256 ) {
const svg = renderIconSvg (IconComponent, color, size);
const pngBuffer = await sharp (Buffer. from (svg)). png (). toBuffer ();
return "image/png;base64," + pngBuffer. toString ( "base64" );
} Add Icon to Slide
const iconData = await iconToBase64Png (FaCheckCircle, "#4472C4" , 256 );
slide. addImage ({
data: iconData,
x: 1 , y: 1 , w: 0.5 , h: 0.5 // Size in inches
}); Note : Use size 256 or higher for crisp icons. The size parameter controls the rasterization resolution, not the display size on the slide (which is set by w and h in inches).
Icon Libraries
Install: npm install -g react-icons react react-dom sharp
Popular icon sets in react-icons:
react-icons/fa - Font Awesome
react-icons/md - Material Design
react-icons/hi - Heroicons
react-icons/bi - Bootstrap Icons
Slide Backgrounds
// Solid color
slide.background = { color: "F1F1F1" };
// Color with transparency
slide.background = { color: "FF3399" , transparency: 50 };
// Image from URL
slide.background = { path: "https://example.com/bg.jpg" };
// Image from base64
slide.background = { data: "image/png;base64,iVBORw0KGgo..." };
Tables
slide. addTable ([
[ "Header 1" , "Header 2" ],
[ "Cell 1" , "Cell 2" ]
], {
x: 1 , y: 1 , w: 8 , h: 2 ,
border: { pt: 1 , color: "999999" }, fill: { color: "F1F1F1" }
});
// Advanced with merged cells
let tableData = [
[{ text: "Header" , options: { fill: { color: "6699CC" }, color: "FFFFFF" , bold: true } }, "Cell" ],
[{ text: "Merged" , options: { colspan: 2 } }]
];
slide. addTable (tableData, { x: 1 , y: 3.5 , w: 8 , colW: [ 4 , 4 ] });
Charts
// Bar chart
slide. addChart (pres.charts. BAR , [{
name: "Sales" , labels: [ "Q1" , "Q2" , "Q3" , "Q4" ], values: [ 4500 , 5500 , 6200 , 7100 ]
}], {
x: 0.5 , y: 0.6 , w: 6 , h: 3 , barDir: 'col' ,
showTitle: true , title: 'Quarterly Sales'
});
// Line chart
slide. addChart (pres.charts. LINE , [{
name: "Temp" , labels: [ "Jan" , "Feb" , "Mar" ], values: [ 32 , 35 , 42 ]
}], { x: 0.5 , y: 4 , w: 6 , h: 3 , lineSize: 3 , lineSmooth: true });
// Pie chart
slide. addChart (pres.charts. PIE , [{
name: "Share" , labels: [ "A" , "B" , "Other" ], values: [ 35 , 45 , 20 ]
}], { x: 7 , y: 1 , w: 5 , h: 4 , showPercent: true }); Better-Looking Charts
Default charts look dated. Apply these options for a modern, clean appearance:
slide. addChart (pres.charts. BAR , chartData, {
x: 0.5 , y: 1 , w: 9 , h: 4 , barDir: "col" ,
// Custom colors (match your presentation palette)
chartColors: [ "0D9488" , "14B8A6" , "5EEAD4" ],
// Clean background
chartArea: { fill: { color: "FFFFFF" }, roundedCorners: true },
// Muted axis labels
catAxisLabelColor: "64748B" ,
valAxisLabelColor: "64748B" ,
// Subtle grid (value axis only)
valGridLine: { color: "E2E8F0" , size: 0.5 },
catGridLine: { style: "none" },
// Data labels on bars
showValue: true ,
dataLabelPosition: "outEnd" ,
dataLabelColor: "1E293B" ,
// Hide legend for single series
showLegend: false ,
}); Key styling options:
chartColors: [...] - hex colors for series/segments
chartArea: { fill, border, roundedCorners } - chart background
catGridLine/valGridLine: { color, style, size } - grid lines (style: "none" to hide)
lineSmooth: true - curved lines (line charts)
legendPos: "r" - legend position: "b", "t", "l", "r", "tr"
Slide Masters
pres. defineSlideMaster ({
title: 'TITLE_SLIDE' , background: { color: '283A5E' },
objects: [{
placeholder: { options: { name: 'title' , type: 'title' , x: 1 , y: 2 , w: 8 , h: 2 } }
}]
});
let titleSlide = pres. addSlide ({ masterName: "TITLE_SLIDE" });
titleSlide. addText ( "My Title" , { placeholder: "title" });
Common Pitfalls
⚠️ These issues cause file corruption, visual bugs, or broken output. Avoid them.
NEVER use "#" with hex colors - causes file corruption
color : "FF0000" // ✅ CORRECT
color : "#FF0000" // ❌ WRONG
NEVER encode opacity in hex color strings - 8-char colors (e.g., "00000020") corrupt the file. Use the opacity property instead.
shadow : { type : "outer" , blur : 6 , offset : 2 , color : "00000020" } // ❌ CORRUPTS FILE
shadow : { type : "outer" , blur : 6 , offset : 2 , color : "000000" , opacity : 0.12 } // ✅ CORRECT
Use bullet: true - NEVER unicode symbols like "•" (creates double bullets)
Use breakLine: true between array items or text runs together
Avoid lineSpacing with bullets - causes excessive gaps; use paraSpaceAfter instead
Each presentation needs fresh instance - don't reuse pptxgen() objects
NEVER reuse option objects across calls - PptxGenJS mutates objects in-place (e.g. converting shadow values to EMU). Sharing one object between multiple calls corrupts the second shape.
const shadow = { type: "outer" , blur: 6 , offset: 2 , color: "000000" , opacity: 0.15 };
slide. addShape (pres.shapes. RECTANGLE , { shadow, ... }); // ❌ second call gets already-converted values
slide. addShape (pres.shapes. RECTANGLE , { shadow, ... });
const makeShadow = () => ({ type: "outer" , blur: 6 , offset: 2 , color: "000000" , opacity: 0.15 });
slide. addShape (pres.shapes. RECTANGLE , { shadow: makeShadow (), ... }); // ✅ fresh object each time
slide. addShape (pres.shapes. RECTANGLE , { shadow: makeShadow (), ... });
Don't use ROUNDED_RECTANGLE with accent borders - rectangular overlay bars won't cover rounded corners. Use RECTANGLE instead.
// ❌ WRONG: Accent bar doesn't cover rounded corners
slide. addShape (pres.shapes. ROUNDED_RECTANGLE , { x: 1 , y: 1 , w: 3 , h: 1.5 , fill: { color: "FFFFFF" } });
slide. addShape (pres.shapes. RECTANGLE , { x: 1 , y: 1 , w: 0.08 , h: 1.5 , fill: { color: "0891B2" } });
// ✅ CORRECT: Use RECTANGLE for clean alignment
slide. addShape (pres.shapes. RECTANGLE , { x: 1 , y: 1 , w: 3 , h: 1.5 , fill: { color: "FFFFFF" } });
slide. addShape (pres.shapes. RECTANGLE , { x: 1 , y: 1 , w: 0.08 , h: 1.5 , fill: { color: "0891B2" } });
Quick Reference
Shapes : RECTANGLE, OVAL, LINE, ROUNDED_RECTANGLE
Charts : BAR, LINE, PIE, DOUGHNUT, SCATTER, BUBBLE, RADAR
Layouts : LAYOUT_16x9 (10"×5.625"), LAYOUT_16x10, LAYOUT_4x3, LAYOUT_WIDE
Alignment : "left", "center", "right"
Chart data labels : "outEnd", "inEnd", "center"
"""Add a new slide to an unpacked PPTX directory.
Usage: python add_slide.py <unpacked_dir> <source>
The source can be:
- A slide file (e.g., slide2.xml) - duplicates the slide
- A layout file (e.g., slideLayout2.xml) - creates from layout
Examples:
python add_slide.py unpacked/ slide2.xml
# Duplicates slide2, creates slide5.xml
python add_slide.py unpacked/ slideLayout2.xml
# Creates slide5.xml from slideLayout2.xml
To see available layouts: ls unpacked/ppt/slideLayouts/
Prints the <p:sldId> element to add to presentation.xml.
"""
import re
import shutil
import sys
from pathlib import Path
def get_next_slide_number (slides_dir: Path) -> int :
existing = [ int (m.group( 1 )) for f in slides_dir.glob( "slide*.xml" )
if (m := re.match( r " slide (\d + ) \. xml " , f.name))]
return max (existing) + 1 if existing else 1
def create_slide_from_layout (unpacked_dir: Path, layout_file: str ) -> None :
slides_dir = unpacked_dir / "ppt" / "slides"
rels_dir = slides_dir / "_rels"
layouts_dir = unpacked_dir / "ppt" / "slideLayouts"
layout_path = layouts_dir / layout_file
if not layout_path.exists():
print ( f "Error: { layout_path } not found" , file = sys.stderr)
sys.exit( 1 )
next_num = get_next_slide_number(slides_dir)
dest = f "slide { next_num } .xml"
dest_slide = slides_dir / dest
dest_rels = rels_dir / f " { dest } .rels"
slide_xml = '''<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<p:sld xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:p="http://schemas.openxmlformats.org/presentationml/2006/main">
<p:cSld>
<p:spTree>
<p:nvGrpSpPr>
<p:cNvPr id="1" name=""/>
<p:cNvGrpSpPr/>
<p:nvPr/>
</p:nvGrpSpPr>
<p:grpSpPr>
<a:xfrm>
<a:off x="0" y="0"/>
<a:ext cx="0" cy="0"/>
<a:chOff x="0" y="0"/>
<a:chExt cx="0" cy="0"/>
</a:xfrm>
</p:grpSpPr>
</p:spTree>
</p:cSld>
<p:clrMapOvr>
<a:masterClrMapping/>
</p:clrMapOvr>
</p:sld>'''
dest_slide.write_text(slide_xml, encoding = "utf-8" )
rels_dir.mkdir( exist_ok = True )
rels_xml = f '''<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
<Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/slideLayout" Target="../slideLayouts/ { layout_file } "/>
</Relationships>'''
dest_rels.write_text(rels_xml, encoding = "utf-8" )
_add_to_content_types(unpacked_dir, dest)
rid = _add_to_presentation_rels(unpacked_dir, dest)
next_slide_id = _get_next_slide_id(unpacked_dir)
print ( f "Created { dest } from { layout_file } " )
print ( f 'Add to presentation.xml <p:sldIdLst>: <p:sldId id=" { next_slide_id } " r:id=" { rid } "/>' )
def duplicate_slide (unpacked_dir: Path, source: str ) -> None :
slides_dir = unpacked_dir / "ppt" / "slides"
rels_dir = slides_dir / "_rels"
source_slide = slides_dir / source
if not source_slide.exists():
print ( f "Error: { source_slide } not found" , file = sys.stderr)
sys.exit( 1 )
next_num = get_next_slide_number(slides_dir)
dest = f "slide { next_num } .xml"
dest_slide = slides_dir / dest
source_rels = rels_dir / f " { source } .rels"
dest_rels = rels_dir / f " { dest } .rels"
shutil.copy2(source_slide, dest_slide)
if source_rels.exists():
shutil.copy2(source_rels, dest_rels)
rels_content = dest_rels.read_text( encoding = "utf-8" )
rels_content = re.sub(
r ' \s * <Relationship [ ^ >] * Type=" [ ^ "] * notesSlide" [ ^ >] * /> \s * ' ,
" \n " ,
rels_content,
)
dest_rels.write_text(rels_content, encoding = "utf-8" )
_add_to_content_types(unpacked_dir, dest)
rid = _add_to_presentation_rels(unpacked_dir, dest)
next_slide_id = _get_next_slide_id(unpacked_dir)
print ( f "Created { dest } from { source } " )
print ( f 'Add to presentation.xml <p:sldIdLst>: <p:sldId id=" { next_slide_id } " r:id=" { rid } "/>' )
def _add_to_content_types (unpacked_dir: Path, dest: str ) -> None :
content_types_path = unpacked_dir / "[Content_Types].xml"
content_types = content_types_path.read_text( encoding = "utf-8" )
new_override = f '<Override PartName="/ppt/slides/ { dest } " ContentType="application/vnd.openxmlformats-officedocument.presentationml.slide+xml"/>'
if f "/ppt/slides/ { dest } " not in content_types:
content_types = content_types.replace( "</Types>" , f " { new_override }\n </Types>" )
content_types_path.write_text(content_types, encoding = "utf-8" )
def _add_to_presentation_rels (unpacked_dir: Path, dest: str ) -> str :
pres_rels_path = unpacked_dir / "ppt" / "_rels" / "presentation.xml.rels"
pres_rels = pres_rels_path.read_text( encoding = "utf-8" )
rids = [ int (m) for m in re.findall( r ' Id="rId (\d + ) " ' , pres_rels)]
next_rid = max (rids) + 1 if rids else 1
rid = f "rId { next_rid } "
new_rel = f '<Relationship Id=" { rid } " Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/slide" Target="slides/ { dest } "/>'
if f "slides/ { dest } " not in pres_rels:
pres_rels = pres_rels.replace( "</Relationships>" , f " { new_rel }\n </Relationships>" )
pres_rels_path.write_text(pres_rels, encoding = "utf-8" )
return rid
def _get_next_slide_id (unpacked_dir: Path) -> int :
pres_path = unpacked_dir / "ppt" / "presentation.xml"
pres_content = pres_path.read_text( encoding = "utf-8" )
slide_ids = [ int (m) for m in re.findall( r ' <p:sldId [ ^ >] * id=" (\d + ) " ' , pres_content)]
return max (slide_ids) + 1 if slide_ids else 256
def parse_source (source: str ) -> tuple[ str , str | None ]:
if source.startswith( "slideLayout" ) and source.endswith( ".xml" ):
return ( "layout" , source)
return ( "slide" , None )
if __name__ == "__main__" :
if len (sys.argv) != 3 :
print ( "Usage: python add_slide.py <unpacked_dir> <source>" , file = sys.stderr)
print ( "" , file = sys.stderr)
print ( "Source can be:" , file = sys.stderr)
print ( " slide2.xml - duplicate an existing slide" , file = sys.stderr)
print ( " slideLayout2.xml - create from a layout template" , file = sys.stderr)
print ( "" , file = sys.stderr)
print ( "To see available layouts: ls <unpacked_dir>/ppt/slideLayouts/" , file = sys.stderr)
sys.exit( 1 )
unpacked_dir = Path(sys.argv[ 1 ])
source = sys.argv[ 2 ]
if not unpacked_dir.exists():
print ( f "Error: { unpacked_dir } not found" , file = sys.stderr)
sys.exit( 1 )
source_type, layout_file = parse_source(source)
if source_type == "layout" and layout_file is not None :
create_slide_from_layout(unpacked_dir, layout_file)
else :
duplicate_slide(unpacked_dir, source)
"""Remove unreferenced files from an unpacked PPTX directory.
Usage: python clean.py <unpacked_dir>
Example:
python clean.py unpacked/
This script removes:
- Orphaned slides (not in sldIdLst) and their relationships
- [trash] directory (unreferenced files)
- Orphaned .rels files for deleted resources
- Unreferenced media, embeddings, charts, diagrams, drawings, ink files
- Unreferenced theme files
- Unreferenced notes slides
- Content-Type overrides for deleted files
"""
import sys
from pathlib import Path
import defusedxml.minidom
import re
def get_slides_in_sldidlst (unpacked_dir: Path) -> set[ str ]:
pres_path = unpacked_dir / "ppt" / "presentation.xml"
pres_rels_path = unpacked_dir / "ppt" / "_rels" / "presentation.xml.rels"
if not pres_path.exists() or not pres_rels_path.exists():
return set ()
rels_dom = defusedxml.minidom.parse( str (pres_rels_path))
rid_to_slide = {}
for rel in rels_dom.getElementsByTagName( "Relationship" ):
rid = rel.getAttribute( "Id" )
target = rel.getAttribute( "Target" )
rel_type = rel.getAttribute( "Type" )
if "slide" in rel_type and target.startswith( "slides/" ):
rid_to_slide[rid] = target.replace( "slides/" , "" )
pres_content = pres_path.read_text( encoding = "utf-8" )
referenced_rids = set (re.findall( r ' <p:sldId [ ^ >] * r:id=" ([ ^ "] + ) " ' , pres_content))
return {rid_to_slide[rid] for rid in referenced_rids if rid in rid_to_slide}
def remove_orphaned_slides (unpacked_dir: Path) -> list[ str ]:
slides_dir = unpacked_dir / "ppt" / "slides"
slides_rels_dir = slides_dir / "_rels"
pres_rels_path = unpacked_dir / "ppt" / "_rels" / "presentation.xml.rels"
if not slides_dir.exists():
return []
referenced_slides = get_slides_in_sldidlst(unpacked_dir)
removed = []
for slide_file in slides_dir.glob( "slide*.xml" ):
if slide_file.name not in referenced_slides:
rel_path = slide_file.relative_to(unpacked_dir)
slide_file.unlink()
removed.append( str (rel_path))
rels_file = slides_rels_dir / f " { slide_file.name } .rels"
if rels_file.exists():
rels_file.unlink()
removed.append( str (rels_file.relative_to(unpacked_dir)))
if removed and pres_rels_path.exists():
rels_dom = defusedxml.minidom.parse( str (pres_rels_path))
changed = False
for rel in list (rels_dom.getElementsByTagName( "Relationship" )):
target = rel.getAttribute( "Target" )
if target.startswith( "slides/" ):
slide_name = target.replace( "slides/" , "" )
if slide_name not in referenced_slides:
if rel.parentNode:
rel.parentNode.removeChild(rel)
changed = True
if changed:
with open (pres_rels_path, "wb" ) as f:
f.write(rels_dom.toxml( encoding = "utf-8" ))
return removed
def remove_trash_directory (unpacked_dir: Path) -> list[ str ]:
trash_dir = unpacked_dir / "[trash]"
removed = []
if trash_dir.exists() and trash_dir.is_dir():
for file_path in trash_dir.iterdir():
if file_path.is_file():
rel_path = file_path.relative_to(unpacked_dir)
removed.append( str (rel_path))
file_path.unlink()
trash_dir.rmdir()
return removed
def get_slide_referenced_files (unpacked_dir: Path) -> set :
referenced = set ()
slides_rels_dir = unpacked_dir / "ppt" / "slides" / "_rels"
if not slides_rels_dir.exists():
return referenced
for rels_file in slides_rels_dir.glob( "*.rels" ):
dom = defusedxml.minidom.parse( str (rels_file))
for rel in dom.getElementsByTagName( "Relationship" ):
target = rel.getAttribute( "Target" )
if not target:
continue
target_path = (rels_file.parent.parent / target).resolve()
try :
referenced.add(target_path.relative_to(unpacked_dir.resolve()))
except ValueError :
pass
return referenced
def remove_orphaned_rels_files (unpacked_dir: Path) -> list[ str ]:
resource_dirs = [ "charts" , "diagrams" , "drawings" ]
removed = []
slide_referenced = get_slide_referenced_files(unpacked_dir)
for dir_name in resource_dirs:
rels_dir = unpacked_dir / "ppt" / dir_name / "_rels"
if not rels_dir.exists():
continue
for rels_file in rels_dir.glob( "*.rels" ):
resource_file = rels_dir.parent / rels_file.name.replace( ".rels" , "" )
try :
resource_rel_path = resource_file.resolve().relative_to(unpacked_dir.resolve())
except ValueError :
continue
if not resource_file.exists() or resource_rel_path not in slide_referenced:
rels_file.unlink()
rel_path = rels_file.relative_to(unpacked_dir)
removed.append( str (rel_path))
return removed
def get_referenced_files (unpacked_dir: Path) -> set :
referenced = set ()
for rels_file in unpacked_dir.rglob( "*.rels" ):
dom = defusedxml.minidom.parse( str (rels_file))
for rel in dom.getElementsByTagName( "Relationship" ):
target = rel.getAttribute( "Target" )
if not target:
continue
target_path = (rels_file.parent.parent / target).resolve()
try :
referenced.add(target_path.relative_to(unpacked_dir.resolve()))
except ValueError :
pass
return referenced
def remove_orphaned_files (unpacked_dir: Path, referenced: set ) -> list[ str ]:
resource_dirs = [ "media" , "embeddings" , "charts" , "diagrams" , "tags" , "drawings" , "ink" ]
removed = []
for dir_name in resource_dirs:
dir_path = unpacked_dir / "ppt" / dir_name
if not dir_path.exists():
continue
for file_path in dir_path.glob( "*" ):
if not file_path.is_file():
continue
rel_path = file_path.relative_to(unpacked_dir)
if rel_path not in referenced:
file_path.unlink()
removed.append( str (rel_path))
theme_dir = unpacked_dir / "ppt" / "theme"
if theme_dir.exists():
for file_path in theme_dir.glob( "theme*.xml" ):
rel_path = file_path.relative_to(unpacked_dir)
if rel_path not in referenced:
file_path.unlink()
removed.append( str (rel_path))
theme_rels = theme_dir / "_rels" / f " { file_path.name } .rels"
if theme_rels.exists():
theme_rels.unlink()
removed.append( str (theme_rels.relative_to(unpacked_dir)))
notes_dir = unpacked_dir / "ppt" / "notesSlides"
if notes_dir.exists():
for file_path in notes_dir.glob( "*.xml" ):
if not file_path.is_file():
continue
rel_path = file_path.relative_to(unpacked_dir)
if rel_path not in referenced:
file_path.unlink()
removed.append( str (rel_path))
notes_rels_dir = notes_dir / "_rels"
if notes_rels_dir.exists():
for file_path in notes_rels_dir.glob( "*.rels" ):
notes_file = notes_dir / file_path.name.replace( ".rels" , "" )
if not notes_file.exists():
file_path.unlink()
removed.append( str (file_path.relative_to(unpacked_dir)))
return removed
def update_content_types (unpacked_dir: Path, removed_files: list[ str ]) -> None :
ct_path = unpacked_dir / "[Content_Types].xml"
if not ct_path.exists():
return
dom = defusedxml.minidom.parse( str (ct_path))
changed = False
for override in list (dom.getElementsByTagName( "Override" )):
part_name = override.getAttribute( "PartName" ).lstrip( "/" )
if part_name in removed_files:
if override.parentNode:
override.parentNode.removeChild(override)
changed = True
if changed:
with open (ct_path, "wb" ) as f:
f.write(dom.toxml( encoding = "utf-8" ))
def clean_unused_files (unpacked_dir: Path) -> list[ str ]:
all_removed = []
slides_removed = remove_orphaned_slides(unpacked_dir)
all_removed.extend(slides_removed)
trash_removed = remove_trash_directory(unpacked_dir)
all_removed.extend(trash_removed)
while True :
removed_rels = remove_orphaned_rels_files(unpacked_dir)
referenced = get_referenced_files(unpacked_dir)
removed_files = remove_orphaned_files(unpacked_dir, referenced)
total_removed = removed_rels + removed_files
if not total_removed:
break
all_removed.extend(total_removed)
if all_removed:
update_content_types(unpacked_dir, all_removed)
return all_removed
if __name__ == "__main__" :
if len (sys.argv) != 2 :
print ( "Usage: python clean.py <unpacked_dir>" , file = sys.stderr)
print ( "Example: python clean.py unpacked/" , file = sys.stderr)
sys.exit( 1 )
unpacked_dir = Path(sys.argv[ 1 ])
if not unpacked_dir.exists():
print ( f "Error: { unpacked_dir } not found" , file = sys.stderr)
sys.exit( 1 )
removed = clean_unused_files(unpacked_dir)
if removed:
print ( f "Removed {len (removed) } unreferenced files:" )
for f in removed:
print ( f " { f } " )
else :
print ( "No unreferenced files found" )
"""Create thumbnail grids from PowerPoint presentation slides.
Creates a grid layout of slide thumbnails for quick visual analysis.
Labels each thumbnail with its XML filename (e.g., slide1.xml).
Hidden slides are shown with a placeholder pattern.
Usage:
python thumbnail.py input.pptx [output_prefix] [--cols N]
Examples:
python thumbnail.py presentation.pptx
# Creates: thumbnails.jpg
python thumbnail.py template.pptx grid --cols 4
# Creates: grid.jpg (or grid-1.jpg, grid-2.jpg for large decks)
"""
import argparse
import subprocess
import sys
import tempfile
import zipfile
from pathlib import Path
import defusedxml.minidom
from office.soffice import get_soffice_env
from PIL import Image, ImageDraw, ImageFont
THUMBNAIL_WIDTH = 300
CONVERSION_DPI = 100
MAX_COLS = 6
DEFAULT_COLS = 3
JPEG_QUALITY = 95
GRID_PADDING = 20
BORDER_WIDTH = 2
FONT_SIZE_RATIO = 0.10
LABEL_PADDING_RATIO = 0.4
def main ():
parser = argparse.ArgumentParser(
description = "Create thumbnail grids from PowerPoint slides."
)
parser.add_argument( "input" , help = "Input PowerPoint file (.pptx)" )
parser.add_argument(
"output_prefix" ,
nargs = "?" ,
default = "thumbnails" ,
help = "Output prefix for image files (default: thumbnails)" ,
)
parser.add_argument(
"--cols" ,
type = int ,
default = DEFAULT_COLS ,
help = f "Number of columns (default: {DEFAULT_COLS} , max: {MAX_COLS} )" ,
)
args = parser.parse_args()
cols = min (args.cols, MAX_COLS )
if args.cols > MAX_COLS :
print ( f "Warning: Columns limited to {MAX_COLS} " )
input_path = Path(args.input)
if not input_path.exists() or input_path.suffix.lower() != ".pptx" :
print ( f "Error: Invalid PowerPoint file: { args.input } " , file = sys.stderr)
sys.exit( 1 )
output_path = Path( f " { args.output_prefix } .jpg" )
try :
slide_info = get_slide_info(input_path)
with tempfile.TemporaryDirectory() as temp_dir:
temp_path = Path(temp_dir)
visible_images = convert_to_images(input_path, temp_path)
if not visible_images and not any (s[ "hidden" ] for s in slide_info):
print ( "Error: No slides found" , file = sys.stderr)
sys.exit( 1 )
slides = build_slide_list(slide_info, visible_images, temp_path)
grid_files = create_grids(slides, cols, THUMBNAIL_WIDTH , output_path)
print ( f "Created {len (grid_files) } grid(s):" )
for grid_file in grid_files:
print ( f " { grid_file } " )
except Exception as e:
print ( f "Error: { e } " , file = sys.stderr)
sys.exit( 1 )
def get_slide_info (pptx_path: Path) -> list[ dict ]:
with zipfile.ZipFile(pptx_path, "r" ) as zf:
rels_content = zf.read( "ppt/_rels/presentation.xml.rels" ).decode( "utf-8" )
rels_dom = defusedxml.minidom.parseString(rels_content)
rid_to_slide = {}
for rel in rels_dom.getElementsByTagName( "Relationship" ):
rid = rel.getAttribute( "Id" )
target = rel.getAttribute( "Target" )
rel_type = rel.getAttribute( "Type" )
if "slide" in rel_type and target.startswith( "slides/" ):
rid_to_slide[rid] = target.replace( "slides/" , "" )
pres_content = zf.read( "ppt/presentation.xml" ).decode( "utf-8" )
pres_dom = defusedxml.minidom.parseString(pres_content)
slides = []
for sld_id in pres_dom.getElementsByTagName( "p:sldId" ):
rid = sld_id.getAttribute( "r:id" )
if rid in rid_to_slide:
hidden = sld_id.getAttribute( "show" ) == "0"
slides.append({ "name" : rid_to_slide[rid], "hidden" : hidden})
return slides
def build_slide_list (
slide_info: list[ dict ],
visible_images: list[Path],
temp_dir: Path,
) -> list[tuple[Path, str ]]:
if visible_images:
with Image.open(visible_images[ 0 ]) as img:
placeholder_size = img.size
else :
placeholder_size = ( 1920 , 1080 )
slides = []
visible_idx = 0
for info in slide_info:
if info[ "hidden" ]:
placeholder_path = temp_dir / f "hidden- { info[ 'name' ] } .jpg"
placeholder_img = create_hidden_placeholder(placeholder_size)
placeholder_img.save(placeholder_path, "JPEG" )
slides.append((placeholder_path, f " { info[ 'name' ] } (hidden)" ))
else :
if visible_idx < len (visible_images):
slides.append((visible_images[visible_idx], info[ "name" ]))
visible_idx += 1
return slides
def create_hidden_placeholder (size: tuple[ int , int ]) -> Image.Image:
img = Image.new( "RGB" , size, color = "#F0F0F0" )
draw = ImageDraw.Draw(img)
line_width = max ( 5 , min (size) // 100 )
draw.line([( 0 , 0 ), size], fill = "#CCCCCC" , width = line_width)
draw.line([(size[ 0 ], 0 ), ( 0 , size[ 1 ])], fill = "#CCCCCC" , width = line_width)
return img
def convert_to_images (pptx_path: Path, temp_dir: Path) -> list[Path]:
pdf_path = temp_dir / f " { pptx_path.stem } .pdf"
result = subprocess.run(
[
"soffice" ,
"--headless" ,
"--convert-to" ,
"pdf" ,
"--outdir" ,
str (temp_dir),
str (pptx_path),
],
capture_output = True ,
text = True ,
env = get_soffice_env(),
)
if result.returncode != 0 or not pdf_path.exists():
raise RuntimeError ( "PDF conversion failed" )
result = subprocess.run(
[
"pdftoppm" ,
"-jpeg" ,
"-r" ,
str ( CONVERSION_DPI ),
str (pdf_path),
str (temp_dir / "slide" ),
],
capture_output = True ,
text = True ,
)
if result.returncode != 0 :
raise RuntimeError ( "Image conversion failed" )
return sorted (temp_dir.glob( "slide-*.jpg" ))
def create_grids (
slides: list[tuple[Path, str ]],
cols: int ,
width: int ,
output_path: Path,
) -> list[ str ]:
max_per_grid = cols * (cols + 1 )
grid_files = []
for chunk_idx, start_idx in enumerate ( range ( 0 , len (slides), max_per_grid)):
end_idx = min (start_idx + max_per_grid, len (slides))
chunk_slides = slides[start_idx:end_idx]
grid = create_grid(chunk_slides, cols, width)
if len (slides) <= max_per_grid:
grid_filename = output_path
else :
stem = output_path.stem
suffix = output_path.suffix
grid_filename = output_path.parent / f " { stem } - { chunk_idx + 1}{ suffix } "
grid_filename.parent.mkdir( parents = True , exist_ok = True )
grid.save( str (grid_filename), quality = JPEG_QUALITY )
grid_files.append( str (grid_filename))
return grid_files
def create_grid (
slides: list[tuple[Path, str ]],
cols: int ,
width: int ,
) -> Image.Image:
font_size = int (width * FONT_SIZE_RATIO )
label_padding = int (font_size * LABEL_PADDING_RATIO )
with Image.open(slides[ 0 ][ 0 ]) as img:
aspect = img.height / img.width
height = int (width * aspect)
rows = ( len (slides) + cols - 1 ) // cols
grid_w = cols * width + (cols + 1 ) * GRID_PADDING
grid_h = rows * (height + font_size + label_padding * 2 ) + (rows + 1 ) * GRID_PADDING
grid = Image.new( "RGB" , (grid_w, grid_h), "white" )
draw = ImageDraw.Draw(grid)
try :
font = ImageFont.load_default( size = font_size)
except Exception :
font = ImageFont.load_default()
for i, (img_path, slide_name) in enumerate (slides):
row, col = i // cols, i % cols
x = col * width + (col + 1 ) * GRID_PADDING
y_base = (
row * (height + font_size + label_padding * 2 ) + (row + 1 ) * GRID_PADDING
)
label = slide_name
bbox = draw.textbbox(( 0 , 0 ), label, font = font)
text_w = bbox[ 2 ] - bbox[ 0 ]
draw.text(
(x + (width - text_w) // 2 , y_base + label_padding),
label,
fill = "black" ,
font = font,
)
y_thumbnail = y_base + label_padding + font_size + label_padding
with Image.open(img_path) as img:
img.thumbnail((width, height), Image.Resampling. LANCZOS )
w, h = img.size
tx = x + (width - w) // 2
ty = y_thumbnail + (height - h) // 2
grid.paste(img, (tx, ty))
if BORDER_WIDTH > 0 :
draw.rectangle(
[
(tx - BORDER_WIDTH , ty - BORDER_WIDTH ),
(tx + w + BORDER_WIDTH - 1 , ty + h + BORDER_WIDTH - 1 ),
],
outline = "gray" ,
width = BORDER_WIDTH ,
)
return grid
if __name__ == "__main__" :
main()