Agent Skills

Word Read Write

AIPOCH

Create and read Microsoft Word (.docx) documents. Use this skill when you need to generate reports/letters/templates as .docx or extract readable text from existing .docx files.

3
0
FILES
word-read-write/
skill.md
scripts
accept_changes.py
comment.py
office
helpers
merge_runs.py
simplify_redlines.py
pack.py
schemas
ecma
fouth-edition
opc-contentTypes.xsd
opc-coreProperties.xsd
opc-digSig.xsd
opc-relationships.xsd
ISO-IEC29500-4_2016
dml-chart.xsd
dml-chartDrawing.xsd
dml-diagram.xsd
dml-lockedCanvas.xsd
dml-main.xsd
dml-picture.xsd
dml-spreadsheetDrawing.xsd
dml-wordprocessingDrawing.xsd
pml.xsd
shared-additionalCharacteristics.xsd
shared-bibliography.xsd
shared-commonSimpleTypes.xsd
shared-customXmlDataProperties.xsd
shared-customXmlSchemaProperties.xsd
shared-documentPropertiesCustom.xsd
shared-documentPropertiesExtended.xsd
shared-documentPropertiesVariantTypes.xsd
shared-math.xsd
shared-relationshipReference.xsd
sml.xsd
vml-main.xsd
vml-officeDrawing.xsd
vml-presentationDrawing.xsd
vml-spreadsheetDrawing.xsd
vml-wordprocessingDrawing.xsd
wml.xsd
xml.xsd
mce
mc.xsd
microsoft
wml-2010.xsd
wml-2012.xsd
wml-2018.xsd
wml-cex-2018.xsd
wml-cid-2016.xsd
wml-sdtdatahash-2020.xsd
wml-symex-2015.xsd
soffice.py
unpack.py
validate.py
validators
base.py
docx.py
pptx.py
redlining.py
templates
comments.xml
commentsExtended.xml
commentsExtensible.xml
commentsIds.xml
people.xml
85100Total Score
View Evaluation Report
Core Capability
78 / 100
Functional Suitability
10 / 12
Reliability
9 / 12
Performance & Context
8 / 8
Agent Usability
12 / 16
Human Usability
7 / 8
Security
8 / 12
Maintainability
9 / 12
Agent-Specific
15 / 20
Medical Task
20 / 20 Passed
94Generate Word reports (e.g., weekly status, audit summaries) from structured data in Node.js
4/4
90Produce standardized memos/letters/templates with consistent page size, margins, headings, and tables
4/4
88Create .docx documents using the docx (docx-js) library
4/4
88Explicit page setup (US Letter vs A4), margins, and landscape handling
4/4
88End-to-end case for Create .docx documents using the docx (docx-js) library
4/4

SKILL.md

When to Use

  • Generate Word reports (e.g., weekly status, audit summaries) from structured data in Node.js.
  • Produce standardized memos/letters/templates with consistent page size, margins, headings, and tables.
  • Export documents that must render reliably across Microsoft Word and Google Docs (tables, shading, widths).
  • Insert images, headers/footers, page numbers, page breaks, and a table of contents programmatically.
  • Extract text from existing .docx files for indexing, review, or conversion to Markdown.

Key Features

  • Create .docx documents using the docx (docx-js) library.
  • Explicit page setup (US Letter vs A4), margins, and landscape handling.
  • Style control (default font, overriding built-in Heading styles for TOC compatibility).
  • Proper list generation (bullets/numbering via numbering config, not Unicode characters).
  • Robust table rendering rules (DXA widths, column widths + cell widths, shading, borders, padding).
  • Images with required metadata and transformations.
  • Page breaks, headers/footers, and page numbering.
  • Table of contents generation based on Word heading levels.
  • Read/extract .docx content via Pandoc conversion.

Dependencies

  • Node.js: >= 18
  • docx: ^9.0.0
  • pandoc: >= 2.0 (CLI tool for extracting/converting .docx content)

Example Usage

1) Create a .docx (Node.js)

Install

npm i docx

create-doc.js

const fs = require("fs");
const {
  Document,
  Packer,
  Paragraph,
  TextRun,
  Table,
  TableRow,
  TableCell,
  ImageRun,
  Header,
  Footer,
  AlignmentType,
  BorderStyle,
  WidthType,
  ShadingType,
  PageOrientation,
  PageNumber,
  PageBreak,
  TableOfContents,
  HeadingLevel,
  LevelFormat,
} = require("docx");

const US_LETTER = { width: 12240, height: 15840 }; // DXA (1440 = 1 inch)
const MARGINS_1IN = { top: 1440, right: 1440, bottom: 1440, left: 1440 };
const CONTENT_WIDTH = US_LETTER.width - MARGINS_1IN.left - MARGINS_1IN.right; // 9360

const border = { style: BorderStyle.SINGLE, size: 1, color: "CCCCCC" };
const borders = { top: border, bottom: border, left: border, right: border };

const doc = new Document({
  styles: {
    default: {
      document: {
        run: { font: "Arial", size: 24 }, // 12pt
      },
    },
    paragraphStyles: [
      // Use exact IDs to override built-in Word heading styles
      {
        id: "Heading1",
        name: "Heading 1",
        basedOn: "Normal",
        next: "Normal",
        quickFormat: true,
        run: { size: 32, bold: true, font: "Arial" },
        paragraph: { spacing: { before: 240, after: 240 }, outlineLevel: 0 },
      },
      {
        id: "Heading2",
        name: "Heading 2",
        basedOn: "Normal",
        next: "Normal",
        quickFormat: true,
        run: { size: 28, bold: true, font: "Arial" },
        paragraph: { spacing: { before: 180, after: 180 }, outlineLevel: 1 },
      },
    ],
  },

  numbering: {
    config: [
      {
        reference: "bullets",
        levels: [
          {
            level: 0,
            format: LevelFormat.BULLET,
            text: "•",
            alignment: AlignmentType.LEFT,
            style: { paragraph: { indent: { left: 720, hanging: 360 } } },
          },
        ],
      },
      {
        reference: "numbers",
        levels: [
          {
            level: 0,
            format: LevelFormat.DECIMAL,
            text: "%1.",
            alignment: AlignmentType.LEFT,
            style: { paragraph: { indent: { left: 720, hanging: 360 } } },
          },
        ],
      },
    ],
  },

  sections: [
    {
      properties: {
        page: {
          size: {
            width: US_LETTER.width,
            height: US_LETTER.height,
            // For landscape: pass portrait dimensions and set orientation; docx swaps internally.
            // orientation: PageOrientation.LANDSCAPE,
          },
          margin: MARGINS_1IN,
        },
      },

      headers: {
        default: new Header({
          children: [new Paragraph({ children: [new TextRun("Sample Header")] })],
        }),
      },

      footers: {
        default: new Footer({
          children: [
            new Paragraph({
              children: [
                new TextRun("Page "),
                new TextRun({ children: [PageNumber.CURRENT] }),
              ],
            }),
          ],
        }),
      },

      children: [
        new Paragraph({
          heading: HeadingLevel.HEADING_1,
          children: [new TextRun("DOCX Creation Example")],
        }),

        new TableOfContents("Table of Contents", {
          hyperlink: true,
          headingStyleRange: "1-3",
        }),

        new Paragraph({
          heading: HeadingLevel.HEADING_2,
          children: [new TextRun("Lists")],
        }),

        new Paragraph({
          numbering: { reference: "bullets", level: 0 },
          children: [new TextRun("Bullet item")],
        }),
        new Paragraph({
          numbering: { reference: "numbers", level: 0 },
          children: [new TextRun("Numbered item")],
        }),

        new Paragraph({
          heading: HeadingLevel.HEADING_2,
          children: [new TextRun("Table")],
        }),

        new Table({
          width: { size: CONTENT_WIDTH, type: WidthType.DXA },
          columnWidths: [CONTENT_WIDTH / 2, CONTENT_WIDTH / 2],
          rows: [
            new TableRow({
              children: [
                new TableCell({
                  borders,
                  width: { size: CONTENT_WIDTH / 2, type: WidthType.DXA },
                  shading: { fill: "D5E8F0", type: ShadingType.CLEAR },
                  margins: { top: 80, bottom: 80, left: 120, right: 120 },
                  children: [new Paragraph("Left cell")],
                }),
                new TableCell({
                  borders,
                  width: { size: CONTENT_WIDTH / 2, type: WidthType.DXA },
                  shading: { fill: "FFFFFF", type: ShadingType.CLEAR },
                  margins: { top: 80, bottom: 80, left: 120, right: 120 },
                  children: [new Paragraph("Right cell")],
                }),
              ],
            }),
          ],
        }),

        new Paragraph({
          heading: HeadingLevel.HEADING_2,
          children: [new TextRun("Image")],
        }),

        new Paragraph({
          children: [
            new ImageRun({
              type: "png",
              data: fs.readFileSync("./image.png"),
              transformation: { width: 200, height: 150 },
              altText: { title: "Title", description: "Desc", name: "Name" },
            }),
          ],
        }),

        new Paragraph({ children: [new PageBreak()] }),

        new Paragraph({
          heading: HeadingLevel.HEADING_2,
          children: [new TextRun("New Page Section")],
        }),

        // Do not use "\n" inside TextRun; create separate Paragraphs instead.
        new Paragraph({ children: [new TextRun("This is on a new page.")] }),
      ],
    },
  ],
});

(async () => {
  const buffer = await Packer.toBuffer(doc);
  fs.writeFileSync("output.docx", buffer);
  console.log("Wrote output.docx");
})();

Run

node create-doc.js

2) Read/extract .docx text with Pandoc

pandoc document.docx -o output.md

Implementation Details

  • DOCX structure: .docx is a ZIP archive containing XML parts; libraries generate the required XML and package it.
  • Page sizing (DXA):
    • Word uses DXA units where 1440 DXA = 1 inch.
    • docx-js defaults to A4; for US documents set US Letter explicitly: 12240 x 15840.
    • With 1" margins, content width is 12240 - 1440 - 1440 = 9360 DXA.
  • Landscape orientation:
    • docx-js swaps width/height internally for landscape.
    • Provide portrait dimensions and set orientation: PageOrientation.LANDSCAPE.
  • Headings and TOC:
    • TOC generation requires paragraphs using HeadingLevel.*.
    • To override built-in heading appearance, use exact style IDs: "Heading1", "Heading2", etc.
    • Include outlineLevel (H1 = 0, H2 = 1, ...), otherwise TOC may not pick up headings.
  • Paragraphs vs newlines:
    • Do not embed \n in runs to simulate line breaks; create separate Paragraph elements.
  • Lists:
    • Do not manually insert Unicode bullet characters as plain text.
    • Use numbering.config with LevelFormat.BULLET / LevelFormat.DECIMAL.
    • Numbering continuity depends on reference: same reference continues; different reference restarts.
  • Tables (critical rendering rules):
    • Always use WidthType.DXA (percent widths can break in Google Docs).
    • Set table width and columnWidths; the sum of columnWidths must equal table width.
    • Set each cell width to match its corresponding column width (required for consistent rendering).
    • Cell margins are internal padding and do not add to width; they reduce usable content area.
    • For shading, prefer ShadingType.CLEAR to avoid unexpected black backgrounds.
  • Images:
    • ImageRun requires a valid type (e.g., png, jpg) and altText fields.
  • Page breaks:
    • PageBreak must be inside a Paragraph (or use pageBreakBefore: true).