Skip to content

Conversation

@tupizz
Copy link
Contributor

@tupizz tupizz commented Feb 10, 2026

Summary

Fixes autofit tables dropping columns when rows use rowspan/colspan patterns (e.g. the PCI compliance table) and improves page break splitting to match Word's behavior.

Root cause: The PCI table has a 4-column grid but some rows have only 2-3 physical cells due to rowspan/colspan. Two bugs caused the rightmost column to disappear:

  1. Column count was derived from cells.length instead of the sum of colSpan values — rows with rowspan continuations have fewer physical cells, making the engine think the table had 3 columns
  2. Cell colwidth values (scaled up from tcW during import) were preferred over grid values — grid values sum exactly to page width, but colwidth values get distorted by scaling math

Changes

Column rendering fixes (core bug fix)

  • Grid priority over colwidth (pm-adapter/converters/table.ts): Swap priority so w:tblGrid values (which sum to exact page width) are used before falling back to PM colwidth attributes. Cell colwidth values come from tcW (cell width hints) and get scaled during import, introducing proportion errors that make columns narrower than they should be.

  • Column count from colSpan sums (measuring/dom/index.ts): maxCellCount now sums colSpan values per row instead of counting physical cells. Rows with rowspan continuations have fewer physical cells than grid columns — using cells.length dropped the 4th column.

  • Colspan tcW distribution (legacy-handle-table-cell-node.js): When a colspan cell's tcW exceeds the grid span total, distribute proportionally across grid columns instead of using raw grid values. This prevents colspan cells from getting undersized colwidth arrays during DOCX import.

Page break improvements

  • Rowspan-aware split point selection (layout-table.ts): Track maxRowspanEnd (farthest row reached by any active rowspan) and lastCleanFitRow (last boundary with no rowspan crossing). When the standard break would split a rowspan group, prefer the clean break + force a page advance. This matches Word's behavior of keeping rowspan groups together on the same page.

  • Ghost cell rendering (renderTableFragment.ts): When a table continues from a previous fragment, render empty bordered cells ("ghost cells") for grid columns occupied by rowspan cells that started on the previous page. This maintains correct table structure and borders on continuation pages, matching Word's rendering.

  • Header-skip for cantSplit rows (layout-table.ts): When repeated headers would prevent a cantSplit row from fitting (but the row fits without headers), skip header repetition. Word does not split cantSplit rows just because repeated headers eat up space.

  • fullPageHeight fix (layout-table.ts): Use contentBottom - topMargin for over-tall row detection instead of just contentBottom, which includes the top margin and overstates available space.

Cell height refinement

  • Skip last paragraph spacing.after (measuring/dom/index.ts, renderTableCell.ts): In Word, the last paragraph's spacing.after is absorbed by the cell's bottom padding and doesn't add extra height. Skipping it saves ~4px per row, accumulating to shift page breaks closer to Word's behavior.

Test plan

  • pnpm --filter @superdoc/pm-adapter test — column width priority tests updated
  • pnpm --filter @superdoc/measuring-dom test — spacing.after + colSpan column count tests
  • pnpm --filter @superdoc/painter-dom test — spacing.after rendering tests updated
  • Regression test: table-autofit-colspan.test.js validates 4-column grid preserved when rows have <4 physical cells
  • Manual: upload PCI table DOCX → verify all 4 columns render, page break at rowspan boundary, ghost cells on page 2

Closes SD-1797

Fix autofit tables dropping columns when rows use rowspan/colspan patterns
and improve page break splitting to match Word behavior.

Column rendering fixes:
- Prefer grid values over colwidth for column widths (grid sums to page width)
- Use colSpan sums for column count instead of physical cell count
- Distribute colspan tcW proportionally across grid columns during DOCX import

Page break improvements:
- Rowspan-aware split point selection (prefer breaks at rowspan boundaries)
- Ghost cell rendering for rowspan cells spanning page breaks
- Skip header repeat when it would prevent a cantSplit row from fitting
- Fix fullPageHeight to use actual content area (contentBottom - topMargin)

Cell height refinement:
- Skip last paragraph spacing.after in cell measuring and rendering

SD-1797
@linear
Copy link

linear bot commented Feb 10, 2026

@github-actions
Copy link
Contributor

Based on my review of the code changes and my knowledge of ECMA-376 WordprocessingML specification, here is my assessment:

Status: PASS

The changes in legacy-handle-table-cell-node.js appear to be spec-compliant. Here's what the code is doing:

What Changed

The modification (lines 73-90, 97-98) adds logic to handle cells with w:gridSpan (colspan) where the cell's w:tcW (table cell width) exceeds the sum of the grid columns it spans.

OOXML Spec Compliance

The implementation correctly handles the ECMA-376 table model:

  1. w:tcW (Table Cell Width): Lines 63-69 correctly extract the cell width from w:tcPr/w:tcW/@w:w and convert from twips to pixels. The spec allows this attribute on table cells.

  2. w:gridSpan (Grid Span): Line 41 correctly reads w:tcPr/w:gridSpan/@w:val which specifies how many grid columns the cell spans. This is a valid ECMA-376 attribute.

  3. Width Resolution Logic: The code computes total grid width (lines 87-89) and only scales when tcW > gridSpanTotal + 1 (line 90). This matches Word's behavior where cell width hints can exceed grid totals for autofit tables.

  4. Proportional Distribution: Lines 97-98 distribute the excess width proportionally across spanned columns. While the spec doesn't mandate this exact algorithm, it's a reasonable interpretation of how to reconcile conflicting width values.

Minor Observations

  • The +1 tolerance in line 90 (cellOwnWidth > gridSpanTotal + 1) is a practical rounding buffer, not from the spec, but this is reasonable for handling floating-point precision.
  • The fallback defaultColwidth = 100 (line 94) is an implementation default when grid columns are missing, which is acceptable.

The code correctly interprets the relationship between w:tcW, w:gridSpan, and w:tblGrid elements as defined in ECMA-376 Part 1 §17.4.

For details on these elements, see https://ooxml.dev/spec?q=tcW and https://ooxml.dev/spec?q=gridSpan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant