Skip to content

Conversation

@omda777
Copy link

@omda777 omda777 commented Jan 13, 2026

Closes #635

This PR fixes incorrect handling of HTML elements in tables during HTML → CommonMark deserialization.
It ensures captions are properly parsed, normalized, and emitted as Markdown content preceding the table, while also fixing edge cases related to table headers and whitespace normalization.

Screenshots or Video

Input (HTML with element):

<table>
  <caption>Employee Details</caption>
  <thead>
    <tr>
      <th>Name</th>
      <th>Role</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Alice</td>
      <td>Engineer</td>
    </tr>
    <tr>
      <td>Bob</td>
      <td>Designer</td>
    </tr>
  </tbody>
</table>

the output:
image

**Employee Details**

| Name | Role |
|---------|---------|
| Alice | Engineer |
| Bob | Designer |

Related Issues

Author Checklist

  • Ensure you provide a DCO sign-off for your commits using the --signoff option of git commit.
  • Vital features and changes captured in unit and/or integration tests
  • Commits messages follow AP format
  • Extend the documentation, if necessary
  • Merging to main from fork:branchname

Signed-off-by: Abdelrahman Emad <abdoemad0120@gmail.com>
@omda777
Copy link
Author

omda777 commented Jan 16, 2026

hi @mttrbrts can Review PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

HTML Transform: Erronous HTML Table Parsing

1 participant