Skip to content

Unicode character composition fails with NFD #1661

@nicod-pc

Description

@nicod-pc

Bug Report

UTF-8 and Unicode in general allow different forms of encoding for umlaut and other characters, which JavaScript translates with the normalize function. The NFD version is not working.

Description of the problem

Both are valid representations of the same characters:

  • NFD Fails: If I use NFD (like 0075 + 0308 for ü) it doesn't work. Starting with the ü, different characters are printed.
  • NFC Works: If I use NFC (like 00FC for ü) it works and the characters are printed correctly in the PDF.

Code sample

var doc = new PDFDocument();
var stream = doc.pipe(blobStream());

// draw some text
doc.fontSize(25).text('Text für Bug', 100, 80);
doc.fontSize(25).text('Text für Bug', 100, 120);

// end and display the document in the iframe to the right
doc.end();
stream.on('finish', function() {
  iframe.src = stream.toBlobURL('application/pdf');
});

Both lines should look the same, but they don't:
Image

Your environment

I used the Browser Demo on a Mac. Also had issues it being wrapped in pdfmake. Created an issue there, too, but it's looking different there.

  • pdfkit version:
  • Node version:
  • Browser version (if applicable):
  • Operating System:

Workaround

Normalize all strings given to pdfkit with "someString".normalize("NFC")

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions