@@ -55,7 +55,10 @@ Kpdfsync
5555 Highlighting = 12 67 (Expected, Much more accepatble
5656 highlighting)
5757
58- [ ] For some PDF files, org.pdfclown.tools.TextExtractor.extract() is returning null.
58+ [ ] (GitHub Issue #2)
59+ Book: Concrete Mathematics original PDF file.
60+
61+ For some PDF files, org.pdfclown.tools.TextExtractor.extract() is returning null.
5962 This is seen with the Concrete Mathematics original PDF file. May be a TrueType font issue.
6063 Here is the stack trace:
6164 java.lang.NullPointerException
@@ -87,6 +90,10 @@ Kpdfsync
8790 at org.pdfclown.tools.TextExtractor.extract(TextExtractor.java:296)
8891 at coderarjob.kpdfsync.lib.annotator.PdfAnnotatorV1.highlight(PdfAnnotatorV1.java:62)
8992 at coderarjob.kpdfsync.poc.MainFrame$2.run(MainFrame.java:172)
93+
94+ Solution:
95+ Running pdftocairo tool (from poppler-utils package), solves this error.
96+ Command: pdftocairo -pdf <in pdf file> <out pdf file>
9097
9198[ ] Highlight is not visible on the output PDF file. This was seen on the Concrete Mathematics
9299 cropped PDF file.
@@ -103,7 +110,8 @@ Kpdfsync
103110
104111 The times, this exception occures, it occures around the 73% mark.
105112
106- [ ] EOFException at org.pdfclown.tools.TextExtractor.extract() method. This is seen on
113+ [ ] (GitHub Issue #1)
114+ EOFException at org.pdfclown.tools.TextExtractor.extract() method. This is seen on
107115 'the_evolution_of_operating_system_cropped.pdf' file. Could also be a font issue.
108116 Here is the stack trace
109117java.lang.RuntimeException: java.io.EOFException
@@ -173,3 +181,60 @@ java.io.EOFException
173181 at org.pdfclown.tools.TextExtractor.extract(TextExtractor.java:296)
174182 at coderarjob.kpdfsync.lib.annotator.PdfAnnotatorV1.highlight(PdfAnnotatorV1.java:62)
175183 at coderarjob.kpdfsync.poc.MainFrame$2.run(MainFrame.java:172)
184+
185+ Solution:
186+ Running pdftocairo tool (from poppler-utils package), solves this error.
187+ Command: pdftocairo -pdf <in pdf file> <out pdf file>
188+
189+ [ ] (GitHub Issue #4)
190+ Book: MMURTL
191+ org.pdfclown.util.NotImplementedException: LZWDecode
192+
193+ Stack trace:
194+ org.pdfclown.util.NotImplementedException: LZWDecode
195+ at org.pdfclown.bytes.filters.Filter.get(Filter.java:74)
196+ at org.pdfclown.objects.PdfStream.getBody(PdfStream.java:193)
197+ at org.pdfclown.objects.PdfStream.getBody(PdfStream.java:155)
198+ at org.pdfclown.documents.contents.Contents$ContentStream.moveNextStream(Contents.java:279)
199+ at org.pdfclown.documents.contents.Contents$ContentStream.(Contents.java:86)
200+ at org.pdfclown.documents.contents.Contents.load(Contents.java:591)
201+ at org.pdfclown.documents.contents.Contents.(Contents.java:366)
202+ at org.pdfclown.documents.contents.Contents.wrap(Contents.java:345)
203+ at org.pdfclown.documents.Page.getContents(Page.java:571)
204+ at org.pdfclown.documents.contents.ContentScanner.(ContentScanner.java:1033)
205+ at org.pdfclown.tools.TextExtractor.extract(TextExtractor.java:297)
206+ at coderarjob.kpdfsync.lib.annotator.PdfAnnotatorV1.highlight(PdfAnnotatorV1.java:62)
207+ at coderarjob.kpdfsync.poc.MainFrame$2.run(MainFrame.java:202)
208+ at java.base/java.lang.Thread.run(Thread.java:833)
209+
210+ Solution:
211+ Running pdftocairo tool (from poppler-utils package), solves this error.
212+ Command: pdftocairo -pdf <in pdf file> <out pdf file>
213+
214+ [ ] (GitHub Issue #3)
215+ Book: https://plan9.io/sys/doc/lexnames.pdf
216+
217+ Cannot invoke "org.pdfclown.documents.contents.IContentContext.getContents()"
218+ because "contentContext" is null
219+
220+ Stack Trace:
221+ Exception :Cannot invoke "org.pdfclown.documents.contents.IContentContext.getContents()" because "contentContext" is null
222+ java.lang.NullPointerException: Cannot invoke "org.pdfclown.documents.contents.IContentContext.getContents()" because "contentContext" is null
223+ at org.pdfclown.documents.contents.ContentScanner.(ContentScanner.java:1033)
224+ at org.pdfclown.tools.TextExtractor.extract(TextExtractor.java:297)
225+ at coderarjob.kpdfsync.lib.annotator.PdfAnnotatorV1.highlight(PdfAnnotatorV1.java:62)
226+ at coderarjob.kpdfsync.poc.MainFrame$2.run(MainFrame.java:202)
227+ at java.base/java.lang.Thread.run(Thread.java:833)
228+
229+ [ ] Index Out Of Bounds in PdfAnnotatorV1
230+ PDF souce :/home/coder/kpdfsync/test-files/Books/Classic Operating Systems_ From Batch Processing To Distributed Systems_cropped.pdf
231+
232+ Exception :index -1, length 0
233+ java.lang.StringIndexOutOfBoundsException: index -1, length 0
234+ at java.base/java.lang.String.checkIndex(String.java:4560)
235+ at java.base/java.lang.AbstractStringBuilder.deleteCharAt(AbstractStringBuilder.java:970)
236+ at java.base/java.lang.StringBuilder.deleteCharAt(StringBuilder.java:298)
237+ at coderarjob.kpdfsync.lib.annotator.PdfAnnotatorV1.doHighlight(PdfAnnotatorV1.java:96)
238+ at coderarjob.kpdfsync.lib.annotator.PdfAnnotatorV1.highlight(PdfAnnotatorV1.java:65)
239+ at coderarjob.kpdfsync.poc.MainFrame$2.run(MainFrame.java:201)
240+ at java.base/java.lang.Thread.run(Thread.java:833)
0 commit comments