Skip to content

Commit 1192028

Browse files
committed
Merge branch 'feature/list-render-gui' into release/alpha/master
2 parents f4de79b + e88f414 commit 1192028

22 files changed

+1329
-656
lines changed

Manifest.txt

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
Main-Class: coderarjob.kpdfsync.poc.Main
2+
Class-Path: pdfclown.jar
3+

README.md

Lines changed: 13 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
## About kpdfsync
22

3-
![Screenshot](/docs/images/screenshot.png)
3+
![Screenshot](/docs/images/screenshot_alpha.png)
44

55
If you use Kindle to read PDF books or documents, you might have seen that the highlights or notes
66
made on the Kindle are not saved on the PDF file itself. This means, if you take the file and
@@ -15,16 +15,23 @@ Kindle device.
1515
Currently it is in development, so not all the features work or even present. Here is the rough
1616
roadmap.
1717

18+
## Requirements
19+
- JRE 1.8 or higher
20+
- Linux, Mac, Windows
21+
1822
## Roadmap
1923

2024
- [X] Parsing the Clippings.txt file
2125
- [X] Search for the highlighted text in a page of the PDF file.
2226
- [X] Annotate highlight and notes on the PDF file.
23-
- [X] Graphical User Interface testing.
24-
- [X] Comments to notes mapping. This is required, because the clippings text file does not provide
25-
information which can used to determine which comments are related to which note on a single
26-
page.
27-
- [ ] Debug loggings
27+
- [X] Graphical User Interface (GUI) testing.
28+
- [X] Highlights to notes mapping. This is required, because the clippings text file does not
29+
provide information which can used to determine which notes are related to which highlight on a
30+
single page. Some cases where a page contains a single note and highlight, automatic pairs are
31+
created, however in cases where there are more than one note, these associations can be created
32+
manually by the user.
33+
- [X] GUI finalizing for the Alpha release.
34+
- [X] Debug loggings
2835
- [ ] **Alpha Release**
2936

3037
----

TODO

Lines changed: 126 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,36 +1,46 @@
11
Kpdfsync THINGS-TO-DO
22
---------------------------------------------------------------------------------------------------
33

4+
# Alpha Release
45
# TASKS Estimated Actual
5-
[X] String comparison algorithm, that can analyze the degree of match.
6+
[X] String comparison algorithm, that can analyze the degree of match.
67
So that minor differences between the pattern and the read text
78
from pdf files are handled.
8-
99
[X] Use PDFClown library to highlight the text which matches the most
1010
with the highlighted text from My Clippings file.
11-
1211
[X] Parse the 'My Clippings.txt' file.
13-
1412
[-] Gui POC
13+
[X] Manual and Automatic creation of association between highlights.
14+
and notes.
15+
[ ] Use grid layout for displaying and creating page mappings.
16+
(Not done, in favor of below)
17+
[X] Use custom renderer in list box to show highlight nore mappings.
18+
[X] A separate dialog window for selection of notes for a highlight.
19+
[X] Loging
1520

16-
[ ] Finalize GUI
17-
21+
# Beta Release
22+
# TASKS Estimated Actual
1823
[ ] Optimization and cleanup objects.
19-
24+
[ ] Lib - Use Iterator instead of Enumeration. (Not sure)
25+
[ ] GUI - Status bar showing last error or success message.
26+
[ ] Lib - parseLine function can be protected. It is public now.
27+
[ ] Lib - matching Bom bytes can be put inside a method in the
28+
ByteOrderMarkTypes enum. It is now separe in
29+
ByteOrderMark file.
2030
# BUGS:
21-
[ ] The string matching algo is too simple, and gives wrong match
22-
percentage, if the strings being compared differ in the number
23-
of non-whitespace characters. The two indexes get out of sync
24-
at the first mismatch and never recover.
31+
[ ] The string matching algo is too simple, and gives wrong match
32+
percentage, if the strings being compared differ in the number
33+
of non-whitespace characters. The two indexes get out of sync
34+
at the first mismatch and never recover.
2535
Example:
2636
PDF text = 123 56 789
2737
Clipping text = 123 456 789
2838
% match = 3/8 (Wrong)
2939
% match = 7/8 (What is expected)
3040

31-
[ ] Related to the above bug, we are highlighting more characters -
32-
by that many characters as the diffence in the number of
33-
characters, between the text read from the PDF and the pattern
41+
[ ] Related to the above bug, we are highlighting more characters -
42+
by that many characters as the diffence in the number of
43+
characters, between the text read from the PDF and the pattern
3444
read from the clippings file.
3545
The algorithm matches character by character, the pattern and the
3646
text from the pdf. The matching and thus the highlighting is as
@@ -46,7 +56,37 @@ Kpdfsync
4656
highlighting)
4757

4858
[ ] For some PDF files, org.pdfclown.tools.TextExtractor.extract() is returning null.
49-
This is seen with the Concrete Mathematics original PDF file.
59+
This is seen with the Concrete Mathematics original PDF file. May be a TrueType font issue.
60+
Here is the stack trace:
61+
java.lang.NullPointerException
62+
at java.base/java.util.Hashtable.put(Hashtable.java:476)
63+
at org.pdfclown.documents.contents.fonts.PfbParser.parse(PfbParser.java:99)
64+
at org.pdfclown.documents.contents.fonts.Type1Font.getNativeEncoding(Type1Font.java:96)
65+
at org.pdfclown.documents.contents.fonts.Type1Font.loadEncoding(Type1Font.java:141)
66+
at org.pdfclown.documents.contents.fonts.SimpleFont.onLoad(SimpleFont.java:118)
67+
at org.pdfclown.documents.contents.fonts.Font.load(Font.java:738)
68+
at org.pdfclown.documents.contents.fonts.Font.<init>(Font.java:351)
69+
at org.pdfclown.documents.contents.fonts.SimpleFont.<init>(SimpleFont.java:62)
70+
at org.pdfclown.documents.contents.fonts.Type1Font.<init>(Type1Font.java:75)
71+
at org.pdfclown.documents.contents.fonts.Font.wrap(Font.java:249)
72+
at org.pdfclown.documents.contents.FontResources.wrap(FontResources.java:72)
73+
at org.pdfclown.documents.contents.FontResources.wrap(FontResources.java:1)
74+
at org.pdfclown.documents.contents.ResourceItems.get(ResourceItems.java:119)
75+
at org.pdfclown.documents.contents.objects.SetFont.getResource(SetFont.java:119)
76+
at org.pdfclown.documents.contents.objects.SetFont.getFont(SetFont.java:83)
77+
at org.pdfclown.documents.contents.objects.SetFont.scan(SetFont.java:97)
78+
at org.pdfclown.documents.contents.ContentScanner.moveNext(ContentScanner.java:1330)
79+
at org.pdfclown.documents.contents.ContentScanner$TextWrapper.extract(ContentScanner.java:811)
80+
at org.pdfclown.documents.contents.ContentScanner$TextWrapper.extract(ContentScanner.java:817)
81+
at org.pdfclown.documents.contents.ContentScanner$TextWrapper.<init>(ContentScanner.java:777)
82+
at org.pdfclown.documents.contents.ContentScanner$TextWrapper.<init>(ContentScanner.java:770)
83+
at org.pdfclown.documents.contents.ContentScanner$GraphicsObjectWrapper.get(ContentScanner.java:690)
84+
at org.pdfclown.documents.contents.ContentScanner$GraphicsObjectWrapper.access$0(ContentScanner.java:682)
85+
at org.pdfclown.documents.contents.ContentScanner.getCurrentWrapper(ContentScanner.java:1154)
86+
at org.pdfclown.tools.TextExtractor.extract(TextExtractor.java:633)
87+
at org.pdfclown.tools.TextExtractor.extract(TextExtractor.java:296)
88+
at coderarjob.kpdfsync.lib.annotator.PdfAnnotatorV1.highlight(PdfAnnotatorV1.java:62)
89+
at coderarjob.kpdfsync.poc.MainFrame$2.run(MainFrame.java:172)
5090

5191
[ ] Highlight is not visible on the output PDF file. This was seen on the Concrete Mathematics
5292
cropped PDF file.
@@ -62,3 +102,74 @@ Kpdfsync
62102
5. Begin highlighting.
63103

64104
The times, this exception occures, it occures around the 73% mark.
105+
106+
[ ] EOFException at org.pdfclown.tools.TextExtractor.extract() method. This is seen on
107+
'the_evolution_of_operating_system_cropped.pdf' file. Could also be a font issue.
108+
Here is the stack trace
109+
java.lang.RuntimeException: java.io.EOFException
110+
at org.pdfclown.documents.contents.fonts.CffParser.load(CffParser.java:703)
111+
at org.pdfclown.documents.contents.fonts.CffParser.<init>(CffParser.java:640)
112+
at org.pdfclown.documents.contents.fonts.Type1Font.getNativeEncoding(Type1Font.java:104)
113+
at org.pdfclown.documents.contents.fonts.Type1Font.loadEncoding(Type1Font.java:151)
114+
at org.pdfclown.documents.contents.fonts.SimpleFont.onLoad(SimpleFont.java:118)
115+
at org.pdfclown.documents.contents.fonts.Font.load(Font.java:738)
116+
at org.pdfclown.documents.contents.fonts.Font.<init>(Font.java:351)
117+
at org.pdfclown.documents.contents.fonts.SimpleFont.<init>(SimpleFont.java:62)
118+
at org.pdfclown.documents.contents.fonts.Type1Font.<init>(Type1Font.java:75)
119+
at org.pdfclown.documents.contents.fonts.Font.wrap(Font.java:249)
120+
at org.pdfclown.documents.contents.FontResources.wrap(FontResources.java:72)
121+
at org.pdfclown.documents.contents.FontResources.wrap(FontResources.java:1)
122+
at org.pdfclown.documents.contents.ResourceItems.get(ResourceItems.java:119)
123+
at org.pdfclown.documents.contents.objects.SetFont.getResource(SetFont.java:119)
124+
at org.pdfclown.documents.contents.objects.SetFont.getFont(SetFont.java:83)
125+
at org.pdfclown.documents.contents.objects.SetFont.scan(SetFont.java:97)
126+
at org.pdfclown.documents.contents.ContentScanner.moveNext(ContentScanner.java:1330)
127+
at org.pdfclown.documents.contents.ContentScanner$TextWrapper.extract(ContentScanner.java:811)
128+
at org.pdfclown.documents.contents.ContentScanner$TextWrapper.<init>(ContentScanner.java:777)
129+
at org.pdfclown.documents.contents.ContentScanner$TextWrapper.<init>(ContentScanner.java:770)
130+
at org.pdfclown.documents.contents.ContentScanner$GraphicsObjectWrapper.get(ContentScanner.java:690)
131+
at org.pdfclown.documents.contents.ContentScanner$GraphicsObjectWrapper.access$0(ContentScanner.java:682)
132+
at org.pdfclown.documents.contents.ContentScanner.getCurrentWrapper(ContentScanner.java:1154)
133+
at org.pdfclown.tools.TextExtractor.extract(TextExtractor.java:633)
134+
at org.pdfclown.tools.TextExtractor.extract(TextExtractor.java:296)
135+
at coderarjob.kpdfsync.lib.annotator.PdfAnnotatorV1.highlight(PdfAnnotatorV1.java:62)
136+
at coderarjob.kpdfsync.poc.MainFrame$2.run(MainFrame.java:172)
137+
at java.base/java.lang.Thread.run(Thread.java:833)
138+
Caused by: java.io.EOFException
139+
at org.pdfclown.bytes.Buffer.readUnsignedShort(Buffer.java:511)
140+
at org.pdfclown.documents.contents.fonts.CffParser$Index.parse(CffParser.java:306)
141+
at org.pdfclown.documents.contents.fonts.CffParser$Index.parse(CffParser.java:324)
142+
at org.pdfclown.documents.contents.fonts.CffParser.load(CffParser.java:669)
143+
... 27 more
144+
:: Cause #1
145+
java.io.EOFException
146+
at org.pdfclown.bytes.Buffer.readUnsignedShort(Buffer.java:511)
147+
at org.pdfclown.documents.contents.fonts.CffParser$Index.parse(CffParser.java:306)
148+
at org.pdfclown.documents.contents.fonts.CffParser$Index.parse(CffParser.java:324)
149+
at org.pdfclown.documents.contents.fonts.CffParser.load(CffParser.java:669)
150+
at org.pdfclown.documents.contents.fonts.CffParser.<init>(CffParser.java:640)
151+
at org.pdfclown.documents.contents.fonts.Type1Font.getNativeEncoding(Type1Font.java:104)
152+
at org.pdfclown.documents.contents.fonts.Type1Font.loadEncoding(Type1Font.java:151)
153+
at org.pdfclown.documents.contents.fonts.SimpleFont.onLoad(SimpleFont.java:118)
154+
at org.pdfclown.documents.contents.fonts.Font.load(Font.java:738)
155+
at org.pdfclown.documents.contents.fonts.Font.<init>(Font.java:351)
156+
at org.pdfclown.documents.contents.fonts.SimpleFont.<init>(SimpleFont.java:62)
157+
at org.pdfclown.documents.contents.fonts.Type1Font.<init>(Type1Font.java:75)
158+
at org.pdfclown.documents.contents.fonts.Font.wrap(Font.java:249)
159+
at org.pdfclown.documents.contents.FontResources.wrap(FontResources.java:72)
160+
at org.pdfclown.documents.contents.FontResources.wrap(FontResources.java:1)
161+
at org.pdfclown.documents.contents.ResourceItems.get(ResourceItems.java:119)
162+
at org.pdfclown.documents.contents.objects.SetFont.getResource(SetFont.java:119)
163+
at org.pdfclown.documents.contents.objects.SetFont.getFont(SetFont.java:83)
164+
at org.pdfclown.documents.contents.objects.SetFont.scan(SetFont.java:97)
165+
at org.pdfclown.documents.contents.ContentScanner.moveNext(ContentScanner.java:1330)
166+
at org.pdfclown.documents.contents.ContentScanner$TextWrapper.extract(ContentScanner.java:811)
167+
at org.pdfclown.documents.contents.ContentScanner$TextWrapper.<init>(ContentScanner.java:777)
168+
at org.pdfclown.documents.contents.ContentScanner$TextWrapper.<init>(ContentScanner.java:770)
169+
at org.pdfclown.documents.contents.ContentScanner$GraphicsObjectWrapper.get(ContentScanner.java:690)
170+
at org.pdfclown.documents.contents.ContentScanner$GraphicsObjectWrapper.access$0(ContentScanner.java:682)
171+
at org.pdfclown.documents.contents.ContentScanner.getCurrentWrapper(ContentScanner.java:1154)
172+
at org.pdfclown.tools.TextExtractor.extract(TextExtractor.java:633)
173+
at org.pdfclown.tools.TextExtractor.extract(TextExtractor.java:296)
174+
at coderarjob.kpdfsync.lib.annotator.PdfAnnotatorV1.highlight(PdfAnnotatorV1.java:62)
175+
at coderarjob.kpdfsync.poc.MainFrame$2.run(MainFrame.java:172)

build.sh

Lines changed: 17 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -10,23 +10,34 @@ find src -name "*.java" -exec sed -i s/\ \*$//g {} \; || exit
1010

1111
export CLASSPATH="lib/pdfclown.jar:$BIN_DIR"
1212

13+
JDK_VER_TARGET=8
14+
1315
# Build AJL
14-
javac -Xlint -d "$BIN_DIR/" src/coderarjob/ajl/file/*.java || exit
16+
javac --release $JDK_VER_TARGET -Xlint -d "$BIN_DIR/" \
17+
src/coderarjob/ajl/file/*.java || exit
1518

1619
# Build Pattern Matcher
17-
javac -Xlint -d "$BIN_DIR/" src/coderarjob/kpdfsync/lib/pm/*.java || exit
20+
javac --release $JDK_VER_TARGET -Xlint -d "$BIN_DIR/" \
21+
src/coderarjob/kpdfsync/lib/pm/*.java || exit
1822

1923
# Build Annotator
20-
javac -Xlint -d "$BIN_DIR/" src/coderarjob/kpdfsync/lib/annotator/*.java || exit
24+
javac --release $JDK_VER_TARGET -Xlint -d "$BIN_DIR/" \
25+
src/coderarjob/kpdfsync/lib/annotator/*.java || exit
2126

2227
# Build Kindle Clippings File Parser
23-
javac -Xlint -d "$BIN_DIR/" src/coderarjob/kpdfsync/lib/clipparser/*.java || exit
28+
javac --release $JDK_VER_TARGET -Xlint -d "$BIN_DIR/" \
29+
src/coderarjob/kpdfsync/lib/clipparser/*.java || exit
2430

2531
# Build kpdfsync library
26-
javac -Xlint -d "$BIN_DIR/" src/coderarjob/kpdfsync/lib/*.java || exit
32+
javac --release $JDK_VER_TARGET -Xlint -d "$BIN_DIR/" \
33+
src/coderarjob/kpdfsync/lib/*.java || exit
2734

2835
# Build POC
29-
javac -Xlint -d "$BIN_DIR/" src/coderarjob/kpdfsync/poc/*.java || exit
36+
javac --release $JDK_VER_TARGET -Xlint -d "$BIN_DIR/" \
37+
src/coderarjob/kpdfsync/poc/*.java || exit
38+
39+
# Copy resources
40+
cp -r src/coderarjob/kpdfsync/poc/res $BIN_DIR/coderarjob/kpdfsync/poc || exit
3041

3142
# Generate tags file
3243
ctags --recurse ./src || exit

docs/images/screenshot_alpha.png

66.7 KB
Loading

pack.sh

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
#!/bin/bash
2+
3+
echo :: Building packages
4+
5+
rm -rf dist
6+
mkdir dist
7+
8+
pushd build/classes
9+
10+
# Create testplib.jar
11+
# BasicParser is what that can change. So it is packaged separately.
12+
jar cfm kpdfsync.jar ../../Manifest.txt \
13+
coderarjob
14+
popd
15+
16+
mv build/classes/kpdfsync.jar ./dist/
17+
cp lib/pdfclown.jar ./dist/
18+
19+
20+
echo :: Building packages completed
21+
Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
package coderarjob.kpdfsync.poc;
2+
3+
import javax.swing.*;
4+
import java.awt.Component;
5+
import java.awt.BorderLayout;
6+
import java.awt.Color;
7+
8+
public class HighlightNotePairListRenderer extends JPanel implements ListCellRenderer<HighlightNotePair>
9+
{
10+
11+
private JLabel highlightLabel;
12+
private JLabel pairedNoteLabel;
13+
private Color alternateColor;
14+
15+
public HighlightNotePairListRenderer()
16+
{
17+
this.setLayout (new BorderLayout());
18+
19+
highlightLabel = new JLabel();
20+
String iconResourceName = "/coderarjob/kpdfsync/poc/res/highlighter.png";
21+
highlightLabel.setIcon (new ImageIcon (getClass().getResource (iconResourceName)));
22+
highlightLabel.setForeground (Color.BLACK);
23+
highlightLabel.setOpaque (false);
24+
this.add (highlightLabel, BorderLayout.PAGE_START);
25+
26+
pairedNoteLabel = new JLabel();
27+
pairedNoteLabel.setForeground (Color.DARK_GRAY);
28+
pairedNoteLabel.setOpaque (false);
29+
this.add (pairedNoteLabel, BorderLayout.PAGE_END);
30+
31+
alternateColor = new Color (237, 244, 249);
32+
this.setBorder (BorderFactory.createEmptyBorder (5, 2, 5, 0));
33+
}
34+
35+
public Component getListCellRendererComponent(JList<? extends HighlightNotePair> list,
36+
HighlightNotePair value, int index,
37+
boolean isSelected, boolean cellHasFocus)
38+
{
39+
highlightLabel.setText (value.getHighlightText());
40+
pairedNoteLabel.setText (value.getNoteText());
41+
42+
Color normalBackgroundColor = (index % 2 == 0) ? alternateColor : list.getBackground();
43+
44+
if (isSelected)
45+
this.setBackground (list.getSelectionBackground());
46+
else
47+
this.setBackground (normalBackgroundColor);
48+
49+
return this;
50+
}
51+
}

0 commit comments

Comments
 (0)