The solution to problem 10 (find unique words) considers a "word" to be anything that matches the pattern r"[0-9a-zA-Z-']+"
Alas, there are many words in shakespeare.txt that contain words outside of this pattern. The first few are:
- Personæ
- Phœbus
- dæmon
- Cæsar
- Æneas
These are mis-detected as
-'Person'
-['Ph', 'bus']
-['d', 'mon']
-['C', 'sar']
-'neas'