Latin OCR training data and tools for Tesseract, based on Nick White's Ancient Greek OCR for Tesseract.
v0.3.0
- move training process into Tesseract's new tesstrain.sh
system.v0.2.2
- add training on more ligatured forms & glyphs, tweak dictionaries.v0.2.1
- add training on various punctuation marks and new fonts.v0.2.0
- fix use of Tesseract character blacklist, vastly improving accuracy.v0.1.0
- rebuild training under stable environment.v0.1.0-alpha2
- add training on bold and italic font variants.v0.1.0-alpha1
- initial training file prerelease.