Models¶
There are currently three kinds of models containing the recurrent neural
networks doing all the character recognition supported by kraken: pronn
files serializing old pickled pyrnn
models as protobuf, clstm’s native
serialization, and versatile Core ML models.
pyrnn¶
These are serialized instances of python lstm.SeqRecognizer
objects. Using
such a model just entails loading the pickle and calling the appropriate
functions to perform recognition much like a shared library in other
programming languages.
Support for these models has been dropped with kraken 1.0 as python 2.7 is phased out.
pronn¶
Legacy python models can be converted to a protobuf based serialization. These are loadable by kraken 1.0 and will be automatically converted to Core ML.
Protobuf models have several advantages over pickled ones. They are noticeably smaller (80Mb vs 1.8Mb for the default model), don’t allow arbitrary code execution, and are upward compatible with python 3. Because they are so much more lightweight they are also loaded much faster.
clstm¶
clstm, a small and fast implementation of LSTM networks that was used in previous kraken versions. The model files can be loaded with pytorch-based kraken and will be converted to Core ML.
CoreML¶
Core ML allows arbitrary network architectures in a compact serialization with metadata. This is the default format in pytorch-based kraken.
Conversion¶
Per default pronn/clstm models are automatically converted to the new Core ML
format when explicitely defined using the -m
option to the ocr
utility
on the command line. They are stored in the user kraken directory (default is
~/.kraken) and will be automatically substituted in future runs.
If conversion is not desired, e.g. because there is a bug in the conversion
routine, it can be disabled using the --disable-autoconversion
switch.