kraken API

Kraken provides routines which are usable by third party tools. In general you can expect function in the kraken package to remain stable. We will try to keep these backward compatible, but as kraken is still in an early development stage and the API is still quite rudimentary nothing can be garantueed.

kraken.binarization module

kraken.binarization.is_bitonal(im)

Tests a PIL.Image for bitonality.

Parameters:im (PIL.Image) – Image to test
Returns:True if the image contains only two different color values. False otherwise.
kraken.binarization.nlbin(im, threshold=0.5, zoom=0.5, escale=1.0, border=0.1, perc=80, range=20, low=5, high=90)

Performs binarization using non-linear processing.

Parameters:
  • im (PIL.Image) –
  • threshold (float) –
  • zoom (float) – Zoom for background page estimation
  • escale (float) – Scale for estimating a mask over the text region
  • border (float) – Ignore this much of the border
  • perc (int) – Percentage for filters
  • range (int) – Range for filters
  • low (int) – Percentile for black estimation
  • high (int) – Percentile for white estimation
Returns:

PIL.Image containing the binarized image

kraken.serialization module

kraken.serialization.serialize(records, image_name=u'', image_size=(0, 0), writing_mode=u'horizontal-tb', scripts=None, template=u'hocr')

Serializes a list of ocr_records into an output document.

Serializes a list of predictions and their corresponding positions by doing some hOCR-specific preprocessing and then renders them through one of several jinja2 templates.

Parameters:
  • records (iterable) – List of kraken.rpred.ocr_record
  • image_name (str) – Name of the source image
  • image_size (tuple) – Dimensions of the source image
  • writing_mode (str) – Sets the principal layout of lines and the direction in which blocks progress. Valid values are horizontal-tb, vertical-rl, and vertical-lr.
  • scripts (list) – List of scripts contained in the OCR records
  • template (str) – Selector for the serialization format. May be ‘hocr’ or ‘alto’.

kraken.pageseg module

kraken.rpred module

kraken.transcrib module

Utility functions for ground truth transcription.

kraken.train module

kraken.linegen module

linegen

An advanced line generation tool using Pango for proper text shaping. The actual drawing code was adapted from the create_image utility from nototools available at [0].

[0] https://github.com/googlei18n/nototools

class kraken.linegen.LineGenerator(family='Sans', font_size=32, font_weight=400, language=None)

Bases: future.types.newobject.newobject

Produces degraded line images using a single collection of font families.

render_line(text)

Draws a line onto a Cairo surface which will be converted to an pillow Image.

Parameters:

text (unicode) – A string which will be rendered as a single line.

Returns:

PIL.Image of mode ‘L’.

Raises:
  • KrakenCairoSurfaceException if the Cairo surface couldn’t be created
  • (usually caused by invalid dimensions.
kraken.linegen.ocropy_degrade(im, distort=1.0, dsigma=20.0, eps=0.03, delta=0.3, degradations=[(0.5, 0.0, 0.5, 0.0)])

Degrades and distorts a line using the same noise model used by ocropus.

Parameters:
  • im (PIL.Image) – Input image
  • distort (float) –
  • dsigma (float) –
  • eps (float) –
  • delta (float) –
  • degradations (list) – list returning 4-tuples corresponding to the degradations argument of ocropus-linegen.
Returns:

PIL.Image in mode ‘L’

kraken.linegen.degrade_line(im, mean=0.0, sigma=0.001, density=0.002)

Degrades a line image by adding several kinds of noise.

Parameters:
  • im (PIL.Image) – Input image
  • mean (float) – Mean of distribution for Gaussian noise
  • sigma (float) – Standard deviation for Gaussian noise
  • density (float) – Noise density for Salt and Pepper noiase
Returns:

PIL.Image in mode ‘L’

kraken.linegen.distort_line(im, distort=3.0, sigma=10, eps=0.03, delta=0.3)

Distorts a line image.

Run BEFORE degrade_line as a white border of 5 pixels will be added.

Parameters:
  • im (PIL.Image) – Input image
  • distort (float) –
  • sigma (float) –
  • eps (float) –
  • delta (float) –
Returns:

PIL.Image in mode ‘L’