OCR Tesseract – Text Recognition in Ubuntu 14.04

  1. Installation via command line
    sudo apt-get install tesseract-ocr
  2. Install a language file (e.g. -eng, -deu, -fra, -ita, -ndl, -por, -spa, …)
    sudo apt-get install tesseract-ocr-eng
  3. Run OCR on the scanned file
    tesseract scan.png scanned.txt -l eng

A great addition is the graphical frontend gImageReader for tesseract.

  1. Add the application repository
    sudo add-apt-repository ppa:sandromani/gimagereader
  2. Update the repository sources
    sudo apt-get update
  3. Install the application
    sudo apt-get install gimagereader

References