Some references to collect here on the use of general-purpose graphic processors in OCR. Nominally, you run some kind of MapReduce based algorithm with a GPGPU over some image, and out spits text much faster than any ordinary processor of any speed; of course, there are lots of details to attend to.
Jike Chong, Bryan Christopher Catanzaro, Narayanan Sundaram, Fares Hedayati and Kurt Keutzer
A new breed of general-purpose manycore computing platform is emerging. Exemplary examples include the Niagara from Sun Microsystems, the G80 from Nvidia, the Cell from IBM, and the up-coming Larrabee from Intel. These manycore processors each pack 8-32 relatively simple cores on a chip, capable of supporting up to 100s of threads, and boast tremendous potential peak single-chip performances up to the range of Tera-FLOPS. However, traditional algorithms and applications in many domains cannot take advantage of much of the parallelism provided by these platforms.
The ParLab at Berkeley was recently founded in part to help meet this acute need for novel algorithmic approaches to unleash the performance potentials of emerging manycore platforms for a wide range of application domains. It proposes to concentrate on analyzing the communication and computation patterns (or Dwarfs) of important classes of algorithms underlying modern application domains, and develop techniques to efficiently parallelize them for the general purpose manycore platforms.
We concentrate on the domain of image recognition and retrieval, leveraging the Intel PIRO content-based image retrieval framework as a motivating application. Specifically, we study the parallelization of classification algorithms for machine learning, and develop parallelization techniques to improve the performance of these algorithms and applications on the emerging manycore platforms.
The cuda implementation allows for exceptionally fast image manipulation, cleaning and segmentation before being presented to the template based ocr system. Traditional cpu based ocr systems are very slow, especially in the image rotation, cleaning etc departments and although cocr is by no means a complete ocr package it is orders of magnitude faster than the various cpu based ones I’ve tried. With a bit more work and an additional of a neural network ocr it could easily become a system able to do greater than realtime ocr’ing. Neural networks are extremely well suited to GPU implementations due to their inherit parallelism. I will post more snippets and modules of the system here over the coming weeks.
I wasn't able to find any evidence that the ocropus OCR system that Google has used and released is using any parallel computing mechanisms, so this still looks like "gee, it should work for someone, if you can throw a PhD student or two at it."