English 中文(简体)
Recognize numbers in images
原标题:

I ve been searching for resources for number recognition in images on the web. I found many links providing lots of resources on that topic. But unfortunately it s more confusing than helping, I don t know where to start.

I ve got an image with 5 numbers in it, non-disturbed (no captcha or something like this). The numbers are black on a white background, written in a standard font.

My first step was to separate the numbers. The algorithm I currently use is quite simple, it just checks if a column is entirely white and thus a space. Then it trims each character, so that there is no white border around it. This works quite well.

But now I m stuck with the actual recognition of the number. I don t know what s the best way of guessing the correct one. I don t think directly comparing to the font is a good idea, because if the numbers only differ a little, it will no more work.

Could anyone give me a hint on how this is done?

It doesn t matter to the question, but I ll be implementing this in C# or Java. I found some libraries which would do the job, but I d like to implement it myself, to learn something.

最佳回答

Why not look at using an open source OCR engine such as Tesseract?

http://code.google.com/p/tesseract-ocr/

C# Wrapper for Tesseract

http://www.pixel-technology.com/freeware/tessnet2/

Java Wrapper for Tesseract

http://sourceforge.net/projects/tessocrinjava/

While you might not consider using a third-party library as implementing it yourself, there s a tremendous amount of work that goes into just integrating the third-party tool. Keep in mind also that something that may seem simple (recognizing the number 5 versus the number 6) is often very complex; we re talking thousands and thousands of lines of code complex. In the least, look at the source code for tesseract and it ll give you a good reason to want to leverage a third-party library.

Here s another SO question that ll give you some ideas about the algorithms involved: https://stackoverflow.com/questions/850717/what-are-some-popular-ocr-algorithms

问题回答

暂无回答




相关问题
Spring Properties File

Hi have this j2ee web application developed using spring framework. I have a problem with rendering mnessages in nihongo characters from the properties file. I tried converting the file to ascii using ...

Logging a global ID in multiple components

I have a system which contains multiple applications connected together using JMS and Spring Integration. Messages get sent along a chain of applications. [App A] -> [App B] -> [App C] We set a ...

Java Library Size

If I m given two Java Libraries in Jar format, 1 having no bells and whistles, and the other having lots of them that will mostly go unused.... my question is: How will the larger, mostly unused ...

How to get the Array Class for a given Class in Java?

I have a Class variable that holds a certain type and I need to get a variable that holds the corresponding array class. The best I could come up with is this: Class arrayOfFooClass = java.lang....

SQLite , Derby vs file system

I m working on a Java desktop application that reads and writes from/to different files. I think a better solution would be to replace the file system by a SQLite database. How hard is it to migrate ...

热门标签