I need an OCR tool that will allow me to run OCR "Optical Character Recognition" on tiff files. I am looking for a tool that is royalty-free and is very inexpensive to use. I am expecting to use a third-party tool, and I am looking for a developer that can find a tool for me and can show me how to incorporate it into my application.
My requirements are that I can OCR as many pages at a time without any memory problems (I am ok with OCR'ing a page and then saving it to a file and then OCR'ing another).
I also need the tool to record the character positions, so that I can later use those positions to highlight the words within the document.
I also would like for it to be easily re-distributable, without having other apps to install, such as the C++ runtime libraries. (This is just a preference, and is not a requirement).
## Deliverables
I have been looking at Tesseract OCR because it is Open-Source. It sounds like a good tool, and it has an Open-Source .NET 2.0 assembly that was created using it called Tesnet2.
Apparently, Tessnet2 has a memory leak that prevents you from being able to run OCR on multiple documents at a time. It also requires the distribution of the C++ runtime libraries. I am still willing to try Tesseract if someone can figure out a workaround for the memory leak because it is free.
If you are interested in trying the Tesseract OCR, here are some websites that I found that might be useful:
<[login to view URL]>
<[login to view URL]>
<[login to view URL]>
<[login to view URL]>