在C#项目中使用Tesseract ocr

本文关键字：Tesseract ocr 项目 | 更新日期: 2023-09-27 18:29:03

我已经从这里下载了tesseract。当我尝试将dll文件添加到visual studio 2012时，它显示的错误是它不是有效的程序集。有人能给我推荐一些ocr的其他dll文件和示例编码吗。我试过很多网站，但我总能找到一个好的。然后我找到了这个dll文件tesrect，并使用了以下代码

string path = @"C:'pic'mytext.jpg";
Bitmap image = new Bitmap(path);
Tesseract ocr = new Tesseract();
ocr.SetVariable("tessedit_char_whitelist", "0123456789"); // If digit only
ocr.Init(@"C:'tessdata'", "eng", false); // To use correct tessdata
List<tessnet2.Word> result = ocr.DoOCR(image, Rectangle.Empty);
foreach (tessnet2.Word word in result)
Console.WriteLine("{0} : {1}", word.Confidence, word.Text);

但visualstudio抛出错误，称其为无效程序集。有人能帮我吗。。。

EDIT:仅限属性文件夹中的框架感谢提前

在C#项目中使用Tesseract ocr

我尝试使用Tesseract.NET包装器。它有更令人愉快的语法：

using (var engine = new TesseractEngine(pathToLangFolder, "eng", EngineMode.Default))
{
    // have to load Pix via a bitmap since Pix doesn't support loading a stream.
    using (var image = new Bitmap(fileName))
    {
        using (var pix = PixConverter.ToPix(image))
        {
            using (var page = engine.Process(pix))
            {
                Console.WriteLine(page.GetMeanConfidence() + " : " + page.GetText());
            }
        }
    }
}

为什么不尝试OCRSDK是一项付费服务，也可试用。从图像中提取文本的准确率高达85%。。。