tesseract 和 pytesseract 识别结果为何完全不同

推荐学习书目

› Learn Python the Hard Way

Python Sites

› PyPI - Python Package Index

› http://diveintopython.org/toc/index.html

› Pocoo

值得关注的项目

› PyPy

› Celery

› Jinja2

› Read the Docs

› gevent

› pyenv

› virtualenv

› Stackless Python

› Beautiful Soup

› 结巴中文分词

› Green Unicorn

› Sentry

› Shovel

› Pyflakes

› pytest

Python 编程

› pep8 Checker

Styles

› PEP 8

› Google Python Style Guide

› Code Style from The Hitchhiker's Guide

This topic created in 2200 days ago, the information mentioned may be changed or developed.

不是说 pytesseract 是 tesseract 的命令封包吗，为什么完全默认的情况下，pytesseract 识别率比 tesseract 差这么多啊，字库也是默认的，没加任何参数。网上搜了一圈也没找到答案。

tesseract

默认

pytesseract

字库

8 replies • 2020-05-13 20:05:33 +08:00

wa8n

May 12, 2020 via iPhone

图片一样，还有版本？

cz5424

May 12, 2020 via iPhone

有遇到过类似，不过我是猜测是 macos 的版本跟 Linux 版本有识别率差距

jacklin96

May 12, 2020

默认参数的没试过用自己训练的库并添加参数之后准确率没什么区别

tony9413

May 12, 2020

2 楼正解

sadfQED2

May 12, 2020 via Android

你自己手动把参数加上看看，带上那个文本类型参数

nicevar

May 12, 2020

pytesseract 就那么一丁点代码没必要网上去搜什么答案, tesseract 是你自己配置的, 你确定是调用的同一个 tesseract?配置数据一样, 参数一样? 是不是你从哪弄来的代码有对图片进行处理?

Clay0620

May 12, 2020

是要做 OCR 么？其实申请个百度之类的 API，识别的还挺准的

cmmulxuk

May 13, 2020 via Android

只安装了一个版本，既然 pytesseract 可以用，那应该不是版本问题。问题没有解决，通过处理图片，准确率上来了，就懒的管了，将就用了。