docTR 是一个无缝、高性能且可访问的库,用于由深度学习支持的 OCR 相关任务。
docTR 0.6.0 发布了,doctr 0.6.0 需要 TensorFlow >= 2.9.0 或 PyTorch >= 1.8.0。
版本亮点:
与 Huggingface Hub 完全集成
- 从 Hub 加载:
from doctr.io import DocumentFile
from doctr.models import ocr_predictor, from_hub
image = DocumentFile.from_images(['data/example.jpg'])
# Load a custom detection model from huggingface hub
det_model = from_hub('Felix92/doctr-torch-db-mobilenet-v3-large')
# Load a custom recognition model from huggingface hub
reco_model = from_hub('Felix92/doctr-torch-crnn-mobilenet-v3-large-french')
# You can easily plug in this models to the OCR predictor
predictor = ocr_predictor(det_arch=det_model, reco_arch=reco_model)
result = predictor(image)
- 推送到 Hub:
from doctr.models import recognition, login_to_hub, push_to_hf_hub
login_to_hub()
my_awesome_model = recognition.crnn_mobilenet_v3_large(pretrained=True)
push_to_hf_hub(my_awesome_model, model_name='doctr-crnn-mobilenet-v3-large-french-v1', task='recognition', arch='crnn_mobilenet_v3_large')
文档:https://mindee.github.io/doctr/using_doctr/sharing_models.html
新模型(两个框架)
- 分类:VisionTransformer(ViT)
- 识别:用于场景文本识别的 Vision Transformer (ViTSTR)
错误修复识别模型
- MASTER 和 SAR 架构现在可在两个框架(TensorFlow 和 PyTorch)中运行
更新公告:https://github.com/mindee/doctr/releases/tag/v0.6.0