For the cleaned entity image, we use a pre-trained PaddleOCR model to extract possible text information, and retain the detected text with a confidence level higher than 80%. We then use a pre-trained multi-language BERT model to encode the text, taking the output of its last layer as the OCR encoding. After obtaining the OCR encoding, we input it into a feedforward neural network for learning to obtain the final OCR embedding.

翻译:对于清洗后的实体图像我们使用经过预训练的PaddleOCR模型提取其中可能存在的文本信息对其中置信度高于80的检测文本进行保留并使用预训练多语言BERT模型对其进行文本编码将其最后一层输出作为OCR编码。获取OCR编码后输入一个前馈神经网络进行学习以获取最终的OCR嵌入。

原文地址: https://www.cveoy.top/t/topic/bFyZ 著作权归作者所有。请勿转载和采集!

免费AI点我,无需注册和登录