Deep Learning for Oracle Bone Inscription Detection: A Comprehensive Study and Optimization

With the success of deep learning in the field of computer vision, the task of detecting and recognizing oracle bone inscriptions has also begun to incorporate machine learning algorithms to obtain more abstract feature representations. Currently, the method of using deep learning to detect oracle bone inscriptions is a supervised learning method. It trains deep neural networks using a large amount of data that has been annotated with oracle bone glyph position information. This training is carried out in one or two stages to achieve automatic annotation of oracle bone inscriptions. Liu et al. accurately detected and recognized specific categories of oracle bone inscriptions through the learning process of a deep learning model [3]. Fujikawa conducted comprehensive and accurate research on the detection and recognition of oracle bone inscriptions using the MobileNet and YOLO models [4]. Meng et al. conducted research on oracle bone inscription recognition through deep learning based on a data augmentation method [5]. Professor Huang Shuangping's team at South China University of Technology combined the use of the R-FCN neural network and the Feature Pyramid Network (FPN) for oracle bone inscription detection [6]. Lin Meng from Ritsumeikan University in Japan established a dataset consisting of 330 oracle bone inscription images. In order to improve detection accuracy, they made multiple improvements to SSD [8]. However, handwritten oracle bone inscription detection belongs to small target detection, and each oracle bone inscription image to be detected contains very little information, resulting in a lack of discriminative features. At the same time, small target detection requires very accurate prediction of the detection box coordinates, because small target detection itself contains fewer pixels. Compared with ordinary target detection, when the predicted coordinates have the same offset, the IOU error of small target detection is larger. In addition, there are more handwritten oracle bone inscriptions (negative samples) and fewer oracle bone inscriptions (positive samples). This leads to the network not being able to learn the features of positive samples well. At the same time, most of the oracle bone inscriptions are handwritten documents, which results in large differences in font styles between documents and some handwritten characters being similar to oracle bone inscriptions, which affects the accuracy of oracle bone inscription detection. In response to the above problems, we first created a dataset containing 3982 oracle bone inscription images. Using this dataset, we studied the detection performance of three of the most representative general object detection frameworks in recent years, including Sparse R-CNN [9], DAMO YOLO [10], and YOLO v8. Finally, based on the YOLO v8 framework, we optimized the detection algorithm using sliding window cropping and recognition assistance techniques. We manually expanded the dataset to increase the overall data size and specifically increase the training data for difficult-to-detect characters, aiming to improve the detection model's ability to recognize difficult-to-detect characters. In addition, we improved the reliability of the dataset through secondary labeling. To solve the problem of misclassifying characters similar to oracle bone inscriptions, we used an oracle bone inscription character recognition-assisted detection algorithm to filter out incorrect predictions. Experimental results show that our method improves the accuracy of the model in detecting oracle bone inscriptions and has good robustness and generalization ability.