RTMOSeg: A Transformer-Based Railway Track Multi-Objective Segmentation Network

This paper proposes a railway track multi-objective segmentation network based on Transformer. Firstly, we construct two railway track component datasets (dataset #1 and dataset #2). Then, we introduce an end-to-end segmentation network called RTMOSeg, which is trained and tested on the constructed datasets. The RTMOSeg consists of an encoder and a decoder, where CNN and Transformer are employed in the encoder to extract global and local feature information from the input images. The decoder incorporates a multi-scale feature fusion module to enhance the segmentation performance by combining information from multiple layers. Numerous experiments on the created datasets demonstrate that the proposed RTMOSeg can effectively and accurately segment multiple target components in railway track images. The segmentation network provides improved methods and ideas for the practical application of railway track multi-target segmentation and detection, thereby enhancing the intelligence of railway track multi-target segmentation and subsequent defect detection. In the future, our first goal is to expand the railway track multi-target image dataset. Additionally, we will further optimize the reasoning speed of the segmentation model proposed in this paper to enhance its efficiency.