Viewpoint-Guided Multi-View Prototype Network for 3D Shape Recognition

3D shape recognition, a cornerstone of computer vision, allows for deep understanding and analysis of 3D environments by segmenting, recognizing, and positioning objects within 3D scenes. This field focuses on extracting features from 3D data for classification, recognition, and segmentation. While traditional methods relied on manually designed features like geometric shapes and surface textures, they struggled with complex shapes, susceptibility to noise and deformation, and high manual effort.

The advent of deep learning, fueled by advancements in computer graphics, computer vision, and machine learning, has revolutionized 3D shape recognition. By leveraging neural networks, deep learning automatically learns features, adapts to complex 3D shapes, and extracts robust feature representations.

Our Research

This thesis delves into existing deep learning methods for 3D shape analysis, proposing their integration with additional deep learning techniques to enhance multi-view-based 3D shape classification. We introduce a novel viewpoint-guided multi-view prototype network model, addressing key limitations of traditional approaches:

Addressing Viewpoint Relationships: Traditional methods often neglect the intrinsic relationships between different view images. Our model tackles this by incorporating a feature scoring unit based on attention weight. This unit effectively captures the amount of 3D shape information present in different views, assigning weights based on information content. Consequently, it amplifies the importance of significant viewpoints while mitigating the impact of irrelevant information on aggregated features. This leads to a more accurate and stable representation of sample features.
Prototype-Based Viewpoint Weight Guidance: Inspired by prototype networks known for attention weight and small sample learning, we propose a novel viewpoint weight guidance algorithm. We establish a prototype network framework and feed it with the original feature expressions for learning. This process calculates and stores the feature center and viewpoint weight library for each class. By utilizing multiple loss functions with diverse effects, we optimize feature distribution within the mapping space. Finally, the classification is performed by calculating the distance between the query sample and the prototype representation of each class.

Experimental Results and Conclusion

We conducted various experiments with different methodologies and compared our approach to other advanced methods using the ModelNet10 and ModelNet40 databases. The results demonstrate that our proposed method achieves superior classification accuracy and outperforms existing techniques on both datasets.

This research contributes to the advancement of 3D shape recognition by introducing a novel viewpoint-guided multi-view prototype network model. Our approach effectively leverages the power of deep learning and attention mechanisms to enhance the accuracy and robustness of 3D shape classification from multiple views.

Viewpoint-Guided Multi-View Prototype Network for 3D Shape Recognition