An Optimized Credit Decision-Making Model for SMEs in Xinjiang Under Inclusive Finance

Inclusive finance is a concept that has emerged based on small-scale credit and microfinance systems. The introduction of this financial concept aims to address the issues of imbalanced and insufficient financial services in China during its rapid economic development. This study focuses on small and medium-sized enterprises (SMEs), an important customer group in inclusive finance. These enterprises often face challenges in terms of funding and resources, but they play a crucial role in social and economic development. Therefore, it is important to have an appropriate credit decision-making method that meets market demands and technological changes, without sacrificing the profitability and risk management of financial institutions. Specifically, all the data used in this study were collected from small and medium-sized enterprises in Xinjiang, a region with unique geographical factors, cultural differences, policy environment, and industrial structure, which make the credit issues of SMEs under the context of inclusive finance in Xinjiang distinct and special.

The main research work and innovations can be summarized in the following four aspects:

(1) An optimized default risk assessment index system is proposed for credit decision-making for small and medium-sized enterprises in Xinjiang. The existing assessment index system lacks the ability to effectively distinguish between default and non-default SMEs, and often suffers from redundancy and overlap issues among rating indicators. This study proposes a dynamic feature selection method based on multiple methods combination to select the strongest individual default indicators and the optimal combination of indicators for overall default identification, in order to establish a credit score feature set with optimal default discrimination ability. This method provides a feature set method system for credit risk evaluation of SMEs that can be applied by banks and other financial institutions, and the model is validated using real data, providing insights for further improving the interpretability of the model.

(2) Based on the optimization of the index system and considerations of interpretability, a suitable default prediction model is further constructed. Existing default prediction models often fail to address the issue of class imbalance in credit default datasets. To overcome this problem, this study proposes a novel CT-XGBoost prediction model that combines cost-sensitive and threshold methods to improve XGBoost. The experimental data used in this study is from a loan default database of a bank in Xinjiang from 2017 to 2021. The CT-XGBoost model established in the experiments achieves an average AUC value of 96.38%, outperforming other default prediction models (AUC values ranging from 90.35% to 95.44%).

(3) To further address the class imbalance problem and consider the context of inclusive finance, this study proposes a credit rating model for SMEs based on anomaly detection, which can be classified as unsupervised learning, a less commonly used field in credit assessment. The system content mainly includes data preprocessing, feature extraction, and construction of enterprise credit risk classification model. After feature ranking, the normal samples are augmented using the Smote method, and denoising autoencoders (DAE) are used for feature extraction to classify enterprises based on reconstruction errors, improving the robustness of the model. A multi-level credit safety risk rating system is established using a three-dimensional index to assess enterprise risk, and warning thresholds are introduced for early warning analysis. The prediction results generated by the model are corrected by an expert group to ensure the accuracy and reliability of the system.

(4) To further improve discriminative accuracy and expand data sources, this study also introduces credit investigation reports as unstructured textual data for credit default prediction, exploring their role in enhancing prediction performance. The research data used includes enterprise credit information from a rural credit cooperative in Xinjiang, China. Textual data analysis is conducted from two aspects: text attributes and text topics, to extract relevant information, and models such as logistic regression, support vector machines, and neural networks are used for prediction.

In summary, this study provides valuable practical experience for credit decision-making for SMEs, improves the interpretability of the model by combining actual index systems, and provides more accurate and reliable credit decision models for future potential applications by financial institutions. It also offers prospects for the future development of the credit evaluation field, makes beneficial explorations in SME credit decision-making, and contributes positively to the promotion of inclusive finance development and the improvement of credit evaluation accuracy and reliability.