请翻译以下段落：随着科技的不断进步人工智能技术被广泛应用于解决现实中的复杂问题。在多智能体系统中实现多智能体间的避碰是不可或缺的一个任务环节。然而传统的多智能体避碰决策方法存在环境、智能体模型以及计算资源依赖性较强、扩展性较差等问题难以适应复杂应用场景。为解决这些问题本研究提出了基于深度强化学习Deep Reinforcement LearningDRL算法的多智能体避碰决策方法。该方法结合实验场

With the continuous advancement of technology, artificial intelligence (AI) is widely used to solve complex problems in the real world. In multi-agent systems, avoiding collisions between agents is an essential task. However, traditional methods for multi-agent collision avoidance decision-making suffer from problems such as strong dependency on the environment, agent models, and computational resources, as well as poor scalability, making them unsuitable for complex application scenarios. To address these issues, this study proposes a multi-agent collision avoidance decision-making method based on deep reinforcement learning (DRL) algorithms. This method combines experimental scene design with reward function to improve the reward mechanism of DRL algorithms. By maximizing cumulative rewards, the strategy is optimized, resulting in more stable overall training, faster convergence, and higher success rates in multi-agent autonomous collision avoidance. The content and innovation points of this research include: (1) Chapter 3 proposes a multi-agent collision avoidance decision-making method based on the traditional reinforcement learning algorithm, Proximal Policy Optimization (PPO2). This method builds a partially observable Markov environment suitable for traditional reinforcement learning algorithms and models the multi-agent collision avoidance problem through reasonable experimental scene and reward function design. Experimental results show that compared with the Soft Actor-Critic (SAC) and Deep Deterministic Policy Gradient (DDPG) algorithms, the multi-agent collision avoidance decision-making method proposed in this study based on PPO2 can better achieve autonomous collision avoidance between agents, with a higher success rate in the experimental scene of Chapter 3. (2) Chapter 4 mainly introduces the challenges faced by traditional reinforcement learning algorithms in solving the multi-agent collision avoidance decision-making problem in complex real-world environments. These challenges include dimension explosion, computational complexity, and a decrease in collision avoidance success rate. To address these issues, this study proposes a multi-agent collision avoidance decision-making method based on the Multi-Agent Deep Deterministic Policy Gradient algorithm (MADDPG). This method draws on the idea of multi-agent reinforcement learning (MARL) and adopts the algorithm framework of centralized training and decentralized execution (CTDE). At the same time, a reward function is designed to better adapt to the collision avoidance problem. In a simulation environment, a three-dimensional experimental scene with an increasing number of agents and multiple random entry points is designed, and the multi-agent collision avoidance method proposed in this study based on MADDPG is trained and compared with PPO2, SAC, and DDPG algorithms. Experimental results show that the multi-agent collision avoidance method based on MADDPG has good collaborative ability, stable algorithm performance, and higher collision avoidance success rate, highlighting the superiority of multi-agent deep reinforcement learning algorithms in multi-agent collision avoidance decision-making problems.