A-practical-benchmark-system-should-address-the-following-four-concerns-about-deep-learning-protein-ligand-docking-models-Molecular-docking-models-must-adapt-to-realistic-molecular-docking-tasks-such-as-binding-pose-prediction-and-virtual-screening-T
A practical benchmark system for deep learning protein-ligand docking models should address four key concerns. First, the models must be adapted to realistic molecular docking tasks, such as predicting binding poses and virtual screening. Second, a large amount of data is required to train and evaluate deep learning models. Third, sample features must be extracted as inputs by the models. Fourth, the models must be able to generalize to predict samples from different distributions.
However, existing datasets and benchmarks are not customized for deep learning methods and do not satisfy all of these concerns. Most benchmark datasets have limited data, which is insufficient for large-scale deep learning models. Extracting features from existing datasets is also challenging for modelers who lack domain knowledge. Moreover, the datasets may have inherent flaws such as data bias and simplistic evaluation criteria that make them unsuitable for comprehensively and objectively evaluating deep learning protein-ligand docking models.
The PDBbind dataset is the first to annotate protein-ligand complexes in the PDB with experimental binding data in a systematic manner. The Comparative Assessment of Scoring Functions (CASF) benchmark uses PDBbind to compare scoring functions, but it only uses 285 samples, lacks sufficient features for deep learning models, and does not evaluate generalization performance.
Therefore, there is a need for high-quality datasets that can be used to train and evaluate deep learning protein-ligand docking algorithms.
原文地址: https://www.cveoy.top/t/topic/rvc 著作权归作者所有。请勿转载和采集!