Please use this identifier to cite or link to this item:
http://hdl.handle.net/20.500.11861/7483
Title: | Low-Quality Structural and Interaction Data Improves Binding Affinity Prediction via Random Forest |
Authors: | Li, Hongjian Prof. LEUNG Kwong Sak Wong, Man-Hon Ballester, Pedro J. |
Issue Date: | 2015 |
Source: | Molecules 2015, 20(6), 10947-10962 |
Journal: | Molecules |
Abstract: | Abstract Docking scoring functions can be used to predict the strength of protein-ligand binding. It is widely believed that training a scoring function with low-quality data is detrimental for its predictive performance. Nevertheless, there is a surprising lack of systematic validation experiments in support of this hypothesis. In this study, we investigated to which extent training a scoring function with data containing low-quality structural and binding data is detrimental for predictive performance. We actually found that low-quality data is not only non-detrimental, but beneficial for the predictive performance of machine-learning scoring functions, though the improvement is less important than that coming from high-quality data. Furthermore, we observed that classical scoring functions are not able to effectively exploit data beyond an early threshold, regardless of its quality. This demonstrates that exploiting a larger data volume is more important for the performance of machine-learning scoring functions than restricting to a smaller set of higher data quality. |
Type: | Peer Reviewed Journal Article |
URI: | http://hdl.handle.net/20.500.11861/7483 |
DOI: | 10.3390/molecules200610947 |
Appears in Collections: | Applied Data Science - Publication |
Find@HKSYU Show full item record
SCOPUSTM
Citations
78
checked on Dec 15, 2024
Page view(s)
39
Last Week
0
0
Last month
checked on Dec 20, 2024
Google ScholarTM
Impact Indices
Altmetric
PlumX
Metrics
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.