Please use this identifier to cite or link to this item: http://hdl.handle.net/20.500.11861/7544
Title: Discovering protein-DNA binding sequence patterns using association rule mining
Authors: Prof. LEUNG Kwong Sak 
Wong, Ka-Chun 
Chan, Tak-Ming 
Wong, Man-Hon 
Lee, Kin-Hong 
Lau, Chi-Kong 
Tsui, Stephen K.W. 
Issue Date: 2010
Publisher: Oxford University Press
Source: Nucleic Acids Research. 2010, vol. 38 (19) , pp. 6324 - 6337
Journal: Nucleic Acids Research 
Abstract: Protein-DNA bindings between transcription factors (TFs) and transcription factor binding sites (TFBSs) play an essential role in transcriptional regulation. Over the past decades, significant efforts have been made to study the principles for protein-DNA bindings. However, it is considered that there are no simple one-to-one rules between amino acids and nucleotides. Many methods impose complicated features beyond sequence patterns. Protein-DNA bindings are formed from associated amino acid and nucleotide sequence pairs, which determine many functional characteristics. Therefore, it is desirable to investigate associated sequence patterns between TFs and TFBSs. With increasing computational power, availability of massive experimental databases on DNA and proteins, and mature data mining techniques, we propose a framework to discover associated TF-TFBS binding sequence patterns in the most explicit and interpretable form from TRANSFAC. The framework is based on association rule mining with Apriori algorithm. The patterns found are evaluated by quantitative measurements at several levels on TRANSFAC. With further independent verifications from literatures, Protein Data Bank and homology modeling, there are strong evidences that the patterns discovered reveal real TF-TFBS bindings across different TFs and TFBSs, which can drive for further knowledge to better understand TF-TFBS bindings. © The Author(s) 2010. Published by Oxford University Press.
Type: Peer Reviewed Journal Article
URI: http://hdl.handle.net/20.500.11861/7544
ISSN: 03051048
DOI: 10.1093/nar/gkq500
Appears in Collections:Applied Data Science - Publication

Show full item record

SCOPUSTM   
Citations

48
checked on Nov 17, 2024

Page view(s)

36
Last Week
0
Last month
checked on Nov 21, 2024

Google ScholarTM

Impact Indices

Altmetric

PlumX

Metrics


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.