Predicting approximate protein-DNA binding cores using association rule mining

Wong, Po-Yuen; Chan, Tak-Ming; Wong, Man-Hon; Prof. LEUNG Kwong Sak

Please use this identifier to cite or link to this item: http://hdl.handle.net/20.500.11861/7522

DC Field	Value	Language
dc.contributor.author	Wong, Po-Yuen	en_US
dc.contributor.author	Chan, Tak-Ming	en_US
dc.contributor.author	Wong, Man-Hon	en_US
dc.contributor.author	Prof. LEUNG Kwong Sak	en_US
dc.date.accessioned	2023-03-17T04:21:48Z	-
dc.date.available	2023-03-17T04:21:48Z	-
dc.date.issued	2012	-
dc.identifier.citation	Proceedings - International Conference on Data Engineering 6228148, pp. 965-976	en_US
dc.identifier.issn	10844627	-
dc.identifier.uri	http://hdl.handle.net/20.500.11861/7522	-
dc.description.abstract	The studies of protein-DNA bindings between transcription factors (TFs) and transcription factor binding sites (TFBSs) are important bioinformatics topics. High-resolution (length[removed]490) are shown promising in identifying accurate binding cores without using any 3D structures. While the current association rule mining method on this problem addresses exact sequences only, the most recent ad hoc method for approximation does not establish any formal model and is limited by experimentally known patterns. As biological mutations are common, it is desirable to formally extend the exact model into an approximate one. In this paper, we formalize the problem of mining approximate protein-DNA association rules from sequence data and propose a novel efficient algorithm to predict protein-DNA binding cores. Our two-phase algorithm first constructs two compact intermediate structures called frequent sequence tree (FS-Tree) and frequent sequence class tree (FSCTree). Approximate association rules are efficiently generated from the structures and bioinformatics concepts (position weight matrix and information content) are further employed to prune meaningless rules. Experimental results on real data show the performance and applicability of the proposed algorithm. © 2012 IEEE.	en_US
dc.language.iso	en	en_US
dc.relation.ispartof	Proceedings - International Conference on Data Engineering	en_US
dc.title	Predicting approximate protein-DNA binding cores using association rule mining	en_US
dc.type	Conference Paper	en_US
dc.identifier.doi	10.1109/ICDE.2012.86	-
item.fulltext	No Fulltext	-
crisitem.author.dept	Department of Applied Data Science	-
Appears in Collections:	Applied Data Science - Publication

Find@HKSYU

Show simple item record

SCOPUS^TM
Citations

12

checked on May 18, 2025

Page view(s)

54

Last Week
0

Last month

checked on May 19, 2025

Google Scholar^TM

Impact Indices

SCOPUS^TM
Citations

Page view(s)

Google Scholar^TM

Altmetric

PlumX
Metrics

Publisher copyright policies & self-archiving

SCOPUSTM Citations

Page view(s)

Google ScholarTM

Altmetric

PlumX Metrics

SCOPUS^TM
Citations

Google Scholar^TM

PlumX
Metrics