Discovering Protein-DNA Binding Cores by Aligned Pattern Clustering

Lee, En-Shiun Annie; Sze-To, Ho-Yin Antonio; Wong, Man-Hon; Prof. LEUNG Kwong Sak; Lau, Terrence Chi-Kong; Wong, Andrew K. C.

Please use this identifier to cite or link to this item: http://hdl.handle.net/20.500.11861/7449

DC Field	Value	Language
dc.contributor.author	Lee, En-Shiun Annie	en_US
dc.contributor.author	Sze-To, Ho-Yin Antonio	en_US
dc.contributor.author	Wong, Man-Hon	en_US
dc.contributor.author	Prof. LEUNG Kwong Sak	en_US
dc.contributor.author	Lau, Terrence Chi-Kong	en_US
dc.contributor.author	Wong, Andrew K. C.	en_US
dc.date.accessioned	2023-03-02T07:30:19Z	-
dc.date.available	2023-03-02T07:30:19Z	-
dc.date.issued	2015	-
dc.identifier.citation	IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2015, vol. 14(2), pp. 254-263	en_US
dc.identifier.uri	http://hdl.handle.net/20.500.11861/7449	-
dc.description.abstract	Understanding binding cores is of fundamental importance in deciphering Protein-DNA (TF-TFBS) binding and gene regulation. Limited by expensive experiments, it is promising to discover them with variations directly from sequence data. Although existing computational methods have produced satisfactory results, they are one-to-one mappings with no site-specific information on residue/nucleotide variations, where these variations in binding cores may impact binding specificity. This study presents a new representation for modeling binding cores by incorporating variations and an algorithm to discover them from only sequence data. Our algorithm takes protein and DNA sequences from TRANSFAC (a Protein-DNA Binding Database) as input; discovers from both sets of sequences conserved regions in Aligned Pattern Clusters (APCs); associates them as Protein-DNA Co-Occurring APCs; ranks the Protein-DNA Co-Occurring APCs according to their co-occurrence, and among the top ones, finds three-dimensional structures to support each binding core candidate. If successful, candidates are verified as binding cores. Otherwise, homology modeling is applied to their close matches in PDB to attain new chemically feasible binding cores. Our algorithm obtains binding cores with higher precision and much faster runtime ( ≥ 1,600x) than that of its contemporaries, discovering candidates that do not co-occur as one-to-one associated patterns in the raw data. Availability: http://www.pami.uwaterloo.ca/~ealee/files/tcbbPnDna2015/Release.zip .	en_US
dc.language.iso	en	en_US
dc.publisher	IEEE	en_US
dc.relation.ispartof	IEEE/ACM Transactions on Computational Biology and Bioinformatics 14(2), pp. 254-263	en_US
dc.title	Discovering Protein-DNA Binding Cores by Aligned Pattern Clustering	en_US
dc.type	Peer Reviewed Journal Article	en_US
dc.identifier.doi	10.1109/TCBB.2015.2474376	-
item.fulltext	No Fulltext	-
crisitem.author.dept	Department of Applied Data Science	-
Appears in Collections:	Applied Data Science - Publication

Find@HKSYU

Show simple item record

SCOPUS^TM
Citations

2

checked on Nov 17, 2024

Page view(s)

45

Last Week
0

Last month

checked on Nov 21, 2024

Google Scholar^TM

Impact Indices

SCOPUS^TM
Citations

Page view(s)

Google Scholar^TM

Altmetric

PlumX
Metrics

Publisher copyright policies & self-archiving

SCOPUSTM Citations

Page view(s)

Google ScholarTM

Altmetric

PlumX Metrics

SCOPUS^TM
Citations

Google Scholar^TM

PlumX
Metrics