Please use this identifier to cite or link to this item: http://hdl.handle.net/20.500.11861/7536
DC FieldValueLanguage
dc.contributor.authorChan, Tak-Mingen_US
dc.contributor.authorWong, Ka-Chunen_US
dc.contributor.authorLee, Kin-Hongen_US
dc.contributor.authorWong, Man-Honen_US
dc.contributor.authorLau, Chi-Kongen_US
dc.contributor.authorTsui, Stephen Kwok-Wingen_US
dc.contributor.authorProf. LEUNG Kwong Saken_US
dc.date.accessioned2023-03-23T03:01:05Z-
dc.date.available2023-03-23T03:01:05Z-
dc.date.issued2011-
dc.identifier.citationBioinformatics, 2011, vol. 27( 4), pp. 471 - 478, Article number btq682en_US
dc.identifier.issn14602059-
dc.identifier.urihttp://hdl.handle.net/20.500.11861/7536-
dc.description.abstractMotivation: The bindings between transcription factors (TFs) and transcription factor binding sites (TFBSs) are fundamental protein-DNA interactions in transcriptional regulation. Extensive efforts have been made to better understand the protein-DNA interactions. Recent mining on exact TF-TFBS-associated sequence patterns (rules) has shown great potentials and achieved very promising results. However, exact rules cannot handle variations in real data, resulting in limited informative rules. In this article, we generalize the exact rules to approximate ones for both TFs and TFBSs, which are essential for biological variations. Results: A progressive approach is proposed to address the approximation to alleviate the computational requirements. Firstly, similar TFBSs are grouped from the available TF-TFBS data (TRANSFAC database). Secondly, approximate and highly conserved binding cores are discovered from TF sequences corresponding to each TFBS group. A customized algorithm is developed for the specific objective. We discover the approximate TF-TFBS rules by associating the grouped TFBS consensuses and TF cores. The rules discovered are evaluated by matching (verifying with) the actual protein-DNA binding pairs from Protein Data Bank (PDB) 3D structures. The approximate results exhibit many more verified rules and up to 300% better verification ratios than the exact ones. The customized algorithm achieves over 73% better verification ratios than traditional methods. Approximate rules (64-79%) are shown statistically significant. Detailed variation analysis and conservation verification on NCBI records demonstrate that the approximate rules reveal both the flexible and specific protein-DNA interactions accurately. The approximate TF-TFBS rules discovered show great generalized capability of exploring more informative binding rules. © The Author 2010. Published by Oxford University Press. All rights reserved.en_US
dc.language.isoenen_US
dc.relation.ispartofBioinformaticsen_US
dc.titleDiscovering approximate-associated sequence patterns for protein-DNA interactionsen_US
dc.typePeer Reviewed Journal Articleen_US
dc.identifier.doi10.1093/bioinformatics/btq682-
item.fulltextNo Fulltext-
crisitem.author.deptDepartment of Applied Data Science-
Appears in Collections:Applied Data Science - Publication
Show simple item record

SCOPUSTM   
Citations

17
checked on Nov 17, 2024

Page view(s)

30
Last Week
0
Last month
checked on Nov 21, 2024

Google ScholarTM

Impact Indices

Altmetric

PlumX

Metrics


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.