Please use this identifier to cite or link to this item: http://hdl.handle.net/20.500.11861/7551
Title: N-SAMSAM: A simple and faster algorithm for solving approximate matching in DNA sequences
Authors: Ni, Bing 
Wong M.H. 
Prof. LEUNG Kwong Sak 
Issue Date: 2008
Source: 2008 IEEE Congress on Evolutionary Computation, CEC 2008, pp. 2592 - 2598, 2008 , Article number 4631146
Journal: 2008 IEEE Congress on Evolutionary Computation, CEC 2008 
Abstract: This work proposes a novel algorithm to do approximate matching in a database consisting of multiple sequences. We apply Agrep algorithm in an indexing structure, the r-cut numerical substring array (r-NSA). The structure basically indexes all the substrings of length r. The advantage of using the r-NSA is two-fold: (1) The space requirement of the r-NSA is much smaller than that of the other existing indexing structures, such as the generalized suffix tree. (2) We propose an algorithm to apply Agrep in the r-NSA, in which the substrings are processed sequentially. Since the common substrings are processed only once, the cost of our algorithm is smaller than that of the full scanning search by Agrep. Consequently, the matching time of our algorithm is also reduced. We design experiments to validate and compare the performance of our algorithm against the full scanning search by Agrep. We define the speed-up of our algorithm as the time required by the full scanning search by Agrep over that of our algorithm. We use eight sets of real DNA sequences in our experiments, and the results show that our algorithm achieves significant speed-up. We also investigate the speed-up of difference data sets, and analyze their differences in detail. © 2008 IEEE.
Type: Conference Paper
URI: http://hdl.handle.net/20.500.11861/7551
ISBN: 978-142441823-7
DOI: 10.1109/CEC.2008.4631146
Appears in Collections:Publication

Show full item record

SCOPUSTM   
Citations

3
checked on Jan 3, 2024

Page view(s)

7
checked on Jan 3, 2024

Google ScholarTM

Impact Indices

Altmetric

PlumX

Metrics


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.