Please use this identifier to cite or link to this item: http://hdl.handle.net/20.500.11861/7597
DC FieldValueLanguage
dc.contributor.authorJin, Huidongen_US
dc.contributor.authorWong, Man-Leungen_US
dc.contributor.authorProf. LEUNG Kwong Saken_US
dc.date.accessioned2023-03-27T03:15:26Z-
dc.date.available2023-03-27T03:15:26Z-
dc.date.issued2005-
dc.identifier.citationIEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, vol. 27 (11), pp. 1710 - 1719en_US
dc.identifier.issn01628828-
dc.identifier.urihttp://hdl.handle.net/20.500.11861/7597-
dc.description.abstractThe scalability problem in data mining involves the development of methods for handling large databases with limited computational resources such as memory and computation time. In this paper, two scalable clustering algorithms, bEMADS and gEMADS, are presented based on the Gaussian mixture model. Both summarize data into subclusters and then generate Gaussian mixtures from their data summaries. Their core algorithm, EMADS, is defined on data summaries and approximates the aggregate behavior of each subcluster of data under the Gaussian mixture model. EMADS is provably convergent. Experimental results substantiate that both algorithms can run several orders of magnitude faster than expectation-maximization with little loss of accuracy © 2005 IEEE.en_US
dc.language.isoenen_US
dc.relation.ispartofIEEE Transactions on Pattern Analysis and Machine Intelligenceen_US
dc.titleScalable model-based clustering for large databases based on data summarizationen_US
dc.typePeer Reviewed Journal Articleen_US
dc.identifier.doi10.1109/TPAMI.2005.226-
item.fulltextNo Fulltext-
crisitem.author.deptDepartment of Applied Data Science-
Appears in Collections:Publication
Show simple item record

SCOPUSTM   
Citations

35
checked on Jan 3, 2024

Page view(s)

14
checked on Jan 3, 2024

Google ScholarTM

Impact Indices

Altmetric

PlumX

Metrics


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.