Please use this identifier to cite or link to this item: http://hdl.handle.net/20.500.11861/9019
DC FieldValueLanguage
dc.contributor.authorDr. YUEN Man-Ching, Connieen_US
dc.contributor.authorKing Irwinen_US
dc.contributor.authorProf. LEUNG Kwong Saken_US
dc.date.accessioned2024-03-14T01:58:18Z-
dc.date.available2024-03-14T01:58:18Z-
dc.date.issued2016-
dc.identifier.citationBig Data Analytics, 2016, vol. 1, article no. 14.en_US
dc.identifier.issn2058-6345-
dc.identifier.urihttp://hdl.handle.net/20.500.11861/9019-
dc.description.abstractBackground To ensure the output quality, current crowdsourcing systems highly rely on redundancy of answers provided by multiple workers with varying expertise, however massive redundancy is very expensive and time-consuming. Task recommendation can help requesters to receive good quality output quicker as well as help workers to find their right tasks faster. To reduce the cost, a number of previous works adopted active learning in crowdsourcing systems for quality assurance. Active learning is a learning approach to achieve certain accuracy with a very low cost. However, previous works do not consider the varying expertise of workers for various task categories in real crowdsourcing scenarios; and they do not consider new workers who are not willing to work on a large amount of tasks before having a list of preferred tasks recommended. In this paper, we propose ActivePMFv2, Probabilistic Matrix Factorization with Active Learning (version 2), on a task recommendation framework called TaskRec to recommend tasks to workers in crowdsourcing systems for quality assurance. By assigning the most uncertain task for new workers to work on, this paper identifies a flaw in our previous ActivePMFv1, Probabilistic Matrix Factorization with Active Learning (version 1). Therefore, ActivePMFv2 can give new workers a list of preferred tasks recommended faster than that of ActivePMFv1. Our factor analysis model considers not only worker task selection preference, but also worker performance history. It actively selects the most uncertain task for the most reliable workers to work on to retrain the classification model. Moreover, we propose a generic online-updating method for learning the model, ActivePMFv2. The larger the profile of a worker (or task) is, the less important is retraining its profile on each new work done. In case of the worker (or task) having large profile, our online-updating algorithm retrains the whole feature vector of the worker (or task) and keeps all other entries in the matrix fixed. Our online-updating algorithm runs batch update to reduce the running time of model update. Results Complexity analysis shows that our model is efficient and is scalable to large datasets. Based on experiments on real-world datasets, the result shows that the MAE results and RMSE results of our proposed ActivePMFv2 are improved up to 29 % and 35 % respectively comparing with ActivePMFv1, where ActivePMFv1 outperforms the PMF with other active learning approaches significantly as shown in previous work. Experiment results show that our online-updating algorithm is accurate in approximating to a full retrain of the learning model while the average runtime of model update for each work done is reduced by more than 80 % (decreases from a few minutes to several seconds). Conclusions To the best of our knowledge, we are the first one to use PMF, active learning and dynamic model update to recommend tasks for quality assurance in crowdsourcing systems for real scenarios.en_US
dc.language.isoenen_US
dc.relation.ispartofBig Data Analyticen_US
dc.titleAn online-updating algorithm on probabilistic matrix factorization with active learning for task recommendation in crowdsourcing systemsen_US
dc.typePeer Reviewed Journal Articleen_US
dc.identifier.doihttps://doi.org/10.1186/s41044-016-0012-2-
item.fulltextNo Fulltext-
crisitem.author.deptDepartment of Applied Data Science-
crisitem.author.deptDepartment of Applied Data Science-
Appears in Collections:Publication
Show simple item record

Google ScholarTM

Impact Indices

Altmetric

PlumX

Metrics


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.