Lam, Cindy Man-FongCindy Man-FongLamDr. CHAN Kin WingWong, Sheung PingSheung PingWong2025-07-252025-07-252024Lam, M. F., Chan, K. W., & Wong, S. P. (14.12.2024). Analysing non-Chinese speaking students’ spoken Chinese proficiency in Hong Kong using a learner corpus: A focus on negative markers. The 28th International Conference on Yue Dialects, Hong Kong Metropolitan University.http://hdl.handle.net/20.500.11861/24096https://www.hkmu.edu.hk/EL/Yue_28/Conf_Handbook.pdfAdvancements in natural language processing technologies have enabled the development of various Cantonese corpora in recent years, providing extensive data for research into the language. Nevertheless, the majority of Cantonese corpora are oriented towards historical texts, native adult discourse, and child first language acquisition. By contrast, very few Cantonese learner corpora exist (Chan et al., 2019; Granger et al.; 2015), particularly those documenting the spoken language output of non-Chinese speaking (NCS) students in Hong Kong. This study aims to address research gaps regarding the language learning challenges that NCS students face through a corpus-based examination of patterns and errors in their speech. The current study is based on a self-compiled, small-scale digital corpus of transcribed spoken language data elicited through an oral narration task using the wordless illustrated storybook Frog, Where Are You? (Mayer, 1969). Participants include primary and secondary students from six schools in Hong Kong, including 82 NCS students and 22 native Chinesespeaking (CS) students. The corpus comprises over 71,000 tokens of narrative speech data and annotations. Corpus analysis was conducted using the concordance software tool AntConc to determine usage patterns for negative markers across the narratives. Preliminary findings from keyword-in-context analyses indicate various kinds of usage errors involving verbal constructions featuring the negator over predicatives m4 (唔) and the existential negator mou5 (冇, lit. “not have”). Among the errors for m4-constructions, the most frequently observed was the negator preceding the verbal head and its complement rather than only the complement; for instance, the narratives of 10 NCS students included the construction #唔揾到(#m4-wan2-dou2, #NEG-find-ACCOMPLISHMENT]) in the context of searching for the frog, which incorrectly expresses “inability to find” rather than “failure to find after attempting to do so” (cf. 揾唔到 wan2-m4-dou2, find-NEG-ACCMP). Erroneous [NEG-V-complement] constructions were also found with mou5, resulting in incorrect aspectual and modal configurations. For instance, the construction #冇見到 (#mou5-gin3-dou2, #NEG-see-ACCMP) expresses “had not seen/encountered” (negating completed event) rather than “failure to see/encounter” (negating incomplete event or state), which instead should be realised using m4 (cf. Tang 2022:324). These results offer valuable insights into second language development and learning strategies for NCS students in Hong Kong.enAnalysing non-Chinese speaking students’ spoken Chinese proficiency in Hong Kong using a learner corpus: A focus on negative markersConference Paper