Han, DongDongHanDr. YANG Yike2025-07-242025-07-242024Han, D., & Yang, Y. (22 June 2024). Constructing a physiologically validated Cantonese emotional speech dataset. FoCaL-7: The Seventh Forum on Cantonese Linguistics, Hong Kong Shue Yan University.https://drive.google.com/file/d/1TnoLquj0cbsW-WI_yOpgUiqZYmc-yQSq/viewhttp://hdl.handle.net/20.500.11861/24062Introduction Emotional speech plays a vital role in interpersonal communication and is also significant for speech recognition and synthesis. However, little attention has been paid to Cantonese emotional speech. The only existing Cantonese emotional speech dataset [1] did not adopt an effective elicitation method and validated speech by subjective judgements from native speakers. This study aims to construct a Cantonese emotional speech dataset with a well-established emotion elicitation method and employ physiological measurements to ensure the authenticity and validity of emotional expressions. Methods Participants: This study will recruit 60 Cantonese native speakers (30 males) aged 18 to 25 with no reported hearing or speech disorders. Recording procedures: Participants need to complete two recording sessions and wear a FlexComp Infiniti Biofeedback System to collect heart rate (HR) and skin conductance (SC) [2] during recording. First, instructions of autobiographical recall modified from [3] for eliciting emotions will be displayed on a computer screen, guiding participants to recall a past neutral event. Once ready, participants can signal the experimenter to start recording and narrate the event without a time limit. After this, the recording session will be repeated, in which the participant needs to describe a highly emotional personal event related to one of six basic emotions (anger, disgust, fear, happiness, sadness, surprise [4]) under instructions. For emotional expression, the participant will be randomly assigned one specific emotional state. Segmentation and validation: One hundred and twenty recordings, including 60 with natural state and 10 of each emotion of six emotions, will be segmented into sentences and verified with physiological data. HR values will be averaged across each scan, while SC values will be calculated by determining the normalized area under the curve. Differences in physiological data between neutral and a specific emotion will be analyzed using paired t-tests. If a participant's HR or SC changes while expressing an emotion compared to their neutral state [2], the recording will be validated as that emotion and included in the dataset. Summary This study aims to construct a Cantonese emotional speech dataset with six basic emotions and a neutral state from 60 Cantonese native speakers, which will be validated with physiological measurements. The dataset will contribute to a better understanding of emotional communication and propel the development of cutting-edge speech technologies.enConstructing a physiologically validated Cantonese emotional speech datasetConference Paper