Luo, YunfangYunfangLuoDr. CUI Xiling, CelineCelineDr. CUI XilingLiu, QiangQiangLiuDr. ZHOU QiangDr. ZHANG Yingxuan, CynthiaCynthiaDr. ZHANG Yingxuan2024-10-072024-10-072025Data and Information Management, 2025, vol. 9(2), article no. 100084.2543-9251http://hdl.handle.net/20.500.11861/10508Open accessExaggeration is a specific way in which companies potentially overstate certain aspects of their actual environmental performance, strategically disclosing positive information about their environmental performance. This research aims to identify instances of exaggerated information within environmental, social, and governance (ESG) reports by employing machine learning techniques. We crawled 594 ESG reports and employed a variety of machine learning algorithms to identify instances of exaggeration. Through the cross-validation, we found that random forest exhibits the best performance in predicting exaggeration and ridge regression demonstrates superior performance in predicting the exaggeration scores. A significant contribution of our study is the development of an exaggerated thesaurus tailored specifically to this domain. Ultimately, our study lays a foundation for further investigations into addressing the impact of exaggerated information in ESG reporting.enNatural Language Processing (NLP)Machine LearningESGExaggerated InformationIdentifying exaggeration in ESG reports using machine learning techniquesPeer Reviewed Journal Article10.1016/j.dim.2024.100084