大语言模型在新闻传播学质性研究中的应用The Application of Large Language Models in Qualitative Research in Journalism and Communication Studies
刘子琨,付东晗,韩一铭
摘要(Abstract):
大语言模型的兴起正逐步改变新闻传播学质性研究的格局,特别是在数据收集、生成和分析方面展现出了巨大潜力。已有研究表明,在数据收集方面,大语言模型可模拟人类访谈者,实现自动化访谈,提高效率和一致性;还可作为事实核查工具,提高核查效率与准确性。在数据生成方面,大语言模型不仅可以进行文本数据的生成与模拟,还可以进行多模态数据的融合与生成。在数据分析方面,大语言模型能够进行自动识别与主题分类,提高分析效率,还可进行文本分析和话语分析,揭示文本背后的深层含义和情感。此外,大语言模型还可用于影像数据分析,进行高级的视觉信息提取与多模态分析。尽管大语言模型为新闻传播学质性研究带来了革命性的变革,但其应用也存在一定的局限性。未来的研究应进一步探索大语言模型在新闻传播学质性研究中的应用价值,以充分发挥其潜力。
关键词(KeyWords): 大语言模型;质性研究;新闻传播学
基金项目(Foundation):
作者(Author): 刘子琨,付东晗,韩一铭
参考文献(References):
- 常江(2024):人工智能时代的新闻行动者:人机比较与未来生态,《新闻与写作》,第10期,36-45页。
- 陈昌凤、张舒媛(2024):视觉优势?生成式人工智能应用于传播的模态偏向问题,《新闻与写作》,第10期,5-14页。
- 陈佳雯、褚乐阳、潘香霖、陈向东(2024):共享调节中的群体情感感知工具开发与应用——基于大语言模型技术框架,《远程教育杂志》,第42期第3卷,79-92页。
- 付少雄、朱梦蝶、杨海燕、王思博(2024):楚门的世界:短视频虚假信息多模态特征及其传播效果,《图书情报知识》,网络首发。
- 郭小安、周子琪、李泽源(2023):重大风险事件中短视频的共情传播效应及反思——基于“重庆山火事件”1063条抖音短视频的内容分析,《传媒观察》,第8期,73-81页。
- 金圣钧、李江梅、李宇皓、金楚浩(2023):空间漫游与想象生产:在线影像中“网红城市”的媒介化建构,《新闻与传播研究》,第30卷第5期,53-74,127页。
- 刘涛(2024):失联的“踪迹”:生成式AI图像与图像阐释学的知识框架重构,《南京社会科学》,第8期,86-100页。
- 龙强、冯强(2023):“在抖音做新闻”:新闻从业者的短视频平台生产实践研究,《新闻与写作》,第12期,78-87页。
- 张冬、魏俊斌(2021):情感驱动下主流媒体疫情信息数据分析与话语引导策略,《图书情报工作》,第65卷第14期,101-108页。
- Amirova,A.,Fteropoulli,T.,Ahmed,N.,Cowie,M.R.& Leibo,J.Z.(2024).Framework-based qualitative analysis of free responses of large language models:Algorithmic fidelity.PLoS One,19(3),e0300024.doi:10.1371/journal.pone.0300024.
- Araujo,T.,Lock,I.& Van de Velde,B.(2020).Automated visual content analysis (AVCA) in communication research:A protocol for large scale image classification with pre-trained computer vision models.Communication Methods and Measures,14(4),239-265.doi:10.1080/19312458.2020.1810648.
- Ashwin,J.,Chhabra,A.& Rao,V.(2023).Using large language models for qualitative analysis can introduce serious bias.Washington:World Bank Group.
- Breazu,P.,Schirmer,M.,Hu,S.B.& Katsos,N.(2024).Large language models and thematic analysis:Human-AI synergy in researching hate speech on social media.arXiv:2408.05126.doi:10.48550/arXiv.2408.05126.
- Byrne,D.(2022).A worked example of Braun and Clarke's approach to reflexive thematic analysis.Quality & Quantity,56(3),1391-1412.doi:10.1007/s11135-021-01182-y.
- Caine,K.(2016).Local standards for sample size at CHI.In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (pp.981-992).San Jose,CA,USA:ACM.doi:10.1145/2858036.2858498.
- Chen,Y.X.,Sui,J.,Hu,L.& Gong,W.(2019).Attention-residual network with CNN for rumor detection.In Proceedings of the 28th ACM International Conference on Information and Knowledge Management (pp.1121-1130).Beijing,China:ACM.doi:10.1145/3357384.3357950.
- Choi,E.C.& Ferrara,E.(2024).Automated claim matching with large language models:Empowering fact-checkers in the fight against misinformation.In Proceedings of the Companion Proceedings of the ACM Web Conference 2024 (pp.1441-1449).Singapore,Singapore:ACM.doi:10.1145/3589335.3651910.
- Chopra,F.& Haaland,I.(2023).Conducting qualitative interviews with AI.Copenhagen:CESifo.doi:10.2139/ssrn.4583756.
- De Paoli,S.(2024a).Performing an inductive thematic analysis of semi-structured interviews with a large language model:An exploration and provocation on the limits of the approach.Social Science Computer Review,42(4),997-1019.doi:10.1177/08944393231220483.
- De Paoli,S (2024b).Thematic analysis with large language models:Does it work with languages other than English?A targeted test in Italian.arXiv:2404.08488.doi:10.48550/arXiv.2404.08488.
- Deiner,M.S.,Honcharov,V.,Li,J.W.,Mackey,T.K.,Porco,T.C.& Sarkar,U.(2024).Large language models can enable inductive thematic analysis of a social media corpus in a single prompt:Human validation study.JMIR Infodemiology,4(1),p.e59641.doi:10.2196/59641.
- Dengel,A.,Gehrlein,R.,Fernes,D.,G?rlich,S.,Maurer,J.,Pham,H.H.,Gro?mann,G.& Eisermann,N.D.G.(2023).Qualitative research methods for large language models:Conducting semi-structured interviews with ChatGPT and BARD on computer science education.Informatics,10(4),p.4.doi:10.3390/informatics10040078.
- Fu,C.B.,Pan,X.Y.,Liang,X.J.,Yu,S.Q.,Xu,X.K.& Min,Y.(2023).Feature drift in fake news detection:An interpretable analysis.Applied Sciences,13(1),p.592.doi:10.3390/app13010592.
- Gauthier,R.P.& Wallace,J.R.(2022).The computational thematic analysis toolkit.Proceedings of the ACM on Human-Computer Interaction,6(1),p.25.doi:10.1145/3492844.
- Haginoya,S.,Ibe,T.,Yamamoto,S.,Yoshimoto,N.,Mizushi,H.& Santtila,P.(2023).AI avatar tells you what happened:The first test of using AI-operated children in simulated interviews to train investigative interviewers.Frontiers in Psychology,14,p.1133621.doi:10.3389/fpsyg.2023.1133621.
- H?m?l?inen,P.,Tavast,M.& Kunnari,A.(2023).Evaluating large language models in generating synthetic HCI research data:A case study.In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (p.433).Hamburg,Germany:ACM.doi:10.1145/3544548.3580688.
- Hanschmann,L.,Mokelke,M.& Maedche,A.(2024).LadderChat-An LLM-based conversational agent for laddering interviews.In Proceedings of the 8th International Workshop on Chatbots and Human-Centred AI (pp.1-17).Thessaloniki,Greece:Springer.
- Hu,W.B.,Xu,Y.F.,Li,Y.,Li,W.Y.,Chen,Z.Y.& Tu,Z.W.(2024).BLIVA:A simple multimodal LLM for better handling of text-rich visual questions.In Proceedings of the 38th AAAI Conference on Artificial Intelligence (pp.2256-2264).Vancouver,Canada:AAAI.doi:10.1609/aaai.v38i3.27999.
- IFCN.(April 14,2024).The commitments of the code of principles.Retrieved from https://ifcncodeofprinciples.poynter.org/the-commitments
- Jiao,Q.R.,Chen,D.Y.,Huang,Y.L.,Li,Y.L.& Shen,Y.(2024).Enhancing multimodal large language models with vision detection models:An empirical study.arXiv:2401.17981.doi:10.48550/arXiv.2401.17981.
- Kidd,C.& Birhane,A.(2023).How AI can distort human beliefs.Science,380(6651),1222-1223.doi:10.1126/science.adi0248.
- Kim,C.Y.,Lee,C.P.& Mutlu,B.(2024).Understanding large-language model (LLM)-powered human-robot interaction.In Proceedings of the 2024 ACM/IEEE International Conference on Human-Robot Interaction (pp.371-380).Boulder,CO,USA:ACM.doi:10.1145/3610977.3634966.
- Kim,J.& Lee,B.(2024).AI-augmented surveys:Leveraging large language models and surveys for opinion prediction.arXiv:2305.09620.doi:10.48550/arXiv.2305.09620.
- Lee,N.,Li,B.Z.,Wang,S.N.,Yih,W.T.,Ma,H.& Khabsa,M.(2020).Language models as fact checkers?In Proceedings of the Third Workshop on Fact Extraction and VERification (pp.36-41).ACL.doi:10.18653/v1/2020.fever-1.5.
- Leite,J.A.,Razuvayevskaya,O.,Bontcheva,K.& Scarton,C.(2025).Weakly supervised veracity classification with LLM-predicted credibility signals.EPJ Data Science,14(1),p.16.doi:10.1140/epjds/s13688-025-00534-0.
- Li,K.D.,Fernandez,A.M.,Schwartz,R.,Rios,N.,Carlisle,M.N.,Amend,G.M.,Patel,H.V.& Breyer,B.N.(2024).Comparing GPT-4 and human researchers in health care data analysis:Qualitative description study.Journal of Medical Internet Research,26(1),p.e56500.doi:10.2196/56500.
- Liu,Z.L.,Li,Y.W.,Zolotarevych,O.,Yang,R.W.& Liu,T.M.(2024).LLM-POTUS score:A framework of analyzing presidential debates with large language models.arXiv:2409.08147.doi:10.48550/arXiv.2409.08147.
- Pocol,A.,Istead,L.,Siu,S.,Mokhtari,S.& Kodeiri,S.(2024).Seeing is no longer believing:A survey on the state of deepfakes,AI-generated humans,and other nonveridical media.In Proceedings of 40th Computer Graphics International Conference on Advances in Computer Graphics (pp.427-440).Shanghai,China:Springer.doi:10.1007/978-3-031-50072-5_34.
- Prystawski,B.,Thibodeau,P.,Potts,C.& Goodman,N.D.(2022).Psychologically-informed chain-of-thought prompts for metaphor understanding in large language models.arXiv preprint arXiv:2209.08141.
- Roberts,J.,Baker,M.& Andrew,J.(2024).Artificial intelligence and qualitative research:The promise and perils of large language model (LLM) ‘assistance’.Critical Perspectives on Accounting,99,102722.doi:10.1016/j.cpa.2024.102722.
- Singh,S.H.,Jiang,K.,Bhasin,K.,Sabharwal,A.,Moukaddam,N.& Patel,A.B.(2024).RACER:An LLM-powered methodology for scalable analysis of semi-structured mental health interviews.In Proceedings of the 1st Workshop on NLP for Science (pp.73-87).Miami,FL,USA:ACL.doi:10.18653/v1/2024.nlp4science-1.8.
- Sun,L.H.,Wei,M.,Sun,Y.B.,Suh,Y.J.,Shen,L.W.& Yang,S.J.(2024).Smiling women pitching down:Auditing representational and presentational gender biases in image-generative AI.Journal of Computer-Mediated Communication,29(1),zmad045.doi:10.1093/jcmc/zmad045.
- Thomas,R.J.& Thomson,T.J.(2023).What does a journalist look like?Visualizing journalistic roles through AI.Digital Journalism.doi:10.1080/21670811.2023.2229883.
- T?rnberg,P.(2023).How to use LLMs for text analysis.arXiv:2307.13106.doi:10.48550/arXiv.2307.13106.
- Villalba,A.C.,Brown,E.M.,Scurrell,J.V.,Entenmann,J.& Daepp,M.I.G.(2023).Automated interviewer or augmented survey?Collecting social data with large language models.arXiv:2309.10187.doi:10.48550/arXiv.2309.10187.
- VP,S.E.,S,C.M.& Dheepthi,R.(2024).LLM-enhanced deepfake detection:Dense CNN and multi-modal fusion framework for precise multimedia authentication.In Proceedings of the 2024 International Conference on Advances in Data Engineering and Intelligent Computing Systems (ADICS) (pp.1-6).Chennai,India:IEEE.doi:10.1109/ADICS58448.2024.10533511.
- Wachinger,J.,B?rnighausen,K.,Sch?fer,L.N.,Scott,K.& McMahon,S.A.(2024).Prompts,pearls,imperfections:Comparing ChatGPT and a human researcher in qualitative data analysis.Qualitative Health Research,10497323241244669.doi:10.1177/10497323241244669.
- West,B.T.& Blom,A.G.(2017).Explaining interviewer effects:A research synthesis.Journal of Survey Statistics and Methodology,5(2),175-211.doi:10.1093/jssam/smw024.
- Westerlund,M.(2019).The emergence of deepfake technology:A review.Technology Innovation Management Review,9(11),39-52.doi:10.22215/timreview/1282.
- Wosny,M.& Hastings,J.(2024).Applying large language models to interpret qualitative interviews in healthcare.In Proceedings of the Digital Health and Informatics Innovations for Sustainable Health Care Systems (pp.791-795).IOS Press.doi:10.3233/SHTI240530.
- Wu,J.Y.& Hooi,B.(2023).DECOR:Degree-corrected social graph refinement for fake news detection.In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (pp.2582-2593).Long Beach,CA,USA:ACM.doi:10.1145/3580305.3599298.
- Wu,J.Y.,Guo,J.F.& Hooi,B.(2024).Fake news in sheep's clothing:Robust fake news detection against LLM-empowered style attacks.In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (pp.3367-3378).Barcelona,Spain:ACM.doi:10.1145/3637528.3671977.
- Wuttke,A.,A?enmacher,M.,Klamm,C.,Lang,M.M.,Würschinger,Q.& Kreuter,F.(2024).AI conversational interviewing:Transforming surveys with LLMs as adaptive interviewers.arXiv:2410.01824.doi:10.48550/arXiv.2410.01824.
- Yu,D.N.,Li,L.Y.,Su,H.& Fuoli,M.(2024).Assessing the potential of LLM-assisted annotation for corpus-based pragmatics and discourse analysis:The case of apology.International Journal of Corpus Linguistics,29(4),534-561.doi:10.1075/ijcl.23087.yu.
- Zhang,X.& Gao,W.(2023).Towards LLM-based fact verification on news claims with a hierarchical step-by-step prompting method.In Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (pp.996-1011).Nusa Dua,Bali:ACL.doi:10.18653/v1/2023.ijcnlp-main.64.
- Zhang,D.Z.,Yu,Y.H.,Dong,J.H.,Li,C.X.,Su,D.,Chu,C.H.& Yu,D.(2024a).MM-LLMs:Recent advances in MultiModal large language models.In Proceedings of the Findings of the Association for Computational Linguistics (pp.12401-12430).Bangkok,Thailand:ACL.doi:10.18653/v1/2024.findings-acl.738.
- Zhang,H.,Wu,C.H.,Xie,J.Y.,Rubino,F.,Graver,S.,Kim,C.,Carroll,J.M.& Cai,J.(2024b).When qualitative research meets large language model:Exploring the potential of QualiGPT as a tool for qualitative coding.arXiv:2407.14925.doi:10.48550/arXiv.2407.14925.
- Zhao,X.Y.,Ma,Z.X.& Ma,R.(2024).Analyzing narrative contagion through digital storytelling in social media conversations:An AI-powered computational approach.New Media & Society,14614448241285445.doi:10.1177/146144 48241285445.
- Ziems,C.,Held,W.,Shaikh,O.,Chen,J.,Zhang,Z.& Yang,D.(2024).Can large language models transform computational social science?.Computational Linguistics,50(1),237-291.doi:org/10/1162/coli_a_00502.