计算扎根方法的演进与比较，争论

本报告综合了计算扎根方法（CGT）从理论框架构建到跨学科应用的演进全貌。研究不仅确立了以“人机协同”和“溯因推理”为核心的方法论逻辑，还在社会话语分析、心理构念测量等领域实现了深度的实证落地。同时，学术界围绕计算工具的偏见、公平性及认识论严谨性展开了深入争论，推动了从单纯的自动化编码向更具反思性、解释性和交叉性的计算社会科学范式转型。

共 52 篇文献，5 个研究方向

计算扎根理论的方法论框架与人机协同推理模式

这些文献奠定了计算扎根理论（CGT）的理论基石，探讨了如何将社会科学的推理性逻辑（归纳、演绎、溯因）与机器学习的计算能力相结合。重点在于构建“人机协同”的迭代框架，强调通过算法增强研究者的注意力，同时保持定性研究对意义解释的深度和理论生成的本质。相关文献: Abhishek Sheetal et. al, 2021 等 10 篇文献

计算扎根方法在复杂社会议题中的实证应用与话语分析

这组文献展示了CGT在多元社会场景下的实证价值。研究涵盖了性别与种族歧视、民粹主义逻辑、LGBTQ+群体压力、COVID-19风险感知、极端化过程以及公共卫生政策话语等。通过自动化工具（主题模型、分类器）与定性解读的结合，研究者能够从大规模社交媒体或新闻数据中提取深层的社会动态与文化内涵。相关文献: Koustuv Saha et. al, 2019 等 14 篇文献

技术工具的演进：计算辅助编码、解释性AI与NLP管道

此类文献侧重于技术层面的创新，旨在解决大规模文本处理中的效率与透明度问题。研究内容包括利用非监督学习（聚类）、张量分解、针对特定语种的NLP管道以及LLM辅助编码工具，旨在提升主题发现的清晰度、自动化编码的准确性及计算过程的可解释性。相关文献: S. Anning et. al, 2021 等 7 篇文献

认识论争论、公平性反思与计算严谨性评价

这组文献反映了对计算方法在社会科学中应用的批判性思考。讨论集中在算法中潜伏的性别与种族偏见、机器学习公平性的定义冲突、以及如何通过“交叉性理论”和人类反馈来建立新的严谨性标准。同时，探讨了计算模拟是否符合科学方法以及聚类分析在社会学理论解释中的局限性。相关文献: Steven Zhou et. al, 2024 等 9 篇文献

跨学科扩展：心理构念测量、社会模拟与治理实践

该组文献关注CGT思想向更广泛领域的延伸。一方面是利用深度学习和LLM对复杂心理构念（如谦逊、道德感）进行量化测量与行为模拟；另一方面是将计算方法应用于城市数字治理、社会技术风险分析及计算机视觉推理，展现了计算扎根方法作为跨学科研究工具的广阔前景。相关文献: Sarah Bratt et. al, 2024 等 12 篇文献

总计52篇相关文献

A novel, human-in-the-loop computational grounded theory framework for big social data

一种新颖的、人机交互的针对大规模社会数据的计算扎根理论框架

Lama Alqazlan, Zheng Fang, Michael Castelle 等, 2025-Big Data & Society

The availability of big data has significantly influenced the possibilities and methodological choices for conducting large-scale behavioural and social science research. In the context of qualitative data analysis, a major challenge is that conventional methods require intensive manual labour and are often impractical to apply to large datasets. One effective way to address this issue is by integrating emerging computational methods to overcome scalability limitations. However, a critical concern for researchers is the trustworthiness of results when machine learning and natural language processing tools are used to analyse such data. We argue that confidence in the credibility and robustness of results depends on adopting a ’human-in-the-loop’ methodology that is able to provide researchers with control over the analytical process, while retaining the benefits of using machine learning and natural language processing. With this in mind, we propose a novel methodological framework for computational grounded theory that supports the analysis of large qualitative datasets, while maintaining the rigour of established grounded theory methodologies. To illustrate the framework’s value, we present the results of testing it on a dataset collected from Reddit in a study aimed at understanding tutors’ experiences in the gig economy.