论文标题

主题和主题:如何将定性研究和主题建模融合在一起

Theme and Topic: How Qualitative Research and Topic Modeling Can Be Brought Together

论文作者

Gillies, Marco, Murthy, Dhiraj, Brenton, Harry, Olaniyan, Rapheal

论文摘要

定性研究是一种基于人类对数据的解释,尤其是文本的解释来理解社会现象的方法。概率主题建模是一种机器学习方法,它也基于文本分析,并且经常用于理解社会现象。这两种方法旨在在文本语料库中提取重要的主题或主题,因此我们可能会认为它们彼此相似。但是,这两种方法的运作方式也有很大差异。一个是一个高度人类的解释过程,另一个是自动化和统计的。在本文中,我们将此类比作为我们主题和主题系统的基础,这是定性研究人员进行文本研究的工具,将主题建模整合到可访问的界面中。这是一种更通用的互动机器学习系统设计方法的示例,其中现有的人类专业过程可以用作涉及机器学习的过程的模型。这具有为现有专业人员提供熟悉的方法的特殊好处,这可能会使机器学习看起来不那么陌生,更容易学习。我们的设计方法有两个要素。我们首先调查专业人员执行任务时经历的步骤,并为整合机器学习的主题和主题设计工作流程。然后,我们为主题建模设计了界面,其中定性研究的熟悉概念被映射到机器学习概念上。这使这些机器学习概念更加熟悉,更容易为定性研究人员学习。

Qualitative research is an approach to understanding social phenomenon based around human interpretation of data, particularly text. Probabilistic topic modelling is a machine learning approach that is also based around the analysis of text and often is used to in order to understand social phenomena. Both of these approaches aim to extract important themes or topics in a textual corpus and therefore we may see them as analogous to each other. However there are also considerable differences in how the two approaches function. One is a highly human interpretive process, the other is automated and statistical. In this paper we use this analogy as the basis for our Theme and Topic system, a tool for qualitative researchers to conduct textual research that integrates topic modelling into an accessible interface. This is an example of a more general approach to the design of interactive machine learning systems in which existing human professional processes can be used as the model for processes involving machine learning. This has the particular benefit of providing a familiar approach to existing professionals, that may can make machine learning seem less alien and easier to learn. Our design approach has two elements. We first investigate the steps professionals go through when performing tasks and design a workflow for Theme and Topic that integrates machine learning. We then designed interfaces for topic modelling in which familiar concepts from qualitative research are mapped onto machine learning concepts. This makes these the machine learning concepts more familiar and easier to learn for qualitative researchers.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源