论文标题
CASPR:基于客户活动序列的预测和表示
CASPR: Customer Activity Sequence-based Prediction and Representation
论文作者
论文摘要
对企业盈利能力至关重要的任务,例如客户流失预测,欺诈性帐户检测或客户寿命估计,通常是通过按照表格格式设计的功能培训的模型来解决的。随着时间的推移,特定于应用程序的功能工程增加了开发,操作和维护成本。表示学习的最新进展为简化和概括跨应用程序的功能工程提供了机会。将这些进步应用于表格数据时,研究人员处理数据异质性,客户参与历史记录的变化或企业数据集的庞大量。在本文中,我们提出了一种新颖的方法,以编码包含客户交易,购买历史记录和其他互动的表格数据,以作为客户与企业关联的一般表示。然后,我们将这些嵌入式评估为训练跨越各种应用程序的多种型号的功能。 CASPR是基于客户活动序列的预测和表示,将变压器体系结构应用于编码活动序列,以提高模型性能并避免跨应用程序的定制功能工程。我们的规模实验验证了针对小型和大型企业应用程序的CASPR。
Tasks critical to enterprise profitability, such as customer churn prediction, fraudulent account detection or customer lifetime value estimation, are often tackled by models trained on features engineered from customer data in tabular format. Application-specific feature engineering adds development, operationalization and maintenance costs over time. Recent advances in representation learning present an opportunity to simplify and generalize feature engineering across applications. When applying these advancements to tabular data researchers deal with data heterogeneity, variations in customer engagement history or the sheer volume of enterprise datasets. In this paper, we propose a novel approach to encode tabular data containing customer transactions, purchase history and other interactions into a generic representation of a customer's association with the business. We then evaluate these embeddings as features to train multiple models spanning a variety of applications. CASPR, Customer Activity Sequence-based Prediction and Representation, applies Transformer architecture to encode activity sequences to improve model performance and avoid bespoke feature engineering across applications. Our experiments at scale validate CASPR for both small and large enterprise applications.