多企业因果表示学习，例如标签预测和分布概括

论文标题

多企业因果表示学习，例如标签预测和分布概括

Multi-Instance Causal Representation Learning for Instance Label Prediction and Out-of-Distribution Generalization

论文作者

Zhang, Weijia, Zhang, Xuanhui, Deng, Han-Wen, Zhang, Min-Ling

论文摘要

多实施学习（MIL）涉及表示为实例袋的对象，可以预测行李级监督的实例标签。但是，由于实例标签在MIL中不可用，因此实例级级MIL算法和监督学习者之间存在显着的性能差距。大多数现有的MIL算法通过将多种现实袋视为有害的歧义来解决该问题，并通过减少监督不符合性来预测实例标签。这项工作从新的角度研究了MIL，通过将袋子视为辅助信息，并利用它从行李级较弱的监督中确定实例级别的因果关系。我们提出了Causalmil算法，该算法不仅在实例标签预测上表现出色，而且还通过协同将MIL与可识别的变异自动编码器整合在一起为分布变化提供了鲁棒性。我们的方法基于一个实用和一般的假设：潜在表示对实例的先验分布属于多实体袋中的非物质指数家庭条件。关于合成和现实世界数据集的实验表明，我们的方法在实例标签预测和分布外的概括任务上明显优于各种基准。

Multi-instance learning (MIL) deals with objects represented as bags of instances and can predict instance labels from bag-level supervision. However, significant performance gaps exist between instance-level MIL algorithms and supervised learners since the instance labels are unavailable in MIL. Most existing MIL algorithms tackle the problem by treating multi-instance bags as harmful ambiguities and predicting instance labels by reducing the supervision inexactness. This work studies MIL from a new perspective by considering bags as auxiliary information, and utilize it to identify instance-level causal representations from bag-level weak supervision. We propose the CausalMIL algorithm, which not only excels at instance label prediction but also provides robustness to distribution change by synergistically integrating MIL with identifiable variational autoencoder. Our approach is based on a practical and general assumption: the prior distribution over the instance latent representations belongs to the non-factorized exponential family conditioning on the multi-instance bags. Experiments on synthetic and real-world datasets demonstrate that our approach significantly outperforms various baselines on instance label prediction and out-of-distribution generalization tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题