多项式参数的最佳置信区

论文标题

多项式参数的最佳置信区

Optimal Confidence Regions for the Multinomial Parameter

论文作者

Malloy, Matthew L., Tripathy, Ardhendu, Nowak, Robert D.

论文摘要

建造紧密的信心区域和间隔对于统计推断和决策是至关重要的。本文开发了新的理论，显示了分类数据的最低平均体积置信区域。更准确地说，考虑经验分布$ \ wideHat {\ boldsymbol {p}} $从$ n $ iid实现的随机变量中生成的$，该变量根据未知分布$ \ boldsymbol {p} $进行$ k $可能的值。这类似于来自多项式分布的单个平局。置信区域是概率单纯的子集，取决于$ \ wideHat {\ boldsymbol {p}} $，并包含未知$ \ boldsymbol {p} $，并具有指定的信心。本文展示了如何构建最小平均量置信区域，并回答一个长期存在的问题。我们还展示了该区域的最佳性，直接转化为线性功能的最佳置信区间，例如平均值，暗示样本复杂性和自适应机器学习算法的遗憾改善。

Construction of tight confidence regions and intervals is central to statistical inference and decision making. This paper develops new theory showing minimum average volume confidence regions for categorical data. More precisely, consider an empirical distribution $\widehat{\boldsymbol{p}}$ generated from $n$ iid realizations of a random variable that takes one of $k$ possible values according to an unknown distribution $\boldsymbol{p}$. This is analogous to a single draw from a multinomial distribution. A confidence region is a subset of the probability simplex that depends on $\widehat{\boldsymbol{p}}$ and contains the unknown $\boldsymbol{p}$ with a specified confidence. This paper shows how one can construct minimum average volume confidence regions, answering a long standing question. We also show the optimality of the regions directly translates to optimal confidence intervals of linear functionals such as the mean, implying sample complexity and regret improvements for adaptive machine learning algorithms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题