论文标题
通过指数模型降低分位数估计的偏差和差异
Reducing bias and variance in quantile estimates with an exponential model
论文作者
论文摘要
百分位数,更一般而言,分位数通常在各种情况下用于总结数据。对于大多数分布,恰好有一个毫无偏的分数。对于像高斯这样具有相同平均值和中位数的发行版,它成为中位数。有不同的方法可以从文献中描述的有限样本中估算分位数并在统计包中实施。可以利用指数分布和设计高质量估计器的无内存属性,这些估计器是无偏见的,差异较低和平方误差。自然,当基础分布指数呈指数时,这些估计量超过统计包的表现。但是,当违反该假设时,它们也恰好可以很好地概括。
Percentiles and more generally, quantiles are commonly used in various contexts to summarize data. For most distributions, there is exactly one quantile that is unbiased. For distributions like the Gaussian that have the same mean and median, that becomes the medians. There are different ways to estimate quantiles from finite samples described in the literature and implemented in statistics packages. It is possible to leverage the memory-less property of the exponential distribution and design high quality estimators that are unbiased and have low variance and mean squared errors. Naturally, these estimators out-perform the ones in statistical packages when the underlying distribution is exponential. But, they also happen to generalize well when that assumption is violated.