论文标题
在天文数据之间建模高维依赖性
Modeling high-dimensional dependence among astronomical data
论文作者
论文摘要
在许多科学学科中,修复一组实验数量的关系是一个基本问题。在2D情况下,经典方法是从散点图计算线性相关系数。但是,该方法隐含地假设变量之间存在线性关系。这样的假设并不总是正确的。通过使用部分相关系数,可以扩展到多维情况。但是,变量假定的相互线性关系的问题仍然存在。一种相对较新的方法可以避免此问题,这是与Copulas数据的关节概率密度函数(PDF)的建模。这些功能包含有关两个随机变量之间关系的所有信息。尽管原则上,这种方法也可以与多维数据一起使用,但理论和计算困难通常将其用于2D情况。在本文中,我们考虑了一种基于所谓的葡萄藤的方法,该方法克服了这一限制,同时也可以接受理论处理,并且从计算的角度来看是可行的。我们将此方法应用于Herschel参考样品的近红外和FAR-IR亮度以及原子和分子质量的数据,这是附近宇宙中的体积有限样本。我们确定了亮度和气体质量的关系,并表明Far-Ir的光度可以视为与其他三个数量相关的关键参数。一旦从4D关系中删除,后者之间的残差关系可以忽略不计。这可以解释为气体质量与近红外光度之间的相关性是由Far-Ir发光度驱动的,这可能是由于星系的恒星形成活性。
Fixing the relationship of a set of experimental quantities is a fundamental issue in many scientific disciplines. In the 2D case, the classical approach is to compute the linear correlation coefficient from a scatterplot. This method, however, implicitly assumes a linear relationship between the variables. Such an assumption is not always correct. With the use of the partial correlation coefficients, an extension to the multidimensional case is possible. However, the problem of the assumed mutual linear relationship of the variables remains. A relatively recent approach that makes it possible to avoid this problem is the modeling of the joint probability density function (PDF) of the data with copulas. These are functions that contain all the information on the relationship between two random variables. Although in principle this approach also can work with multidimensional data, theoretical as well computational difficulties often limit its use to the 2D case. In this paper, we consider an approach based on so-called vine copulas, which overcomes this limitation and at the same time is amenable to a theoretical treatment and feasible from the computational point of view. We applied this method to published data on the near-IR and far-IR luminosities and atomic and molecular masses of the Herschel reference sample, a volume-limited sample in the nearby Universe. We determined the relationship of the luminosities and gas masses and show that the far-IR luminosity can be considered as the key parameter relating the other three quantities. Once removed from the 4D relation, the residual relation among the latter is negligible. This may be interpreted as the correlation between the gas masses and near-IR luminosity being driven by the far-IR luminosity, likely by the star formation activity of the galaxy.