论文标题
基于优化的EHG信号的合成抽样改进了早产预测
Improved Preterm Prediction Based on Optimized Synthetic Sampling of EHG Signal
论文作者
论文摘要
早产是新生儿发病率和死亡率的主要原因,并吸引了许多科学领域的研究工作。子宫收缩与潜在的电活动之间的相互关系使子宫电术(EHG)成为早产检测和预测的有希望的方向。由于EHG信号的稀缺性,尤其是早产患者的信号,因此使用合成算法来创建自早产的人工样本,以消除对学期的预测偏见,而牺牲了基于机器学习的自动早产检测的特征有效性。为了解决此类问题,我们量化了合成样本(平衡系数)对特征有效性的影响,并通过利用具有相关权重的多个特征得分来形成其对班级分离的贡献,形成了一般性能指标。结合表征训练样本在学期和早产预测精度中的效果的激活/失活函数,我们获得了最佳样本平衡系数,从而损害了合成样本在消除对多数偏见的偏见以及降低特征的副作用的影响。通过对公共可用TPEHG数据库的一系列数值测试,已经实现了预测精度的实质性提高,并验证了所提出方法的有效性。
Preterm labor is the leading cause of neonatal morbidity and mortality and has attracted research efforts from many scientific areas. The inter-relationship between uterine contraction and the underlying electrical activities makes uterine electrohysterogram (EHG) a promising direction for preterm detection and prediction. Due the scarcity of EHG signals, especially those of preterm patients, synthetic algorithms are applied to create artificial samples of preterm type in order to remove prediction bias towards term, at the expense of a reduction of the feature effectiveness in machine-learning based automatic preterm detecting. To address such problem, we quantify the effect of synthetic samples (balance coefficient) on features' effectiveness, and form a general performance metric by utilizing multiple feature scores with relevant weights that describe their contributions to class separation. Combined with the activation/inactivation functions that characterizes the effect of the abundance of training samples in term and preterm prediction precision, we obtain an optimal sample balance coefficient that compromise the effect of synthetic samples in removing bias towards the majority and the side-effect of reducing features' importance. Substantial improvement in prediction precision has been achieved through a set of numerical tests on public available TPEHG database, and it verifies the effectiveness of the proposed method.